js

Why OFFSET Pagination Breaks at Scale—and What to Use Instead

Discover why OFFSET pagination fails with large datasets and learn scalable alternatives like cursor and keyset pagination.

Why OFFSET Pagination Breaks at Scale—and What to Use Instead

I was building an API for a social media platform when I hit a wall. Users were complaining that their feeds were showing the same posts twice. Our database was groaning under the load of simple OFFSET queries. That’s when I realized most pagination tutorials only scratch the surface. They teach you the basics that break at scale. Today, I want to share what I learned about making pagination fast, consistent, and reliable.

Have you ever clicked “next page” only to see items you just looked at? That’s the problem we’re solving.

Let’s start with why the common approach fails. Most developers learn offset-based pagination first. You take a page number and a limit, calculate an offset, and fetch your data. It looks clean in tutorials.

// This is what many APIs start with
app.get('/posts', async (req, res) => {
  const page = Number(req.query.page) || 1;
  const limit = 20;
  const offset = (page - 1) * limit;
  
  const posts = await db.query(
    'SELECT * FROM posts ORDER BY created_at DESC OFFSET $1 LIMIT $2',
    [offset, limit]
  );
  
  res.json({ posts, page });
});

This works fine with small datasets. But what happens when you have a million records? The database must scan through all those rows to reach your offset. It’s like asking someone to find page 5000 in a book by counting every page from the beginning. The performance gets worse with each page.

There’s another issue. What if someone adds a new post while you’re browsing? The entire order shifts. You might see the same post on two different pages, or miss some entirely. This creates a frustrating experience.

So what’s the alternative? Let me introduce you to cursor-based pagination. Instead of using page numbers, you use a pointer to the last item you saw. The client doesn’t need to know about page 1, 2, or 3. They just ask for items after a specific point.

Think of it like a bookmark in that book. You don’t say “show me page 5000.” You say “show me what comes after my bookmark.”

Here’s how it works in practice. First, we need to create cursors. A cursor is just an encoded version of the last item’s unique identifier. We usually include the item’s ID and a timestamp.

// A simple cursor encoder
export class CursorEncoder {
  static encode(id: string, timestamp: Date): string {
    const data = `${id}:${timestamp.getTime()}`;
    return Buffer.from(data).toString('base64url');
  }
  
  static decode(cursor: string): { id: string; timestamp: Date } {
    const data = Buffer.from(cursor, 'base64url').toString('utf8');
    const [id, timestamp] = data.split(':');
    return { id, timestamp: new Date(Number(timestamp)) };
  }
}

Now, how do we use this in a query? Instead of OFFSET 10000, we ask for items that come after our cursor.

async function getPostsAfterCursor(cursor: string, limit: number) {
  const decoded = CursorEncoder.decode(cursor);
  
  const query = `
    SELECT id, content, created_at
    FROM posts
    WHERE created_at < $1 OR (created_at = $1 AND id < $2)
    ORDER BY created_at DESC, id DESC
    LIMIT $3
  `;
  
  return await db.query(query, [
    decoded.timestamp,
    decoded.id,
    limit + 1  // Fetch one extra to check if there's more
  ]);
}

Notice the LIMIT $3 + 1? We fetch one extra item to know if there are more results. If we get 21 items when the limit is 20, we know there’s a next page. We return 20 items and include a “hasNextPage” flag.

This approach has several benefits. The query performance stays constant regardless of how far you’ve paginated. The results remain consistent even if new items are added. And users get a smooth experience without duplicates.

But what if you need to go backward? Cursor pagination supports that too. You just need to reverse your query logic and remember to reverse the results.

async function getPostsBeforeCursor(cursor: string, limit: number) {
  const decoded = CursorEncoder.decode(cursor);
  
  // Note: We flip the comparison and order
  const query = `
    SELECT id, content, created_at
    FROM posts
    WHERE created_at > $1 OR (created_at = $1 AND id > $2)
    ORDER BY created_at ASC, id ASC
    LIMIT $3
  `;
  
  const results = await db.query(query, [
    decoded.timestamp,
    decoded.id,
    limit + 1
  ]);
  
  // Reverse since we fetched in ascending order
  return results.reverse();
}

Now let’s talk about a different pattern: keyset pagination. It’s similar to cursor pagination but focuses on using indexed columns for filtering. The idea is simple - instead of scanning through rows, you jump directly to where you need to be.

Consider a table of products with prices. If you’re paginating by price, keyset pagination would look like this:

async function getProductsByPrice(lastPrice: number, lastId: string) {
  const query = `
    SELECT id, name, price
    FROM products
    WHERE price < $1 OR (price = $1 AND id < $2)
    ORDER BY price DESC, id DESC
    LIMIT 20
  `;
  
  return await db.query(query, [lastPrice, lastId]);
}

This uses a compound condition on price and ID. The database can use an index on (price, id) to find the exact position instantly. No scanning required.

What about complex queries with joins? This is where deferred joins shine. The problem with joining tables in paginated queries is that you’re joining all the rows before limiting them. That’s expensive.

Deferred joins solve this by doing the pagination first, then joining. Here’s the pattern:

async function getPostsWithAuthors(limit: number, offset: number) {
  // First, get just the post IDs we need
  const postIds = await db.query(`
    SELECT id
    FROM posts
    ORDER BY created_at DESC
    LIMIT $1 OFFSET $2
  `, [limit, offset]);
  
  // Then, join with authors using only those IDs
  const posts = await db.query(`
    SELECT p.*, a.name as author_name
    FROM posts p
    JOIN authors a ON p.author_id = a.id
    WHERE p.id = ANY($1)
    ORDER BY p.created_at DESC
  `, [postIds.rows.map(row => row.id)]);
  
  return posts.rows;
}

This two-step process can be dramatically faster. You’re only joining the exact rows you need to display, not every row in the table.

Each of these patterns has its place. Cursor pagination is excellent for infinite scroll feeds. Keyset pagination works well when you have clear, indexed columns to paginate by. Deferred joins save you when queries get complex.

But how do you choose? Start with your access patterns. What columns are users filtering and sorting by? Are they browsing forward through time or jumping around? Your answers will guide your choice.

Remember to handle edge cases. What happens when a cursor is invalid? What if someone tries to tamper with it? Always validate and sanitize your cursors.

function validateCursor(cursor: string): boolean {
  try {
    CursorEncoder.decode(cursor);
    return true;
  } catch {
    return false;
  }
}

Also consider encryption for sensitive cursors. Base64 encoding is transparent. Anyone can decode it and see your data structure. For sensitive applications, you might want to encrypt your cursors.

One more thing: testing. Pagination logic can be tricky. Make sure you test with real data volumes. Test what happens when items are added or deleted during pagination. Test backward navigation. Test with empty results.

I’ve seen teams spend weeks optimizing database queries when the real problem was their pagination approach. These patterns can give you instant performance improvements without changing your infrastructure.

The next time you’re building an API that needs to serve large datasets, think beyond OFFSET and LIMIT. Your users will thank you for the consistent experience. Your database will thank you for the reduced load. And you’ll thank yourself for avoiding those late-night performance fires.

What patterns have you found effective for handling large datasets? Have you encountered other pagination challenges? I’d love to hear about your experiences in the comments below. If you found this helpful, please share it with other developers who might be struggling with pagination at scale.


As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!


📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!


Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Keywords: pagination,api performance,cursor pagination,keyset pagination,database optimization



Similar Posts
Blog Image
Building Event-Driven Microservices with NestJS, RabbitMQ and TypeScript: Complete 2024 Developer Guide

Master event-driven microservices with NestJS, RabbitMQ & TypeScript. Learn architecture patterns, distributed transactions & testing strategies.

Blog Image
Complete Guide to Integrating Next.js with Prisma ORM: Build Type-Safe Full-Stack Applications

Learn how to integrate Next.js with Prisma ORM for full-stack TypeScript apps with end-to-end type safety. Build faster with modern database tooling and optimized rendering.

Blog Image
Build Type-Safe Event-Driven Microservices: NestJS, RabbitMQ, and Prisma Complete Tutorial

Learn to build scalable event-driven microservices with NestJS, RabbitMQ & Prisma. Complete guide with type-safe schemas, error handling & Docker deployment.

Blog Image
Build a Real-Time Collaborative Document Editor with Operational Transforms, Socket.io, Redis, and MongoDB

Learn to build a real-time collaborative document editor with Operational Transforms using Socket.io, Redis & MongoDB. Complete tutorial with conflict resolution & scaling tips.

Blog Image
Complete Guide to Integrating Next.js with Prisma ORM: Build Type-Safe Full-Stack Applications

Learn how to integrate Next.js with Prisma ORM for type-safe database operations. Build full-stack apps with seamless data management and TypeScript support.

Blog Image
Complete Guide to Server-Sent Events with Node.js and TypeScript for Real-Time Data Streaming

Master Node.js TypeScript SSE implementation for real-time data streaming. Complete guide covers server setup, connection management, authentication & performance optimization.