The N+1 problem in GraphQL: DataLoader and the batching fix.

The N+1 problem is one of the first performance issues you hit with GraphQL. It occurs when resolving a list of N items, and each item triggers an additional query to fetch related data — resulting in 1 query for the list and N queries for the relations, for a total of N+1.

How the problem occurs

Consider a GraphQL query that fetches posts with their authors:

query {
  posts {
    id
    title
    author {
      name
      email
    }
  }
}

The naive resolver implementation:

const resolvers = {
  Query: {
    posts: () => db.query('SELECT * FROM posts LIMIT 20'),
  },
  Post: {
    // This runs once per post
    author: (post) => db.query(
      'SELECT * FROM users WHERE id = $1', [post.author_id]
    ),
  },
};

Fetching 20 posts triggers 1 query for the posts, then 20 separate queries for each author — 21 queries total. At 100 posts it’s 101 queries. At 1000 posts, it’s 1001.

You can see this in query logs: many nearly-identical SELECT * FROM users WHERE id = $1 queries running in sequence.

DataLoader: batch and cache

DataLoader, originally built by Facebook, solves this by batching multiple single-item lookups into one query within a single event loop tick.

import DataLoader from 'dataloader';

// The batch function receives an array of keys and returns an array of values
// in the same order
const userLoader = new DataLoader(async (userIds) => {
  const users = await db.query(
    'SELECT * FROM users WHERE id = ANY($1)',
    [userIds]
  );

  // DataLoader requires results in the same order as the input keys
  const userMap = new Map(users.map(u => [u.id, u]));
  return userIds.map(id => userMap.get(id) ?? null);
});

Now update the resolver to use the loader:

const resolvers = {
  Post: {
    author: (post) => userLoader.load(post.author_id),
  },
};

When 20 posts resolve simultaneously (within the same event loop tick), DataLoader collects all 20 author_id values and makes a single batched query. The result: 2 queries total instead of 21.

Request-scoped loaders

DataLoader also caches within a request — calling userLoader.load('usr_123') twice returns the same cached result the second time. This is only safe when the loader is scoped to a single request. Sharing a loader across requests would return stale data.

// Create loaders per request in your GraphQL context
function createContext(req) {
  return {
    user: req.user,
    loaders: {
      user: new DataLoader(async (ids) => {
        const users = await db.query(
          'SELECT * FROM users WHERE id = ANY($1)', [ids]
        );
        const map = new Map(users.map(u => [u.id, u]));
        return ids.map(id => map.get(id) ?? null);
      }),
      post: new DataLoader(async (ids) => {
        const posts = await db.query(
          'SELECT * FROM posts WHERE id = ANY($1)', [ids]
        );
        const map = new Map(posts.map(p => [p.id, p]));
        return ids.map(id => map.get(id) ?? null);
      }),
    },
  };
}

// In resolvers, access loaders through context
const resolvers = {
  Post: {
    author: (post, _args, context) =>
      context.loaders.user.load(post.author_id),
  },
  Comment: {
    post: (comment, _args, context) =>
      context.loaders.post.load(comment.post_id),
    author: (comment, _args, context) =>
      context.loaders.user.load(comment.author_id),
  },
};

Batching with filters

DataLoader’s batch function receives an array of keys. For more complex lookups, use a composite key:

// Batch loading posts by author, with a status filter
const postsByAuthorLoader = new DataLoader(
  async (keys) => {
    // keys = [{ authorId: '1', status: 'published' }, ...]
    const authorIds = [...new Set(keys.map(k => k.authorId))];

    const posts = await db.query(`
      SELECT * FROM posts
      WHERE author_id = ANY($1)
      ORDER BY created_at DESC
    `, [authorIds]);

    // Group by author
    const byAuthor = new Map();
    for (const post of posts) {
      if (!byAuthor.has(post.author_id)) byAuthor.set(post.author_id, []);
      byAuthor.get(post.author_id).push(post);
    }

    return keys.map(({ authorId, status }) =>
      (byAuthor.get(authorId) ?? []).filter(p => p.status === status)
    );
  },
  { cacheKeyFn: (key) => `${key.authorId}:${key.status}` }
);

When DataLoader isn’t enough

DataLoader batches requests within a single tick. For deeply nested schemas where each level triggers resolution before the next level starts, some N+1 patterns remain. In those cases, alternatives like query planning at the schema level (Graphile’s look-ahead, Pothos with query complexity analysis) or moving to a join-based query approach for specific resolvers are worth exploring.

But for the standard case of resolving related records in a list query, DataLoader turns N+1 into 1+1, and that change in query count makes the difference between an endpoint that degrades under load and one that doesn’t.