Back to blog

How to Solve the N+1 Query Problem in GraphQL: Using DataLoader

GraphQL's ability to fetch nested data structures in a single query is its most celebrated feature. However, behind the scenes, this flexibility introduces a major database performance trap: the N+1 Query Problem.

If you fetch a list of items and their associated relations (e.g., a list of posts and the author details of each post), a naive GraphQL resolver will execute one database query for the list, and then a separate query for every single item in that list.

In this guide, we will analyze why the N+1 problem occurs, explore why standard SQL joins do not fit GraphQL’s resolver model, and implement a solution using Facebook's DataLoader library.

How the N+1 Query Problem Happens

GraphQL executes query resolvers recursively, moving from the top of the query tree downwards.

Suppose a client submits this query:

query {
  posts {
    title
    author {
      name
    }
  }
}

The GraphQL engine processes this in stages:

  1. It calls the posts resolver, which queries the database for all posts:
    • Query 1: SELECT * FROM posts; (Returns, say, 50 posts).
  2. For each of the 50 post objects, the engine resolves the nested author field. It executes the author resolver 50 times:
    • Query 2: SELECT * FROM users WHERE id = post1.author_id;
    • Query 3: SELECT * FROM users WHERE id = post2.author_id;
    • ...
    • Query 51: SELECT * FROM users WHERE id = post50.author_id;

Instead of pulling all the data in one or two database operations, the server makes 51 queries (1 primary query + N relation queries). If your site has high traffic, your database connection pool will exhaust instantly.

The Solution: Facebook's DataLoader

The standard tool to fix the N+1 query problem is DataLoader.

DataLoader is a utility library that solves the problem using two strategies: Batching and Caching.

  • Batching: Instead of querying the database immediately when a resolver requests a relation, DataLoader waits briefly (coalescing requests in the same Node.js Event Loop microtask queue Tick). It groups all individual keys into an array and requests them in a single batch query (e.g., SELECT * FROM users WHERE id IN (1, 2, 3...);).
  • Caching: If multiple posts share the same author, DataLoader caches the user record for the duration of the current HTTP request, preventing duplicate database checks.

Implementing DataLoader in Node.js

Let's write a resolver setup using DataLoader.

Step 1: Install DataLoader

pnpm add dataloader

Step 2: Define the Batch Loading Function

A batch loading function accepts an array of keys and must return a Promise that resolves to an array of values of the same length and in the same order as the keys.

import DataLoader from 'dataloader';

// DB user query helper
async function batchGetUsers(userIds: readonly number[]) {
  // Execute a single SQL query using the IN operator
  const users = await db.query('SELECT * FROM users WHERE id = ANY($1)', [userIds]);
  
  // CRITICAL: Map database rows back to match key order exactly
  const userMap = new Map(users.map((user) => [user.id, user]));
  return userIds.map((id) => userMap.get(id) || null);
}

// Instantiate the loader per request context
export const userLoader = new DataLoader(batchGetUsers);

Step 3: Wire Up the Resolvers

Instead of calling the database helper directly in the nested user resolver, pass the key to the loader:

const resolvers = {
  Query: {
    posts: async () => {
      // Query 1: Fetches all posts
      return db.query('SELECT * FROM posts');
    },
  },
  Post: {
    author: async (parentPost, args, context) => {
      // DataLoader intercepts this call, batches all author_ids, 
      // and executes one SELECT ... WHERE id IN (...) query.
      return userLoader.load(parentPost.author_id);
    },
  },
};

How the Database Query Graph Changes

By introducing DataLoader, the 51 database queries collapse into 2 queries:

  1. Query 1: SELECT * FROM posts; (Returns 50 posts).
  2. Query 2: SELECT * FROM users WHERE id IN (3, 7, 12, ...); (Fetches all unique authors in one query).

This dramatically reduces database loads, cuts response times, and scales naturally with nested lists.

Conclusion

The N+1 query problem is a natural side effect of GraphQL’s decoupled resolver architecture. By configuring a per-request DataLoader instance, you can batch individual relation lookups into combined SQL IN operations and cache duplicates during the request lifecycle, ensuring your GraphQL APIs remain fast.