Solve  the AppSync Lambda Resolver N+1 Problem with BatchInvoke

Solve the AppSync Lambda Resolver N+1 Problem with BatchInvoke

·

4 min read

If you're not familiar with the GraphQL N+1 problem then consider the following query.

query {
  authors {
    id
    name
    books {
      id
      name
    }
  }
}

The "authors" resolver is executed once and returns a list of N authors. The "books" resolver is then executed once for each author. This is the N+1 problem because you execute N resolvers for the books plus 1 for the authors.

As your query becomes deeper the number of resolvers executed increases significantly. Let's assume there are 10 authors, each with 5 books. To fetch the authors and books there are 11 resolvers executed but if we also request publisher information for each book we now execute 61 resolvers (1 + 10 + 10 x 5).

query {
  authors {
    id
    name
    books {
      id
      name
      publisher {
        id
        name
      }
    }
  }
}

When configuring AppSync to use a Lambda function as a resolver you can use the Invoke or BatchInvoke operation.

The Invoke operation will cause AppSync to invoke your Lambda function every time it needs to execute the resolver. For the "authors" resolver in our example that is acceptable because it is only executed once.

import { AppSyncResolverHandler } from 'aws-lambda';

export const authorsHandler: AppSyncResolverHandler<unknown, { id: string; name: string }[]> = async (event) => {
    const authors = [
        { id: '1', name: 'Tajeddigt Olufemi' },
        { id: '2', name: 'Lelio Miodrag' },
        { id: '3', name: 'Aineias Vladimir' },
        { id: '4', name: 'Sachin Lamya' },
        { id: '5', name: 'Kamakshi Cosme' },
        { id: '6', name: 'James Sharmila' },
        { id: '7', name: 'Holden Wulfflæd' },
        { id: '8', name: 'Quidel Bahdan' },
        { id: '9', name: 'Reinout Johanna' },
        { id: '10', name: 'Til Nikica' },
        { id: '11', name: 'Metoděj Maxima' },
        { id: '12', name: 'Đình Ester' },
        { id: '13', name: 'Màxim Kristina' },
        { id: '14', name: 'Yedidia Jafar' },
        { id: '15', name: 'Lone Mariusz' },
        { id: '16', name: 'Vitya Franjo' },
        { id: '17', name: 'Malvolio Lochlann' },
        { id: '18', name: 'Evette Dierk' },
        { id: '19', name: 'Nnenna Basileios' },
        { id: '20', name: 'Dmitrei Iya' },
    ];
    return authors;
};

For the "books" resolver we really want to use the BatchInvoke operation. To use it with the direct Lambda resolver integration add the MaxBatchSize property to your resolver definition.

  BooksResolver:
    Type: AWS::AppSync::Resolver
    Properties:
      ApiId: !GetAtt Api.ApiId
      DataSourceName: !GetAtt BooksDataSource.Name
      TypeName: Author
      FieldName: books
      MaxBatchSize: 1000

AppSync will now invoke your Lambda with an array of up to MatchBatchSize events. This allows your Lambda to process multiple resolvers at once instead of individually.

import { AppSyncBatchResolverHandler } from 'aws-lambda';

export const lambdaHandler: AppSyncBatchResolverHandler<
    unknown,
    { id: string; name: string }[],
    { id: string; name: string }
> = async (events) => {
    return events.map((event) => {
        const books: { id: string; name: string }[] = [];
        for (let i = 1; i <= 20; i++) {
            books.push({ id: `${event.source.id}-${i}`, name: `Book ${i}` });
        }
        return books;
    });
};

By batch handling resolver executions in a single Lambda invocation you can improve performance by reducing the number of cold starts and lower costs through invoking less Lambda functions.

Tip: In production you may want to experiment with values forMaxBatchSizeas you are trading off the risk of a cold start and cost versus being able to process requests in parallel.

What about our publishers?

You should always use BatchInvoke for nested Lambda resolvers when possible. At each level of nesting AppSync will determine the events that need to be sent to your Lambda resolver then pass them in array of MaxBatchSize length resulting in the minimum number of Lambda functions being executed. With a sufficiently high enough MaxBatchSize value you may only need to invoke one Lambda function for each level of nesting in your query.

By using BatchInvoke you may also be able to optimize your resolvers through more efficient processing of requests.

In our example we are requesting information about the publisher for each book. It's likely that publishers will be repeated many times. If we used Invoke then each resolver execution would use GetItem to fetch the published record from DynamoDB. This means the same publisher record would be retrieved multiple times. With BatchInvoke we can filter a list of unique publisher ID's, use BatchGetItem to fetch them all at once from DynamoDB then map over the events to return the correct publish record for each resolver. This reduces our usage of DynamoDB and makes it quicker.