What is pagination in GraphQL?
Querying a list of data in discrete parts
Pagination means getting a part of data in a large dataset, and continuously get parts of it till the whole dataset has been received. This helps in optimizing our app's performance because trying to get all the data will result in a slow load time of our app and also a slow rendering of the results in the UI.
For example, let's say you have a blog website, and on the blog page, you display blog posts in a list view. If you have 1000 blog posts in your database, the blog list page will fetch the 1000 blog posts and display them.
Fetching up to 1000 blog posts in one swoop will eat up rendering time. Mapping through the fetched 1000 blog posts and rendering them on the UI also will eat up the browser memory and performance.
In this case, we might experience a slow response from our browser and the UI will be laggy to respond to our actions.
The only way to mitigate against this is to paginate our queries. For example, we can fetch 10 blog posts first, then on scrolling of the list view, we fetch the next 10 blog posts and append them to the UI. This way we optimize our app performance.
Doing this in GraphQL is pretty easy. We use the limit and offset arguments.
Now, in pagination, data are broken into pages. We can define how much data can be contained on a page. That's where we use the offset and limit. Limit holds the number of data a page will contain, offset is the number of records to skip.
Let's say in our 1000 blog posts, that we want a page to have 10 blog posts. Now the limit will be 10. We can set the offset to always skip records and start from the next record after the last skipped record.
Initially to fetch a page the pagination details will be:
This will fetch 10 blog posts starting from the 1st record.
The second page will have:
This will fetch 10 records starting from the 11th record.
This is just like this SQL statement:
You see the offset is incremented from the client-side to tell the server where next from the database to start fetching records from.
A GraphQL query of our blog posts will be this:
Let's use Strapi to help demonstrate how pagination in graphql works.
Scaffold a Strapi-GraphQL project
Run this command to create a Strapi project:
This will create a Strapi project with folder name graphql-prj.
We will install a GraphQL plugin that will enable us to build GraphQL endpoints in Strapi.
Move into the graphql-prj folder:
Run the below command in the folder:
This installs the GraphQL plugin.
Navigate to http://localhost:1337/graphql in your browser.
You will see the GraphQL playground UI show up. Now, we have to register first before we can perform authorized requests.
The registration will open up, enter your details and click on "Let's Start".
Create a new collection, name it blogPosts. Set it to have the fields:
Next, go to "Settings" -> "Roles".
On both "Authenticated" and "Public" pages, scroll down to the "Permissions" section and click on the "Select all" check box. Scroll up and click on "Save".
Now, we are set. Add mock data for blogPosts, you can add 20 blog posts.
Create a new tab in the GraphQL UI playground, use this mutation to create mock blog posts:
After you are done, move back to the GraphQL UI playground. That's where we play it all out.
There are many types of pagination we can use:
- Offset pagination
- cursor-based pagination
- Relay style pagination
This is the pagination we discussed at the beginning of this article.
Let's see how we can use offset-based pagination in our Strapi endpoints.
We want to fetch blog posts with a limit of 2 posts.
Our first dataset will be this:
- limit (integer): Define the number of returned entries.
- start (integer): Define the amount of entries to skip.
The limit is set to 2, and the start is set to 0. The start is Strapi's offset.
The result will be this:
Check the ids and see that the offset began from the first(0) index. It began at index 0 and took off two blogposts.
Let's get the next blog post, to do that the start will be incremented
Since the previous query started from index 0, this starts from index 2. The query will start from index 2 in the database.
So the result will be this:
See the id started from 3.
How performant is offset-based pagination?
Offset-based pagination is good in not-too large datasets.
How do we mean?
In offset-pagination, the offset tells the database where to start the query, right? Yes, but the database engine does not work that way.
The database will start the query from the start and begin collecting the datasets when it reaches the offset specified.
So, if we have a 1M datasets and we want to query data from offset 2000. The database will walk 2000 steps through the datasets from the start index 0 before it starts collecting the data at index 2000.
It is just like this in JS:
It iterates through the dataset from the start, then it starts collecting results from the offset. That's how offset=pagination works in databases.
So we see that a large offset impacts the performance of offset-based pagination.
Cursor-based pagination is the most efficient method of paging and should always be used where possible. - Facebook
This type of pagination uses a record or a pointer to a record in the dataset to paginate results. The cursor will refer to a record in the database.
The cursor is set to the id 1 of a record in the database. The database will start from the record with id 1 and collect 2 records below it.
See the result:
Now the last record here is id 2. to query for the next set of records we pass in the next record id as the cursor.
This will make the GraphQL engine start from the record with id 3 and fetch 2 records below it.
See the id is acting as the cursor column. The next cursor is set from the last result. We used the id as the cursor pivot because it is auto-incremented and we can't have a case of missing/duplicate records.
Here the cursor refers to the created_at field. It will start from the record with created_at field set to "2020-03-12:08-00" and collect 12 records below it.
The result will be this:
These are the records it returns. Now, see that the last record has a created_at value "2020-03-03:07-00". This last record becomes the next cursor in the next query.
We see that in cursor-based pagination, the pagination is done based on the fields of the record in the database. It is more like a pointer. You tell the database where to start from, it does not iterate through the records from the start before it starts collecting. It skips the row and starts from the cursor.
The equivalent in JS is this:
It gets the starting point from the dataset using the cursor then, it iterates using the limit and populates the result.
This is basically how cursor-based pagination works. It uses a record id in the database to locate its starting point then, slice off the required chunk. The last record in the returned chunk becomes the next cursor.
Note: The method of usign IDs as cursor is one way of implementing cursor-based pagination but there are other ways beyond the scope of the article. We can use other uniques fields in our records provided the fields are globally unique.
We have to return more info so we know more about the remaining data so we can know the next query to perform.
The information can be:
- Total count: This tells us the total count of data in the database.
- Has next page: This indicates whether we have exhausted the pages or there are more pages.
- Next cursor: The cursor to use in the next query.
This information is computed and returned in the query result. This uses a concept called Connection.
Connection is a concept that started with RelayJS and is used for GraphQL. This concept is used by Facebook, Twitter, and GitHub for fetching the paginated records.
Connection is an object that contains Edges, and Nodes.
- Connection: This object contains metadata about the paginated field. This is mostly used in cursor-based pagination because it gives us extra info we use to perform the next query.
- Edges: This object contains the info about the parent and the child it contains. It connects two nodes, representing some kind of relationship between the two nodes. Each edge holds metadata about each object in the returned paginated result.
- Nodes: the objects held in the edges.
Let's see a complete example of cursor-based pagination in an Express.js GraphQL server.
First, we scaffold an Express.js project:
Move into it, init a Nodejs environ, install dependencies, and create files:
The index.js is the entry point of our server, the schema.js will hold our GraphQL schema types, and the resolvers.js will hold resolvers.
We going to build a News app graphql endpoints. It will have a news query where the news feed can be paginated. The query will be like this:
The totalCount will be the total number of the news items in the database. This info if we want to display the number of news in our database on our UI. The edges object like we already know contains info about each node, here it holds the cursor of each node. The node is the object of each news item. The pageInfo contains general information about the particular page returned. Here, the startCursor holds the next cursor that will be used to fetch the next of set news items. hasNextPage is a boolean that tells us whether there are more news items or not.
So let's flesh out the schema:
The NewsItem is a type for a single news item. See it has fields id, title, and body.
The Edge is the type for an edge object. It has node and cursor fields.
The PageInfo is the type for a page information. It has startCursor and hasNextPage fields.
The NewsResult holds the shape of the returned query result. It has totalCount, edges and pageInfo.
The Query has news, this tells us the root queries that can be performed. The news expects a first parameter, this parameter will be the limit of records to return from the database. The next parameter is afterCursor which points to the record where the database will start collecting records from.
Now, let's implement the resolvers:
We used the afterCursor argument to get the index of the cursor in the database. We used the Array#slice to collect the records starting from the afterCursor index, and the amount/number of records to collect using the first argument.
The result is mapped using the Array#map method, each news item in the result is converted to an edge. In the map() callback function, each news item is converted to a node, and the cursor is set using the news item id.
Further down, we set the startCursor. We got the last record in the edges and referenced the node object. This allows the dev to use this startCursor to query for another set of news items.
Last, an object literal (that is of NewsResult type) containing the totalCount, edges object and pageInfo object, is returned.
Now, we add our server ocde in index.js:
The main point here is that we imported the schema and the resolvers, then, passed them as a parameter to the ApolloServer(...) constructor.
We are all set. Run your server:
Open your browser and navigate to localhost:3000/graphql. We can query for news feed like:
Here, we want 5 news items starting from the news items with id 1.
The totalCount says the number of news feeds in the database is 25 items. The edges contain an array of the new items. In the pageInfo, the hasNextPage indicates that there are still more next pages. The startCursor states that the next cursor to use for the next page is 6.
So let's use this startCursor to query for the next news items:
You see it, records from id 5 are collected and the number of the records collected is 5. The next cursor in line is 10.
Cursor-based pagination is faster than offset-based pagination because it uses indexes to find results fast. It looks up the cursor in the database index, then jumps to the record in the database and starts there. The time taken in offset-based to iterate from the start is shaved off here.
Relay style pagination
Relay is a GraphQL client library for the React framework.
Relay keeps the management of data-fetching easy, whether your app has tens, hundreds, or thousands of components. And thanks to Relay’s incremental compiler, it keeps your iteration speed fast even as your app grows. - relay.dev
Relay uses cursor-based pagination to paginate results.
Let's see it in play:
The loadNext function is used to fetch the next records in the database. Relay used cursor-based pagination. See the first indicates the number of records that will be fetched from the database, after is the point where it will start.
Types of pagination UX
Pagination can be displayed in various forms on a webpage. The forms can be in:
- Numbered pages
- Sequential pages
- Infinite scroll
Numbered pages: The UI shows the number of pages on the web and the current page we are viewing. It also has a next button that we can click to move over to the next page.
The number of pages is also clickable, so we can click on any of them to jump to a certain page. A perfect example of a website that uses this Numbered pages form is Google.
We see at the bottom of our search results on Google.com the numbered pages of the search hits. We can click on the pages to move to a particular page.
Sequential pages: In this form of UX pagination, the pages are not numbered like what we have in the Numbered pages form.
Sequential pages have next and previous buttons. This next button allows us to move to the next page in the sequence.
The previous button allows us to move back to the previous page we just left. So we see here that the navigation between pages is sequential, we move from point to point without jumping off the sequence.
Infinite scroll: This is the most widely used form of pagination. This involves loading a small subset of the data and they are rendered on the visible viewport of the browser, now, when the page is scrolled, more content is loaded and they are appended to the UI.
This happens when you scroll to the bottom all the time, the discrete parts of the database are loaded and appended, this happens until the whole database records have been fetched.
The scroll event is more stable with the introduction of IntersectionObserver.
Facebook and Twitter famously use this type of UX pagination. Many don't usually see this as pagination as it does not load "pages", but on a technical view, it does not matter if it loads pages or not, you just have to scroll to get more content.
Implementaion: Numbered pages
Let's see how we can implement Numbered pages UX pagination in GraphQL. It is quite simple to implement.
Let's say you have a news app, and you are using the Numbered pages pagination.
We will have our schema like this:
We have our schema types here. The query news returns the news. The page specifies the page number the query will return. the type NewsItem holds the schema for a piece of single news.
Now, let's create the resolvers:
The data.json contains a mock news array. We got the page value from the args param, then uses Array#slice to deliver a subset of the news.
Now, our query will no be this:
The above query will load the first page. It will be like clicking on the 1 page on the UI.
Clicking on 2 page on the UI. We perform the below query:
Clicking oon 3 page will load the page 3 by doing this:
This is what will happen whenever we click on the page buttons. The query for that page is performed and the result is appended on the DOM.
This is very simple.
Implementation: Sequential page
This will have the same backend as the Numbered pages because, in Sequential pages, we only have the Next and Previous buttons to navigate the pages sequentially without jumping any page.
So our implementation in the above section works for here.
For example, on the initial load of the news app, the query for the first page is made.
Now, clicking on Next will query the secound page:
Clicking on the Next button again will query the third page:
Clicking on the Previous page will query the second page:
Clicking on the Previous button again will query the first page
Now, the Previous page button will become disabled because there is no further back page.
Implementation: Infinite scroll
Like we stated earlier, Infinite scroll uses Intersection Observer API to listen when the viewport is scrolled into view.
The implementation will be similar to what we have above, but the pages will now be in terms of the number of items that will be displayed in the viewport.
We can set to be loading 10 items per viewport i.e. when a viewport is scrolled into view a new 10 items are loaded and displayed.
So the query will look like this:
The noOfItems argument is the number of news to load for each query.
Our schema will now look like this:
In the Query type our schema has that the news expects a noOfItems parameter.
The resolver will be this:
This query is called till all the records in the database have been fetched. This is very easy to maintain too.
Problems with UX page numbering
Items or records are constantly added or removed. This often happens when users are using the webpage and are navigating between the pages, this might cause a lot of issues in the data displayed. We will look at the issues next.
This is best explained with an example. Let's say we have a record:
Let's say we query for 2 records like this:
The query will collect the data from 1 to 2.
If new records are added after the query:
When the next page is to be queried:
Now, the query result should be from 3. data_C in the old record, but this being a new record it will start from 3. data_M and end in 4. data_A.
Now, we have a duplicate record in 4. data_A, that was returned in the last query. In the DOM we will have data_A rendered twice:
This leads to a poor UX with duplicate records on the DOM.
Let's also explain this with an example.
We have this data:
We query for the first page:
Now, if the data 1. and 2. are deleted from the database, and a query for the second page:
The database will be this:
The query will start from 3. data_E down to 4. data_F.
is skipped. These are the drawbacks of UX pagination.
How do you count in GraphQL?
GraphQL does not natively support count. It is the GraphQL platform that provides support for the count functionality.
Hasura provides the count functionality using their aggregation queries on aggregate fields like sum, count, avg. In Hasura, if we have blogPosts Hasura will create aggregate queries blogpost_aggregate from this aggregate we can get the sum, count, avg functionality like this:
See _aggregate is added to each query. The count field returns the number of records in the blogPosts.
The result will be this:
We can then get the count by doing this: result.blogPosts_aggregate.count.
Aggregation can also be nested like this:
We can reference the count like this blogPosts.blogPosts_aggregate.count.
Let's see an example of how we can add a count functionality in GraphQL. We will be using the Express.js server as an example.
This app will return blogPosts in the graphql endpoint.
We will have to query blog posts and have the count functionality like this:
The count holds the number of blog post records in the database.
So we can have the result like this:
We setup an Express.js server like this:
The main point is in the resolvers and schema files. In the schema we will have to add our schema types.
See the type BlogPosts, which defines the type of the output. the nodes will hold the blog posts, the aggregate will hold the count property as we can see in the type Aggregate.
Let's flesh out our resolvers:
We are returning an object that is the same as the type BlogPosts. See the aggregate object, with the count property, the value is the size of the blog posts in the blgPosts array. So we use the Array#length to get the size and set it to the count property.
The blog posts come from the data.json:
Start your server and launch localhost:3000/graphql, in the graphql playground run the query:
You will see the result.
You see this is how we add count functionality in our queries. We can add sum, avg following how we added count.
What is the use of cursor parameter in GraphQL?
The cursor parameter is the point of pivot in the database, it identifies a particular row. Cursors are the way we tell the server about the last fetched record on the client-side.
In MongoDB, cursors are like using the where and limit clauses to fetch data.
The id is used as the pivot point, the cursor the last fetch record id. MongoDB gets all the blog posts that have the condition, where the userId equals the id and the id is greater than the id in the cursor.
Now, the record gotten by this first query is used to get another dataset from the database. the last record in the previous datasets will become the cursor in the next query.
What are GraphQL edges?
Edges in GraphQL represent the lines that connect two nodes in GraphQL. This represents some kind of relationship between the two nodes.
Edges are like join tables.
Edges are used in cursor-based pagination because they allow us to add additional info to each paginated record. This info is vital because it shows how the records relate to the parent query.
This query gets a team and its members. To get the relationship between each member of the team, we used the edges to add this info. The edges hold the info:
- joinedAt: This holds the info of when the team member joined the team.
- role: This holds the role of the team member in the team. It could be goalkeeper, forward, etc.
The result can look like this:
You get the point now, right? The edges are used to add extra info about the object.
Pagination is a very powerful optimization technique in software development, and GraphQL made it very simple to use and implement.
In this post, we started by describing what Pagination is, its importance, and how it affects the load time of our web pages.
Next, we demonstrated how we can use pagination in GraphQL, we did that by using Strapi. Later on, we saw the types of pagination techniques and also examples of how each of them works.
Further down, we learned about UX pagination with its usage and examples of how they work. Then, from there we answered basic FAQs with numerous examples to buttress our point.