GraphQL: The Next Level REST

RESTful API has always been the #1 choice for today’s client-server communication standard. Many of the existing public and private APIs are now built with RESTful API at their core. REST is chosen due to its simplicity of development, the concept of resource in mind, and of course its compatibility with different interfaces or devices.

As we iterate our product, the backend system with REST is becoming more complex and harder to maintain. Why is it?

Bloated REST == UnREST?

Developing REST API from the beginning is way easier than developing existing “big” REST API. You can always match your response to what your client needs ー like exactly what client needs. However, that’s the white box case of development. You know what the front-end application would be. You know exactly what the design and the data that is going to be displayed on the screen. But what about if you’re creating an open API? 

You don’t really know what your API users will use it for, maybe to build a screen. Therefore, you can’t really provide them with the data they need inside a single endpoint. Instead you will need lots of endpoints to provide different resources available via the API. Server has control over resources.

Maintaining REST endpoints for backwards compatibility is also important in REST architecture. You cannot simply modify responses from the existing APIs, because it may cause break on older versions of the application. Therefore, what you need to do is to version your API. This is a very good workaround for such problems. But what if this problem occurs often? You like to change your API response yet you want to keep your API behaves properly on current API clients. You would end up with many versions of your API.

Here’s another story of why using REST is sometimes stressful. Imagine you own a social media company with a system of following and followers. Your backend team provides REST API for user resources. Your frontend team wants to render a profile page screen illustrated below.

Social Media sample with 3 REST API Call

If we look at the screen, there are 3 “parts” of the screen based on its resources. The first part is the user details, the second part is the activities related to the user, and lastly the recent followers. Unfortunately, those resources are not available in a single endpoint. In order to get the data, you would need to call 3 different APIs (let’s say from api.sample.com)

  • User detail (api.sample.com/me)
  • User activities (api.sample.com/me/activities?page=1) 
  • User followers list (api.sample.com/me/followers?limit=3)

Having that kind of problem yields these problems:

  • Multiple API calls for a single screen
  • Underfetching, not having resources you need in one API (activities/followers)
  • Overfetching, having superfluous resources in one API (e.g. birth date on user detail)

Simply said, using REST isn’t a silver bullet in terms of client-server communication. Without maintainability in mind when building REST, maintaining REST API would be very-very exhausting. Building client applications with REST too, if not provided with well-designed API would be very frustrating.

The Solution

The solution to this problem will definitely be a tool which gives clients efficient data fetching. This means no more overfetching and underfetching problems. No more wasted bandwidth. If possible, maybe we need a tool which can combine multiple API calls to a single API call. Simply translated to the technical part, we need a flexible API, flexible that it gives the client the control over resources available in the server. So, introducing you to GraphQL.

TL; DR ― GraphQL

About GraphQL

GraphQL (which stands for Graph Query Language) is a query language for API originally created by Facebook in 2012. Facebook firstly used this to develop their uneasiness of using REST on their mobile applications [2]. However, instead of keeping it internally, they decided to make it as an open-source solution in 2015, making it available to anyone who wants to use that standard to the public. As per today’s Stackshare data, it has been used by 1175 companies around the world! Including GitHub, Lyft, Airbnb, and Facebook obviously. And some of them may not be listed there, so there should be more than 1175 companies.

GraphQL History (Timeline)
GraphQL History (Timeline)

Concepts and How It Works

GraphQL gives clients the power to fetch resources as sufficient as possible. Clients are able to specify which attributes of resources you would like to fetch. This is quite similar to query language we are probably already familiar with – SQL. In SQL, you can SELECT *` or you can `SELECT attribute_one, attribute_two on a table (or specific resource). But rather than this being done on the level of the database, GraphQL gives this capability to API level. Nevertheless, GraphQL is NOT a database technology. You can use any database technology with GraphQL since GraphQL works on API level instead of database level.

Clients fetch resources in two different forms – queries and mutations (Actually there is another form of operation called subscription). Both of the operations combined are equivalent to CRUD operations in REST API. Queries are operations that are equivalent to HTTP GET operations, you can get resources using queries. Mutations on the other hand are equivalent to the rest, which are HTTP POST, HTTP PUT/PATCH, HTTP DELETE (create, update, delete). These two operations are sent using a query language specified by GraphQL and it returns the data in the form of JSON.

QueryResponse
{
  getBookById(id:3) {
    id,
    title,
    year
  }
}
{
  "data": {
    "getBookById": {
      "id": 3,
      "title": "Book Three",
      "year": 2010
    }
  }
}
{
  book: getBookById(id:3) {
    id,
    title,
    authors {
      id,
      name
    },
    year
  }
}
{
  "data": {
    "book": {
      "id": 3,
      "title": "Book Three",
      "authors": [
        {
          "id": 2,
          "name": "Author Two"
        },
        {
          "id": 4,
          "name": "Author Four"
        }
      ],
      "year": 2010
    }
  }
}
GraphQL Query-Response Sample

GraphQL matches the response based on the query sent to the server. This works because GraphQL uses something called resolvers. Resolvers are no more than functions that take actions based on the query being sent to it. Every field in GraphQL is handled by a resolver. Generally, we can divide the resolvers into two kinds of them.

The first one is query resolvers. Query resolvers are resolvers that handle queries coming from the client. These kinds of resolvers should act as a root to any downline resolvers. This query resolver will trigger some functions like getting data from a database or even a third-party or private REST API, etc. After getting the data, usually the data is sent to the downline resolvers to build the overall response based on the query.

The downline resolvers that are mentioned previously are called field resolvers. These resolvers are the ones that build up the overall response based on the query. Field resolvers are used for each attribute of a resource. Each of the fields inside a resource has its own resolvers. This is why GraphQL can dynamically respond to queries based on the requests, because the fields that are not requested from the client won’t trigger the field resolver to get that attribute. Only the requested fields are being resolved by GraphQL.

{
  book: getBookById(id:3) {
    id,
    title,
    authors {
      id,
      name
    },
    year
  }
}
{
  query(getBookById)
    field(book.id),
    field(book.title),
    field(book.authors) {
      field(book.author[i].id),
      field(book.author[i].name)
    },
    field(book.year)
  }
}

The Difference

API Contract vs Schema

In REST API, API contract or API documentation is provided by the API developer, so that the users know which endpoint to call in order to get particular resources. Usually this consists of a lot of endpoints. A single endpoint has its own HTTP method too. Each of the endpoints has its own handler and does different actions. In addition, each of them has a different response, including different error handling. One of them may require payload in the request. Some of them may not. One endpoint may return a response in the form of a JSON array, one of them may return a JSON object. Some of them may just return an HTTP status code response. One example of an oversimplified API contract is illustrated in the table below.

GETapi.sample.com/booksGet list of books
POSTapi.sample.com/booksCreate a new book
PUTapi.sample.com/books/:idUpdate a book
PATCHapi.sample.com/books/:idPartially update a book
DELETEapi.sample.com/books/:idDelete a book

For backend developers, this API contract is actually not mandatory to make sure their APIs are working properly. This is a skippable process on API development. That’s why if not maintained properly, API contracts might be obsolete and need lots of revisions. 

In contrast, GraphQL utilizes only a single almighty endpoint (e.g. /graphql) to do all of them. What differs each request is the operation is the query or the mutation itself. Therefore API documentation needed is the endpoint itself and the list of operations available on GraphQL API. This list of operations is defined in a document called schema which is written using GraphQL SDL (Schema Definition Language). This document consists of all objects available in the backend server. Including all of the operations that are exposed to the “outer world”. One sample of a GraphQL schema would look something like below.

The schema above is one kind of schema. Two types of resources are defined there which are Book and Author including their fields that you can choose to fetch. One query operation that you can do with the GraphQL API is to get the book by book ID. Because you know that it returns a book, you can also choose which book’s fields you want to fetch! That’s just one sample of GraphQL schema, another example of a GraphQL schema is GitHub’s, check it out here!

What’s so good about this is – schema is a mandatory requirement for API developers this time. That’s why API documentation for GraphQL will never be obsolete (unless the API developer hasn’t updated the schema to front-end developers). Before the API developer adds a new feature or a new query/mutation, the API developers need to add the related fields first or the query first to make sure that the GraphQL API that they built can receive requests with such queries or mutations. This makes the process of maintaining GraphQL schema unskippable. So.. say goodbye to obsolete API documentations.

Error Handling

Errors on REST API are usually noticed by looking at the HTTP status code. HTTP status 200 means that it is returning an OK response. Status code 201 means that you successfully created a resource on the backend server. Meanwhile, 400s status code means that something is wrong on the request, maybe you’re unauthorized (401), forbidden to get the resource (403), the resource you’re looking for is not available (404), and some other 400s codes. Sometimes you get 500s codes too which means the server is busy, gateway timeout, or some other server-side errors.

On the other hand, GraphQL servers will ALWAYS return 200 OK responses. However, that does not mean that the response will always be successful. GraphQL responses will always be in a format of JSON containing at least one field. Either “data” or “errors” or both of them.

{
  getBookById(id:3) {
    id,
    title,
    year
  }
}
{
  "data": {
    "getBookById": {
      "id": 3,
      "title": "Book Three",
      "year": 2010
    }
  }
}

The example above is the sample query being sent to the server from front-end applications and on the right is the sample response. Success responses are denoted by the inexistence of “errors” field. And the data you requested is available in the “data” field.

{
  etBookById(id:3) {
    id,
    title,
    year
  }
}
{
  "errors": [
    {
      "message": "Cannot query field \"etBookById\" on type \"Query\". Did you mean \"getBookById\"?",
      "locations": [
        {
          "line": 2,
          "column": 2
        }
      ]
    }
  ]

The example above shows an error response. We are trying to query something which query does not exist on the schema. The response from the server is nothing but an “errors” field. Error responses are denoted by the existence of the “errors” field containing an array of error messages.

Pros and Cons

We’ve seen a lot of what GraphQL is and how it can help us. Actually, GraphQL can help us a lot more.

Pros #1 Smaller bandwidth consumption, smaller RTT

By only requesting the data the client really needs, the bandwidth consumption will be a lot lower. Time-wise, RTT will be faster since we can get every resource we need using a one time GraphQL API call. This will improve the overall user experience, as users will see faster loading screens especially in places with low Internet speed.

Pros #2 Versioning? Pfftt

Say no more to versioning, even though you can do it. Actually if you want to get a different response, just send a different query. Adding a new field? Just add it on the resource schema.

Pros #3 Multiple programming languages support (just like REST)

Since GraphQL is just a standard like REST. Both of them are supported in multiple programming languages. It was originally built using JavaScript. Fortunately, you can still use other programming languages including C# (.NET), Clojure, Go, Python, Ruby, Rust, Scala, and other programming languages. This is different compared to Falcor which is a concrete technology built using JavaScript.

Pros #4 No more obsolete API contracts

API exposure needs schema updates/creations, resulting in up-to-date schema.

Pros #5 Flexible deployments

You can simply put GraphQL that acts as an API gateway, calling multiple services seamlessly. Thinking of moving to GraphQL from your existing microservices REST API? You can consider that option too by creating GraphQL as an API gateway to multiple services! You can also put GraphQL as a standalone service which will call the database immediately returning the resources. Options are unlimited.

GraphQL seems like a very good solution so far. However, just like every other libraries, GraphQL also has limitations.

Cons #1 N+1 problem

As you dive a lot into the fields inside the resources. You will need to call the database a lot, resulting in a lot of internal RTTs (database calls). Consider the book-authors example. You want to get the detail of a book, including the detail of all authors (1-N relation). The database call would be 1 time to get book detail added with N time calls to get author details if there are N authors and this is just for one request. To overcome this problem, we can actually consider DataLoader for batching queries created by Facebook as an option. Unfortunately, this tool is a concrete technology built using JavaScript too.

Cons #2 No native file operations

GraphQL does not support native file uploads and downloads. You would need a third-party library for that or simply create a REST API for file operations.

Cons #3 Harder caching

Requests are very dynamic, caching responses might not be a good idea. However, one option for this is to create query hashes which acts as the key, which value is the response to that query.

Cons #4 Overkill for small applications

Small applications would only need very little amount of resources to make it work. Using GraphQL is an option yet an overkill for these applications. You will definitely need a lot more effort than just creating a simple REST API.

Cons #5 Steep learning curve

GraphQL is quite powerful, yet you will need time to learn all of its capabilities. Building a simple schema is easy. However, if you’re working on big domains where you have lots of entities, lots of interconnections between them, building a maintainable and usable schema is noticeably difficult because you would need to use “stuffs” which terms are not mentioned in this post, like fragments, interfaces, unions, scalars, etc. In addition, you obviously need to understand when to use which “stuff”. This is way much harder compared to REST where everything is much simpler.

Cons #6 Security concerns

Enforcing security is another crucial problem in GraphQL. However, enforcing this in GraphQL is also not an easy problem. Malicious clients can also send queries with lots of nested queries or small queries with a bunch of jobs happening in the backend server. This makes our server performance decrease. On the other hand our GraphQL clients might really need to make nested queries. So, we need to determine the right maximum depth for our GraphQL query or apply other policies to ensure that the GraphQL server works just fine [4]. In addition, we also need to do a lot more work just to enable field-level authorization as the resource we use is shared between different roles (admin or other specific roles). This might get more complicated as our schema gets more complex. 

Similar Technology

Falcor: One Model Everywhere
Falcor: One Model Everywhere

Netflix’s Falcor offers the same capabilities of GraphQL. It allows clients to fetch resources dynamically based on a given query. Falcor is easier to use yet not more powerful than GraphQL. Falcor also has no static typing of the schema. That’s why people choose GraphQL over Falcor due to its capability of being flexible yet robust because of its static typing in the schema definition. [3]

Key Takeaways

GraphQL is a very good fit when you’re having problems with REST architecture, especially on underfetching and overfetching problems. This can be used to optimize bandwidth since front-end applications will always get exactly what they need. These will definitely improve overall user experience on the front-end applications. Developer experience will be good as well as you won’t need to worry about API versioning and obsolete API documentation. Even though GraphQL is powerful enough to solve that kind of problems, it is not a silver bullet due to its limitation on native file operations (upload and download), you would probably need another third-party library for that or simply consider a REST API to work alongside GraphQL. It would also cost you more processing power since GraphQL requires a lot of RTT in order to solve complex queries. Furthermore, GraphQL does not always fit your problem. Even if it does, you will always need good engineering management to maintain the GraphQL API as maintaining the schema wouldn’t be that easy.

Read More About It!

  1. GraphQL
  2. GraphQL: A data query language – Facebook Engineering
  3. GraphQL vs. Falcor Comparison
  4. Security and GraphQL

Want to explore more about GraphQL or anything else around software development? Work with us! Start here.

0
Share

More Articles

let's talk illustration

Loving what you're seeing so far?

It doesn’t have to be a project. Questions or love letters are fine. Drop us a line

Let's talk arrow right