Ten Tips for Writing Your First GraphQL Schema

Ten Tips for Writing Your First GraphQL Schema

The other day a colleague of mine messaged me, asking whether I had any tips for those writing their first GraphQL schema. I thought back over everything I’d learnt about GraphQL schemas in the last three years. What were the mistakes I’d made? What did I wish I’d known in advance? What were the things that I had seen trip up other developers?

I replied to him with one suggestion, then another, and then another. Before I knew it I’d basically written a whole blog post’s worth of them. So without any further ado, here are my top ten tips for those writing their first GraphQL schema. Spoiler alert: I become quite emotional about date formats in tip number seven.

1. Make sure you’re using GraphQL for the right reasons

Like any new technology, learning and using GraphQL takes effort. So it’s important to make sure that you’re going to actually reap sufficient benefit from this effort.

In my opinion, Facebook designed GraphQL with one primary goal in mind: so that when a user performs some action on a client, the client can fetch exactly the data that is needed – no more, no less – over the internet in just one call. Furthermore, Facebook wanted to make it as easy as possible for clients to retain this characteristic, even as their data needs evolved over time.

I mention this because GraphQL can also be used in other ways that are interesting, but not necessarily related to the original goal. For example, it can be used as:

However, I’m not convinced yet that GraphQL is necessarily the best way to achieve those outcomes. For example, if you want an alternative to REST for communicating between microservices, you might be better off looking at gRPC. Or if you use a GraphQL framework to expose datasources directly to a client, but need to get data from multiple sources in response to a single user interaction, the client can end up making multiple calls over the internet, which defeats the purpose of using GraphQL in the first place.

Consequently, before you get started, be clear about why you are using GraphQL. If the original, primary benefit of GraphQL – efficiently and flexibly fetching exactly the data a user needs at a point in time – isn’t actually the thing you are most interested in, then maybe GraphQL isn’t the right fit for you.

2. Understand the terminology

OK, so let’s assume that you’re sure that GraphQL is what you need. Now you actually need to understand it.

GraphQL schemas and queries constitute a small, specialised programming language whose goal is different to the general-purpose languages most developers are accustomed to using day-to-day. Some of the constructs may have names that sound familiar, but that doesn’t mean they’ll mean exactly what you think they should. And some constructs will represent concepts you’ve never seen before.

It can be tempting to avoid understanding these concepts in the interest of getting things done. This pragmatism is admirable, but only works up until a point. Eventually, as with any programming language, you’ll need to develop an understanding of the fundamentals, so that you have something to build upon when you get into the more advanced stuff.

For example, when writing a schema it’s important to understand what a type and a field are, and what an argument is. When writing a query, it’s important to understand what an operation and a variable are. Understand that fields on any type can take arguments, not just fields on the Query type. Also understand that Query and Mutation are just like an other type, the only exception being that a GraphQL server must be able to map the query and mutation operation names to them on incoming requests.

To grok all this (and more) I recommend that anybody who works with GraphQL reads the documentation at some stage. You don’t have to do it all in one hit, but as you design your first schema and write your first queries, I recommend reading (and re-reading, if necessary) the relevant sections of the docs. And if you need to dig in even further, don’t be afraid to check out the GraphQL specification.

3. Don’t worry about versioning

The first rule of versioning in GraphQL is that there isn’t really any versioning.

Those of us who have used REST APIs may be scarred by experiences of enormous JSON payloads being returned to our clients, irrespective of whether the client actually needed everything in them or not. However, with GraphQL, the important thing to remember is this: if a client doesn’t need a particular piece of data, it doesn’t ask for it, and it doesn’t get it.

The end result of this is that the process of developing a GraphQL schema is largely additive. You just keep adding new fields. Furthermore, because client-side development is driven by schemas that originate from servers, in normal usage there aren’t any workflows where newer clients would be talking to an older server. At worst, an older client will be talking to a newer server, and – assuming you haven’t removed support for any fields from the server – the client will still able to get everything it needs.

In short, it’s usually not necessary for the client and the server to be synced up via a version-numbering scheme (although it can be useful to track how a server’s schema has changed over time). And for those rare occasions where you feel that you made a genuine design misstep and really do want to remove a field, the @deprecated directive is there to help manage the process.

4. Let the client drive your schema

So now that you’ve learnt a bit about GraphQL at a high-level, it’s time to dive in and start designing what you want your schema to look like. This can be daunting, as it can be hard to know where to start.

The best advice I got was to let the client drive the design. In other words, design a schema that will meet the need of a particular client. Define types and fields that make sense in the context of that client’s user interface. If you have multiple clients, then pick one and do that first. Because you don’t really have to worry about versioning with GraphQL, you can always add new things later on to support new clients.

One thing you shouldn’t do is pre-emptively attempt to write a generalised schema in isolation. You may hear vendors proclaiming that GraphQL is a way to model the data for your whole organization. Maybe it is, maybe it isn’t. Either way, if you’re new to GraphQL, don’t try and do that. Opportunities for generalisation may appear later on, and if they do, the additive nature of GraphQL means they’ll usually be able to sit happily alongside the initial client-specific schema you came up with. But if you try and generalise it too early, you risk getting it wrong and making the client’s life harder, not easier.

Nor should you necessarily let backend system dictate the names and relationships you define in your schema. Your GraphQL server’s job is to map between what your client needs and what your backend has. The schema is the client’s interface to the server, so the client’s needs should drive the design of this schema. The server should be the thing that can take care of the gory details, not the client.

Finally, don’t try and get it right all on one shot. Add a field and a type or two, implement them on the server, hook them up to the client, then take a step back. Review the documentation if necessary to consolidate your understanding of any new concepts you have encountered. Refactor any duplication that may have arisen. Then rinse and repeat.

5. Avoid foreign-key lookups

OK, now we can start to get into the nitty gritty. I’ll start with one mistake that I’ve seen lots of people make, including myself.

In short, if you find yourself adding fields to the Query type that do foreign-key lookups, you’re probably doing something wrong. For example, imagine we have a GraphQL server that implements the following schema:

type Parent {
  id: ID!
  name: String!
  children: [Child!]!
}

type Child { 
  id: ID!
  name: String!
}

type Query {
  parent(id: ID!): Parent!
}

If the user wanted to see the name of a particular parent, we’d send the following query to the server:

query GetParent($id: ID!) { 
  parent(id: $id) {
    name
  } 
}

But what if the user then wanted to see the names of that particular parent’s children? You might be tempted to think you’ve got to add a whole new field to the Query type:

children(parentId: ID!): [Child!]!  

and then query the new field like this:

query GetChildren($parentId: ID!) { 
  children(parentId: $parentId) {
    name
  } 
}

However, the new children field on the Query type is unnecessary, as the original schema already has everything we need. All we have to do is rework our query to get children via the existing parent field on Query:

query GetChildren($parentId: ID!) { 
  parent(id: $parentId) {
    children {
      name
    }
  } 
}

Some people get uneasy when they see this. If they’re looking up a Parent, shouldn’t they at least get its name or id? Well, no. When you look up an object, you don’t have to request any fields with scalar types. If all you want is a field that is a list, that’s all you have to ask for.

So as a general rule, before adding a new field to the Query type, I always recommend double-checking whether you can leverage an existing field to get what you need. This is especially the case with foreign-key relationships.

6. Keep IDs opaque to the client

This is optional, but I highly recommend it. An ID should mean nothing to the client other than that it’s some “thing” that can be used to look up an entity from the GraphQL server. This means that clients should definitely never attempt to parse values out of an ID. Nor should IDs ever need to be displayed to the user, unless the user needs to manually enter them into an external system.

I recommend this because, whilst ID‘s are serialised as a string, the internal representation of that string may change over time. For example, an entity may start by being stored in a DynamoDB table, with the ID mapping to a key. However, if it’s then relocated to a relational database, the format of that key might need to change. Alternatively, whilst you may be happy with simple IDs now, you might want to leave yourself open to the possibility of using a more sophisticated Global Object Identification pattern in future. I’ve even had scenarios where, due to the limitations of back-end systems I’m working with, some of my ids have had to actually be converted to composites of several different values, serialized as a JSON object.

In short, if the client is making any assumptions about the internal representation of the key, but you decide to change that representation, then the client will have to be updated. If it’s treating the representation as opaque, any changes won’t affect it.

So keep an eye out for fields like this:

type SomeType { 
  name: ID!
  ...
}

Whilst the case could be made that name will always be unique, in my experience a field like that doesn’t stay unique for very long. Consequently, I have no problem with you doing this instead:

type SomeType {
  id: ID!
  name: String!
}

…even if id and name actually have the same value. It’s a price worth paying to retain that flexibility.

7. Use a custom scalar type for dates

I’m going to get straight to the point on this one because I feel strongly about it. If you want to put dates in your GraphQL schema, use a string with the RFC3339/ISO8601 format.

I once worked with a core backend system where the time was represented as the number seconds since midnight on the first of January, 1970. The only catch was that it was midnight in the Darwin timezone. Because it was just a number, there was nothing to indicate that this was the case. Everybody would just assume it was UTC, and wonder why the dates they displayed to end users were always nine and a half hours off. The only consolation was that Darwin doesn’t have daylight savings time, so at least you were nine and half hours out all year round.

So do not use seconds or milliseconds since epoch, whatever the timezone. Do not use your own string format. Do not define your own GraphQL type. Instead, use an RFC3339/ISO8601 string. Unlike any of those other mechanisms, it is unambiguous. It is also human-readable. Finally, many standard libraries can parse and format it.

If you really want to get your tech on, use a custom scalar that serialises this format as a string. If you’re using a server based on graphql-js, there’s even a serializer/deserializer already available for doing this. But whatever you do, don’t use anything other than RFC3339/ISO8601. Life’s too short for debate and confusion about this. It’s a solved problem. Let’s move on. I don’t even know why we’re talking about this anymore. Whatever, I don’t care, just shut up.

8. Use the ! operator as much as possible

OK, now that I’ve gotten date formatting out of my system, let’s move on. This next tip is a controversial one, because it runs against the default behaviour of GraphQL.

You’ve probably noticed a lot of ! operators in the preceding schema examples. If you don’t already know, this operator specifies that a field or argument or variable cannot have a null value. I recommend you use it as much as possible.

Why? Because at the risk of stating the bleeding obvious, if something can be nullable, then you’ve got to deal with the scenario when it is null. At the very least, if you’ve indicated that something can be nullable, then your GraphQL server won’t check the values for you as they come in and out of the server at runtime – you need to do the check yourself.

Furthermore, if you use a code generator to generate static types, then a nullable field in a GraphQL type will result in a nullable field in the corresponding generated types. This means that, if you’re using a static type checker that is strict about nulls (for example, TypeScript with the strictNullChecks flag enabled, a flag that I always recommend you use), then the compiler will force you to deal with the possibility that a value is null at compile time. Which is great if it can be null, but a waste of time if it won’t ever actually be null.

Facebook argue that the typical scenario is for something to be nullable. I think this is because they want to use GraphQL to build clients that will continue to operate even when faced with missing data. However, you’re not Facebook, especially if you’re just starting out with GraphQL. So unless you’re facing the genuine possibility that a field can be null, don’t bother making it nullable. You’ll end up with lot’s of ! operators in your schema, but you’ll have a lot less to worry about both at runtime and if you statically check your code.

9. Remember that you can query multiple root fields

This might seem obvious to GraphQL experts, but I’ve seen it trip up novices. If you need to get the values of two different fields from the Query type, remember that you can fetch multiple fields from this type in a single operation. Consider the following example schema:

type Cat {
  id: ID!
  name: String!
}

type Dog { 
  id: ID!
  name: String!
}

type Query {
  cat(id: ID!): Cat!
  dog(id: ID!): Dog!
}

If you wanted to get the name of a cat with a particular ID, you’d query the server with something like this:

query GetCat($id: ID!) { 
  cat(id: $id) {
    name
  }
}

But what if you wanted to get the name of a particular dog at the same time, without having to send two separate queries in their own network requests? I’ve seen people think that, to do this, they will need to add another field to the Query type in their schema, where the type of that field contains specifically the information they want. For example:

type Query {
  ...
  catAndDog(catId: ID!, dogId: ID!): CatAndDog!
}
...

type CatAndDog {
  cat: Cat!
  dog: Dog!
}

They’d then query it with something like:

query GetCatAndDog($catId: ID!, $dogId: ID!) { 
  catAndDog(catId: $catId, dogId: $dogId) {
      cat {
        name
      }
      dog {
        name
      }
  }
}

However, you don’t need a new type, or a new field. Instead, you can just request the original two root fields in a single query:

query GetCatAndDog($catId: ID!, $dogId: ID!) { 
  cat(id: $catId) {
    name
  }
  dog(id: $dogId) {
    name
  }
}

In doing so, you saved having to create a new type, and avoided adding a new field to the Query type. You’re welcome.

10. Don’t be afraid to use an interface

This last one is something I’ve learnt from personal experience. It turns out that GraphQL interfaces can be a great way to model entities that share common fields. However, it can be tempting to postpone using them as they can seem a little intimidating at first. Nevertheless, it’s important to get in there and use them sooner rather than later, as they’ll often make your life easier in the long run.

For example, a common pattern when building an application is to show a list of entities, where all of the entities share one or more common properties. The user might then choose to look in more detail at one of those entities. However, the details of the entity may differ, depending on its type. To illustrate this, let’s extend upon our schema from the previous tip, adding a field to Dog called numberOfSticksFetched, and a field to Cat called numberOfMiceCaught:

type Dog {
  id: ID!
  name: String!
  numberOfSticksFetched: Int!
}

type Cat {
  id: ID!
  name: String!
  numberOfMiceCaught: Int!
}

Clearly these new fields are specific to dogs and cats. Dogs (generally) don’t catch mice, and well all know that no self-respecting cat is going to fetch a stick for you. However, they both still have an id and name field.

If we wanted to get a list of all cats and dogs, and only cared about showing their names in that list, we could add a couple of fields to the Query type:

type Query {
  ...
  cats: [Cat!]!
  dogs: [Dog!]!
}
...

and then query them like this:

query GetCatsAndDogs { 
  cats {
    id
    name
  }
  dogs {
    id
    name
  }
}

We’d then jam all the results together on the client-side and display them in a single list.

However, this is repetitive, as we had to ask for id and name in two separate places. Furthermore, if we add more types of animal with their own id and name fields to our schema in future, this duplication will grow.

At this point, you might start to think that we should instead add an Animal type. But what should it look like? I’ve seen some people start with something like this:

type Animal {
  id: ID!
  name: String!
  numberOfSticksFetched: Number
  numberOfMiceCaught: Number
}

type Query {
  ...
  animals: [Animal!]!
}

Where numberOfSticksFetched and numberOfMiceCaught are nullable, so that we can tell at runtime whether the animal is actually a cat or a dog. However, this modelling is problematic because it doesn’t prevent the possibility that they could both be null, or both non-null.

The good news is that, rather than having to model this sort of polymorphism ourselves, GraphQL has a built-in mechanism for doing it: interfaces.

interface Animal {
  id: ID!
  name: String!
}

type Dog implements Animal {
  id: ID!
  name: String!
  numberOfSticksFetched: Number!
}

type Cat implements Animal {
  id: ID!
  name: String!
  numberOfMiceCaught: Number!
}

type Query {
   ...
   animals: [Animal!]!
}

At this point it just might seem like a bunch of extra typing (I’m still unclear as to why GraphQL requires you to redeclare fields in types that implement an interface). However, having laid this groundwork, you can then get a list of all animals using:

query GetAnimals{ 
  animals {
    id
    name
    __typename
  }
}

Note the inclusion of the __typename field in the query. This is a built-in metafield that any GraphQL server must give you. It’ll be a string representation of the type of a particular result. If your user then wants to view the details of a particular animal, the client can use __typename to choose which query to run to get the details for that animal type. Even better, if we want to add a new type of animal, we just add a new type that implements the Animal interface, and don’t have to change anything else.

Whilst interfaces can be a bit of work to set up up-front, they’re worth the effort. I’ve designed numerous schemas where I wish I had just bitten the bullet earlier and used an interface.

Conclusion

So there you have it: my top ten tips for those writing their first GraphQL schema.

To recap, first ensure that you’re using GraphQL for the right reasons, and that you understand the key concepts. Next, rid yourself of preconceptions you might have from your days building versioned REST APIs. Then, take an incremental, client-driven approach, avoiding premature generalisation or undue influence from your backend systems.

If possible, keep your IDs opaque to give yourself a little wiggle room. And every time you find yourself adding yet another field to the Query type, first take a step back and ask: is there a way I achieve the same outcome by just leveraging more of what GraphQL has to offer? Reading the documentation will help with this.

Finally, whilst I’ll let you make the final call as to how much you use the ! operator, remember that each time you don’t use an ISO string to represent a date, a GraphQL fairy dies. I mean seriously, who would do something like that? Do you really want that to happen? Good, I didn’t think so. Now go away and start working on that schema.

ben.teese@shinesolutions.com

I'm a Senior Consultant at Shine Solutions.

No Comments

Leave a Reply