Ten tips for designing a useful REST service API

pexels-photo-2249538.jpeg

Ten tips for designing a useful REST service API

I regularly build front-end UIs that talk to REST services. During this time I’ve found that some service APIs seem to have been built to accomodate change, whilst others seem to actively work against it.

In this post I’m going to talk about some of the patterns and approaches that I’ve found work best. These recommendations are intended for both service and front-end developers because – as you’ll soon see – both need to collaborate to design good service interfaces.

There’s a lot in this post, ranging from the very high-level to the very specific. You can read through the whole thing, or just jump straight to the section you are interested in:

The examples I present assume that backend service endpoints are returning JSON payloads. To describe the shapes of these payloads, I use TypeScript syntax. However, these guidelines are not TypeScript-specific.

Finally, to avoid distracting from the overall flow of this post, I have hidden some details and answers to common questions inside expandable headings. These headings look like this:

Click me for more details!

Yep, here are the gory details!

If they are not of immediate interest to you, feel free to ignore them or revisit them later .

1. Design an OpenAPI spec in collaboration with client developers

As a front-end developer, I am generally expected to build user interfaces from mockups and requirements that have been agreed upon with my users in advance. If I instead chose to just build whatever I want, and then threw it over the fence to my users, it’s unlikely that it will be very useful to them.

The same approach should be taken by REST service developers. Except in this case, the users are frontend developers. And instead of using mockups to agree upon what should be built, you should instead use OpenAPI specifications.

Why OpenAPI? It’s a common and well-understood specification format for REST interfaces. There is also a tooling ecosystem built around it. It’s not perfect, but it does represent an ideal baseline for specifying a REST interface.

How should the collaboration process work? I recommend you pick a thin vertical slice of functionality, and then either the client or service developers draft up what they think the OpenAPI interface for that functionality should look like. Representatives of both the client and service teams should then meet and walk through how that vertical slice will map from the client requirements (for example, a set of user interface mockups) through to the OpenAPI spec.

If issues are found, adjust the spec and repeat, until everybody has signed-off on it. Then both sides build their implementation, review how it went, and repeat the process.
If each iteration seems to be taking too long, you may have picked a slice of functionality that is too wide, and should consider narrowing the scope of what you are trying to achieve.

2. Prioritise interface over implementation

You should prioritise defining the OpenAPI interface over building the implementation. Once an interface has been agreed upon, both the client and the service teams can then work concurrently.

The client team can (and should) test their code against a stubbed version of the server. It is their responsibility to build this stub. They should not require the backend service to be completed before they can do meaningful development and testing.

The service team can (and should) test their service with a test client of their own. It is their responsibility to build that client. They should not require the client to be ready before they can do any meaningful development or testing.

This doesn’t mean that the real client and service should never be tested together. However, it does mean that 80%-90% of development can be done independently.

Nor does it mean that you can expect that the interface definition will be 100% correct on the first attempt. As everybody’s understanding of the problem develops, it’s likely that changes will need to be made.

If a service team chooses to use a so-called “code-first” solution to implement their service, that should not delay or impact the definition of the interface in advance. If the interface that is automagically generated from a “code-first” solution doesn’t match what was agreed upon, it’s a problem for the server team to fix, not the client team.

Of course, nobody can be forced to delay implementation until interfaces have been defined. Sometimes teams deem it necessary to start implementation before interfaces are finalised. That’s OK, as long as the team accepts that they risk missing something important and having to go back and rework both the interface and the implementation.

3. Design a domain service, not a point-solution

Service APIs should be defined as a domain service, rather than a a set of highly specialised endpoints that fetch very specific aggregations of data.

A domain service identifies specific types of entity within a particular business domain, then provides endpoints for looking up and manipulating instances of those objects, either individually or as collections.

Importantly, a single domain service endpoint should only return information about a single domain entity type. It can return the IDs of other entities that are related to that type, but no additional information about those entities. Note that there is no problem with a particular entity type being returned by multiple endpoints.

A domain service should not be designed as a backend-for-frontend (BFF). I have argued elsewhere that every frontend should have a server, and that this server should still be built by frontend developers. Service developers are not front-end developers, and thus should not build the server-side of a frontend.

An example

To demonstrate, imagine a service that keeps track of customers, as well as orders made by customers for particular products.

Furthermore, imagine that the first thing that the frontend wants to do is get the first and last name of a particular user from that service.

To do this, we might add a service endpoint like this:

Endpoint path

Shape of response

/customers/:customerId

type Customer = {
id: string
firstName: string
lastName: string
}

But what if on the same screen, we wanted to show to the user the details of a customer and a summary of their orders?

At first it might be tempting to extend the shape of the /customers/:customerId response to be something like this:

type Customer = {
id: string
  firstName: string
  lastName: string
  orders: [{
    id: string
    timestamp: string
    productId: string
  }]
}

We highly recommend against doing this, because we’re now returning information about multiple domain objects from a single endpoint. There are two problems with this:

  • Even if we only want to get the name of a customer, we always getting their orders over the wire.
  • It makes it hard to deal with the fact that the number of orders for a customer is unbounded (which we will discuss shortly)

So for now, we’ll instead add a new endpoint for getting the orders for a customer:

Endpoint path

Shape of response

/customers/:customerId/orders

type Orders = [{
id: string
 timestamp: string  
productId: string
}]

Now, to get the details of a customer and a summary of their orders, we would first call  /customers/:customerId, then/customers/:customerId/orders.

Aren’t two calls worse than one?

Domain service developers might fret about the theoretical performance implications of a client calling two endpoints instead of one, but in practice it is usually not a problem. A well-written front-end will have its own BFF (AKA front-end server), and that server should only access backend services over a private, internal network that has low latency and high bandwidth. 

The slowest network over which a front-end will operate will be that between the front-end client and the front-end server. It is thus the front-end server’s responsibility to minimise the number of calls that the front-end client needs to make to it over that slow network. This is the optimisation that counts. Less concerning is whether the front-end server makes more than one request to upstream domain services whilst it is processing a single incoming request from the front-end client. 

Dealing with unbounded data sets

Next, imagine that the front-end wants to to get more specific details about the orders made by a particular customer – specifically, the name and price of the product for each order.

According to our guidelines for writing a domain service, because this is a new entity, we should define a new endpoint:

Endpoint path

Shape of response

/products/:productId

type Product = { 
id:string 
name:string 
price:number
}

So to get a detailed list of the orders for a customer, first the front-end would need to call /customers/:customerId/orders, then for each order it gets back, it calls /products/:productId.

The problem is that – as we mentioned earlier – the number of orders for a customer is unbounded. Specifically, as the number of orders for a customer grows, then the number of calls made to /products/:productId will always grow.

This is sometimes referred to as an N+1 Problem. Even with a low-latency network, the number of different products that have been ordered by a customer could still get to a point that the overall response time of the frontend server is too high. This not a good experience for the user, who will either experience a network timeout, or get sick of waiting and try to take alternative action.

What if we include the product details with the orders?

To get around this, you might once again be tempted to return to /customers/:customerId/orders and change the response type to include the product details:

Endpoint URL

Shape of response

/customers/:customerId/orders

type Orders = [{
id: string
timestamp: string
product: {
id: string
name: string
price: number
}
}]

This will solve the problem of /products/:productId being called multiple times, but in addition to the endpoint fetching data that won’t necessarily always be needed, the size of the /customers/:customerId/orders response still remains unbounded. Eventually, a customer could have so many orders that the network still times-out before the response can be streamed back to the front-end client.

To be clear, the problem here is not that the number of orders for a customer might exceed some arbitrary threshold. The problem is that the number of orders is unbounded. In other words, it has no upper limit. This makes it impossible to put an upper bound on how long it will take to load those orders. What should we do instead?

Use paging

I strongly recommend that most unbound lists of data that are displayed to the user by a frontend are pageable. Furthermore, to be effective this pageability will need to extend down though the stack and into backend service APIs.

Why paging is especially applicable to UIs

A front-end is a user interface, meaning that there is a limit to the amount of information that it can (and should) display to the user on a single screen. This in-turn means that there is usually an upper-bound on the amount of data that the front-end needs to request from backend services in response to a single user interaction. Paging lets us exploit this fact.

To support paging, some additional information will need to be encoded in some of our service API endpoints. In our example, this means we’ll have to make some changes to the /customers/:customerId/orders endpoint.

There are lots of different ways to do paging, but a common technique is to introduce search parameters to the URL and extend the response shape slightly. For example, we could extend the/customers/:customerId/orders endpoint as follows:

Endpoint path

Shape of response

/customers/:customerId/orders?cursor=<string | null>&action=<"first"|"last"|"next"|"prev">&limit=<number>

type OrdersPage = {
startCursor: string 
endCursor: string
hasNext:boolean
hasPrev:boolean
orders: [{   
id:string
timestamp:string
productId:string
}]
}

The key part is the limit URL search parameter. This lets the client control the maximum allowable size of the result set coming back to it. This means that in our example, there is now a fixed upper limit on the number of subsequent calls that the front-end server will make to /products/:productId whilst it services a single incoming request.

If it wants, the server can impose an maximum allowable value for limit to prevent a client from fetching more records than the server can expect to load without timing out.

What about cursor and action?

In addition to the limit search parameter, the paging interface introduces the concept of a cursor and action.

A cursor is a value that identifies a point in an overall result set. Cursors are like page numbers (sometimes they are even implemented internally using page numbers), but offer a bit more flexibility.

The action describes a paging action that we want to take, sometimes relative to a particular cursor. Only the next and prev actions require a cursor value.

So to go to the first page, the client will call the endpoint with action=first and store the values of startCursor and endCursor that it receives in the response. If the client then wants to go to the next page, it will call the endpoint with the params action=next&cursor=<storedStartCursor>, and then store the new values of startCursor and endCursor that it gets back. If the client wants to go back, it can call the endpoint action=prev&cursor=<storedEndCursor>.

If the client needs additional information about the page, you can add it to the response. For example, you could include a pageNumber field in the response to specify the number of that page. You can even include additional cursors that the client can use to jump directly to a page number. However, at the very least you should always provide these base fields:

type PageInfo = {
startCursor: string
endCursor: string
hasNext: boolean
hasPrev: boolean
}

For example, imagine that a frontend client is displaying a list of order details to the user. It’s unlikely that the list needs to show more than 25 orders at a time – any more and it probably wouldn’t fit on the screen without the user having to scroll vertically.

So in the worst case the frontend server will make 26 calls to the backend service in response to a single request from the frontend client: one call to /customers/:customerId/orders, and 25 calls to /products/:id  This sounds like a lot, but will not necessarily be a problem over a fast internal network. The key thing is that number of calls has an upper bound, and that bound is known.

But what if it’s still too slow?

If making twenty-six calls to a backend service still leads to a perceptible and measurable performance problem, a dedicated batch-loading endpoint could be exposed by the server to look up a group of products in one hit. For example:

Endpoint path

Shape of response

/products?id=<someId>&id=<someOtherId>&id=<yetAnotherId>...

type Products = [{
id: string
name: string
price: number
}]

This would bring us back to only making two calls to the backend service when gathering our detailed order information: one call to /customers/:customerId/orders and one call to /products. However, it’s important to remember that there is no need to add a batch-loading endpoint pre-emptively if there is no perceptible and measurable performance problem.

Can caching help with this?

Sometimes frontend servers can use simple caching mechanisms to delay the onset of the N+1 problem. However, they won’t avoid it completely.

For example, if a BFF is a GraphQL server built using Apollo Server, it’s common that whilst processing a single incoming GraphQL request, Apollo will build a map where the key is each upstream service URL that has been called so far, and the value is the response that was received from that URL.

If, whilst processing an incoming request, Apollo wants to make an outgoing request to a URL, it will first check this map. If the URL is already in the map, then it will instead use the stored response, rather than making a new call over the network to that URL.

This means that if, for example, several orders had the same product ID, then only one call would actually be made over the network to /products/:productId for that ID.

This is a useful optimisation, but if the number of orders is unbounded, there is still no limit on the number of unique products that might need to be looked up over the network.

4. Make sure your identifiers are actually unique

If a client developer sees a field called id on an entity type in your OpenAPI spec, they will generally assume that the value of that field will be unique across all instances of that type. As the service developer, you should make sure that this assumption is correct.  Doing anything else will violate the principle of least surprise.

For example, if a client developer sees an endpoint like this:

Endpoint path

Shape of response

/customers/:customerId

type Customer = { 
id: string 
firstName: string 
lastName: string
}

It is not unreasonable for them to assume that the :customerId search parameter uniquely identifies a single customer, because only a single record is returned from the endpoint.

Similarly, if a developer sees an endpoint like this:

Endpoint path

Shape of response

/customers/:customerId/orders

type Orders = [{ 
id: string 
timestamp: string 
productId: string
}]

they will generally assume that the id in the response is unique across all orders. You should ensure that is in fact the case.

5. Be clear and consistent when naming identifiers

As already discussed, each endpoint in a domain service should only return information about a specific type of domain entity, along with the IDs of any related entities. This reliance on IDs means that it is extra important to be clear and consistent about their naming.

For example, consider the following API endpoints:

Endpoint path

Shape of response

/customers/:customerId

type Customer = {
id: string
firstName: string
lastName: string
}

/customers/:customerId/orders

type Orders = [{
id: string
customerId: string
timestamp: string
productId: string
}]

/products/:productId

type Product = {
id: string
name: string
price: number
}

It’s reasonable for somebody who is looking at these endpoints to assume that:

  • The id field value returned by /customers/:customerId is a customer ID and will match the value of :customerId
  • The id field values returned by /customers/:customerId/ordersare order IDs
  • The customerIdfield values returned by /customers/:customerId/ordersare customer IDs and will match the value of :customerId
  • The productIdfield values returned by /customers/:customerId/ordersare product IDs and can thus be passed to /products/:productId.
  • The id field value returned by /products/:productId is a product ID that matches the value of :productId

Imagine instead if /customers/:customerId/orders returned the following:

type Orders = [{
  orderId: string
  id: string
  timestamp: string
  entityId: string
}]

In this case, it’s less clear what each ID is. If orderId is the ID of the order, then what is id? Is it the customer ID or the product ID? What is entityId? Is it the product ID…or even the customer ID?

In short, try and keep your ID field names in line with your entity type names, and consistent across your URLs and response payloads.

6. Use strings for identifiers

You’ll notice that in all of the examples, we’ve specified the type of ID fields as strings, not numbers.

Using numbers as unique identifiers can be problematic because in any particular process, there is usually a limit on how big numbers can get. Furthermore, this limit can differ from one process to another. For example, the maximum size of a number in a JavaScript runtime is different to the maximum size of an int in a Java JVM.

This can lead to sudden crashes when numeric ID values start exceeding the maximum allowed number value of a host process. Furthermore, fixes for such crashes can be complex and require changes to multiple systems.

Instead, we recommend you always use strings as unique identifiers, even if the underlying source is a number. Practically speaking there is no upper bound as to how big an identifier can get when it is encoded as a string, meaning it is extremely unlikely you’ll ever have to worry about overflows.

7. Use single values to identify entities

Sometimes it can be tempting to try and uniquely identify an entity using more than one field on that entity, rather than a single, dedicated id field.

The problem with doing this is that then every client has to know what combination of fields constitutes the unique identifier. Furthermore, those fields will always need to be passed up and down the stack if you want to retain the option of looking that particular entity up again in future. If a client gets either of these things wrong, then lookups will fail at runtime.

For example, imagine that you want to associate notes with an order. To look up the notes for an order, we might define a new endpoint:

Endpoint path

Shape of response

/order/:orderId/notes

type Notes = [{
orderId: string 
createdAt: string 
text:string
}]

Note how we don’t put an ID on each note because, strictly speaking, it’s not necessary.

Now imagine that we have a requirement to lookup a specific note. What should we use to uniquely identify it?

Without a dedicated id field in-place, it can be tempting to try and use a combination of fields. For example, we might happen to know that no note can have the same combination of orderId and createdAt values. This would mean our endpoint for looking up a single note might look like this:

Endpoint path

Shape of response

/note?orderId=<string>&createdAt=<string>

type Note = {
orderId: string 
createdAt: string 
text: string
}

However, now every client of your service has to know that a note is uniquely identified by an orderId and createdAt value. Furthermore, if any clients wants to retain the option of looking up a particular note again in future, they need to store both the orderId and createdAt of that note.

Instead, it’s better for your service to give each note a unique identifier:

Endpoint path

Shape of response

/order/:orderId/notes

type Notes = [{
id: string
orderId: string
createdAt: string
text: string
}]

and a more familiar-looking endpoint for looking up a note:

Endpoint path

Shape of response

/notes/:noteId

type Note = {
id: string
orderId: string
createdAt: string
text: string
}

Now that we are being clear and consistent about what the unique identifier of a note is, it’s much easier to understand.

What if my entity doesn’t have a single unique identifier in my implementation?

Generally in this situation we recommend that the service transparently adopts a scheme for encoding and decoding the composite fields into a single value.

For example, the /order/:orderId/notes endpoint could construct a synthetic id value by joining together orderId and createdAt/notes/:noteIdwould then need to split the incoming :noteId value back into these constituent parts and use them under the hood to look up the value.

As long as the service is consistent across all endpoints about how it both generates the outgoing IDs for a particular entity type and how it processes incoming IDs for that type, the details can remain entirely transparent to the client.

8. Encode datetimes as ISO strings

As with our GraphQL schemas, we highly recommend that you encode all timestamps as strings whose format matches the RFC3339 profile of the ISO8601 standard. This format is human-readable and unambiguous about timezones. OpenAPI even provides a predefined format for string types that lets you specify this.

Do not use numbers, as that requires an implicit shared agreement as to whether they represent seconds or milliseconds since epoch (or even something more exotic), and what timezone the epoch is in.

9. Use enums (most of the time)

If an endpoint receives data that should only match a predefined set of values, we recommend you constrain those values using an OpenAPI enum.

If an endpoint sends data that should only match a predefined set of values, and that data needs to be subject to any sort of processing by a frontend (as opposed to just being dumped directly to the screen), we also recommend you constrain those values to an enum.

10. Be clear about naming identifiers across services

If your service is referring to entities in other services, it’s important that these references are clearly named to include the name of the service and, if the entity type name is different, the name of that type.

Sometimes the same entity can have different representations in different services. For example, one service might provide a view of one aspect of a customer, whilst another service might manage data related some other aspect of the customer.

Furthermore, it might be the case that clients of your service need to be able to use your service as a starting point for navigating to other services. For example, they might use your service to lookup a customer, then want to use your customer record to lookup information about the customer in another service. These “other” services don’t even need to have been built from scratch by you – they could be third-party package software.

The good news is that all of these services don’t necessarily need to have a shared understanding of what constitutes a unique identifier for a customer. However, it’s important to be clear about which fields are identifiers to which services. Just because you think that something constitutes the unique identifier of an entity across the business, doesn’t mean that all other developers will think the same thing.

So if, for example, we were to extend our /customers/:customerId endpoint to return an ID that can be used to lookup the customer in some other service, we should be explicit about what that ID is for:

type Customer = {
  id: string
  firstName: string
  lastName: string
  otherServiceCustomerId: string
}

Differing entity names

If, as is the case in the previous example, both your service and the other service share the same name for the entity type – ie, they both have the notion of a “customer” – it’s usually sufficient to prefix the field name with the name of the service.

If the other service uses a different terminology to describe this related entity, it should be reflected in the naming of the ID field in your service. For example, if the other service refers to customers as “users”, to minimise confusion it’s probably worth naming the field like this:

type Customer = {
id: string
  firstName: string
  lastName: string
  otherServiceUserId: string
}

What the id field is for

Generally, we reserve the id field of any entity to be its ID in the context of the enclosing domain service. When referring to this ID from other services, we name those referring fields to include as much context as we can about the target service.

Continuing with our previous example:

Endpoint URL

Shape of response

http://myService.com/customers/:customerId

type Customer = {
id: string
orderId: string
timestamp: string
text: string
otherServiceUserId: string
}

http://otherService.com/users/:userId

type User = {
id: string 
// ... other details
}

So in this case it is reasonable to expect that the otherServiceUserId value returned by http://myService.com/customers/:customerId can be used to lookup a user in http://otherService.com/users/:userId, and that the id field in the record returned from that endpoint will also have the same value.

Re-using IDs

It might be the case that two (or more services) have the same definition of what constitutes a unique identifier for an entity. In that case it can be tempting to skip being explicit when naming identifiers, and just rely on the id fields. We advise against doing this.

Extending upon our previous example, imagine that the values of :customerId, id, otherServiceUserId and :userId could all be the same for a particular entity, even across two different services. Consequently, it might be tempting to ditch the otherServiceUserId field completely and just rely on the id.

The problem with this is that the assumption that these values are the same is implicit, and won’t necessarily be obvious to other developers. Furthermore, it won’t necessarily be the case across all services.

Consequently, we recommend having explicit unique identifiers to other services, even if their values are the same as the id This would introduce some redundancy – id and otherServiceUserId would have the same values – but we considered it worthwhile in terms of clarity.

Conclusion

If there’s one message you should go away from this post with it’s this: service developers should consult client developers about what would be most useful to them. Every system has users of one kind or another, not just frontends. Together, service developers and client developers can then use the other recommendations I’ve made in this post to design an API that will be most likely to adapt to accomodate new requirements over time.

Most significantly, you should aim to design a domain service where each endpoint doesn’t return information about more than one type of entity (other than related IDs). You shouldn’t worry if this means that a client has to make multiple calls to your service in response to a single user interaction, as long as the number of calls has an upper bound. However, if the number of calls is unbounded, you should introduce a paging scheme. Finally, you should be clear about what the unique identifier of every entity type is, especially if it refers to an entity in a different domain service.

I’ve found that if you follow these guidelines, you’ll build a service that is able to adapt and grow over time to changing requirements.

Tags:
, ,
ben.teese@shinesolutions.com

I'm a Senior Consultant at Shine Solutions.

No Comments

Leave a Reply

Discover more from Shine Solutions Group

Subscribe now to keep reading and get access to the full archive.

Continue reading