Skip to content

Instantly share code, notes, and snippets.

@mayank-kansal15
Last active May 16, 2022 07:36
Show Gist options
  • Select an option

  • Save mayank-kansal15/a1bfdedcb53b13975f95bb4adab528a6 to your computer and use it in GitHub Desktop.

Select an option

Save mayank-kansal15/a1bfdedcb53b13975f95bb4adab528a6 to your computer and use it in GitHub Desktop.
This gist explains about the REST best practices and how to write pragmatic REST APIs.

WHAT IS REST?

REST stands for Representational State Transfer. It’s a software architectural style for implementing web services. Web services implemented using the REST architectural style are known as the RESTful Web services.

In REST we transfer representation of a resource in a particular state at a point in time, that's why it is called REST. For example: GET /v1/articles/123, transfer article 123 representation in JSON format in the state when API is hit.

REST use HTTP as a communication protocol, HTTP is a stateless protocol so should be REST. It means two REST APIs should be treated completely independent on server, both APIs should be authenticated and authorized, don't store anything from the first API call in server memory to be used by further API calls.

RESTful APIs are written for consumers. The name and structure of URIs should convey meaning to those consumers. Although your internal data models may map neatly to resources, it isn't necessarily a one-to-one mapping. The key here is to not leak irrelevant implementation details out to your API! Your API resources need to make sense from the perspective of the API consumer. Design for your clients, not for your data.

Each resource in a service suite will have at least one URI identifying it. REST URIs should follow a predictable, hierarchical structure to enhance understandability. Predictable in the sense that they're consistent, hierarchical in the sense that data has structure—relationships.

Example of REST APIs and how to read them.

POST /v1/users, read as, create a user in the user collection. GET /v1/users, read as, get users from the user collection. GET /v1/users/123, read as, get the user 123 from the user collection. PUT /v1/users/123, read as, override the user 123 in the user collection. PATCH /v1/users/123, read as, modify the user 123 in the user collection. DELETE /v1/users/123 read as, delete the user 123 from the user collection.

Resources in REST:

In REST we transfer resource representation, any information or functionality which a REST API server offers can be treated as a resource.

How to name resources?

  • Resource name should be noun not verb.
  • Use lowercase to write resource name.
  • In most of the cases use plural for resource name, because server offer resource collection, to identify a particular from the collection use ID. Example, /users, /orders, /posts, etc.
  • For multi words resource name, use snake_case(separate words with underscore). snake_case is more readable than spinal-case(separate words with hyphen) and camelCase.
  • Use singular resource name only when resource actually is a singleton. Example if a single configuration exist for a customer then /customers/{customer_id}/configuration.

Resource identifier:

  • For a particular resource ID, avoid using database sequence number. A UUID, Mongo Object ID, etc, is preferred.
  • Some cases where a resource can be identified by both id and a name and we want to allow both in API then route could be like POST /services/{service_id_or_name}
  • APIs specific to users or customer, where ID comes from token, etc, use mine in the path to convey that the path is related to a particular resource. Example: GET /users/mine/bank_accounts, here user id should come from JWT not from path. But in case of a admin API to list user bank account path will be like /users/{user_id}/bank_accounts. Difference b/w earlier and this admin route authorization, /users/mine/bank_accounts is authorized for customer and /users/{user_id}/bank_accounts for admin role.

Sub or nested resources:

Sub-resources represent a relationship from one resource to another. If cardinality is 1:1, then no additional information is required, otherwise, the sub-resource should provide a sub-resource ID for unique identification.

If a sub-resource can be treated independently in the system and offer a whole set of functionality then it should have it's own path instead of a sub-resource to parent. Example instead of /customers/mine/orders have /orders in the service.

If a sub resource can't exist without parent or doesn't make more sense without parent or doesn't offer functionality of it's own then keep it as a sub resource under parent. Example: /posts/{post_id}/comments, posts/{post_id}/comments/{comment_id}, /users/mine/bank_accounts(when we don't want to treat bank accounts independently in service and no functionality on bank accounts).

Avoid too much levels in path, no more than two levels of sub resources, if we see more than two levels means a sub resource is offering good number of functionalists and can be treated independently.

## ??? Need to clarify: Eg: POST /orders is to create an order for a customer where customer details will come from the token GET /orders is to get all orders for a customer but customerId should come from query params, if admin want to see orders for couple of users.

In case or orders two different route make sense one is like POST /customers/mine/orders GET /customers/mine/orders

GET /orders, /orders/{order_id}, for admin control ???

Actions that don't fit into the world of CRUD operations?

This is where things can get fuzzy so do this carefully and put some thoughts before doing it.

Try to map the action like a sub resource and then create route like below:

  • To start a gist, POST /gists/{gist_id}/star and to unstar it DELETE /gists/{gist_id}/star
  • To activate a service, POST /services/{service_id}/activate, DELETE /services/{service_id}/activate

Mostly use POST for these actions.

API domain name:

We should have 3 domains for your production environment like: API – https://api.{company_name}.com OAuth2 – https://oauth2.{company_name}.com Developer portal – https://developers.{company_name}.com

To provide a stage env like many companies offer, two more domains for sandbox environment like: API – https://api.sandbox.{company_name}.com OAuth2 – https://oauth2.sandbox.{company_name}.com

For APIs offered by a service within a company, it is good to provide them a namespace like: https://api.{company_name}.com/{namespace} Eg: api.google.com/translations, api.rupeek.com/cis, etc

This namespace can also help API gateway to look APIs only in a specific service and return 404 if not found.

HTTP Methods:

We should restrict ourself to GET, POST, PUT, PATCH and DELETE, with these 5 methods, we should be able to write almost all of the REST APIs.

POST:

POST should be used in three cases like:

  1. To create an independent resource. Eg: POST /posts, POST /transactions
  2. To create a new sub-resource and establish its relationship with the main resource. Eg: POST /customers/{customer_id}/order
  3. To perform some operation on a resource. We might be able to map this operation as a sub resource, but sometime not. Eg: POST /gists/{gist_id}/star, /services/{service_id}/activate, POST /payments/{payment_id}/refund

On a successful post request when a resource is actually created we should return HTTP status 201 and a Location header containing link to new resource. In other cases when resource was not created return HTTP status 200.

Some people say that POST should not be idempotent but some says that it should be. I think in most cases it shouldn't be idempotent but we can make it idempotent in some cases based on our requirement. Eg: POST /customers/{customer_id}/orders, it shouldn't be idempotent because we might want to create two same orders for a customer. Eg: POST /customers/{customer_id}/gold_loan, it should be idempotent as a customer shouldn't have a gold loan with same jewels details.

In case of POST request body fields are preferred than query params.

GET:

The HTTP GET method is used to read (or retrieve) a representation of a resource. GET is safe and idempotent means calling same URI again and again shouldn't impact the underlying resource.

In the "happy" (or non-error) path, GET returns a representation and a HTTP response code of 200 (OK). In case of GET list of resources, even if the collection is empty we should return 200 with empty array. When a specified resource GET /posts/{post_id} is not present then return 404.

PUT:

The HTTP verb PUT should be used in order to do a full update of a resource instance in the collection (all attributes are replaced and those which do not exist are deleted). On successful update, return 204 as not need to echo the same data which client send. Eg: PUT /posts/{post_id}

PUT is not a safe operation, as it modifies the state of a resource instance on the server, but it should be idempotent, doing PUT with same data should take the resource in the same state.

PATCH:

PATCH is used to perform a partial update to a resource. The PATCH request body only needs to contain the changes to the resource, not the complete resource. Eg: PATCH /posts/{post_id}

PATCH is neither safe nor idempotent. Calling the same PATCH API call at same/different time might put resource in a different state because of other update API calls during that time frame.

On a successful PATCH return 204 status with no response body.

DELETE:

It is used to delete a resource identified by a URI. On successful deletion, return status 204 with no response body, as in most cases client doesn't need the deleted resource. Eg: DELETE /posts/{post_id}

DELETE operations should be idempotent, so even if a resource doesn't present or already deleted just return status 204 and no response body.

Miscellaneous:

  • Some HTTP clients can only work with simple GET and POST requests, for these clients we can use header X-HTTP-Method-Override to take PUT, PATCH or DELETE value on a POST API call.

HTTP response:

In case of successful API call when needed return resource representation with proper HTTP status code.

Envelope response or not:

Don't unnecessary envelope the response like below { status: 200, data: {} }

HTTP status code should be used to convey whether it was successful or not. How to convey pagination and error scenarios are explained below.

Compress response:

To reduce response time and network bandwidth, gzip the response by default as all major browser and HTTP library unzip the content by default.

Limiting resource fields in response:

The API consumer doesn't always need the full representation of a resource. To minimize the network bandwidth use a fields query parameter that takes a comma separated list of fields to include. Eg: GET /tickets?fields=id,subject,updatedAt&status=open

Auto loading related resource representations:

There are many cases where an API consumer needs to load data related to (or referenced from) the resource being requested. Rather than requiring the consumer to hit the API repeatedly for this information, there would be a significant efficiency gain from allowing related data to be returned and loaded alongside the original resource on demand.

We can use embed query param to indicate embedding of related resources. Eg: GET /tickets/12?embed=assignee

We can use limiting resource fields and auto loading related resource something like below Eg: GET /tickets?fields=id,subject,updatedAt,assignee.id,assignee.name&status=open&embed=assignee

HTTP status code:

Below list includes the most commonly status codes sent from an API server. For complete list refer: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status

  • 200 OK: Response to a successful GET, PUT, PATCH or DELETE. Can also be used for a POST that doesn't result in a creation. 200 OK is the usual success code for most cases.

  • 201 CREATED: Successful creation occurred via POST. Set the Location header to contain a link to the newly-created resource. Response body content may or may not be present.

  • 202 Accepted: Used for asynchronous method execution to specify the server has accepted the request and will execute it at a later time.

  • 204 NO CONTENT: Indicates success but nothing is in the response body, often used for DELETE and PUT operations.

  • 206 PARTIAL CONTENT: The content returned is incomplete. Mostly returned by paginated response in GET.

  • 400 BAD REQUEST: To indicate a client side error in request formatting, request body is not in proper format, fields are missing or fields data types are different, etc.

  • 401 UNAUTHORIZED: When no or invalid authentication details are provided.

  • 403 FORBIDDEN: When authentication succeeded but authenticated user doesn't have access to the resource.

  • 404 NOT FOUND: Used when the requested resource is not found, whether it doesn't exist or if there was a 401 or 403 that, for security reasons, the service wants to mask.

  • 405 METHOD NOT ALLOWED: Used to indicate that the requested URL exists, but the requested HTTP method is not applicable. For example, POST /users/12345 where the API doesn't support creation of resources this way (with a provided ID). The Allow HTTP header must be set when returning a 405 to indicate the HTTP methods that are supported. In the previous case, the header would look like "Allow: GET, PUT, DELETE". When an HTTP method is being requested that isn't allowed for the authenticated user.

  • 406 Not Acceptable: Nothing matches the Accept-* Header of the request. As an example, client asked for an XML formatted resource but it is only available as JSON.

  • 410 GONE: Indicates that the resource at this end point is no longer available. Useful as a blanket response for old API versions.

  • 415 UNSUPPORTED MEDIA TYPE: If incorrect content type was provided as part of the request.

  • 422 UNPROCESSABLE ENTITY: Cannot be executed due to factors outside of the request body. The request was well-formed but was unable to be followed due to semantic errors.

  • 429 TOO MANY REQUESTS: When a request is rejected due to rate limiting.

  • 500 INTERNAL SERVER ERROR: The general catch-all error when the server side throws an exception. Use this only for errors that the consumer cannot address from their end. The error response should follow the same error resource representation. We SHOULD limit the amount of information to avoid exposing internal service implementation details to clients. Service developers should use logging and tracking utilities to provide additional information.

Result filtering, sorting & searching:

It is best to keep the base URL as simple as possible and use query params for filtering, etc. Keep query params name in lower snake case for better readability.

How should be the params values, always upper case or how?

Filtering:

Use a unique query parameter for each field that implements filtering. Eg: GET /tickets?status=OPEN

For multiples values for a query params, recommended way is like GET /tickets?status=CLOSED&status=INVALID, here query param name is singular. Most of the lib support this out of box.

In case if we are hitting the max URL length limit 2000 chars, we can do like GET /tickets?statuses=CLOSED,INVALID, query param name should be plural and it should be documented in API spec about separator and how to encode the , separator.

Sorting:

Similar to filtering, a generic parameter sort can be used to describe sorting rules. Accommodate complex sorting requirements by letting the sort parameter take in list of comma separated fields and sorting order explicitly in all cases by pipe.

Eg: GET /tickets?sort=creation_date|asc,author|desc

Searching:

Sometimes basic filters aren't enough and you need the power of full text search. Perhaps you're already using ElasticSearch or another Lucene based search technology. When full text search is used as a mechanism of retrieving resource instances for a specific type of resource, it can be exposed as /search sub resource on the parent resource collection. This results will have a different format than the resources and the collection representation. This allows us to add suggestions, corrections and information related to the search.

Eg:

  • Searching for restaurants whose names start with “La” Request: GET /restaurants/search?name=la* Response: Might look something like { "count" : 5, "query" : "name=la*", "suggestions" : ["las"], results : [...] }

  • More complex search. /restaurants/search?name=napoli&type=chinese,japanese&zipcode=75*

If search context is applicable at multiple resources then an independent search resource can be offered as an API Eg: GET /search?q=abc

Errors:

In case of errors cause by client or server, API should always return sensible HTTP status codes and a proper error resource representation. This error resource representation should be different from API other resources representation.

Error resource representation should have atleast a unique error code(that can be looked up for more details in the docs, this can be integer or a string) and a short user friendly message.

{
  code: 1024,
  message: "Some error happened on server, please try again."
}

If we want we can also include description or errorURI field for more details about error.

Validation errors for PUT, PATCH and POST requests will need a field breakdown. This is best modeled by using a fixed top-level error code for validation failures and providing the detailed errors in an additional errors field, like so:

{
  "code" : 1024,
  "message" : "Validation Failed",
  "errors" : [
    {
      "code" : 5432,
      "field" : "firstName",
      "message" : "First name cannot have fancy characters"
    },
    {
       "code" : 5622,
       "field" : "password",
       "message" : "Password cannot be blank"
    }
  ]
}

HTTP headers:

The purpose of HTTP headers is to provide metadata information about the body or the sender of the message in a uniform, standardized, and isolated way. HTTP header names case insensitive. If available, HTTP standard headers MUST be used instead of creating a custom header.

Mostly used headers:

API developers should be aware about these commonly used headers.

  • Authentication headers: Authorization
  • Caching headers: Cache-Control, ETag, If-None-Match, Last-Modified, If-Modified-Since
  • Connection management headers: Connection, Keep-Alive
  • Content negotiation headers: Accept
  • CORS headers: Access-Control-Allow-Origin, Origin
  • Message body information headers: Content-Length, Content-Type, Content-Encoding
  • Request context headers: Host, User-Agent

Refer this link to get information on headers: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers#content_negotiation

API Response Pagination:

If if forsee a GET API might needs to be paginated then we should start doing it in the early design phase of your API and paginating your resources with default values when they are not provided by the calling client, for example page=1&per_page=25.

Many APIs include pagination information in the response envelope but the right way to include pagination details today is using the Link header introduced by RFC 8288.

An API that uses the Link header can return a set of ready-made links so the API consumer doesn't have to construct links themselves. This is especially important when pagination is cursor based. Here is an example of a Link header used properly, grabbed from GitHub's documentation:

In this below case we need to support two query params, page and per_page. It is highly recommended to include the Link tag in the HTTP Header of your answers. It allows you to add, amongst others, navigation links such as next page, previous page, first and last page.

Link: https://api.github.com/user/repos?page=5&per_page=100; rel="next", https://api.github.com/user/repos?page=3&per_page=100; rel="previous", https://api.github.com/user/repos?page=1&per_page=100; rel="first", https://api.github.com/user/repos?page=50&per_page=100; rel="last"

In some cases where APIs want to send a count of the total number of available results, can be done by a custom HTTP response header like X-Total-Count.

API Versioning:

Always version your API. Versioning helps you iterate faster and make the transition smoother to new API functionality with a new API version.

API can have major and minor versions.

  • Major version: Like v1, v2. It might or might not be backward compatible. At most we should have to 2 major version running, one should be live like v3 and one should be deprecated like v2. We should have just major version in API path.

  • Minor version: Like v1.2, v2.3, etc. Minor version should always be backward compatible. As minor versions are backward compatible so no need to mention it in path and have just one minor version active for a major version. No deprecated or retired minor versions. Minor version changes like new request/response fields, headers, query params etc should be optional.

We should clearly communicate about major and minor versions and their related changes in API documentation. Deprecation and retirement timelines should also be communicated in API doc.

API security:

A RESTful API should be stateless. This means that request authentication should not depend on cookies or sessions. Instead, each request should come with some sort of authentication credentials.

We should always use SSL. No exceptions. Client can access APIs from an unsecured network unknowingly so enforcing SSL will protect clients. Don't redirect non SSL HTTP traffic to HTTPs and throw error instead.

Find more information on API security here: https://github.com/shieldfy/API-Security-Checklist

Caching:

HTTP provides a built-in caching framework! All you have to do is include some additional outbound response headers and do a little validation when you receive some inbound request headers.

There are 2 approaches: ETag and Last-Modified

ETag:

When generating a response, include a HTTP header ETag containing a hash or checksum of the representation. This value should change whenever the output representation changes. Now, if an inbound HTTP requests contains a If-None-Match header with a matching ETag value, the API should return a 304 Not Modified status code instead of the output representation of the resource.

Last-Modified:

This basically works like to ETag, except that it uses timestamps. The response header Last-Modified contains a timestamp which is validated against If-Modified-Since.

API server should also set response header cache-control to control the behavior of cache storage and validation by browser.

Rate liming:

To prevent abuse, it is standard practice to add some sort of rate limiting to an API. RFC 6585 introduced a HTTP status code 429 Too Many Requests to accommodate this. However, it can be very useful to notify the consumer of their limits before they actually hit it. This is an area that currently lacks standards but has a number of popular conventions using HTTP response headers. At a minimum, include the following headers:

  • X-Rate-Limit-Limit - The number of allowed requests in the current period
  • X-Rate-Limit-Remaining - The number of remaining requests in the current period
  • X-Rate-Limit-Reset - The number of seconds left in the current period. Sending just the seconds will avoid confusion on date and timezone.

Cross-domain CORS:

When the application (JavaScript SPA) and the API are hosted on different domains, for example: Application URL: https://some_company_name.com API URL: https://api.some_company_name.com

CORS has to be enabled on the web server(Nginx//Apache/NodeJS...)

Handling Dates in REST:

Handling date and timezone is always a tedious process and attract bugs, follow below rules to simplify life:

  • Law #1: Use ISO-8601 for your dates.
  • Law #1: Accept any timezone with date in ISO-8601 from client.
  • Law #2: Store it in ISO-8601 UTC in DB.
  • Law #3: Return it in ISO-8601 UTC in response.
  • Law #4: Don’t use time if you don’t need it, in request, response and storage.

Consider Connectedness, HATEOAS:

Although the web generally works on HATEOAS type principles (where we go to a website's front page and follow links based on what we see on the page), I don't think we're ready for HATEOAS on APIs just yet. When browsing a website, decisions on what links will be clicked are made at run time. However, with an API, decisions as to what requests will be sent are made when the API integration code is written, not at run time. Could the decisions be deferred to run time? Sure, however, there isn't much to gain going down that route as code would still not be able to handle significant API changes without breaking. That said, I think HATEOAS is promising but not ready for prime time just yet. Some more effort has to be put in to define standards and tooling around these principles for its potential to be fully realized.

For now, it's best to assume the user has access to the documentation & include resource identifiers in the output representation which the API consumer will use when crafting links. There are a couple of advantages of sticking to identifiers - data flowing over the network is minimized and the data stored by API consumers is also minimized (as they are storing small identifiers as opposed to URLs that contain identifiers).

Also, given this post advocates version numbers in the URL, it makes more sense in the long term for the API consumer to store resource identifiers as opposed to URLs. After all, the identifier is stable across versions but the URL representing it is not!

API Documentation:

An API is only as good as its documentation. The docs should be easy to find and publicly accessible. Most developers will check out the docs before attempting any integration effort. When the docs are hidden inside a PDF file or require signing in, they're not only difficult to find but also not easy to search.

The docs should show examples of complete request/response cycles. Preferably, the requests should be pastable examples - either links that can be pasted into a browser or curl examples that can be pasted into a terminal. GitHub and Stripe do a great job with this.

Once you release a public API, you've committed to not breaking things without notice. The documentation must include any deprecation schedules and details surrounding externally visible API updates. Updates should be delivered via a blog (i.e. a changelog) or a mailing list (preferably both!).

There are various options available to define the API's contract interface (API specification or API description). Examples are: OpenAPI (fka Swagger), Google Discovery Document, RAML, API BluePrint and so on. OpenAPI is a vendor neutral API description format.

References:

@mohitsenapaty
Copy link

Should we have any provision for idempotency keys in the headers?

@satyendra-singh-talentica
Copy link

satyendra-singh-talentica commented Mar 17, 2022

In case of GET list of resources, even if the collection is empty we should return 200 with empty array. When a specified resource GET /posts/{post_id} is not present then return 404.

404 Not found

Is this for route not found ?
or
Data not found ?

for GET /:id
I like to return 200 with nothing.

This is in line with the ORM behaviors. If we're trying to find something, and data is not there, DB will return 'null'.
+
The consumer of the api, doesn't have to do special handling for 404 inside error handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment