lunes, 31 de octubre de 2011

A look into various REST APIs

I have recently been working on a web application for data mining. We were initially using a typical stack of server Java technologies. The web MVC part was written on Struts 2.

Our application is pretty UI intensive and we wanted it to have a nice, modern look and feel. As development went on, we soon found ourselves issuing a good amount of AJAX calls to the server. Without careful planning, this can soon become an unmanageable mixture of HTML, JSON and XML response formats, and dozens or hundreds of different GET and POST requests that expect various parameters.

Our architect soon tackled the issue and we decided to define a full featured REST API for the application (incidentally,  we moved to Spring MVC which we found nicer), which we then access from the client side.

REST methods

We have since defined dozens of URLs that handle our application requests. Much has been said about how REST method signatures should look: should resource names be used in singular or plural [1]? is it mandatory to use PUT and DELETE methods and do they need to be idempotent [2]? how do list/search methods look like [3]? Not to forget that there are also some well known implementations that provide default methods for most operations.

Besides, REST is not a protocol, but an application architecture style [4]. Without an strict definition, nowadays there are many REST principles that are sistematically violated in real-world applications: not many REST APIs use hyperlinks as their only method reference (rather, they publish API documentation and expect programmers to directly access their published methods). Not all services use all four HTTP methods (GET, POST, PUT, DELETE) as defined by HTTP. Not all of them have consistent naming rules for resources.

Seeing how loose REST is about API aspect, I have collected a few examples from some of the most famouse services on the Internet, so we can have an oversight of how famous REST APIs look as of today.

Facebook (Graph API)

Facebook publish resources as a graph of social objects [5]. Every object in the social graph has a unique ID. You can access the properties of an object by requesting:

https://graph.facebook.com/ID

For example, the official page for the Facebook Platform has id 19292868552, so you can fetch the object at: https://graph.facebook.com/19292868552.

Responses are JSON:

{
   "name": "Facebook Platform",
   "type": "page",
   "website": "http://developers.facebook.com",
   ...
   "id": 19292868552,
   "category": "Technology"
}

Relationships between objects are called connections and are accessed using the name of the connection:

https://graph.facebook.com/ID/CONNECTION_TYPE
https://graph.facebook.com/me/likes?access_token=...
https://graph.facebook.com/me/friends?access_token=...

And results are also JSON:

{
   "data": [
      {
         "name": "John John",
         "id": "201095685"
      },
      {
         "name": "Andy Andy",
         "id": "202921236"
      },
      ...
   ]
}



You can render the current profile photo for any object by adding the suffix /picture to the object URL. They have consistent support for paging, selecting fields and/or multiple objects in one call, accepting/returning different date formats and fulltext search:

https://graph.facebook.com/bgolub?fields=id,name,picture
https://graph.facebook.com?ids=arjun,vernal
https://graph.facebook.com/search?q=mark&type=user

They also provide support for publishing (via POST) to the appropriate resource, using a few simple parameters. The following URL is used to add a comment to an object:

POST:      https://graph.facebook.com/OBJECT_ID/comments
arguments: message

This API uses only GET and POST. They don't use hyperlinks to reference other objects (for example, lists are composed of object IDs, not URLs).

Twitter

Twitter API [6] is also GET/POST based:

Get user home timeline, users who retweeted a post, followers:

GET statuses/home_timeline
GET statuses/ID/retweeted_by
GET followers/ids

Modifications are done via POST, like the URL used for actually tweeting:

POST statuses/update

These GET and POST methods accept a variety of parameters which are very well documented. Parameters can be used to filter the timespan of the object we want to retrieve, for defining pagination, filtering, whether to include extra metadata about objects returned... Twitter API can also provide a number of response formats: JSON, XML, RSS, Atom.

This is an example of a response list of tweets in JSON format (which is an array of tweet objects):

[
  {
    "coordinates": null,
    "favorited": false,
    "created_at": "Fri Jul 16 16:58:46 +0000 2010",
    "truncated": false,
    "entities": {
      "urls": [
        {
          "expanded_url": null,
          "url": "http://www.flickr.com/photos/cindyli/4799054041/",
          "indices": [
            75,
            123
          ]
        }
      ],
      "hashtags": [
      ],
      "user_mentions": [
    ...


Google Custom Search API

This is a very simple API [7]. You can retrieve results for a particular search by sending an HTTP GET request to its URI. The URI for a search has the following format:

https://www.googleapis.com/customsearch/v1?parameters

Parameters have to include the Google API key, the search query and other optional Google query parameters (response format, pagination, search options, filtering...). For example:

GET https://www.googleapis.com/customsearch/v1?key=INSERT-YOUR-KEY&cx=013036536707430787589:_pqjad5hr1a&q=flowers&alt=json

Response formats can be JSON or Atom.

{
 "kind": "customsearch#search",
 "url": {
  "type": "application/json",
  "template": "https://www.googleapis.com/customsearch/v1?q\u003d{searchTerms}&num\u003d{count?}&start\u003d{startIndex?}&hr\u003d{language?}&safe\u003d{safe?}&cx\u003d{cx?}&cref\u003d{cref?}&sort\u003d{sort?}&alt\u003djson"
 },
 "queries": {
  "nextPage": [
   {
    "title": "Google Custom Search - flowers",
    "totalResults": 10300000,
    "searchTerms": "flowers",
    "count": 10,
    "startIndex": 11,
    "inputEncoding": "utf8",
    ...


Summary


The big players have very different REST method signature approaches, but all of them share the simplicity and handyness of REST APIs. Being public APIs, all of them are also able to provide results in a number of formats, which allow for an easy consumption of these web services from a huge number of technologies.

As the real world rules for REST web services seem so lax, my advice is to prioritize ease of usage when designing REST services, and leverage the support that your tools or framework provide when building these services. The actual aspect of method signatures and result types is not as important as the naming and type consistency across your various method calls. Using HTTP methods properly is not as critical unless full proxy-cache support is a requirement for your system, and in many cases you'll be safe just using POST for requests that change the state of the system.

I hope this helps showing the different approaches used out there.



[1] http://stackoverflow.com/questions/6845772/rest-uri-convention-singular-or-plural-name-of-resource-while-creating-it
[2] http://stackoverflow.com/questions/7016785/is-put-delete-idempotent-with-rest-automatic
[3] http://stackoverflow.com/questions/1418114/questions-on-proper-rest-design

[4] http://en.wikipedia.org/wiki/Representational_state_transfer

[5] http://developers.facebook.com/docs/reference/api/
[6] https://dev.twitter.com/docs/api
[7] http://code.google.com/apis/customsearch/v1/using_rest.html