Text Analysis API 2.0

April 12th, 2011 by Fredrik Hörte

Saplo API History

When Saplo started in 2008 many of the large web services like Facebook, Twitter and Digg had their own API which made it possible to extend their service to other applications. Though for smaller services an API was not as obvious as it is today. Today almost every new startup design their service having an API in mind and many startups even has their API as the main product.

When thinking back of our first Text Analysis API I get both nostalgic and scared at the same time. It was built for a one case scenario and for one customer only (our first customer). It should just be able to receive a text, extract the tags and send them back. We hacked it together using a simple PHP page which read a raw post data of XML which we together with our customer had agreed upon. It got the job done, but we figured that it was not a long term solution if we wanted to get more customers.

Build a solid and standardized API

As first time API developers we began to read, read and read some more. What protocol should be used? How does it work? How do you handle users? How do you authenticate? How is a good API built?

After some reading we finally chose to build a JSON-RPC. We felt it was easy to understand and to get started and since we did not have a URI based service (unlike Facebook, Twitter and Digg) it felt more correct than REST. The language of choice became Java, mainly because it was scalable enough and we had some experience using it. We quickly found out that the tricky part when building an API was that different developers have different needs. To find a balance between flexibility and usability is probably one of the most difficult parts when building an API. Also thinking in terms of an end-user/API-developer was kind of tricky since all of our in-house applications, until this point, were built directly upon the underlaying database. In late 2009 our API running JSON-RPC 1.1 specification was out and it worked pretty good. Of course as in every software bugs existed and were solved. One of the trickiest parts is to get the complete system to work together; everything from the tiniest little socket to our databases, task queue, calculation machines need to play well together.

After having the API running for around a year we had a long list of improvements and changes that we wanted to do, both internally and externally.

Focus and Key Points

One of the key differences from common web 2.0 APIs (e.g. Facebook, Twitter, Digg etc.) is that those APIs is mainly built for adding and getting information to/from their web service. In our case – which makes it harder for developers to use – you actually put some information into the API, do some work on the information and then get the results back. It is built and optimized for being a text analysis platform rather than a web service that stores information. As an end-user developer, it’s simply not plain simple.

Knowing these things we wanted to rebuild the API with focus at some key points. We want it to be Understandable, Predictable and have an Easy Workflow. Some of the new features in the 2.0 API is that we will support named parameters which makes it easier to use default values. This gives more flexibility for advanced users without complicating things for easier tasks. We will also support batch calls which is needed when developing for some platforms where there are limits for outgoing requests. Behind the scenes we have also added a lot of system improvements which hopefully will improve factors of speed and scalability.

As a startup it’s not always easy to fulfill all needs and one thing we unfortunately have been neglecting for too long is our API documentation. Of course there won’t be too many developers if the documentation is not good enough. Since some time back we have been working on improving our developer site both content wise and design wise. Our main goal for the new developer site is to provide a good and simple documentation for all levels, it should be easy to find and follow and of course have a nice and simple look and feel.

Some new features you can expect in the new API and developer site are:

  • Easier work flow
  • More understandable methods
  • Named parameters allowing default values
  • Batch calls according to JSON-RPC spec. v2.0
  • Comparing Context-to-Context
  • Well documented Getting Started Guide
  • Tutorials
  • Workflow charts
  • Example Code and Libraries

Before launching we have some more testing left to do, but stay tuned! For those who already are using our API, don’t be worried all old documented methods will still continue to work just as before.

Text Analysis API Documentation 2.0 Example