When machine learning techniques scale

October 5th, 2011 by Anders Hall

As a startup in 2008, we found ourselves in a situation were we had the outlines for a heavy technology at our hands though hardly the perfect palette of competences, or the needed breadth of skills, to cover every programming aspect ranging from web to low level algorithm development (math).

In our tiny company, we were then forced to make many hard decision, often after heated debates, that seen in retrospect were followed through, despite hardships. What I find most interesting, is that many decisions we made are only recently starting to to pay off.

Notably long term decisions were;

  • to commit to a development process that is based on existing or homegrown machine learning techniques
  • to build software that is language independent
  • to automate everything
  • to use all technologies we could get our hands on, as long as they had elements of prediction technology in them (were scalable in our opinion at the time).
  • to build a scalable technology platform that can spawn many types of easy accessible technologies or high level services
  • calculations and systems must scale (large amounts of text == more hidden information)
  • focus on text analysis (e.g., sound analysis was briefly on the table)

Somewhat later, but nonetheless important, we decided to;

  • build a user friendly and scalable API

The task of building a fully language dependent technology platform is a painstakingly long effort, where hundreds of assorted system and coding projects needs to be completed. Thus, it is not entirely easy to defend that we went down that path, considered we primarily were active in the Swedish market the first two and half years. Today, we see growing demand and interest in language independent technology. It is on the other hand easy to defend the long term work on scalable systems, as we now have the capability to provide stable services to larger organizations and can perform large scale analysis of text.

However, the one effort that consistently have kept our head above the water, in a treacherous sea of marvelous competitors, is the constant strive to automate processes and technologies. The automation of system tests, deployment processes and system administration is to a great degree, but not entirely, based on knowledge from the ICT community. Thanks! We are not the first company to build this type of large scale systems with API:s. Still, we have less resources than many competitors and manage to keep good pace with other similar efforts – while being active in more text analysis markets than may seem healthy. Why?

My firm belief is that the decision to commit to predictive technologies, rather than building programs designed to solve a specific problem, is one major reason why we still are competitive. Machine learning (or genetic programming) is by no means a silver bullet. It requires (perhaps less today) good theoretical knowledge of prediction methods (e.g., we get many ideas from the field of psychology) and how to combine them, has a rather high learning curve and requires many non-standard system solutions to work well and scale performance-wise.

The main benefit is that machine learning scales well in terms of how many developers a technology or service require. Take the case of our tagging service, which we built from scratch at Saplo. Excluded below are all API, GUI, support and market efforts required to launch a service/product, which at least triple all numbers:

  • (2 p team) firstly, we built a technology that could predict the category of entities in Swedish text
  • (2 p team) secondly, we built a technology that could predict entities in Swedish text
  • (0.5 p team) thirdly, we shaved of a few aspects that were not fully language independent and ran the same code on English text.
  • (1 p team) fourthly, we refined the above techniques to a second generation tagging technology.
  • (0.2 p team) to scale to other languages (markets), or support the code base, we now don’t need much basic code work.

A third generation of this service will probably be merged with technology from the other 3 (soon 4) technologies we provide with a small team of dedicated coders.

Traditional rule based coding can also be implemented in smart ways, still in order to scale vertically in terms of languages, text formats, types of predictive tagging tasks, etc the code base will likely contain many special rules and is probably maintained by a larger team. The code will, in my opinion, also be harder to adopt to new complex problems and to combine with other technologies.

A second problem we solved, by using machine learning, is that often we found that we were entirely wrong in our assumptions of what was the best solution for a specific problems. Today, we to a great degree let the technology find the optimal solution to any given problem by rapidly implementing ideas (models) and testing them. The test effort is seldom considered lost time, even if we prove a model has poor standard. For instance, when analyzing sentiment predictions we can see that in some cases a particular technology is significant while in others it is not. It depends entirely on the purpose of the analysis task designated to our API by the end-user (or customer).

A third, perhaps hidden benefit, is that we let the technology predict/decide even minor options for running a service. This greatly reduces the need for complex decisions by the end user (e.g. when API methods are designed).

On an ending note. Yesterday, when watching an documentary about IBM:s Watson I was stricken by similarities between their and our team efforts, despite the apparent size difference between our projects. I’m not indicating we are their equals. What I interpreted as similarities was; The progress of technology based on machine learning can initially be slow – in the long term (in their case the 4:th year) it will accelerate beyond anything rule based programming can provide. In conclusion, I argue that machine learning greatly accelerates the development process in the long term.

Cheers,
Anders Hall,
coder