Tutorial: Similarity between Twitter users in 10 minutes

March 27th, 2012 by Joakim Stenberg

We are going to build an app that compare similarity between Twitter users by matching their tweets against each other. We will start out by fetching tweets from a specific Twitter user through the Twitter API and add those tweets through Saplo API to a Saplo Group. Then we will compare the groups against each other and that way we get a comparison of the similarity of Twitter users.

For this tutorial we’ll be using PHP. Let’s dive right in!

Step 1: Generate Twitter API credentials

Visit https://dev.twitter.com and click the Sign in button up in the right corner. Use your Twitter account to log in.

Click the Create an app link and you will be asked to fill in details about your app.

The only field that might cause you some confusion is the Callback URL field used by Twitter to authenticate the user. If you plan to try this app out on your local development environment Twitter probably won’t accept an URL like http://localhost/saplo-similar-twitter-users, but we can easily go around this problem by using an URL shortener, e.g. http://goo.gl/.

Accept the Developer Rules Of The Road and submit the form by clicking the Create your Twitter application button and you’ll be redirecting to a page displaying information about your app, e.g. OAuth settings.

Scroll down to the bottom of the page and click the button labeled Create my access token. The page is reloaded and if you scroll down to the bottom again, you’ll see your access token.

Great, we have prepared everything on the Twitter side of our app, time to set up a Saplo API account!

Step 2: Generate Saplo API credentials

Visit http://saplo.com/signup/free and fill out the required fields Name and E-mail. Optionally you can provide us with some more information by filling in the other fields as well. Click the Submit button and you will be redirected to a page thanking you for signing up and asking you to keep an eye on your e-mails for more information.

Check your e-mail for an e-mail labeled Saplo Text Analysis API invitation, when it arrive just open it and click the registration link. This will open a page telling you that your API account has been created and that your API key will be send to your e-mail.

Great, let’s check our e-mail again. You should get a mail labeled Saplo API key. Open it and you’ll find your API credentials.

Step 3: Grab the code

In this tutorial we are not going to write a lot of code, let’s just download the complete app from GitHub.

Put the downloaded content in a folder where Apache or some other web server running on your local environment can reach it. Open the app in your browser (e.g. http://localhost/saplo-similar-twitter-users) and make sure that you can see the Twitter username form.

Alright, we have successfully set up the app, now time for some final preparations before we can run it.

Step 4: Configure the app

Go to the directory where you have put the content you previously downloaded. Open the file named settings.php. Enter the Twitter-, and Saplo API settings, but leave COLLECTION_ID set to 0 for now. Save the file.


 * Saplo API credentials.

 * ID of Saplo Collection where we would like to store tweets.
define('COLLECTION_ID', 0);

 * Twitter API credentials.

We need to create a Saplo Collection where tweets can be stored. Run the create collection helper script by accessing collection.php (e.g. http://localhost/saplo-similar-twitter-users/collection.php) in a web browser. The ID of your new Saplo Collection is printed out on the screen, copy it and open up your settings.php file again. Set COLLECTION_ID, save and close the file.



Step 5: Run the app

That’s it, you should now be able to try the app out! Open the app by accessing your project directory in a web browser and you should see the Twitter username form. Enter a Twitter username and click Go. The whole process of fetching tweets and analyzing them is started, this might take a while, be patient. Right now we don’t have any users to compare against so you will be handed an empty result. Try adding a new user, wait for the script to work its magic and voila the similarity between the two Twitter users is displayed.


A free Saplo API account gives you 2 000 API calls per month with a hourly limit of 240 calls. Each time you add a Twitter user you’ll use approximately 25 calls (every tweet fetched from Twitter API will mean one call when adding it to Saplo API). That being said, we can easily figure out that you won’t be able to add that many Twitter users to your application per hour so select your Twitter users carefully. Also a free account has a limit of 20 groups meaning that you would anyhow not be able to add more then 20 Twitter users.

A quick note on the quality of the similarity comparison; in this application only the last 20 tweets is fetched from Twitter per user. You could increase this limit to 30, 40 or more tweets in the settings.php file and that way improve the comparison. However that would make you run out of API calls quickly.

Thank you for reading and good luck!