For the uninitiated, ProgrammableWeb is a repository which stores information on publicly available APIs and mashups of said APIs. The repository is rich, currently comprising of over 2000 APIs and over 5000 mashups and quite a lot of useful information pertaining to the APIs and the mashups.
Obviously, this is not an enormous data set, but even still, understanding what it looks like is non-trivial. I wanted to develop some simple means of understanding what it looks like to get some feel for what types of mashups people are creating. I asked the friendly guys at PW if they would give me an API key and they kindly obliged.
I had some ideas on what to do, but they changed somewhat as I got some experience working with the data. My initial objective was to look at the classifications that the PW guys have developed and see how mashups relate to classifications; in particular, to understand which combinations of classifications result in large amounts of mashups.
I got some basic animation up and running in which I extracted the 10 classifications with the most APIs, as I thought these the most important. Clicking on one of these classifications gives information on the APIs that fall into that classification, ordered in terms of the numbers of mashups that contain that API. So, for example, the first four APIs returned when the ‘Social’ classification is clicked are twitter, facebook, foursquare and LinkedIn. It’s then possible to click on each of these in turn to obtain a list of mashups which use those APIs, ordered by popularity.
(Note that the animation of the different classifications is not so important – originally, I had designs on making these moving circles that would move the selected classification into the centre and link to the other classifications, with the thickness of the link reflecting how many mashups comprise of APIs that fall into the two classifications).
Once I had developed this somewhat rudimentary visualization tool, I was able to navigate the PW data set a bit more easily. After playing with it a little, it became apparent that the dataset is somewhat skewed – there are over 2000 mashups which use the Google Maps API alone and youtube, flickr and twitter all come in around the 500 mark. Further, there are some classifications that have very few mashups.
This had implications for the work – in particular, I decided not to progress with enhancing the tool to reflect the linkages between the classifications as there was significant non-uniformity in it which is reasonably easy to detect.
There are still some interesting things which could be done with this and I may come back to revisit it at some future date. For now, it’s parked here.
Happy to have any comments/feedback on the tool in the handy comment box below!