Just over a month ago I posted about all the assorted stuff Twitter had been up to. At that time they had recently unveiled an ad platform, bought a company that made a Twitter iPhone client, launched @anywhere for embedding Twitter on other pages, redesigned their own homepage, and announced that the Library of Congress would be storing the whole backlog of Tweets.
They’d also started to sketch the outlines of a new and exciting feature called annotations that would allow developers to attach arbitrary metadata to Tweets. It’s sort of like the addition of location data. Latitude and longitude can be associated with any Tweet as metadata—it’s part of the payload but doesn’t use up any of the precious 140 characters and it’s only visible through a client application that pulls it out and displays it.
In fact, there’s all sorts of metadata already included in each Tweet, much more than the actual 140 characters of primary “data”:
There’s information about the author (account identifier, avatar image, if it’s a verified account, etc.) whether it’s a reply to something, whether it’s a retweet, when it was posted, where it was posted from…those little snippets are already given a whole lot of context.
That diagram is labeled April 18th, and it’s already quite out of date. Last week it was announced on the Twitter API Google Group that Tweets would start including metadata about embedded mentions, hashtags, and URLs so that developers wouldn’t have to parse the text of a tweet to find them. Without looking at the text of Tweet a computer can pull links or references as well as know the who, where, when, and just about everything else involved.
Twitter has put every piece of information they already have into machine-readable data and attached it to every Tweet. So the only thing left is to start letting people put in more types of information. That’s what annotations are and they’re probably pretty important.
I posted at some length a while back about the idea that, “checkin-esque” (in the foursquare sense of the word) “structured data status updating” was a key new model for social media. Not to cede to much control of language to Twitter’s API team, but “annotated status updating” might be a better way to put it.
Raffi Krikorian from Twitter recently gave a presentation about the new annotations feature, and I think it’s worth watching for anyone interested in how all these social tools are going to evolve:
He gives some really good examples of how annotations might be used. Notably, one is about a standardized way to describe television shows. Miso, one of my examples from the previous post about structured data status updating, is a whole service organized around sharing well-organized TV watching information. Now the exact same thing can be done through Twitter. It’s actually not a bad deal for Miso, who could morph their application into a Twitter client that bundled Miso data with each outgoing Tweet and displayed it for any incoming Tweets that included it—let Twitter handle the infrastructure (try not to think about the Fail Whale).
Another example from the presentation I like a lot is automatically including song track information in Tweets. The client application would pull information from iTunes (or another application) about what a user was listening to and include that as metadata in outbound messages. Another user could set their client to recognize that type of data and play any song annotated to incoming Tweets from a particular user. All of a sudden you have a radio station, or a sort.
But it’s not too long before the obvious problem shows up: How many client applications are you going to need in order take advantage of all this? And how would you pick?
This is by far my favorite part of the whole affair. At around minute 8:00 of the above video, Krikorian explains that the Twitter developer portal will provide information about “most used”,”trending”, and “most adopted” types of annotations (use refers to the number of Tweets including that type or markup, adoption to the number of applications using it). This way, the developer community can collaboratively work out how best to annotate Tweets for certain data types and everyone can know which are the hottest data types that should be processed by any hip iPhone application.
To be clear, this does mean that someone will be able to claim the mantel of “parses the most trendingest annotation namespaces of any Twitter application” for their project. And that will be the application that lets uses do the TV watching checkin thing and the streaming music selection thing and the instant purchase of embedded product thing, and the a whole bunch of other things that no one has thought of yet. There will be an incentive to include the most annotation types and there will be a hyper-transparent (basically leader boad–driven) market of sorts for working out which are the best ones. Twitter is crowd-sourcing the project of making its little social service into a universal information exchange pipeline.
Why should that role fall to Twitter? Well, at around minute 19:00 of the video that gets addressed. An audience member asks why Twitter is doing all this data markup stuff when semantic web people have been on this for years with standards like RDF and microformats.
Krikorian explains that at Twitter they “really like” all those efforts and were “inspired by that”. They believe the annotations feature “use[s] that research and mindshare” but that the existing tools didn’t fit their use case and fell short on two key points. Interestingly, they are the two types of data that have been with Twitter from the beginning are are arguable constitutive of the service: when and who.
Twitter timestamps all data, no matter what. That function isn’t open to developers to play around with, information is indexed rigidly chronologically. We know how old any Tweet is.
The source of the Tweet is always included. As Krikorian notes, now that Twitter is switching to exclusively OAuth for application authentication, that means we know exactly where a Tweet came from.
So that is what Twitter is doing: Trying to build the most reliable and well-contextualized transmission system on the Web. Or at least, that’s what the API team seems to be up to.
