Wednesday, April 15, 2009

#hashtags and @replies....

Recently I was intrigued by some Twitter posts by @garrickvanburen relating to whether #hashtags and @reply syntax on Twitter were useful anymore with the advent of Twitter search. The topic of #hashtags has been discussed by Robert Scoble on Friendfeed as well, where Scoble claimed that "hashtags are dead".

I disagree with the sentiment that #hashtags serve no purpose anymore, and I think that part of the issue with people thinking they are dead is a lack of understanding about how search algorithms work (or should work). The key to #hashtags being useful is that they represent a significant act by the user who added them to their Tweet. The act of adding a #hashtag is akin to saying "this post is about #thisTopic", and that is powerful fuel for a good Information Retrieval algorithm that takes advantage of it. Now I realize that not all #hashtags really describe what the post is about, but the majority do and this data should be used.

During my Twitter exchange about this topic I made the following point:
A Tweet with a url and a #hashtag is directly translatable to a Delicious entry.
What I meant by this is that Delicious lets you tag your bookmarks into various categories and then search/sort along those dimensions. So if I write a Tweet that essentially shares a url (a very common use case), and then put a #hashtag on it, I have done the exact same thing. If this is powerful for Delicious it is certainly powerful for Twitter. In fact I would posit it is more powerful for Twitter, because Tweets are only 140 characters which is not a lot of data from which a search algorithm can determine similarity and relevance.

So how does a search algorithm take advantage of this data? Well a very simple way is to "boost" the relevance score for that field. Google does this based on HTML tags (well I think they do or at least did at some point), giving more weight to the text inside a <title> tag than say a inside <h4> tag. There are more complex ways of utilizing this data as well... but that is another post.

Okay so maybe I convinced you that #hashtags are not dead, but you are probably asking about @replies now. Well the argument is the same, if a bit more convoluted. The @reply tags are also meta-data about the post, they indicate who is/are the intended audience for the post. Knowing who is the intended audience means that you can then utilize meta-data you know about that users aggregate activity to "boost" the Tweet up or down. One example of this would be using TunkRank to re-weight a Tweet based on where the user was going. This is essentially giving extra weight to a Tweet going to someone who is an expert, or a power-user.

@replies and #hashtags are important pieces of meta-data (some of the only pieces of meta-data we get in Tweets); they should and will be considered by advanced search/indexing algorithms in the ranking of Tweet search results. So my plea is don't let @replies and #hashtags die, we would lose a useful system for collaborative filtering and intelligence, and make data-mining twitter all the more difficult.

No comments:

Post a Comment