Siena's Twitter Information Retrieval System: The 2012 Microblog Track
SIENA COLLEGE LOUDONVILLE NY
Pagination or Media Count:
Since 1992, the National Institute of Standards and Technology NIST has been annually hosting the Text Retrieval Conference TREC. One of the newest tracks, which started in 2011, is the Microblog Track which uses a well-known social network site, Twitter, as its source of microblog data. Twitter allows its users to post 140 character length tweets to share messages with their followers, posting personal updates, and share major media stories from around the world. In order to evaluate information retrieval on microblog data, groups were provided with a file of about 16 million tweet IDs from January 24th to February 8th, 2011. This allowed us to download the tweet content of each ID for a total of 16,141,812 tweets. Participating teams were given a set of topics to test their retrieval process, and their program would return relevant tweets about that topic. The Siena College Institute of Artificial Intelligence expanded on STIRS, Sienas Twitter Information Retrieval System. The results for our adhoc run showed STIRS best run to be at 18.08 precision, while the average of the median from all participating teams was 14.86.
- Information Science