Accession Number:

ADA618605

Title:

Siena's Twitter Information Retrieval System: The 2014 Microblog Track

Descriptive Note:

Conference paper

Corporate Author:

SIENA COLLEGE LOUDONVILLE NY

Report Date:

2014-11-01

Pagination or Media Count:

9.0

Abstract:

As the internet dramatically changes each year, microblogs - such as Facebook and Twitter - are being used more often as a source of information exchange. Twitter users are learning about current events earlier compared to reading about it on their news feeds, as companies and celebrities continue to utilize Twitter to spread information. Information Retrieval, a topic which NIST1 National Institute of Standards and Technology holds a conference for every year, involves utilizing such online environments, like microblogs, to grab as much information from these sources to find if the information can be put towards a purpose. The Microblog Track was originally introduced to TREC2 Text REtrieval Conference in 2011, and selected Twitter3 as its microblog resource. Twitter allows its users to share short, 140 character length posts with their followers, and is often used to share anything from fashion trends to the latest terrorist attacks. Due to the short length of tweets, users often utilize other ways to share more information, such as including links or images with their tweets, which has an effect on the tweet containing relevant information. Participating groups for the track were given access to a Twitter API, provided by TREC, containing a corpus of 243 million tweets scrapped from February 1st to March 31st, 2013. Each group was given a set of test topics in which to test their system, which return results for the Adhoc andor Tweet Timeline Generation Task TTG. In this paper, we describe five Query Expansion modules and three Relevance modules designed for the microblog track, built within STIRS. Our precision results for our adhoc run shows STIRS average to be at 61.91 precision, with our average TTG at 85.38 precision.

Subject Categories:

  • Radio Communications

Distribution Statement:

APPROVED FOR PUBLIC RELEASE