Accession Number:

ADA579307

Title:

TREC Microblog 2012 Track: Real-Time Algorithm for Microblog Ranking Systems

Descriptive Note:

Conference paper

Corporate Author:

ROMA TRE UNIV (ITALY)

Personal Author(s):

Report Date:

2012-11-01

Pagination or Media Count:

6.0

Abstract:

As a matter of fact Twitter is becoming the new big data container, due to the deep increase of amount of users and its growing popularity. Moreover the huge amount of user profiles and rough text data, are providing continuosly new research challenges. This paper reports our contribution and results to the Trec 2012 Microblog Track. In this particular, challenge each participant is required to conduct a real-time retrieval task which given a query topic seeks for the most recent and relevant tweets. We devised an effective real time ranking algorithm, avoiding heavy computational requirements. Our contribution is multifold 1 adapting an existing ranking method BM25 to the microblogging purpose 2 enhancing traditional content-based features with knowledge extracted from Wikipedia, 3 employing Pseudo Relevance Feedback techniques for query expansion 4 performing text analysis such as ad-hoc text normalization and POS Tagging to limit noise data and better represent useful information.

Subject Categories:

  • Information Science

Distribution Statement:

APPROVED FOR PUBLIC RELEASE