CNIPA, FUB and University of Rome "Tor Vergata" at TREC 2008 Legal Track
FONDAZIONE UGO BORDONI ROME (ITALY)
Pagination or Media Count:
The TREC Legal track was introduced in TREC 2006with the claimed purpose of to evaluate the efficacy of automated support for review and production of electronic records in the context of litigation, regulation and legislation. The TREC Legal track 2008 runs three tasks 1 an automatic ad hoc task, 2 an automatic relevance feedback task, and 3 an interactive task. We have only taken part in the automatic ad hoc task of the TREC Legal track 2008, and focused on the following issues 1. Indexing. The CDIP test collection is characterized by an large number of unique terms due to OCR mistakes. We have defined a term selection strategy to reduce the number of terms, as described in Section 2. 2. Querying. The analysis of the past TREC results for the Legal track showed that the best retrieval strategy basically returned a ranked list of the boolean retrieved documents. As a consequence,we have defined a strategy aimed to boost the score of documents satisfying the final negotiated boolean query. Furthermore, we defined a method for automatic construction of a weighted query from the request text, as reported in Section 3. 3. Estimation of the K value.We have used a query performance prediction approach to try to estimate K values. The query weighting model that we have adopted is described in Section 4.
- Information Science
- Computer Programming and Software