Accession Number:

ADA618620

Title:

UCLA at TREC 2014 Clinical Decision Support Track: Exploring Language Models, Query Expansion, and Boosting

Descriptive Note:

Conference paper

Corporate Author:

CALIFORNIA UNIV LOS ANGELES

Report Date:

2014-11-01

Pagination or Media Count:

11.0

Abstract:

For the TREC 2014 Clinical Decision Support track, participants were given a set of 30 patient cases in the form of a short natural language description and a data set of over 700,000 full-text articles from PubMed Central. The task was to retrieve articles relevant to the patient cases and one of three types of clinical question diagnosis, test, and treatment. This paper describes the retrieval system developed by the Medical Imaging Informatics group at the University of California, Los Angeles. One manual run and four automatic runs were submitted. For the automatic runs, a variety of retrieval strategies were explored. Two retrieval methods were compared the vector space model with TF-IDF similarity, and a unigram language model with Jelinek-Mercer smoothing. The performance of retrieving on abstracts alone was compared to that of full-text. Finally, a simple set of rules for query expansion and term boosting was developed based on recommendations from domain experts. Submissions for 26 groups were pooled and evaluated by a team of medical librarians and physicians at the National Institute of Standards and Technology. The results showed that 1 the language model outperformed the vector space model for automatically-constructed queries, 2 searching full-text was more effective than searching abstracts alone, and 3 boosting improved the ranking of retrieved documents for test topics, but not diagnosis topics. Our best automatic run used the language model, full-text search, query expansion, and no boosting.

Subject Categories:

  • Medicine and Medical Research

Distribution Statement:

APPROVED FOR PUBLIC RELEASE