Exploring Evidence Aggregation Methods and External Expansion Sources for Medical Record Search
DELAWARE UNIV NEWARK DEPT OF COMPUTER AND INFORMATION SCIENCES
Pagination or Media Count:
This paper describes and analyzes experiments we performed for the Medical Records track in the 2012 Text REtrieval Conference TREC. We mainly investigated three research problems 1. Evidence Aggregation In last years track there were two different methods in general for obtaining a visit ranking out of reports smaller document units, i.e., A using reports as indexing and retrieval units and then converting a report ranking into a visit ranking, and B using visits as indexing and retrieval units by concatenating reports at the very first stage and then obtain a visit ranking directly. Method A avoids the potential problem of varying visit document length, while Method B naturally aggregates evidence scatter over multiple reports and easily obtains a visit ranking. It is unclear which method is better based on all reported results. Thus, we compared the two approaches, tried various score aggregation methods for A, and combined both approaches in a way that further improved the system performance. 2. Expansion Sources We tested a variety of external collections ranging from general web datasets to domain-specific thesauri, and from Megabyte datasets to Terabyte datasets for query expansion, compared their effectiveness, and obtained useful insights into the data. 3. Retrieval Models We tested several statistical IR models proven to be effective on news and web collections on this medical collection, and explored ways to combine these models to address different aspects of task. For instance, we used MRF model to model term proximity since most medical concepts are phrases. We also used a mixture of relevance models to obtain various relevant expansion terms covered by different expansion collections respectively, which is expect to significantly alleviate the vocabulary mismatch between medical terminologies.
- Information Science