Accession Number:



Identifying Patients for Clinical Studies from Electronic Health Records: TREC 2012 Medical Records Track at OHSU

Descriptive Note:

Conference paper

Corporate Author:


Report Date:


Pagination or Media Count:



The goal of the TREC 2012 Medical Records Track was to search medical record documents to identify patients as possible candidates for clinical studies based on diagnosis, age, and other attributes. For TREC 2012, the Oregon Health Science University OHSU group experimented with both manual and automated techniques. We used a derivative of Lucene to build an interactive retrieval system that can process queries in one of two ways. Users can manually specify Boolean queries whose terms may include words as well as ICD-9 codes. Alternatively, the system features an automated query parser that transforms free-text queries into structured Boolean queries. The query parser is built on top of MetaMap and the UMLS Metathesaurus. We submitted both automatic runs which relied solely on the automated query parser as well as manual runs consisting of queries built by an expert clinician. Overall, our automated query parser performed below the mean of other groups, although there were individual topics for which it performed very well. This irregular performance was in part due to our parsers tendency to over-specify queries, leading to reduced recall. There were, however, several topics for which our parser performed very well, suggesting that our fundamental approach has merit. In contrast, our manual runs performed very well, scoring second-best among official manual runs. With further modification of the manual queries, we were able to achieve even better performance. Query of electronic health records for the use case of identifying patients as candidates for clinical studies still requires manual query development, at least until better automated methods can be developed that outperform them.

Subject Categories:

  • Information Science
  • Medicine and Medical Research

Distribution Statement: