Accession Number : ADA581326

Title :   Identifying Patients for Clinical Studies from Electronic Health Records: TREC 2012 Medical Records Track at OHSU

Descriptive Note : Conference paper


Personal Author(s) : Bedrick, Steven ; Edinger, Tracy ; Cohen, Aaron ; Hersh, William

Full Text :

Report Date : Nov 2012

Pagination or Media Count : 19

Abstract : The goal of the TREC 2012 Medical Records Track was to search medical record documents to identify patients as possible candidates for clinical studies based on diagnosis, age, and other attributes. For TREC 2012, the Oregon Health & Science University (OHSU) group experimented with both manual and automated techniques. We used a derivative of Lucene to build an interactive retrieval system that can process queries in one of two ways. Users can manually specify Boolean queries whose terms may include words as well as ICD-9 codes. Alternatively, the system features an automated query parser that transforms free-text queries into structured Boolean queries. The query parser is built on top of MetaMap and the UMLS Metathesaurus. We submitted both automatic runs (which relied solely on the automated query parser) as well as manual runs consisting of queries built by an expert clinician. Overall, our automated query parser performed below the mean of other groups, although there were individual topics for which it performed very well. This irregular performance was in part due to our parser's tendency to over-specify queries, leading to reduced recall. There were, however, several topics for which our parser performed very well, suggesting that our fundamental approach has merit. In contrast, our manual runs performed very well, scoring second-best among official manual runs. With further modification of the manual queries, we were able to achieve even better performance. Query of electronic health records for the use case of identifying patients as candidates for clinical studies still requires manual query development, at least until better automated methods can be developed that outperform them.


Subject Categories : Information Science
      Medicine and Medical Research

Distribution Statement : APPROVED FOR PUBLIC RELEASE