Developing Natural Language Processing Algorithms to Medically Code the Clinical Notes in the Theater Data Medical Store

Zouris, James; D'souza, Edwin; Elkins, Trevor; Olson, Andrew

Developing Natural Language Processing Algorithms to Medically Code the Clinical Notes in the Theater Data Medical Store

Active / Technical Report | Accesssion Number: AD1162371 |

Open PDF

Abstract:

Outside the Department of Defense, natural language processing (NLP) strategies have been used with electronic health records (EHR) to increase information extraction from free text notes and structured fields, allowing access to much larger cohorts than previously possible. Current operational medical data is held in the Theater Medical Data Store (TMDS). Most of the medical information in TMDS is contained in unstructured text fields. The objective will be to automate the data-coding process into the injury diagnostic code groups, which are derived from the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) codes. There are over 8million records in the TMDS and there may be as much as 50% of the ICD-9-CM codes that are not completely or accurately coded. The accuracy of the data in the TMDS has never been quantified, largely because most has been captured without any medical billing concerns. The study has developed a set of programming rules using NLP and machine learning (ML) (i.e., algorithms generated by automated learning from manually coded data), with eventual output that will represent human interpretation as much as possible. The coding algorithm models have been developed using pre-existing coded medical records from the Expeditionary Medical Encounter Dataset (EMED) housed at the Naval Health Research Center (NHRC). Experienced nurse staff are responsible for coding and validating all the EMED medical encounter records. The model will be trained ona subset of the EMED data and then tested on TMDS data that has been matched to the remaining EMED data.

Author(s):

Zouris, James ; D'souza, Edwin ; Elkins, Trevor ; Olson, Andrew

Author Organization(s):

NAVAL HEALTH RESEARCH CENTER SAN DIEGO CA

Funding Organization(s):

ARMY MEDICAL RESEARCH AND DEVELOPMENT COMMAND FORT DETRICK MD, FORT DETRICK , MD

Document Type:

Technical Report/Annual Report

Publication Date:

2021 Sep 01

Pagination:

9

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution Code:

A - Approved For Public Release

Distribution Statement: Public Release

RECORD

Collection: TRECMS

Identifying Numbers

Grant Number(s):

DM190529

Subject Terms

Modernization Areas:

Autonomy

Communities of Interest:

Biomedical

Descriptor(s):

natural language processing, machine learning, natural languages, biomedical research, language, computer programming, project management, accuracy, algorithms, classification, contractors, deep learning, department of defense, learning, maryland, patent applications, professional development

Keyword(s):

clinical notes

Subject Categories:

Biological and Medical Sciences

Creation Date:

2022 Mar 07

Update Date:

2022 Mar 29