Natural language processing helps identify patients with chronic cough

Researchers from Regenstrief Institute, Indiana University School of Medicine and Merck & Co. created and validated a natural language processing (NLP) algorithm to identify patients with chronic cough. The validation paper, published in the journal Chest, is the first to use NLP to identify and examine this condition, and the study created the largest assembled cohort of chronic cough patients to this point.

Chronic cough is classified as a cough that lasts eight weeks or more. It affects 10 percent of the population, but it does not have a diagnostic code. That makes it hard to identify people with the condition through electronic health records (EHRs). Identifying these patients is important for characterizing treatment and unmet needs.

Regenstrief research scientist Michael Weiner, M.D., MPH, and his team created an NLP algorithm to analyze unstructured data in the medical records. That method was instrumental in identifying 74 percent of people with chronic cough who did not have structured evidence of the condition and addressed the gap in ability to characterize the disease burden.

This method can be used to create larger and more robust cohorts for studies related to treatment of chronic cough.

“Identifying and characterizing a chronic cough cohort through electronic health records” was published online ahead of print in Chest. Funding for this research came from Merck & Co., Inc. This is part of a partnership between Regenstrief Institute and Merck to collaborate on projects using clinical data to inform delivery of healthcare.