Efficiently and automatically abstract cancer medical records

Publication Date:

Authors:

Jillian Nelson, Monica Agrawal, Den E Bloodworth, Divya Gopinath, James Mullenbach, Peter J Briggs, Dominique Connolly, David Sontag, Alpa V Patel

Journal or conference name:

Cancer Research, Vol. 85 — American Association for Cancer Research

Abstract:

Abstraction of clinical data from unstructured medical records is a labor-intensive process. LLM-based systems can accelerate the availability of data for research use. The American Cancer Society's Cancer Prevention Study-3 is a large, nationwide prospective cohort study of ~300,000 cancer-free participants enrolled between 2006–2013. Participants who self-reported a diagnosis of breast, colorectal, ovarian, or any blood cancer on the 2021 follow-up survey were consented for the retrieval of medical records from diagnostic and treatment facilities. Guidelines including ontologies and decision trees were generated to precisely define abstraction tasks, and relevant research data were annotated via human abstraction. This project then used 300 breast cancer-related medical records to develop and test LLM-based abstraction pipelines.

hello@layerhealth.com

Enterprise-grade security and compliance.

© Layer Health 2025

hello@layerhealth.com

Enterprise-grade security and compliance.

© Layer Health 2026

hello@layerhealth.com

Enterprise-grade security and compliance.

© Layer Health 2026