Natural language processing for automated medical report generation from clinical notes
1AIIMS New Delhi, India
2Tsinghua University, China
3University of Lagos, Nigeria
Abstract
We present an NLP pipeline that automatically generates structured medical reports from unstructured clinical notes written by physicians. The system combines a fine-tuned BioBERT model for named entity recognition with a T5-based text generation module. Evaluated on 8,500 clinical notes from a tertiary hospital, our approach achieves a ROUGE-L score of 0.73 and reduces physician documentation time by approximately 40%.
Topics
Full Text Preview
The pipeline consists of three stages: (1) BioBERT-based NER extracts medical entities (diagnoses, medications, procedures, lab values) from free-text clinical notes; (2) a relation extraction module identifies entity relationships; (3) a T5-large model generates structured reports following HL7 FHIR templates. The system was trained on 8,500 de-identified clinical notes from the cardiology and pulmonology departments of a 2,000-bed tertiary hospital.
Published Through
DAPC Publishing
Official Publication Partner
SCOPUS Indexed
Maximum Visibility & Tracking
Peer Reviewed
Rigorous Academic Standards