DAPC Conference Proceedings

Volume 3224, Issue 1020059

Conference Paper

Natural language processing for automated medical report generation from clinical notes

Sharma Ritu1Chen Xiaoming2Okonkwo Emeka3

1AIIMS New Delhi, India

2Tsinghua University, China

3University of Lagos, Nigeria

Published Online

February 25, 2026

ISSN

1551-7616

Publisher

DAPC Publishing

Abstract

We present an NLP pipeline that automatically generates structured medical reports from unstructured clinical notes written by physicians. The system combines a fine-tuned BioBERT model for named entity recognition with a T5-based text generation module. Evaluated on 8,500 clinical notes from a tertiary hospital, our approach achieves a ROUGE-L score of 0.73 and reduces physician documentation time by approximately 40%.

Topics

NLPHealthcareText GenerationBioBERTClinical AI

Full Text Preview

The pipeline consists of three stages: (1) BioBERT-based NER extracts medical entities (diagnoses, medications, procedures, lab values) from free-text clinical notes; (2) a relation extraction module identifies entity relationships; (3) a T5-large model generates structured reports following HL7 FHIR templates. The system was trained on 8,500 de-identified clinical notes from the cardiology and pulmonology departments of a 2,000-bed tertiary hospital.

Published Through

DAPC Publishing

Official Publication Partner

SCOPUS Indexed

Maximum Visibility & Tracking

Peer Reviewed

Rigorous Academic Standards

Back to all publications