A Hybrid Text Classification and Language Generation Model for Automated Summarization of Dutch Breast Cancer Radiology Reports

Elisa Nguyen*, Daphne Theodorakopoulos, Shreyasi Pathak, Jeroen Geerdink, Onno Vijlbrief, Maurice van Keulen, Christin Seifert

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Breast cancer diagnosis is based on radiology reports describing observations made from medical imagery, such as X-rays obtained during mammography. The reports are written by radiologists and contain a conclusion summarizing the observations. Manually summarizing the reports is time-consuming and leads to high text variability. This paper investigates the automated summarization of Dutch radiology reports. We propose a hybrid model consisting of a language model (encoder-decoder with attention) and a separate BI-RADS score classifier. The summarization model achieved a ROUGE-L F1 score of 51.5% on the Dutch reports, which is comparable to results in other languages and other domains. For the BI-RADS classification, the language model (accuracy 79.1 %) was outperformed by a separate classifier (accuracy 83.3 %), leading us to propose a hybrid approach for radiology report summarization. Our qualitative evaluation with experts found the generated conclusions to be comprehensible and to cover mostly relevant content, and the main focus for improvement should be their factual correctness. While the current model is not accurate enough to be employed in clinical practice, our results indicate that hybrid models might be a worthwhile direction for future research.
Original languageEnglish
Title of host publication2020 IEEE Second International Conference on Cognitive Machine Intelligence (CogMI)
PublisherIEEE
Pages72-81
ISBN (Electronic)978-1-7281-4144-2
DOIs
Publication statusPublished - 20 Jan 2021
Event2nd IEEE International Conference on Cognitive Machine Intelligence, CogMI 2020: The 6th IEEE International Conference on Collaboration and Internet Computing (IEEE CIC 2020) , The 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (IEEE TPS 2020), The 2nd IEEE International Conference on Cognitive Machine Intelligence (IEEE CogMI 2020) - Virtual Conference, Virtual, Atlanta, United States
Duration: 1 Dec 20203 Dec 2020
Conference number: 2

Conference

Conference2nd IEEE International Conference on Cognitive Machine Intelligence, CogMI 2020
Abbreviated titleCogMI 2020
CountryUnited States
CityVirtual, Atlanta
Period1/12/203/12/20

Fingerprint Dive into the research topics of 'A Hybrid Text Classification and Language Generation Model for Automated Summarization of Dutch Breast Cancer Radiology Reports'. Together they form a unique fingerprint.

Cite this