Pre-trained language models to extract information from radiological reports

Pilar López-Úbeda, Manuel Carlos Díaz-Galiano, Luis Alfonso Ureña López, María Teresa Martín-Valdivia
2021 Conference and Labs of the Evaluation Forum  
This paper describes the participation of the SINAI team in the SpRadIE challenge: Information Extraction from Spanish radiology reports which consists of identifying biomedical entities related to the radiological domain. There have been many tasks focused on extracting relevant information from clinical texts, however, no previous task has been centered on radiology using Spanish as the main language. Detecting relevant information automatically in biomedical texts is a crucial task because
more » ... rrent health information systems are not prepared to analyze and extract this knowledge due to the time and cost involved in processing it manually. To accomplish this task, we propose two approaches based on pretrained models using the BERT architecture. Specifically, we use a multi-class classification model, a binary classification model and a pipeline model for entity identification. The results are encouraging since we improved the average of the participants by obtaining a 73.7% F1-score using the binary system.
dblp:conf/clef/Lopez-UbedaDLM21 fatcat:cghdctwklvezrp6hli4acugtji