Bigram feature extraction and conditional random fields model to improve text classification clinical trial document

Jasmir Jasmir, Siti Nurmaini, Reza Firsandaya Malik, Bambang Tutuko
2021 TELKOMNIKA (Telecommunication Computing Electronics and Control)  
In the field of health and medicine, there is a very important term known as clinical trials. Clinical trials are a type of activity that studies how the safest way to treat patients is. These clinical trials are usually written in unstructured free text which requires translation from a computer. The aim of this paper is to classify the texts of cancer clinical trial documents consisting of unstructured free texts taken from cancer clinical trial protocols. The proposed algorithm is
more » ... rithm is conditional random Fields and bigram features. A new classification model from the cancer clinical trial document text is proposed to compete with other methods in terms of precision, recall, and f-1 score. The results of this study are better than the previous results, namely 88.07 precision, 88.05 recall and f-1 score 88.06.
doi:10.12928/telkomnika.v19i3.18357 fatcat:ojfau7nuxrhl3j3ugpjqn6nyny