Basic Building Blocks for Clinical Text Processing [chapter]

Hercules Dalianis
2018 Clinical Text Mining  
This chapter will describe the basics for text processing and give an overview of standard methods or techniques: Preprocessing of texts such as tokenisation and text segmentation. Word processing such as morphological processing, lemmatisation, stemming, compound splitting, abbreviation detection and expansion. Sentence based methods such as part-of-speech tagging, syntactical analysis or parsing, semantic analysis such as named entity recognition, negation detection, relation extraction,
more » ... ral processing and anaphora resolution. Generally, the same building blocks used for regular texts can also be utilised for clinical text processing. However, clinical texts contain more noise in the form of incomplete sentences, misspelled words and non-standard abbreviations that can make the natural language processing cumbersome. For more details on the concepts in this section, see the following comprehensible textbooks in computational linguistics: Mitkov (2005) , Jurafsky and Martin (2014) and Clark et al. (2013).
doi:10.1007/978-3-319-78503-5_7 fatcat:pqwyc2vm2zh2vlky7bzo5da3pa