Natural Language Processing. A Machine Learning Perspective

Julia Ive
2021 Computational Linguistics  
Natural Language Processing (NLP) is a discipline at the crossroads of Artificial Intelligence (Machine Learning (ML) as its part), Linguistics, Cognitive Science and Computer Science that enables machines to analyse and generate natural language data. The multidisciplinary nature of NLP attracts specialists of various backgrounds, mostly with the knowledge of Linguistics and ML. As the discipline is largely practise-oriented, traditionally NLP textbooks are focused on concrete tasks and tend
more » ... elaborate on the linguistic peculiarities of ML approaches to NLP. They also often introduce predominantly either traditional ML or deep learning methods. This textbook introduces NLP from the ML standpoint elaborating on fundamental approaches and algorithms used in the field: such as statistical and deep learning models, generative and discriminative, supervised and unsupervised models, etc. In spite of the density of the material, the book is very easy-to-follow. The complexity of the introduced topics is built up gradually with references to previously introduced concepts while relying on a carefully observed unified notation system. The textbook is oriented to prepare the final-year undergraduate, as well as graduate students of relevant disciplines for the NLP course and stimulate related research activities. Considering the comprehensiveness of the topics covered in an accessible way, the textbook is also suitable for NLP engineers, non-ML specialists and a broad range of readers interested in the topic. The book comprises 18 chapters organised in three parts. Part I "Basics" discusses the fundamental ML and NLP concepts necessary for further comprehension using the example of classification tasks. Part II "Structures" covers the principles of mathematical modelling for structured prediction, namely for such structures as sequences and trees. Part III "Deep Learning" describes the basics of deep learning modelling for classification and structured prediction tasks. The part is finalised with the basics of sequence-tosequence modelling. The textbook thus emphasises the close connection and inheritance between the traditional and deep-learning methods. Following clear logic, generative models are introduced before discriminative ones (e.g., Chapter 7 from Part II introduces generative sequence labelling and Chapter 8 introduces discriminative sequence labelling), while the modelling with hidden variables is presented at the end. Within each chapter, model descriptions are followed by their training and inference details. Finally, chapters are concluded with a summary, chapter notes and exercises. The exercises are carefully designed to not only support and deepen comprehension but also stimulate further independent investigation on the topic. Some example questions include: advantages of the variational dropout versus naïve dropout, or how sequence-to-sequence models could be used for sequence labelling. In the following paragraphs I will cover the content of each chapter in more detail. 1 Computational Linguistics Just Accepted MS.
doi:10.1162/coli_r_00423 fatcat:rbztldsm6bgjlemmajt7wvpb6y