Filters








8,917 Hits in 8.7 sec

Supervised Morphological Segmentation in a Low-Resource Learning Setting using Conditional Random Fields

Teemu Ruokolainen, Oskar Kohonen, Sami Virpioja, Mikko Kurimo
2013 Conference on Computational Natural Language Learning  
Specifically, we employ conditional random fields, a popular discriminative log-linear model for segmentation. We present experiments on two data sets comprising five diverse languages.  ...  We discuss data-driven morphological segmentation, in which word forms are segmented into morphs, the surface forms of morphemes.  ...  Acknowledgements This work was financially supported by Langnet (Finnish doctoral programme in language studies) and the Academy of Finland under the Finnish Centre of Excellence Program 2012-2017 (grant  ... 
dblp:conf/conll/RuokolainenKVK13 fatcat:ultpiswy3fgpzayluf7bddpyyu

Painless Semi-Supervised Morphological Segmentation using Conditional Random Fields

Teemu Ruokolainen, Oskar Kohonen, Sami Virpioja, mikko kurimo
2014 Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers  
We extend a recent segmentation approach based on conditional random fields from purely supervised to semi-supervised learning by exploiting available unsupervised segmentation techniques.  ...  We integrate the unsupervised techniques into the conditional random field model via feature set augmentation.  ...  We study morphological segmentation using conditional random fields (CRFs), a discriminative model for sequential tagging and segmentation (Lafferty et al., 2001) .  ... 
doi:10.3115/v1/e14-4017 dblp:conf/eacl/RuokolainenKVK14 fatcat:rugytfcsknfwpbani4eknkwpiu

Canonical and Surface Morphological Segmentation for Nguni Languages [article]

Tumi Moeng, Sheldon Reay, Aaron Daniels, Jan Buys
2021 arXiv   pre-print
In the unsupervised setting, an entropy-based approach using a character-level LSTM language model fails to outperforms a Morfessor baseline, while on some of the languages neither approach performs much  ...  In this paper, we investigate supervised and unsupervised models for two variants of morphological segmentation: canonical and surface segmentation.  ...  Acknowledgments This work is based on research supported in part by the National Research Foundation of South Africa (Grant Number: 129850) and the South African Centre for High Performance Computing.  ... 
arXiv:2104.00767v1 fatcat:wop3gn3geva3pp7erv5kwo5zjm

Deep-sea Nodule Mineral Image Segmentation Algorithm Based on Pix2PixHD

Wei Song, Haolin Wang, Xinping Zhang, Jianxin Xia, Tongmu Liu, Yuxi Shi
2022 Computers Materials & Continua  
It is important for expanding the application of deep learning techniques in the field of deep-sea exploration and mining.  ...  Deep-sea mineral image segmentation plays an important role in deep-sea mining and underwater mineral resource monitoring and evaluation.  ...  Acknowledgement: Thanks to other teachers and students in the Media Computing Laboratory of the Minzu University of China and anonymous reviewers for their valuable comments and contributions to this research  ... 
doi:10.32604/cmc.2022.027213 fatcat:tjnkzgrzmnabxjiku6vflwfsz4

Integrating Automated Segmentation and Glossing into Documentary and Descriptive Linguistics

Sarah Moeller, Mans Hulden
2021 Proceedings of the Workshop on Computational Methods for Endangered Languages  
In one experiment, a model learns segmentation and glossing as a joint step and another model learns the tasks into two sequential steps.  ...  In a second experiment, one model is trained on surface segmented data, where strings of texts have been simply divided at morpheme boundaries.  ...  McMillan-Major (2020) trained conditional random field (CRF) systems to produce a gloss line for several high-resource languages and three lowresource languages.  ... 
doi:10.33011/computel.v1i.965 fatcat:ffm3ucuepvfudl34nhcoiclnx4

Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages [article]

Katharina Kann, Manuel Mager, Ivan Meza-Ruiz, Hinrich Schütze
2018 arXiv   pre-print
Since neural sequence-to-sequence (seq2seq) models define the state of the art for morphological segmentation in high-resource settings and for (mostly) European languages, we first show that they also  ...  obtain competitive performance for Mexican polysynthetic languages in minimal-resource settings.  ...  We further compare to a conditional random fields (CRF) (Lafferty et al., 2001) model, in particular a strong discriminative model for segmentation by Ruokolainen et al. (2014) .  ... 
arXiv:1804.06024v1 fatcat:hjmbmiyo3rdtxmbc4chtzlkrta

A Comparative Study of Minimally Supervised Morphological Segmentation

Teemu Ruokolainen, Oskar Kohonen, Kairit Sirts, Stig-Arne Grönroos, Mikko Kurimo, Sami Virpioja
2016 Computational Linguistics  
In the minimally supervised data-driven learning setting, segmentation models are learned from a small number of manually annotated word forms and a large set of unannotated word forms.  ...  This article presents a comparative study of a subfield of morphology learning referred to as minimally supervised morphological segmentation.  ...  Acknowledgments This work was financially supported by Langnet (Finnish doctoral programme in language studies), the Academy of Finland under the Finnish Centre of Excellence Program 2012-2017 (grant no  ... 
doi:10.1162/coli_a_00243 fatcat:trm4ofybira6xez5higlt3266y

Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages

Katharina Kann, Jesus Manuel Mager Hois, Ivan Vladimir Meza Ruiz, Hinrich Schütze
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)  
Since neural sequence-to-sequence (seq2seq) models define the state of the art for morphological segmentation in high-resource settings and for (mostly) European languages, we first show that they also  ...  obtain competitive performance for Mexican polysynthetic languages in minimal-resource settings.  ...  We further compare to a conditional random fields (CRF) (Lafferty et al., 2001) model, in particular a strong discriminative model for segmentation by Ruokolainen et al. (2014) .  ... 
doi:10.18653/v1/n18-1005 dblp:conf/naacl/KannMRS18 fatcat:wgz7plttyzbdnlsfxoosvyuhpm

Morphological Processing of Low-Resource Languages: Where We Are and What's Next [article]

Adam Wiemerslage and Miikka Silfverberg and Changbing Yang and Arya D. McCarthy and Garrett Nicolai and Eliana Colunga and Katharina Kann
2022 arXiv   pre-print
First, we survey recent developments in computational morphology with a focus on low-resource languages.  ...  Second, we argue that the field is ready to tackle the logical next challenge: understanding a language's morphology from raw text alone.  ...  Systems A leading non-neural morphological tagger is MARMOT (Mueller et al., 2013) , a higherorder conditional random field (CRF; Lafferty et al., 2001) tagger.  ... 
arXiv:2203.08909v1 fatcat:w64bmzjl6nbrniykqjj5diae2e

Enhancing MR Image Segmentation with Realistic Adversarial Data Augmentation [article]

Chen Chen, Chen Qin, Cheng Ouyang, Zeju Li, Shuo Wang, Huaqi Qiu, Liang Chen, Giacomo Tarroni, Wenjia Bai, Daniel Rueckert
2022 arXiv   pre-print
It is computationally efficient and applicable for both low-shot supervised and semi-supervised learning.  ...  The proposed adversarial data augmentation does not rely on generative networks and can be used as a plug-in module in general segmentation networks.  ...  Specifically, we evaluated one-shot learning (N=1) and threeshot learning (N=3) in both supervised (using the labeled set only) and semi-supervised (using both labeled and unlabeled sets) settings.  ... 
arXiv:2108.03429v3 fatcat:yeshh2io7bbixliqzp2ih4cvxi

Tackling the Low-resource Challenge for Canonical Segmentation [article]

Manuel Mager, Özlem Çetinoğlu, Katharina Kann
2020 arXiv   pre-print
We compare model performance in a simulated low-resource setting for the high-resource languages German, English, and Indonesian to experiments on new datasets for the truly low-resource languages Popoluca  ...  Thus, we conclude that canonical segmentation is still a challenging task for low-resource languages.  ...  Over the last years, supervised methods have attracted more attention: Ruokolainen et al. (2013) cast the task as a sequence labeling problem using conditional random fields (CRFs; Lafferty et al.,  ... 
arXiv:2010.02804v1 fatcat:27ljwcaqujb4dnck4udcvmecre

Computational Morphology with Neural Network Approaches [article]

Ling Liu
2021 arXiv   pre-print
This paper starts with a brief introduction to computational morphology, followed by a review of recent work on computational morphology with neural network approaches, to provide an overview of the area  ...  In the end, we will analyze the advantages and problems of neural network approaches to computational morphology, and point out some directions to be explored by future research and study.  ...  Their tagger is a neural conditional random field and the morphological inflector is a neural encoder-decoder model with hard attention (Aharoni and Goldberg, 2017) .  ... 
arXiv:2105.09404v1 fatcat:6w4n7yjaevh6fpnauntzlfe64u

Automatic rule learning exploiting morphological features for named entity recognition in Turkish

Serhan Tatar, Ilyas Cicekli
2011 Journal of information science  
The paper also provides a comprehensive overview of the field by reviewing the NER research literature.  ...  In this paper, we describe an automatic rule learning method that exploits different features of the input text to identify the named entities located in the natural language texts.  ...  Although the use of rule exception sets explained in Section 3.5 helps to reduce false positives, handling the information regarding to the rule exceptions in a more formal way and generalizing them into  ... 
doi:10.1177/0165551511398573 fatcat:jsppv6n6wvegnjvxqj4a6t7pte

Bayesian Modeling of Lexical Resources for Low-Resource Settings

Nicholas Andrews, Mark Dredze, Benjamin Van Durme, Jason Eisner
2017 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
We evaluate the proposed approach in two settings: part-of-speech induction and lowresource named-entity recognition.  ...  In this paper, we investigate a more robust approach: we stipulate that the lexicon is the result of an assumed generative process.  ...  Our model may be useful in the context of active learning where efficient re-estimation and performance in low-data conditions are important.  ... 
doi:10.18653/v1/p17-1095 dblp:conf/acl/AndrewsDDE17 fatcat:zpg6zlqkjfel5mxnr3t6nlgayq

Labeled Morphological Segmentation with Semi-Markov Models

Ryan Cotterell, Thomas Müller, Alexander Fraser, Hinrich Schütze
2015 Proceedings of the Nineteenth Conference on Computational Natural Language Learning  
For morphological segmentation our method shows absolute improvements of 2-6 points F 1 over a strong baseline.  ...  We introduce a new hierarchy of morphotactic tagsets and CHIPMUNK, a discriminative morphological segmentation system that, contrary to previous work, explicitly models morphotactics.  ...  This material is based upon work supported by a Fulbright fellowship awarded to the first author by the German-American Fulbright Commission and the National  ... 
doi:10.18653/v1/k15-1017 dblp:conf/conll/Cotterell0FS15 fatcat:wycuxrgvofgohb3kaljljk3syq
« Previous Showing results 1 — 15 out of 8,917 results