A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
Linguistica 5: Unsupervised Learning of Linguistic Structure
2016
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
While Linguistica 5 inherits its predecessors' strength in unsupervised learning of natural language morphology, it incorporates significant improvements in multiple ways. ...
This paper introduces Linguistica 5, a software for unsupervised learning of linguistic structure. It is a descendant of Goldsmith's (2001 Goldsmith's ( , 2006 Linguistica. ...
Acknowledgments This work was completed in part with resources provided by the University of Chicago Research Computing Center and Big Ideas Generator. ...
doi:10.18653/v1/n16-3005
dblp:conf/naacl/LeeG16
fatcat:ndfwpyknp5arzolzbo77kqrzcm
Page 29 of Computational Linguistics Vol. 27, Issue 1
[page]
2001
Computational Linguistics
In phase two, the objective is to learn a contextual model of each PN category, augmented with syntactic and semantic features. ...
Second, through an unsupervised corpus-based technique, typical PN syntactic and semantic contexts are learned. ...
Investigating Sub-Word Embedding Strategies for the Morphologically Rich and Free Phrase-Order Hungarian
2019
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
Therefore, we explore and evaluate several sub-word unit based embedding strategiescharacter n-grams, lemmatization provided by an NLP-pipeline, and segments obtained in unsupervised learning (morfessor ...
For the highly agglutinative Hungarian, semantic accuracy of word embeddings measured on word analogy tasks drops by 50-75% compared to English. ...
Results showed that using the lemmas instead of the words was by far the most effective approach by maximizing semantic accuracy of the embeddings. ...
doi:10.18653/v1/w19-4321
dblp:conf/rep4nlp/DobrossyMTS19
fatcat:72tbnxawfrgbnoornhfb3eqzwe
Joint PoS Tagging and Stemming for Agglutinative Languages
[article]
2017
arXiv
pre-print
In this paper, we present an unsupervised Bayesian model using Hidden Markov Models (HMMs) for joint PoS tagging and stemming for agglutinative languages. ...
Part-of-speech tagging (PoS tagging) is one of these tasks that often suffers from sparsity. ...
Acknowledgments This research is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) with the project number EEEAG-E . ...
arXiv:1705.08942v1
fatcat:hauvrqaoobfgbly32rkw6ukuza
A Computational Model for Child Inferences of Word Meanings via Syntactic Categories for Different Ages and Languages
2018
IEEE Transactions on Cognitive and Developmental Systems
We hypothesize that using this model with different numbers of categories can replicate the manner in which children of different ages learn words. ...
In addition, cross-linguistic differences originating from the acquisition of language-specific syntactic categories are identified, i.e., the syntactic categories learned from English and Chinese corpora ...
ACKNOWLEDGMENT The authors gratefully acknowledge the advice of Prof. M. Imai of Keio University. ...
doi:10.1109/tcds.2018.2883048
fatcat:pde5glonyjhrpbnv63jojv2pbq
Building Morphological Chains for Agglutinative Languages
[article]
2017
arXiv
pre-print
In this paper, we build morphological chains for agglutinative languages by using a log-linear model for the morphological segmentation task. ...
The results indicate that candidate generation plays an important role in such an unsupervised log-linear model that is learned using contrastive estimation with negative samples. ...
Acknowledgments This research is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) with the project number EEEAG-115E464 and we are grateful to TUBITAK for their financial ...
arXiv:1705.02314v1
fatcat:djigjptfpfdufo6o6un2k6bige
A Review of Open Information Extraction Techniques
2019
IJCI. International Journal of Computers and Information
To make use of such huge amounts of textual data, there is a need to detect, extract, and structure the information conveyed through this data in a fast and scalable manner. ...
This can be performed using Information Extraction Techniques. ...
of their morphologies limits the implementation of different categories. ...
doi:10.21608/ijci.2019.35099
fatcat:ff3ssqyvzvelzp3lklvmpkj2kq
SEMANTIC ANALYZER FOR MARATHI TEXT
2014
International Journal of Research in Engineering and Technology
We describe our system as the one which analyzes the text by comparing it with the meaning of the words given in the WordNet. ...
This paper represents a Semantic Analyzer for checking the semantic correctness of the given input text. ...
Syntactic Analyzer Morphological Analyzer"s output is used by the Syntactic Analyzer to detect whether the output is syntactically correct or not. ...
doi:10.15623/ijret.2014.0303098
fatcat:xiuko7p2ezbmtc4vvgxroiw5we
Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods
2000
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00
recognition of about 90% of the remaining 50%. ...
However the high performance of most existing PN classifiers heavily depends upon the availability of large dictionaries of domain-specific Proper Nouns, and a certain amount of manual work for rule writing ...
The method combined two learning approaches: supervised learning of decision-tree classifiers and unsupervised probabilistic learning of syntactic and semantic context. ...
doi:10.1145/345508.345563
dblp:conf/sigir/PetasisCVPKS00
fatcat:jmfriwwgdrclnocwt4xeq23u74
Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence
2001
Computational Linguistics
the effectiveness of using more fine-grained evidence--namely, syntactic and semantic contextual knowledge--in classifying NEs. ...
Proper nouns form an open class, making the incompleteness of manually or automatically learned classification rules an obvious problem. ...
Second, through an unsupervised corpus-based technique, typical PN syntactic and semantic contexts are learned. ...
doi:10.1162/089120101300346822
fatcat:hikpb332h5dqtftpu262qnyeg4
A resource-light approach to morpho-syntactic tagging. * Anna Feldman and Jirka Hana
2010
Literary and Linguistic Computing
This volume deals with the problem of morpho-syntactic tagging, i.e. ...
This volume describes an alternative approach to morpho-syntactic tagging by porting the relevant information from one language to a related language, avoiding thus the labour- ...
This volume describes an alternative approach to morpho-syntactic tagging by porting the relevant information from one language to a related language, avoiding thus the labourintensive creation of an annotated ...
doi:10.1093/llc/fqq012
fatcat:53fribvvlzbi5gintrvhijro74
Using eigenvectors of the bigram graph to infer morpheme identity
2002
Proceedings of the ACL-02 workshop on Morphological and phonological learning -
In particular, we look at the suffixes derived from a corpus by unsupervised learning of morphology, and we ask which of these suffixes have a consistent syntactic function (e.g., in English, -ed is primarily ...
We exploit this technique for extending the value of automatic learning of morphology. ...
other, independent heuristics (such as presence of suffixes determined by unsupervised learning of morphology) are syntactically homogenous. ...
doi:10.3115/1118647.1118652
dblp:conf/sigmorphon/BelkinG02
fatcat:mhwbsk4yr5anlh67z53w7d3bsy
Combining Hand-crafted Rules and Unsupervised Learning in Constraint-based Morphological Disambiguation
[article]
1996
arXiv
pre-print
The unsupervised learning process produces two sets of rules: (i) choose rules which choose morphological parses of a lexical item satisfying constraint effectively discarding other parses, and (ii) delete ...
Our approach also uses a novel approach to unknown word processing by employing a secondary morphological processor which recovers any relevant inflectional and derivational information from a lexical ...
Acknowledgments We would like to thank Xerox Advanced Document Systems, and Lauri Karttunen of Xerox Parc and of Rank Xerox Research Centre (Grenoble) for providing us with the two-level transducer development ...
arXiv:cmp-lg/9604001v2
fatcat:jasty72cezhkdmc4gztn2ieope
Computational Models of Language Acquisition
[chapter]
2010
Lecture Notes in Computer Science
can be highly informative of syntactic category This information can be extracted by some psychologically plausible mechanisms Using distributional information concerning syntactic categories involves ...
of syntactic category This information can be extracted by some psychologically plausible mechanisms Using distributional information concerning syntactic categories involves three stages: 1 Measuring ...
uses these labels to parse Evaluation on WSJ10 yields an f -score of almost 0.76 when parsing begins from plain text Note: algorithms for inducing part of speech categories from raw data (i.e., unsupervised ...
doi:10.1007/978-3-642-12116-6_8
fatcat:yeyxymemrnhchlpngncalixo3y
Unsupervised Disambiguation of Syncretism in Inflected Lexicons
[article]
2020
arXiv
pre-print
One can, however, use unsupervised learning (as in EM) to fit a model that probabilistically disambiguates word forms. ...
Lexical ambiguity makes it difficult to compute various useful statistics of a corpus. A given word form might represent any of several morphological feature bundles. ...
Here, the goal is map sequences of forms into coarse-grained syntactic categories. Christodoulopoulos et al. (2010) provide a useful overview of previous work. ...
arXiv:1806.03740v2
fatcat:gxsxqf5lpvcx3d2tlygafbehxu
« Previous
Showing results 1 — 15 out of 3,241 results