Filters








1,062 Hits in 8.5 sec

CLGVSM: Adapting Generalized Vector Space Model to Cross-lingual Document Clustering

Guoyu Tang, Yunqing Xia, Min Zhang, Haizhou Li, Fang Zheng
2011 International Joint Conference on Natural Language Processing  
Experimental results on benchmarking data set show that (1) the proposed CLGVSM is very effective for cross-document clustering, outperforming the two strong baselines vector space model (VSM) and latent  ...  semantic analysis (LSA) significantly; and (2) the new feature selection method can further improve CLGVSM.  ...  Acknowledgment This work is partially supported by NSFC (60703051) and MOST (2009DFA12970). We thank the reviewers for the valuable comments.  ... 
dblp:conf/ijcnlp/TangXZLZ11 fatcat:gxmp2fy4lje6xft3kpzzlk2roi

Watset: Local-Global Graph Clustering with Applications in Sense and Frame Induction [article]

Dmitry Ustalov and Alexander Panchenko and Chris Biemann and Simone Paolo Ponzetto
2019 arXiv   pre-print
We present a detailed theoretical and computational analysis of the Watset meta-algorithm for fuzzy graph clustering, which has been found to be widely applicable in a variety of domains.  ...  Then, it uses hard clustering to discover clusters in this "disambiguated" intermediate graph.  ...  Foundation for Basic Research (RFBR) under the project no. 16-37-00354 мол_а.  ... 
arXiv:1808.06696v3 fatcat:jdd5cnkhffhaxlti72oskgleye

Watset: Local-Global Graph Clustering with Applications in Sense and Frame Induction

Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto
2019 Computational Linguistics  
We present a detailed theoretical and computational analysis of the Watset meta-algorithm for fuzzy graph clustering, which has been found to be widely applicable in a variety of domains.  ...  Then, it uses hard clustering to discover clusters in this "disambiguated" intermediate graph.  ...  We thank Bonaventura Coppolla for discussions and preliminary work on graph-based frame induction and Andrei Kutuzov, who conducted experiments with the HOSG-based baseline related to the frame induction  ... 
doi:10.1162/coli_a_00354 fatcat:b5dr23gh6var3fnzjdgztzdjni

Morphosyntactic Linguistic Wavelets for Knowledge Management [chapter]

Daniela Lpez De Luise
2012 Intelligent Systems  
Gelernter's perspective on reasoning Section 3.2.3. defines that the clustering algorithms must be used first hard clusterings and afterwards fuzzy. It is not a trivial restriction.  ...  That description is analogous to the filtering restriction: define sharp clustering first and leave fuzzy clustering approaches for the final steps.  ...  and human support in the healthcare environment have also been made easier.  ... 
doi:10.5772/35438 fatcat:u446haayojbutoy6cvz6quitdu

Tracking linguistic primitives [chapter]

Niklas Johansson
2017 Iconicity in Language and Literature  
Significant semantic groupings and relations based solely on phonological contrasts were found for most investigated concepts, including the semantic domains; Small, Intense Vision-Touch, Large, Organic  ...  The most notable relations found were; MOTHER/I vs.  ...  Several binary oppositional relations were found between the Small and Intense Vision-Touch, and Large-Organic clusters (e.g. SOFT-HARD.  ... 
doi:10.1075/ill.15.03joh fatcat:u6dlhusmwvf6jpfjlgifqntye4

Mental representation and cognitive consequences of Chinese individual classifiers

Ming Y. Gao, Barbara C. Malt
2009 Language and Cognitive Processes  
(See the Chinese classifier dictionaries for additional examples.)  ...  speakers, and they also produced the greatest amount of clustering for both Chinese and English speakers.  ... 
doi:10.1080/01690960802018323 fatcat:x34dtjiw7bfyrl3lezn4fn42za

Multilingual Metaphor Processing: Experiments with Semi-Supervised and Unsupervised Learning

Ekaterina Shutova, Lin Sun, Elkin Darío Gutiérrez, Patricia Lichtenstein, Srini Narayanan
2017 Computational Linguistics  
Our aim is to identify the optimal type of supervision for a learning algorithm that discovers patterns of metaphorical association from text.  ...  , unconstrained and constrained clustering settings.  ...  Acknowledgments We would like to thank our anonymous reviewers for their most insightful comments. Ekaterina Shutova's research is supported by the Leverhulme Trust Early Career Fellowship.  ... 
doi:10.1162/coli_a_00275 fatcat:ojrv5y4e4zaifafj6femg7n354

Introduction to information retrieval

2009 ChoiceReviews  
Tomasic and Garcia-Molina (1993) and Jeong and Omiecinski (1995) are key early papers evaluating term partitioning versus document partitioning for distributed indexes.  ...  The scheme discussed in this chapter, currently believed to be the best published scheme (achieving as few as 3 bits per link for encoding), is described in a series of papers by Boldi and Vigna (2004b  ...  A document about Chinese cars may get soft assignments of 0.5 to each of the two clusters China and automobiles, reflecting the fact that both topics are pertinent.  ... 
doi:10.5860/choice.46-2715 fatcat:ruwoe46pgzcupjygnwbnit4z3u

Lexikos 30

Lexikos Lexikos
2020 Lexikos  
Bibliography Dictionaries Acknowledgements This research is supported in part by (a) the South African Centre for Digital Language Resources (SADiLaR) and (b) the National Research Foundation of South  ...  Acknowledgements This is a substantially expanded, reorganized and rewritten text of the talk entitled "Teaching Lexicography to EFL Acknowledgements This research is supported by the South African  ...  In a particular sentence, the logical-conceptual relations are transformed into syntactic relations.  ... 
doi:10.5788/30-1-1610 fatcat:6eksuj2d6fef7ijdrvj64vb5zq

Has Computational Linguistics Become More Applied? [chapter]

Kenneth Church
2009 Lecture Notes in Computer Science  
We approach the problem of related term identification by constructing low-dimensional embeddings where related terms are clustered together, and such clusters are spatially arranged according to the semantic  ...  In this work, we demonstrate the proposed methodology for a specific part-of-speech (verbs) of the Spanish language, by using dictionary-based definitions.  ...  Related Work Semi-supervised Clustering The semi-supervised clustering methods can be classified into constraint-based and distance-based.  ... 
doi:10.1007/978-3-642-00382-0_1 fatcat:oddvfzds4nfwjam2ccqeaxe2y4

DWIE: An entity-centric dataset for multi-task document-level information extraction

Klim Zaporojets, Johannes Deleu, Chris Develder, Thomas Demeester
2021 Information Processing & Management  
DWIE is conceived as an entity-centric dataset that describes interactions and properties of conceptual entities on the level of the complete document.  ...  Recognition (NER), (ii) Coreference Resolution, (iii) Relation Extraction (RE), and (iv) Entity Linking.  ...  Acknowledgements Part of the research leading to these results has received funding from (i) the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761488 for the  ... 
doi:10.1016/j.ipm.2021.102563 fatcat:s2imreyj7rep7i56fv7cuy2iia

DWIE: an entity-centric dataset for multi-task document-level information extraction [article]

Klim Zaporojets, Johannes Deleu, Chris Develder, Thomas Demeester
2021 arXiv   pre-print
DWIE is conceived as an entity-centric dataset that describes interactions and properties of conceptual entities on the level of the complete document.  ...  Recognition (NER), (ii) Coreference Resolution, (iii) Relation Extraction (RE), and (iv) Entity Linking.  ...  Acknowledgements Part of the research leading to these results has received funding from (i) the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761488 for the  ... 
arXiv:2009.12626v2 fatcat:2ht56fk3l5bipgev2uttsnagvu

Strudel: A Corpus-Based Semantic Model Based on Properties and Types

Marco Baroni, Brian Murphy, Eduard Barbu, Massimo Poesio
2010 Cognitive Science  
(clustering into superordinates), suggesting the empirical validity of the property-based approach. naturally occurring data, mostly in the form of linguistic corpora, that is, large and typically mixed  ...  when acquiring language and conceptual knowledge.  ...  Acknowledgments We thank Raffaella Bernardi, Katrin Erk, Alessandro Lenci, and the Cognitive Science editor and reviewers for very useful feedback, as well as the developers of the tools and resources  ... 
doi:10.1111/j.1551-6709.2009.01068.x pmid:21564211 fatcat:mrzvpwblmrfu3gncmvce7gok54

Message from the general chair

Benjamin C. Lee
2015 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
To maximize the utility of the injected knowledge, we deploy a learning-based multi-sieve approach and develop novel entity-based features.  ...  We propose a joint learning model which combines pairwise classification and mention clustering with Markov logic.  ...  Given the exponential size of the mapping space, we propose a novel method for optimizing over soft mappings, and use entropy regularization to drive those towards hard mappings.  ... 
doi:10.1109/ispass.2015.7095776 dblp:conf/ispass/Lee15 fatcat:ehbed6nl6barfgs6pzwcvwxria

Sentiment Analysis and Opinion Mining [chapter]

Lei Zhang, Bing Liu
2017 Encyclopedia of Machine Learning and Data Mining  
The constraints can also be relaxed, i.e., they are treated as soft (rather than hard) constraints and may not be satisfied.  ...  In other words, it is hard to use the dictionary-based approach to find domain or context dependent orientations of sentiment words.  ... 
doi:10.1007/978-1-4899-7687-1_907 fatcat:iy5ty44cyzbrtodxfo7osy3iu4
« Previous Showing results 1 — 15 out of 1,062 results