Filters








32 Hits in 8.0 sec

Using search engine to construct a scalable corpus for Vietnamese lexical development for word segmentation

Doan Nguyen
2009 Proceedings of the 7th Workshop on Asian Language Resources - ALR7   unpublished
In this paper we introduce a semi-supervised approach in building a general scalable web corpus for Vietnamese using search engine to facilitate the word segmentation process.  ...  The recent deployment of a Vietnamese translation tool on a well-known search engine justifies its importance in gaining popularity with the World Wide Web.  ...  A Special thank to Mr. Thuy Vu who contributed to an assessment of our approach and the JVnSegmenter.  ... 
doi:10.3115/1690299.1690324 fatcat:icbhb6feprdjxbj2ji5rhqcdk4

A semi-automatic approach to construct Vietnamese ontology from online text

Bao-An Nguyen, Don-Lin Yang
2012 International Review of Research in Open and Distance Learning  
Therefore, we present a support system for Vietnamese ontology construction using pattern-based mechanisms to discover Vietnamese concepts and conceptual relations from Vietnamese text documents.  ...  However, due to the lack of system support for ontology construction, it is difficult to construct self-instructional materials for Vietnamese people.  ...  To make the Vietnamese text documents suitable for concept extraction with GATE, we developed a convertor to convert them into annotated documents that can be used by GATE.  ... 
doi:10.19173/irrodl.v13i5.1250 fatcat:dgsofvyjbzejhla5covvernvii

IRRODL Volume 13, Number 5

Various Authors
2012 International Review of Research in Open and Distance Learning  
This is similar to the approach used by teachers when they see suspicious wording in a student's essay; they will search for the particular sentence(s) with conventional search engines such as Google and  ...  Yang integrate statistics, data mining, and natural language processing techniques to construct a concept tree for the course Introduction to C Programming from text documents in Vietnamese.  ...  Acknowledgments We would like to thank three anonymous reviewers for their constructive comments and the editors for their time and valuable remarks.  ... 
doi:10.19173/irrodl.v13i5.1424 fatcat:lpe2ose6d5dlpea4xfhtxrbldi

Getting Past the Language Gap: Innovations in Machine Translation [chapter]

Rodolfo Delmonte
2012 Mobile Speech and Advanced Natural Language Solutions  
translating patents: the PaTrans and SpaTrans systems developed for LingTech A/S to translate English patents into Danish ...  ...  model language Trust Rating % In addition to phrase translation models, also word translation models are used, based on lexical level translation in conjunction with PBSTM.  ...  search engine.  ... 
doi:10.1007/978-1-4614-6018-3_6 fatcat:2njkc6meabhaxosl4wircumfjm

A Survey on Deep Learning for Named Entity Recognition [article]

Jing Li, Aixin Sun, Jianglei Han, Chenliang Li
2020 arXiv   pre-print
Early NER systems got a huge success in achieving good performance with the cost of human engineering in designing domain-specific features and rules.  ...  Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder.  ...  The key idea is that lexical resources, lexical patterns, and statistics computed on a large corpus can be used to infer mentions of named entities. Collins et al.  ... 
arXiv:1812.09449v3 fatcat:36tnstbyo5h4xizjpqn4cevgui

Shallow features as indicators of English–German contrasts in lexical cohesion

Kerstin Kunz, Ekaterina Lapshinova-Koltunski, José Manuel Martínez Martínez, Katrin Menzel, Erich Steiner
2017 Languages in Contrast: International Journal for Contrastive Linguistics  
This paper contrasts lexical cohesion between English and German spoken and written registers, reporting findings from a quantitative lexical analysis.  ...  The shallow features analysed are: highly frequent words in texts, lexical density, standardized type-token-ratio, top-frequent content words of the language within individual registers and texts, and  ...  Ferrucci, D. and Lally, A. (2004). UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering, 10(3-4):327-348.  ... 
doi:10.1075/lic.16005.kun fatcat:p7pa2crm6rgkrfmsdf7m5265ea

Computational intelligence in processing of speech acoustics: a survey

Amitoj Singh, Navkiran Kaur, Vinay Kukreja, Virender Kadyan, Munish Kumar
2022 Complex & Intelligent Systems  
However, a limited number of automatic speech recognition systems are available for commercial use.  ...  Combination of MFCC and DNN–HMM classifier is most commonly used system for developing ASR minority languages, whereas in some of the majority languages, researchers are using much advance algorithms of  ...  [16] used 1080 words in the speech corpus for the experiment. The SVM classifier was used for speech recognition. Uncertainties in the lexical tones were identified.  ... 
doi:10.1007/s40747-022-00665-1 fatcat:6pu2xccbq5as7bn2y2tav2fdwa

Extracting Temporal and Causal Relations between Events [article]

Paramita Mirza
2016 arXiv   pre-print
We first develop a robust extraction component for each type of relations, i.e. temporal order and causality.  ...  Structured information resulting from temporal information processing is crucial for a variety of natural language processing tasks, for instance to generate timeline summarization of events from news  ...  compared to 50K word corpus used in TempEval-1 and TempEval-2.  ... 
arXiv:1604.08120v1 fatcat:fmd7z6hwyjhgphbrnc3mgpifde

Toward a traceable, explainable, and fairJD/Resume recommendation system [article]

Amine Barrak, Bram Adams, Amal Zouaq
2022 arXiv   pre-print
Typically, pre-trained language models use transfer-based machine learning models to be fine-tuned for a specific field.  ...  For example, providing a detailed matching explanation for the targeted stakeholders is needed to ensure a transparent recommendation.  ...  Instead of applying a manual search, we perform an automated search using Engineering Village 6 to search for papers related to the matching of resumes and job description.  ... 
arXiv:2202.08960v1 fatcat:52mjmfklefhsfh7fhflmkgrime

Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks

Li Lucy, Dorottya Demszky, Patricia Bromley, Dan Jurafsky
2020 AERA Open  
We apply techniques from natural language processing (lexicons, word embeddings, topic models) to 15 U.S. history textbooks widely used in Texas between 2015 and 2017, studying their depiction of historically  ...  Building on a rich tradition of textbook analysis, we release our computational toolkit to support new research directions.  ...  Acknowledgments We would like to thank the following individuals for helpful conversations, feedback, and ideas: Noah Smith, Sebastian Munoz-Najar Galvez, Lily  ... 
doi:10.1177/2332858420940312 fatcat:l5antrdnc5d5dbi4goob6k5lou

Parameter survey of a rib stiffened wooden floor using sinus modes model

Lars‐Göran Sjökvist, Jonas Brunskog, Finn Jacobsen
2008 Journal of the Acoustical Society of America  
The poor quality of clarity was assessed for churches with longest EDT for the performance of music involving polyphony.  ...  A grouping of the churches by typology was found by analysing the reverberation times; 3.  ...  Searching the best design for a large microphone array, a simple genetic algorithm was developed.  ... 
doi:10.1121/1.2933934 fatcat:fmeouf2iynctnculdxzvoqb4iu

Measurement of total sound energy density in enclosures at low frequencies

Finn Jacobsen
2008 Journal of the Acoustical Society of America  
The poor quality of clarity was assessed for churches with longest EDT for the performance of music involving polyphony.  ...  A grouping of the churches by typology was found by analysing the reverberation times; 3.  ...  Searching the best design for a large microphone array, a simple genetic algorithm was developed.  ... 
doi:10.1121/1.2934233 fatcat:vuavezpbsnabxmhi4jlbvbwgly

Spherical near field acoustic holography with microphones on a rigid sphere

Finn Jacobsen, Jo/rgen Hald, Efrén Fernandez, Guillermo Moreno
2008 Journal of the Acoustical Society of America  
The poor quality of clarity was assessed for churches with longest EDT for the performance of music involving polyphony.  ...  A grouping of the churches by typology was found by analysing the reverberation times; 3.  ...  Searching the best design for a large microphone array, a simple genetic algorithm was developed.  ... 
doi:10.1121/1.2934035 fatcat:57w2rziyfvcmnnyzsxlgg43qxa

4. A Protocol for Scholarly Digital Editions? The Italian Point of View [chapter]

Marina Buzzoni
2016 Digital Scholarly Editing: Theories and Practices  
In collaboration with Unglue.it we have set up a survey (only ten questions!) to learn more about how open access ebooks are discovered and used.  ...  you forgo such features as manuscript images, use of collation software, inclusion of a text search engine or of image-related tools.  ...  Ranking by relevance could arguably be methodologically useful for textual scholars who must peruse a large corpus for occurrences of themes, words and motifs.  ... 
doi:10.11647/obp.0095.04 fatcat:7so2wnx6u5bvxcw5smu5ghomb4

6. Exogenetic Digital Editing and Enactive Cognition [chapter]

Dirk Van Hulle
2016 Digital Scholarly Editing: Theories and Practices  
In collaboration with Unglue.it we have set up a survey (only ten questions!) to learn more about how open access ebooks are discovered and used.  ...  you forgo such features as manuscript images, use of collation software, inclusion of a text search engine or of image-related tools.  ...  Ranking by relevance could arguably be methodologically useful for textual scholars who must peruse a large corpus for occurrences of themes, words and motifs.  ... 
doi:10.11647/obp.0095.06 fatcat:3jy6aza4xnhrvkngpjgowpdcmq
« Previous Showing results 1 — 15 out of 32 results