733 Hits in 10.4 sec

Speeding Up Multilingual Grammar Development by Exploiting Linked Data to Generate Pre-terminal Rules [chapter]

Sebastian Walter, Christina Unger, Philipp Cimiano
2014 Lecture Notes in Computer Science  
In order to automatize and speed up the generation of multilingual terminal lists, we present a tool that uses linked data sources such as DBpedia in order to retrieve all entities that satisfy a relevant  ...  The development of grammars, e.g. for spoken dialog systems, is a time-and effort-intensive process. Especially the crafting of rules that list all relevant instances of a non-terminal, e.g.  ...  Conclusion In this paper we presented a first approach of an easy-to-use tool that has the potential to significantly speed up the development of multilingual grammars by exploiting linked data for the  ... 
doi:10.1007/978-3-319-07983-7_34 fatcat:z3hdmm76jbdghcpodhqdv7cerq

TectoMT - a deep linguistic core of the combined Cimera MT system

Martin Popel, Roman Sudarikov, Ondrej Bojar, Rudolf Rosa, Jan Hajic
2016 European Association for Machine Translation Conferences/Workshops  
The development is currently supported by the QTLeap 7 th FP project (  ...  The additional table is then used in a weighted combination with a large Moses translation table to produce pre-final output.  ...  While much of the current research in GF is focused on scaling it up to general-purpose translation, the creation of domain-specific systems has become a routine task, which is commercially exploited by  ... 
dblp:conf/eamt/PopelSBRH16 fatcat:jiuyhfskw5cqfbhdkvwqs2taru

Language Technology 2020: The Meta-Net Priority Research Themes [chapter]

Georg Rehm, Hans Uszkoreit
2013 META-NET Strategic Research Agenda for Multilingual Europe 2020  
methods that exploit linked open data for improved disambiguation.Develop a new generation of information extraction tools that are able to reliably extract from texts all semantic relations defined in  ...  Nevertheless, our technology area has to follow developments in other key engineering disciplines and speed up technology evolution by massive collaboration based on competitive division of labour and  ... 
doi:10.1007/978-3-642-36349-8_6 fatcat:jezmk52phre4lguizqtyh7gule

Experiences of integration and performance testing of multilingual OCR for printed Indian scripts

Deepak Arya, C. V. Jawahar, Chakravorty Bhagvati, Tushar Patnaik, B. B. Chaudhuri, G. S. Lehal, Santanu Chaudhury, A. G. Ramakrishna
2011 Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data - MOCR_AND '11  
This paper presents integration and testing scheme for managing a large Multilingual OCR Project. The project is an attempt to implement an integrated platform for OCR of different Indian languages.  ...  Key challenges involved in the development of the integrated OCR platform were the following: Developing specification scheme for each functional module so that modules can be independently developed,  ...  In this paper, we describe an integrated OCR which exploits common characteristics and common solutions across scripts to generate a robust system.  ... 
doi:10.1145/2034617.2034628 fatcat:z4gwgq3chnbz3n5qd4dnhm2qt4

Probabilistic Modelling of Morphologically Rich Languages [article]

Jan A. Botha
2015 arXiv   pre-print
that learns vector representations of morphemes and leverages them to link together morphologically related words.  ...  and help overcome data sparsity that arises from morphological processes.  ...  We added adaptation to these categories, also introducing intermediate dummy rules to avoid ill-defined recursive adaptor grammar rules like PresPres Pre.  ... 
arXiv:1508.04271v1 fatcat:6qhsfdbvt5emfiaumtwh2pzs7m


1978 Journal of Documentation  
and the development of techniques for access to multilingual data bases.  ...  data bases of multilingual terminology.  ... 
doi:10.1108/eb026657 fatcat:3qdsjriq5nfgzoxohhhou4lpda

Message from the general chair

Benjamin C. Lee
2015 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
To inject knowledge, we use a state-of-the-art system which cross-links (or "grounds") expressions in free text to Wikipedia.  ...  To maximize the utility of the injected knowledge, we deploy a learning-based multi-sieve approach and develop novel entity-based features.  ...  First, by partitioning the factors, it speeds up parsing exponentially over the unfactored approach.  ... 
doi:10.1109/ispass.2015.7095776 dblp:conf/ispass/Lee15 fatcat:ehbed6nl6barfgs6pzwcvwxria

D3.2 Research Challenge Report v2

Christian Fäth, Christian Chiarcos, Jorge Gracia, Julia Bosque-Gil, Bernardo Stearns, John P. McCrae, Fernando Bobillo, Philipp Cimiano, Thierry Declerck, Mohammad Fazleh Elahi, Basil Ell, Julian Grosse (+4 others)
2020 Zenodo  
Methodologies are developed for the transformation of language resources and language data into LLOD representations. ● Task 3.2 , Prêt-à-LLOD Link addresses the challenge of " Linking conceptual and lexical  ...  Novel (semi-)automatic methods are studied that aim at establishing links across multilingual LLOD datasets and models. ● Task 3.3 , Prêt-à-LLOD Workflows addresses the challenge to create " Workflows  ...  nor of the RDF data model or how to set up a Linked Data server.  ... 
doi:10.5281/zenodo.5744508 fatcat:zukppmtuebcrhevwpzz2u3gdoy

Syntax and Parsing of Semitic Languages [chapter]

Reut Tsarfaty
2014 Natural Language Processing of Semitic Languages  
Therefore, general-purpose statistical parsers are not always equally successful when applied to Semitic data.  ...  We then survey the different components of a generative probabilistic parsing system and show how they can be designed and implemented in order to effectively cope with Semitic data.  ...  The writing of this chapter was partially funded by the Swedish research council, for which we are grateful.  ... 
doi:10.1007/978-3-642-45358-8_3 dblp:series/tanlp/Tsarfaty14 fatcat:ntl4l44okvft5c2i72iz4tcyva

Proceedings of the 8thWorkshop on Linked Data in Linguistics (LDL-2022) [article]

Thierry Declerck, John P. McCrae, Elena Montiel, Christian Chiarcos, Max Ionov
2022 Zenodo  
The LDL workshop series has contributed greatly to the development of the Linguistic Linked Open Data (LLOD) cloud and the development of best practices for publishing and accessing language resources  ...  Firstly, the Prêt-à-LLOD project, which is making linguistic linked open data ready-to-use, and, secondly, the ELEXIS project on building a lexicographic infrastructure.  ...  Readyto-use Multilingual Linked Language Data for Knowledge Services across Sectors (2019-2022, grant agreement 825182).  ... 
doi:10.5281/zenodo.6778018 fatcat:agcacvsxyjfgpeox3jn2y7cika

Question answering from structured knowledge sources

Anette Frank, Hans-Ulrich Krieger, Feiyu Xu, Hans Uszkoreit, Berthold Crysmann, Brigitte Jörg, Ulrich Schäfer
2007 Journal of Applied Logic  
Our approach naturally extends to multilingual question answering, and has been developed as a prototype system for two application domains: the domain of Nobel prize winners, and the domain of Language  ...  It therefore makes perfect sense to exploit the reversability of the German and English HPSG grammars for multilingual answer generation.  ...  The elimination of a single property variable by a concrete property speeds up query performance by more than a factor of 100.  ... 
doi:10.1016/j.jal.2005.12.006 fatcat:rdztzmniw5eghl5zm5w2aqxyce

Linguistic Resources and Technologies for Romanian Language

Dan Cristea, Corina Forascu
2006 Computer Science Journal of Moldova  
This paper revises notions related to Language Resources and Technologies (LRT), including a brief overview of some resources developed worldwide and with a special focus on Romanian language.  ...  It then describes a joined Romanian, Moldavian, English initiative aimed at developing electronically coded resources for Romanian language, tools for their maintenance and usage, as well as for the creation  ...  language models, sets of grammar rules, name entity lists, etc.).  ... 
doaj:c3be4ea014bd449da61c7f881c8a2e33 fatcat:mnd4cjfi3bcezk425v5wfkghgm

Machine translation for everyone: Empowering users in the age of artificial intelligence [article]

Dorothy Kenny
2022 Zenodo  
Both have been affected by the increasing availability of machine translation (MT): language learners now make use of free online MT to help them both understand and produce texts in a second language,  ...  Language learning and translation have always been complementary pillars of multilingualism in the European Union.  ...  and suggestions on how to improve this text.  ... 
doi:10.5281/zenodo.6653405 fatcat:fqpka6rnvrcqbpmchjt4u7wiwq

Semantic Frame-Based Spoken Language Understanding [chapter]

Ye-Yi Wang, Li Deng, Alex Acero
2011 Spoken Language Understanding  
The structure of the semantic space can be represented by a set of templates called semantic frames, each contains some important component variables that are often referred as slots.  ...  A frame-based SLU system is often limited to a specific domain, which has a well-defined, relatively small semantic space.  ...  Grammar authoring is difficult to scale up.  ... 
doi:10.1002/9781119992691.ch3 fatcat:3cwoww7tnjez3dhhpvnyfzpmry

Dependency Parsing

Sandra Kübler, Ryan McDonald, Joakim Nivre
2009 Synthesis Lectures on Human Language Technologies  
After an introduction to dependency grammar and dependency parsing, followed by a formal characterization of the dependency parsing problem, the book surveys the three major classes of parsing models that  ...  are in current use: transition-based, graph-based, and grammar-based models.  ...  The parser proceeds by setting up a link between two words w i and w j based on the linking requirements of w i and w j ; then it attempts to link all the words between w i and w j .  ... 
doi:10.2200/s00169ed1v01y200901hlt002 fatcat:e6bea7jadrbjjf2mweetf6a6mu
« Previous Showing results 1 — 15 out of 733 results