Filters








41,739 Hits in 9.7 sec

An information extraction core system for real world German text processing

Günter Neumann, Rolf Backofen, Judith Baur, Markus Becker, Christian Braun
1997 Proceedings of the fifth conference on Applied natural language processing -  
This paper describes SMES, an information extraction core system for real world German text processing.  ...  for processing different tasks in a flexible manner.  ...  We would like to thank the following people for fruitful discussions: Hans Uszkoreit, Gregor Erbach, and Luca Dini.  ... 
doi:10.3115/974557.974588 dblp:conf/anlp/NeumannBBBB97 fatcat:mxptfcos6fhi3eka3qb5kjef4i

Online-Monitoring of Security-Related Events

Martin Atkinson, Jakub Piskorski, Bruno Pouliquen, Ralf Steinberger, Hristo Tanev, Vanni Zavarella
2008 International Conference on Computational Linguistics  
This paper presents a fully operational real-time event extraction system which is capable of accurately and efficiently extracting violent and natural disaster events from vast amount of online news articles  ...  The event extraction results can be viewed on a publicly accessible website.  ...  The results of the core event extraction system are integrated into a real-world global monitoring system.  ... 
dblp:conf/coling/AtkinsonPPSTZ08 fatcat:mzxndrqzjvbjbg7apvgmafoma4

A Structured Review of the Validity of BLEU

Ehud Reiter
2018 Computational Linguistics  
texts, or for scientific hypothesis testing.  ...  I present a structured review of the evidence on whether BLEU is a valid evaluation technique-in other words, whether BLEU scores correlate with real-world utility and user-satisfaction of NLP systems;  ...  Acknowledgments Many thanks to the anonymous reviewers and my colleagues at Aberdeen for their very helpful comments.  ... 
doi:10.1162/coli_a_00322 fatcat:rfvawsrtxzboxozqwscte5yire

Transfer of Clinical Drug Data to a Research Infrastructure on OMOP – A FAIR Concept [chapter]

Ines Reinecke, Michéle Zoch, Markus Wilhelm, Martin Sedlmayr, Franziska Bathelt
2021 Studies in Health Technology and Informatics  
Generating evidence based on real-world data is gaining importance in research not least since the COVID-19 pandemic.  ...  Although the transfer of German claim data to OMOP is already implemented, drug data is an open issue.  ...  All authors approved the submitted manuscript and take responsibility for its scientific integrity.  ... 
doi:10.3233/shti210815 pmid:34795082 fatcat:ps3yrphxxjb7nnwjudfrlruyiu

A Shallow Text Processing Core Engine

Gunter Neumann, Jakub Piskorski
2002 Computational intelligence  
In this paper we present 1 sppc, a high-performance system for intelligent extraction of structured data from free text documents. sppc consists of a set of domain-adaptive shallow core components that  ...  The whole approach proved to be very useful for processing free word order languages like German. sppc has a good performance (more than 6000 words per second on standard PC environments) and achieves  ...  Acknowledgements The research underlying this paper was supported by a research grant from the German  ... 
doi:10.1111/0824-7935.00197 fatcat:ggrveuikyfb6no6tbprmji2vcu

Real-Time Discovery and Geospatial Visualization of Mobility and Industry Events from Large-Scale, Heterogeneous Data Streams

Leonhard Hennig, Philippe Thomas, Renlong Ai, Johannes Kirschnick, He Wang, Jakob Pannier, Nora Zimmermann, Sven Schmeier, Feiyu Xu, Jan Ostwald, Hans Uszkoreit
2016 Proceedings of ACL-2016 System Demonstrations  
We present Spree, a scalable system for real-time, automatic event extraction from social media, news and domain-specific RSS feeds.  ...  Our system is tailored to a range of mobilityand industry-related events, and processes German texts within a distributed linguistic analysis pipeline implemented in Apache Flink.  ...  Acknowledgments This research was partially supported by the German Federal Ministry of Economics and Energy (BMWi) through the projects SDW (01MD15010A) and SD4M (01MD15007B), and by the German Federal  ... 
doi:10.18653/v1/p16-4007 dblp:conf/acl/HennigTAKWPZSXO16 fatcat:epflmdi2uraatcy2alv5ykhylq

Information Extraction: Past, Present and Future [chapter]

Jakub Piskorski, Roman Yangarber
2012 Multi-source, Multilingual Information Extraction and Summarization  
In this chapter we present a brief overview of Information Extraction, which is an area of natural language processing that deals with finding factual information in free text.  ...  Such a record may capture a real-world entity with its attributes mentioned in text, or a real-world event, occurrence, or state, with its arguments or actors: who did what to whom, where and when.  ...  contributed to the spread of deployment of IE techniques in real-world applications for processing of vast amount of textual data.  ... 
doi:10.1007/978-3-642-28569-1_2 dblp:series/tanlp/PiskorskiY13 fatcat:aoc7stoinzf6jc2dengl5ltwte

Streaming Text Analytics for Real-Time Event Recognition

Philippe Thomas, Johannes Kirschnick, Leonhard Hennig, Renlong Ai, Sven Schmeier, Holmer Hemsen, Feiyu Xu, Hans Uszkoreit
2017 RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning  
Real-time information extraction from such high velocity, high volume text streams requires scalable, distributed natural language processing pipelines.  ...  We also present promising experimental results for the event extraction component of our system, which recognizes a novel set of event types.  ...  Acknowledgments This research was partially supported by the German Federal Ministry of Economics and Energy (BMWi) through the projects SDW (01MD15010A) and SD4M (01MD15007B), and by the German Federal  ... 
doi:10.26615/978-954-452-049-6_096 dblp:conf/ranlp/ThomasKHASHXU17 fatcat:ctsfdcoqubfj5ew5qx53ek3vsy

Application of the Dublin Core format for automatic metadata generation and extraction

Ernesto Giralt Hernández, Joan Marc Piulachs
2005 International Conference on Dublin Core and Metadata Applications  
This article describes a set of services and tools to be used by information systems to obtain metadata collections in a automated fashion from online content or other electronic repositories.  ...  Through several algorithms is capable of generate and extract metadata elements from documents, explicitly declared or as a result the document's content analysis.  ...  Any planning of a new information system, oriented to extract and to process content from the chaotic repository that represents Internet today, run into the problem of deal with non catalogued contents  ... 
dblp:conf/dc/HernandezP05 fatcat:mm2n7rsnavguffsiqgd27arnn4

Biographical Data Exploration as a Test-bed for a Multi-view, Multi-method Approach in the Digital Humanities

André Blessing, Andrea Glaser, Jonas Kuhn
2015 Conference on Biographical Data in a Digital World  
The present paper has two purposes: the main point is to report on the transfer and extension of an NLP-based biographical data exploration system that was developed for Wikipedia data and is now applied  ...  Hence, we view the project context as an interesting test-bed for some methodological considerations.  ...  This work is supported by CLARIN-D (Common Language Resources and Technology Infrastructure, http://de.clarin.eu/), funded by the German Federal Ministry for Education and Research (BMBF) and by a Nuance  ... 
dblp:conf/bd/BlessingGK15 fatcat:vtthrmjimzfgrdedqgos377ezi

IDA: A System for Automated Sorting, Indexing, and Classification of Documents

Gerd Maderlechner, Thomas Brückner, Peter Suda
1996 IAPR International Workshop on Machine Vision Applications  
The system has been applied to a variety of tasks: Presorting of forms, reports and letters, index extraction for archiving and retrieval, text column analysis in real estate register documents, in-house  ...  This paper presents an overview of the architecture and applications of the system.  ...  The forms application was part of a system solution for conversion of more than five million real estate register pages into an optical archive 3).  ... 
dblp:conf/mva/MaderlechnerBS96 fatcat:esg5lbs4ind73lcvroiz3mpzfm

Test-Driven Development of Complex Information Extraction Systems using TextMarker

Peter Klügl, Martin Atzmüller, Frank Puppe
2008 Deutsche Jahrestagung für Künstliche Intelligenz  
This paper presents an approach for the testdriven development of complex information extraction systems.  ...  TEXT-MARKER and the test-driven approach are demonstrated by two real-world case studies in technical and medical domains.  ...  Acknowledgements This work has been partially supported by the German Research Council (DFG) under grant Pu 129/8-2.  ... 
dblp:conf/ki/KluglAP08 fatcat:lnofhdtvrneyjjoncdpxrenace

Towards a Platform for Curation Technologies: Enriching Text Collections with a Semantic-Web Layer [chapter]

Peter Bourgonje, Julian Moreno-Schneider, Jan Nehring, Georg Rehm, Felix Sasaki, Ankit Srivastava
2016 Lecture Notes in Computer Science  
In an attempt to put a Semantic Web-layer that provides linguistic analysis and discourse information on top of digital content, we develop a platform for digital curation technologies.  ...  The platform offers language-, knowledge-and data-aware services as a flexible set of workflows and pipelines for the efficient processing of various types of digital content.  ...  -Information extraction: We use Lucene 3 to create an index for our document collection that enables text-based IR.  ... 
doi:10.1007/978-3-319-47602-5_14 fatcat:ez53ffo7zzgw7b37flh6rtdutm

4.-8. März 2019

Cornelia Kiefer, Peter Reimann, Bernhard Mitschang
2019 Datenbanksysteme für Business, Technologie und Web  
Our main contributions comprise the description of the concept of hybrid information extraction as well as a prototypical implementation and an evaluation with two real-world data sets from aftersales  ...  Our solution exploits results of analyzing structured data within the text mining process, i.e., structured information guides and improves the information extraction process on textual data.  ...  Acknowledgements The authors would like to thank the German Research Foundation (DFG) for financial support of this project as part of the Graduate School of Excellence advanced Manufacturing Engineering  ... 
doi:10.18420/btw2019-10 dblp:conf/btw/Kiefer0M19 fatcat:id5onbjv7fditgepdfkpbbue6q

Supporting teachers as content authors in intelligent educational systems

Peter Brusilovsky, Judith Knapp, Johann Gamper
2006 International Journal of Knowledge and Learning  
As a contribution to solving this problem, we present our recent work on authoring support for an adaptive vocabulary acquisition system, ELDIT.  ...  In this paper we discuss a new approach for authoring practical IESs where core authoring is done by professional design teams, while the educational content is mainly developed by teachers who use the  ...  The problem is that evaluating an authoring system developed for teachers is a real challenge.  ... 
doi:10.1504/ijkl.2006.010992 fatcat:q2pphrgv35devdc3lpw3fetedq
« Previous Showing results 1 — 15 out of 41,739 results