Filters








18 Hits in 1.0 sec

OntoGene web services for biomedical text mining

Fabio Rinaldi, Simon Clematide, Hernani Marques, Tilia Ellendorff, Martin Romacker, Raul Rodriguez-Esteban
2014 BMC Bioinformatics  
Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an
more » ... towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges, with top ranked results in several of them.
doi:10.1186/1471-2105-15-s14-s6 pmid:25472638 pmcid:PMC4255746 fatcat:oufaw2qdcfhjraterrruhy7l4q

BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language

Fabio Rinaldi, Tilia Renate Ellendorff, Sumit Madan, Simon Clematide, Adrian van der Lek, Theo Mevissen, Juliane Fluck
2016 Database: The Journal of Biological Databases and Curation  
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal
more » ... f track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text.
doi:10.1093/database/baw067 pmid:27402677 pmcid:PMC4940434 fatcat:nra6lgn7fndppg6o2nzi5lovz4

UZH in BioNLP 2013

Gerold Schneider, Simon Clematide, Tilia Ellendorff, Don Tuggener, Fabio Rinaldi, Gintare Grigonyte
2013
We describe a biological event detection method implemented for the Genia Event Extraction task of BioNLP 2013. The method relies on syntactic dependency relations provided by a general NLP pipeline, supported by statistics derived from Maximum Entropy models for candidate trigger words, for potential arguments, and for argument frames.
doi:10.5167/uzh-91884 fatcat:idyyvqaumza2fg2ezlzzonzcnm

OntoGene: CTD entity and action term recognition

Fabio Rinaldi, Simon Clematide, Tilia Renate Ellendorff, Hernani Marques
2013
doi:10.5167/uzh-91880 fatcat:djsyrvulybbphnllykfrnofbhq

Biomedical Text Mining for Etiological Factor Identification in Mental Health Publications

Tilia Ellendorff
2021
The first version of the resulting corpus has been presented at the 10th Language Resources and Evaluation Conference (LREC) (Ellendorff et al., 2016) .  ... 
doi:10.5167/uzh-205675 fatcat:o52fsxkqj5ejbjkjvb4e47i7qu

Ontogene Term and Relation Recognition for CDR

Tilia Renate Ellendorff, Simon Clematide, Adrian Van Der Lek, Lenz Furrer, Fabio Rinaldi
2015
For our participation in the CDR task of BioCreative 5, we have adapted the Ontogene System and optimized it for disease recognition (DNER Task) and identification of chemical-disease relationships (CID Task). For the DNER Task we have experimented with different changes to the term matching system. We describe the effects of an abbreviation detection tool as well as a selection of rules for term normalization.
doi:10.5167/uzh-116468 fatcat:vcpijkx6v5hadhyfsa7jtigbae

ODIN: a customizable literature curation tool

Fabio Rinaldi, Allan Peter Davis, Christopher Southan, Simon Clematide, Tilia Renate Ellendorff, Gerold Schneider
2013
doi:10.5167/uzh-91888 fatcat:srmq4okelvfdrldpkevm36jkde

Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction

Tilia Ellendorff, Fabio Rinaldi, Simon Clematide
2014
We show how to use large biomedical databases in order to obtain a gold standard for training a machine learning system over a corpus of biomedical text. As an example we use the Comparative Toxicogenomics Database (CTD) and describe by means of a short case study how the obtained data can be applied. We explain how we exploit the structure of the database for compiling training material and a testset. Using a Naive Bayes document classification approach based on words, stem bigrams and MeSH
more » ... criptors we achieve a macro-average F-score of 61% on a subset of 8 action terms. This outperforms a baseline system based on a lookup of stemmed keywords by more than 20%. Furthermore, we present directions of future work, taking the described system as a vantage point. Future work will be aiming towards a weakly supervised system capable of discovering complete biomedical interactions and events.
doi:10.5167/uzh-104487 fatcat:5iuhddwvxzd47cyicnyhlhstyq

A Combined Resource of Biomedical Terminology and its Statistics

Tilia Renate Ellendorff, Adrian Van Der Lek, Lenz Furrer, Fabio Rinaldi
2015
In this paper, we present a large biomedical term resource automatically compiled from the terminology of a selection of biomedical databases. The resource has a very simple and intuitive format and therefore can be easily embedded into a system for biomedical text mining and used as a linguistic resource. It is continuously updated and a user interface makes it possible to compile a new term resource according to individual requirements by selecting specific databases to be included. We
more » ... statistics for each included biomedical entity type separately as well as in the context of the combined terminology.
doi:10.5167/uzh-114510 fatcat:47eoo4xy5ffqdlxkcb75cazyri

The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors

Tilia Renate Ellendorff, Simon Foster, Fabio Rinaldi
2016
We present the first version of a corpus annotated for psychiatric disorders and their etiological factors. The paper describes the choice of text, annotated entities and events/relations as well as the annotation scheme and procedure applied. The corpus is featuring a selection of focus psychiatric disorders including depressive disorder, anxiety disorder, obsessive-compulsive disorder, phobic disorders and panic disorder. Etiological factors for these focus disorders are widespread and
more » ... genetic, physiological, sociological and environmental factors among others. Etiological events, including annotated evidence text, represent the interactions between their focus disorders and their etiological factors. Additionally to these core events, symptomatic and treatment events have been annotated. The current version of the corpus includes 175 scientific abstracts. All entities and events/relations have been manually annotated by domain experts and scores of inter-annotator agreement are presented. The aim of the corpus is to provide a first gold standard to support the development of biomedical text mining applications for the specific area of mental disorders which belong to the main contributors to the contemporary burden of disease. Abstract We present the first version of a corpus annotated for psychiatric disorders and their etiological factors. The paper describes the choice of text, annotated entities and events/relations as well as the annotation scheme and procedure applied. The corpus is featuring a selection of focus psychiatric disorders including depressive disorder, anxiety disorder, obsessive-compulsive disorder, phobic disorders and panic disorder. Etiological factors for these focus disorders are widespread and include genetic, physiological, sociological and environmental factors among others. Etiological events, including annotated evidence text, represent the interactions between their focus disorders and their etiological factors. Additionally to these core events, symptomatic and treatment events have been annotated. The current version of the corpus includes 175 scientific abstracts. All entities and events/relations have been manually annotated by domain experts and scores of inter-annotator agreement are presented. The aim of the corpus is to provide a first gold standard to support the development of biomedical text mining applications for the specific area of mental disorders which belong to the main contributors to the contemporary burden of disease.
doi:10.5167/uzh-127488 fatcat:avdkj5kganauzamhedsnsvldam

Using a Hybrid Approach for Entity Recognition in the Biomedical Domain

Marco Basaldella, Lenz Furrer, Nicola Colic, Tilia Renate Ellendorff, Carlo Tasso, Fabio Rinaldi
2016
For term matching, we compiled a dictionary resource using the Bio Term Hub (Ellendorff et al., 2015) .  ... 
doi:10.5167/uzh-125712 fatcat:ak75ku4m6bdkngwm7hapjw5dtq

Track 4 Overview: Extraction of Causal Network Information in Biological Expression Language (BEL)

Juliane Fluck, Sumit Madan, Tilia Renate Ellendorff, Theo Mevissen, Simon Clematide, Adrian Van Der Lek, Fabio Rinaldi
2015
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological text mining. The BioCreative track 4 provides training data and an evaluation environment for the extraction of causal relationships in Biological Expression Language (BEL). BEL is a modeling language that is easily editable by humans or by automatic systems and can express causal relationships of different levels of granularity. Proteinprotein relations can be expressed in BEL
more » ... as well as relations between biological processes and disease stages. To extract BEL information automatically, named entity recognition and normalization to defined name spaces are necessary. Furthermore, relations extracted from text have to be transformed into correct BEL syntax. The track provided training and evaluation for two complementary task: Given a sentence extract all BEL statements and given a BEL statement propose up to 10 evidence sentences from the literature.
doi:10.5167/uzh-116469 fatcat:7g7qxe4idbei3fca3ek3cafpoa

Approaching SMM4H with Merged Models and Multi-task Learning

Tilia Ellendorff, Lenz Furrer, Nicola Colic, Noëmi Aepli, Fabio Rinaldi
2019 Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task   unpublished
For the last run (MTL+BERT), we combined predictions from all 20 BERT systems with the first system and a second MTL configuration which uses different word embeddings (Ellendorff et al., 2018) and omits  ... 
doi:10.18653/v1/w19-3208 fatcat:5hq64jomrjcphaf4sgzirrdlum

UZH@SMM4H: System Descriptions

Tilia Ellendorff, Joseph Cornelius, Heath Gordon, Nicola Colic, Fabio Rinaldi
2018 Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task   unpublished
Our team at the University of Zurich participated in the first 3 of the 4 sub-tasks at the Social Media Mining for Health Applications (SMM4H) shared task. We experimented with different approaches for text classification, namely traditional feature-based classifiers (Logistic Regression and Support Vector Machines), shallow neural networks, RCNNs, and CNNs. This system description paper provides details regarding the different system architectures and the achieved results. Abstract Our team at
more » ... the University of Zürich participated in the first 3 of the 4 sub-tasks at the Social Media Mining for Health Applications (SMM4H) shared task. We experimented with different approaches for text classification, namely traditional feature-based classifiers (Logistic Regression and Support Vector Machines), shallow neural networks, RCNNs, and CNNs. This system description paper provides details regarding the different system architectures and the achieved results.
doi:10.18653/v1/w18-5916 fatcat:n6cg5wot6bdahozhibbqyirsei

BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language

Fabio Rinaldi, Tilia Renate Ellendorff, Sumit Madan, Simon Clematide, Adrian Van Der Lek, Theo Mevissen, Juliane Fluck
2016
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal
more » ... f track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text.
doi:10.5167/uzh-129620 fatcat:at4ypkjqbrgq5iehck3r7rhuqq
« Previous Showing results 1 — 15 out of 18 results