A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
OntoGene web services for biomedical text mining
2014
BMC Bioinformatics
Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an
doi:10.1186/1471-2105-15-s14-s6
pmid:25472638
pmcid:PMC4255746
fatcat:oufaw2qdcfhjraterrruhy7l4q
more »
... towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges, with top ranked results in several of them.
BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language
2016
Database: The Journal of Biological Databases and Curation
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal
doi:10.1093/database/baw067
pmid:27402677
pmcid:PMC4940434
fatcat:nra6lgn7fndppg6o2nzi5lovz4
more »
... f track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text.
UZH in BioNLP 2013
2013
We describe a biological event detection method implemented for the Genia Event Extraction task of BioNLP 2013. The method relies on syntactic dependency relations provided by a general NLP pipeline, supported by statistics derived from Maximum Entropy models for candidate trigger words, for potential arguments, and for argument frames.
doi:10.5167/uzh-91884
fatcat:idyyvqaumza2fg2ezlzzonzcnm
OntoGene: CTD entity and action term recognition
2013
Biomedical Text Mining for Etiological Factor Identification in Mental Health Publications
2021
The first version of the resulting corpus has been presented at the 10th Language Resources and Evaluation Conference (LREC) (Ellendorff et al., 2016) . ...
doi:10.5167/uzh-205675
fatcat:o52fsxkqj5ejbjkjvb4e47i7qu
Ontogene Term and Relation Recognition for CDR
2015
For our participation in the CDR task of BioCreative 5, we have adapted the Ontogene System and optimized it for disease recognition (DNER Task) and identification of chemical-disease relationships (CID Task). For the DNER Task we have experimented with different changes to the term matching system. We describe the effects of an abbreviation detection tool as well as a selection of rules for term normalization.
doi:10.5167/uzh-116468
fatcat:vcpijkx6v5hadhyfsa7jtigbae
ODIN: a customizable literature curation tool
2013
Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction
2014
We show how to use large biomedical databases in order to obtain a gold standard for training a machine learning system over a corpus of biomedical text. As an example we use the Comparative Toxicogenomics Database (CTD) and describe by means of a short case study how the obtained data can be applied. We explain how we exploit the structure of the database for compiling training material and a testset. Using a Naive Bayes document classification approach based on words, stem bigrams and MeSH
doi:10.5167/uzh-104487
fatcat:5iuhddwvxzd47cyicnyhlhstyq
more »
... criptors we achieve a macro-average F-score of 61% on a subset of 8 action terms. This outperforms a baseline system based on a lookup of stemmed keywords by more than 20%. Furthermore, we present directions of future work, taking the described system as a vantage point. Future work will be aiming towards a weakly supervised system capable of discovering complete biomedical interactions and events.
A Combined Resource of Biomedical Terminology and its Statistics
2015
In this paper, we present a large biomedical term resource automatically compiled from the terminology of a selection of biomedical databases. The resource has a very simple and intuitive format and therefore can be easily embedded into a system for biomedical text mining and used as a linguistic resource. It is continuously updated and a user interface makes it possible to compile a new term resource according to individual requirements by selecting specific databases to be included. We
doi:10.5167/uzh-114510
fatcat:47eoo4xy5ffqdlxkcb75cazyri
more »
... statistics for each included biomedical entity type separately as well as in the context of the combined terminology.
The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors
2016
We present the first version of a corpus annotated for psychiatric disorders and their etiological factors. The paper describes the choice of text, annotated entities and events/relations as well as the annotation scheme and procedure applied. The corpus is featuring a selection of focus psychiatric disorders including depressive disorder, anxiety disorder, obsessive-compulsive disorder, phobic disorders and panic disorder. Etiological factors for these focus disorders are widespread and
doi:10.5167/uzh-127488
fatcat:avdkj5kganauzamhedsnsvldam
more »
... genetic, physiological, sociological and environmental factors among others. Etiological events, including annotated evidence text, represent the interactions between their focus disorders and their etiological factors. Additionally to these core events, symptomatic and treatment events have been annotated. The current version of the corpus includes 175 scientific abstracts. All entities and events/relations have been manually annotated by domain experts and scores of inter-annotator agreement are presented. The aim of the corpus is to provide a first gold standard to support the development of biomedical text mining applications for the specific area of mental disorders which belong to the main contributors to the contemporary burden of disease. Abstract We present the first version of a corpus annotated for psychiatric disorders and their etiological factors. The paper describes the choice of text, annotated entities and events/relations as well as the annotation scheme and procedure applied. The corpus is featuring a selection of focus psychiatric disorders including depressive disorder, anxiety disorder, obsessive-compulsive disorder, phobic disorders and panic disorder. Etiological factors for these focus disorders are widespread and include genetic, physiological, sociological and environmental factors among others. Etiological events, including annotated evidence text, represent the interactions between their focus disorders and their etiological factors. Additionally to these core events, symptomatic and treatment events have been annotated. The current version of the corpus includes 175 scientific abstracts. All entities and events/relations have been manually annotated by domain experts and scores of inter-annotator agreement are presented. The aim of the corpus is to provide a first gold standard to support the development of biomedical text mining applications for the specific area of mental disorders which belong to the main contributors to the contemporary burden of disease.
Using a Hybrid Approach for Entity Recognition in the Biomedical Domain
2016
For term matching, we compiled a dictionary resource using the Bio Term Hub (Ellendorff et al., 2015) . ...
doi:10.5167/uzh-125712
fatcat:ak75ku4m6bdkngwm7hapjw5dtq
Track 4 Overview: Extraction of Causal Network Information in Biological Expression Language (BEL)
2015
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological text mining. The BioCreative track 4 provides training data and an evaluation environment for the extraction of causal relationships in Biological Expression Language (BEL). BEL is a modeling language that is easily editable by humans or by automatic systems and can express causal relationships of different levels of granularity. Proteinprotein relations can be expressed in BEL
doi:10.5167/uzh-116469
fatcat:7g7qxe4idbei3fca3ek3cafpoa
more »
... as well as relations between biological processes and disease stages. To extract BEL information automatically, named entity recognition and normalization to defined name spaces are necessary. Furthermore, relations extracted from text have to be transformed into correct BEL syntax. The track provided training and evaluation for two complementary task: Given a sentence extract all BEL statements and given a BEL statement propose up to 10 evidence sentences from the literature.
Approaching SMM4H with Merged Models and Multi-task Learning
2019
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
unpublished
For the last run (MTL+BERT), we combined predictions from all 20 BERT systems with the first system and a second MTL configuration which uses different word embeddings (Ellendorff et al., 2018) and omits ...
doi:10.18653/v1/w19-3208
fatcat:5hq64jomrjcphaf4sgzirrdlum
UZH@SMM4H: System Descriptions
2018
Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task
unpublished
Our team at the University of Zurich participated in the first 3 of the 4 sub-tasks at the Social Media Mining for Health Applications (SMM4H) shared task. We experimented with different approaches for text classification, namely traditional feature-based classifiers (Logistic Regression and Support Vector Machines), shallow neural networks, RCNNs, and CNNs. This system description paper provides details regarding the different system architectures and the achieved results. Abstract Our team at
doi:10.18653/v1/w18-5916
fatcat:n6cg5wot6bdahozhibbqyirsei
more »
... the University of Zürich participated in the first 3 of the 4 sub-tasks at the Social Media Mining for Health Applications (SMM4H) shared task. We experimented with different approaches for text classification, namely traditional feature-based classifiers (Logistic Regression and Support Vector Machines), shallow neural networks, RCNNs, and CNNs. This system description paper provides details regarding the different system architectures and the achieved results.
BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language
2016
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal
doi:10.5167/uzh-129620
fatcat:at4ypkjqbrgq5iehck3r7rhuqq
more »
... f track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text.
« Previous
Showing results 1 — 15 out of 18 results