Filters








581 Hits in 4.6 sec

GeneE: Gene and protein query expansion with disambiguation

M. J. Schuemie, N. Kang, M. L. Hekkelman, J. A. Kors
2009 Bioinformatics  
In order to accurately retrieve as many relevant documents as possible, we have developed GeneE, a web application that expands a gene query to include all known synonyms, and adds disambiguation information  ...  The query expansion algorithm is also available as a web service. Availability: http://biosemantics.org/geneE Contact: m.schuemie@erasmusmc.nl  ...  To our knowledge, GeneE is the first publicly available gene query expansion tool, and the first to use disambiguation information extracted from the dictionary itself.  ... 
doi:10.1093/bioinformatics/btp597 pmid:19837720 fatcat:nsekyujdtffplmn5n2jcpugoom

Finiding Gene Function using LitMiner

Berry de Bruijn, Joel D. Martin
2003 Text Retrieval Conference  
They returned an average of 196 documents per query across the 50 queries, with a median value of only 100 documents.  ...  For the second submission, reranking was done based on the outcome of an information extraction module, tuned towards the task of identifying gene function papers.  ...  ), for their input during our joint meetings and on-line.  ... 
dblp:conf/trec/BruijnM03 fatcat:v4mkf6axcnb6ldwaiavgu4xzki

pGenN, a Gene Normalization Tool for Plant Genes and Proteins in Scientific Literature

Ruoyao Ding, Cecilia N. Arighi, Jung-Youn Lee, Cathy H. Wu, K. Vijay-Shanker, Willy John Wilbur
2015 PLoS ONE  
Automatically detecting gene/protein names in the literature and connecting them to databases records, also known as gene normalization, provides a means to structure the information buried in free-text  ...  The gene normalization results are stored in a local database for direct query from the pGenN web interface (proteininformationresource.org/pgenn/).  ...  Ross for the editorial assistance for the manuscript, and Mengxi Lv for participating in the evaluation of pGenN. Author Contributions  ... 
doi:10.1371/journal.pone.0135305 pmid:26258475 pmcid:PMC4530884 fatcat:6gphkqhqnzfgbkzbzshp2ge4ha

Integrating Various Resources for Gene Name Normalization

Yuncui Hu, Yanpeng Li, Hongfei Lin, Zhihao Yang, Liangxi Cheng, Darren R. Flower
2012 PLoS ONE  
The system consists of four major components: gene name recognition, entity mapping, disambiguation and filtering.  ...  For the gene names that map to more than one database identifiers, we develop a disambiguation method based on semantic similarity derived from the Gene Ontology and MEDLINE abstracts.  ...  expansion of matching also brings noise and has a detrimental effect upon precision.  ... 
doi:10.1371/journal.pone.0043558 pmid:22984434 pmcid:PMC3440407 fatcat:uwpdne7gxrbpbacqxms6gg57h4

Gene and protein nomenclature in public databases

Katrin Fundel, Ralf Zimmer
2006 BMC Bioinformatics  
Various organism-specific or general public databases aim at organizing knowledge about genes and proteins. These databases can be used for deriving gene and protein name dictionaries.  ...  Frequently, several alternative names are in use for biological objects such as genes and proteins.  ...  This work was funded by projects BEX (Sanofi-Aventis, Frankfurt) and BOA (German ministry for research and education, grant 01GG9824).  ... 
doi:10.1186/1471-2105-7-372 pmid:16899134 pmcid:PMC1560172 fatcat:n56wp6k6ajdspchk2toalqbsyq

Retrieval with gene queries

Aditya K Sehgal, Padmini Srinivasan
2006 BMC Bioinformatics  
For most genes the best ranking query is one that is built from the LocusLink (now Entrez Gene) summary and product information along with the gene names and aliases.  ...  For others, the gene names and aliases suffice. We also present an approach that successfully predicts, for a given gene, which of these two ranking queries is more appropriate.  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.  ... 
doi:10.1186/1471-2105-7-220 pmid:16630348 pmcid:PMC1482725 fatcat:4uwz3ly3gnhcdlel6b5wdwpq3y

Ambiguity of human gene symbols in LocusLink and MEDLINE: creating an inventory and a disambiguation test collection

Marc Weeber, Bob J Schijvenaars, Erik M Van Mulligen, Barend Mons, Rob Jelier, Christian C Van Der Eijk, Jan A Kors
2003 AMIA Annual Symposium Proceedings  
Human genes are often named with a gene symbol and a longer, more descriptive term; the short form is very often an abbreviation of the long form.  ...  ., one gene symbol often refers to more than one gene. Using an existing abbreviation expansion algorithm,we explore MEDLINE for the use of human gene symbols derived from LocusLink.  ...  However, users will most likely not query a database with a UID; instead, they will search with a gene name or with other natural language terms indicating, for instance, a gene's function.  ... 
pmid:14728264 pmcid:PMC1480234 fatcat:arcdr64acna63nwqgbmdzfw2ui

eGIFT: Mining Gene Information from the Literature

Catalina O Tudor, Carl J Schmidt, K Vijay-Shanker
2010 BMC Bioinformatics  
Since many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene.  ...  Results: In this paper, we present eGIFT (http://biotm.cis.udel.edu/eGIFT), a web-based tool that associates informative terms, called iTerms, and sentences containing them, with genes.  ...  and for numerous discussions and insightful suggestions on this project.  ... 
doi:10.1186/1471-2105-11-418 pmid:20696046 pmcid:PMC2929241 fatcat:6zojf5xxk5eqzpov3fw53l35zm

Exploring Extensions to Machine-learning based Gene Normalisation

Benjamin Goudey, Nicola Stokes, David Martínez
2007 Australasian Language Technology Association Workshop  
To address this issue, we experiment with a variety of synonym list expansion and filtering methods including: • Lexical Variations -the creation of gene name variations with different hyphenation and  ...  Introduction One of the foundational text-mining tasks in the biomedical domain is the identification of genes and protein names in journal papers.  ... 
dblp:conf/acl-alta/GoudeySM07 fatcat:jadqw7c3mncuxcrxhokkkzp5ky

A Multistage Gene Normalization System Integrating Multiple Effective Methods

Lishuang Li, Shanshan Liu, Lihua Li, Wenting Fan, Degen Huang, Huiwei Zhou, Lars Kaderali
2013 PLoS ONE  
Gene/protein recognition and normalization is an important preliminary step for many biological text mining tasks.  ...  In the stage of dictionary matching, the exact matching and approximate matching between gene names and the EntrezGene lexicon have been combined.  ...  with PubMed ID and some gene names can be disambiguated with it.  ... 
doi:10.1371/journal.pone.0081956 pmid:24349160 pmcid:PMC3861319 fatcat:dkfblnbuufaq5kz5kxmryy4eoe

Overview of BioCreative II gene normalization

Alexander A Morgan, Zhiyong Lu, Xinglong Wang, Aaron M Cohen, Juliane Fluck, Patrick Ruch, Anna Divoli, Katrin Fundel, Robert Leaman, Jörg Hakenberg, Chengjie Sun, Heng-hui Liu (+8 others)
2008 Genome Biology  
We selected abstracts associated with articles previously curated for human genes.  ...  We provided 281 expert-annotated abstracts containing 684 gene identifiers for training, and a blind test set of 262 documents containing 785 identifiers, with a gold standard created by expert annotators  ...  ; and we required that every human gene or protein mentioned in the abstract be associated with an Entrez Gene identifier (and only human genes or proteins).  ... 
doi:10.1186/gb-2008-9-s2-s3 pmid:18834494 pmcid:PMC2559987 fatcat:qxd4defmunckvn7pheotofs4aq

Automated curation of gene name normalization results using the Konstanz information miner

Matthias Zwick
2015 Journal of Biomedical Informatics  
Background: Gene name recognition and normalization is, together with detection of other named entities, a crucial step in biomedical text mining and the underlying basis for development of more advanced  ...  along with publicly available sources.  ...  Katrin Fundel-Clemens for compiling parts of the blacklist of expanded abbreviations and Dr. Thorsten Schweikardt for very valuable discussions during workflow development.  ... 
doi:10.1016/j.jbi.2014.08.016 pmid:25218035 fatcat:ppdaiouahzayfhillvho4c3sue

A literature based method for identifying gene-disease connections

Lada A Adamic, Dennis Wilkinson, Bernardo A Huberman, Eytan Adar
2002 Proceedings. IEEE Computer Society Bioinformatics Conference  
It offers a comprehensive way to treat alias symbols, a statistical method for computing the relevance of the gene to the query, and a novel way to disambiguate gene symbols from other abbreviations.  ...  We present a statistical method that can swiftly identify, from the literature, sets of genes known to be associated with given diseases.  ...  It offers a comprehensive way to treat alias symbols, a statistical method for computing the relevance of a gene or a group of genes to the query, and a novel way to disambiguate gene symbols from other  ... 
pmid:15838128 fatcat:cly4giqxlnedbdbzhhjas54fai

A method of precise mRNA/DNA homology-based gene structure prediction

Alexander Churbanov, Mark Pauley, Daniel Quest, Hesham Ali
2005 BMC Bioinformatics  
, Galahad and BLAT, including when genes contained micro-exons and non-canonical splice sites.  ...  Accurate and automatic gene finding and structural prediction is a common problem in bioinformatics, and applications need to be capable of handling non-canonical splice sites, micro-exons and partial  ...  Acknowledgements We would like to thank members of the Bioinformatics Group at the University of Nebraska at Omaha who provided useful feedback on our progress and program.  ... 
doi:10.1186/1471-2105-6-261 pmid:16242044 pmcid:PMC1274302 fatcat:scumgpawfjfzvifrkruiwnw4kq

Multi-stage gene normalization for full-text articles with context-based species filtering for dynamic dictionary entry selection

Richard Tsai, Po-Ting Lai
2011 BMC Bioinformatics  
Gene normalization (GN) is the task of identifying the unique database IDs of genes and proteins in literature.  ...  BioCreative III GN uses Threshold Average Precision at a median of k errors per query (TAP-k), a new measure closely related to the well-known average precision, but also reflecting the reliability of  ...  We especially thank the BioCreative organizers and BMC reviewers for their valuable comments, which helped us improve the quality of the paper This article has been published as part of BMC Bioinformatics  ... 
doi:10.1186/1471-2105-12-s8-s7 pmid:22151087 pmcid:PMC3269942 fatcat:dnxamnfmwjdjzbqshi4gixscpy
« Previous Showing results 1 — 15 out of 581 results