SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data

Aravind Venkatesan, Jee-Hyub Kim, Francesco Talo, Michele Ide-Smith, Julien Gobeill, Jacob Carter, Riza Batista-Navarro, Sophia Ananiadou, Patrick Ruch, Johanna McEntyre
2017 Wellcome Open Research  
The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby
more » ... ising the effort in browsing content and identifying key biological concepts. As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data. Click here to access the data. Ananiadou S, Thompson P, Nawaz R, et al.: Event-based text mining for biology and functional genomics. Brief Funct Genomics. 2015; 14(3): 213-30. PubMed Abstract | Publisher Full Text | Free Full Text Attwood TK, Kell DB, McDermott P, et al.: Utopia documents: linking scholarly literature with research data. Bioinformatics. 2010; 26(18): i568-i574. PubMed Abstract | Publisher Full Text | Free Full Text Bateman A: Curators of the world unite: the International Society of Biocuration. Bioinformatics. 2010; 26(8): 991. PubMed Abstract | Publisher Full Text Beagrie N, Houghton J: The Value and Impact of the European Bioinformatics Institute. 2016. Reference Source Chang YM, Kuo CJ, Huang HS, et al.: Analysis and Enhancement of Conditional Random Fields Gene Mention Taggers in BioCreative II Challenge Evaluation. In LBM (Short Papers). 2007; 7: 1. Reference Source Comeau DC, Islamaj Doğan R, Ciccarese P, et al.: BioC: a minimalist approach to interoperability for biomedical text processing. Database (Oxford). 2013; 2013: bat064. PubMed Abstract | Publisher Full Text | Free Full Text Dauga D: Biocuration: a new challenge for the tunicate community. Genesis. 2015; 53(1): 132-142. PubMed Abstract | Publisher Full Text Druzinsky RE, Balhoff JP, Crompton AW, et al.: Muscle Logic: New Knowledge Resource for Anatomy Enables Comprehensive Searches of the Literature on the Feeding Muscles of Mammals. PLoS One. 2016; 11(2): e0149102. PubMed Abstract | Publisher Full Text | Free Full Text Europe PMC Consortium: Europe PMC: a full-text literature database for the life sciences and platform for innovation. Nucleic Acids Res. [Accessed June 7, 2016], 2015; 43(Database issue): D1042-8. PubMed Abstract | Publisher Full Text | Free Full Text Fernández JM, Hoffmann R, Valencia A: iHOP web services. Nucleic Acids Res. 2007; 35(Web Server issue): W21-6. PubMed Abstract | Publisher Full Text | Free Full Text Gobeill J, Gaudinat A, Pasche E, et al.: Deep Question Answering for protein annotation. Database (Oxford). 2015; 2015: pii: bav081. PubMed Abstract | Publisher Full Text | Free Full Text Gobeill J, Tbahriti I, Ehrler F, et al.: Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction. BMC Bioinformatics. 2008; 9(Suppl 3): S9. PubMed Abstract | Publisher Full Text | Free Full Text Gobeill J, Patsche E, Theodoro D: Question answering for biology and medicine. Hirschman L, Burns GA, Krallinger M, et al.: Text mining for the biocuration workflow. Database (Oxford). 2012; 2012: bas020. PubMed Abstract | Publisher Full Text | Free Full Text Kafkas Ş, Pi X, Marinos N, et al.: Section level search functionality in Europe PMC. J Biomed Semantics. 2015; 6(1): 7. PubMed Abstract | Publisher Full Text | Free Full Text Kafkas S, Dunham I, McEntyre J: Literature Evidence in Open Targets-a target validation platform. In Phenotype Day, ISMB. Orlando, Florida, US. 2016. Reference Source Koscielny G, An P, Carvalho-Silva D, et al.: Open Targets: a platform for therapeutic target identification and validation. Nucleic Acids Res. Oxford University Press; 2017; 45(D1): D985-D994. PubMed Abstract | Publisher Full Text | Free Full Text Landeghem SV, Ginter F: EVEX: a PubMed-scale resource for homology-based generalization of text mining predictions. Proceedings of BioNLP. 2011; 28-37. Reference Source Müller HM, Kenny EE, Sternberg PW: Textpresso: An ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2004; 2(11): e309. PubMed Abstract | Publisher Full Text | Free Full Text O'Donoghue SI, Horn H, Pafilis E, et al.: Reflect: A practical approach to web semantics. Journal of Web Semantics. 2010; 8(2-3): 182-189. Publisher Full Text Orchard S, Ammari M, Aranda B, et al.: The MIntAct project--IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 2014; 42(Database issue): D358-63. PubMed Abstract | Publisher Full Text | Free Full Text Pafilis E, Buttigieg PL, Ferrell B, et al.: EXTRACT: interactive extraction of environment metadata and term suggestion for metagenomic sample annotation. Database (Oxford). 2016; 2016: pii: baw005. PubMed Abstract | Publisher Full Text | Free Full Text Piñero J, Queralt-Rosinach N, Bravo À, et al.: DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database (Oxford). 2015; 2015: bav028. PubMed Abstract | Publisher Full Text | Free Full Text Pletscher-Frankild S, Pallejà A, Tsafou K, et al.: DISEASES: text mining and data integration of disease-gene associations. Methods. 2015; 74: 83-89. PubMed Abstract | Publisher Full Text Rak R, Batista-Navarro RT, Rowley A, et al.: Text-mining-assisted biocuration workflows in Argo. Database (Oxford). [Accessed September 26, 2016], 2014; 2014: pii: bau070. PubMed Abstract | Publisher Full Text | Free Full Text Rebholz-Schuhmann D, Arregui M, Gaudan S, et al.: Text processing through Web services: calling Whatizit. Bioinformatics. [Accessed June 7, 2016], 2008; 24(2): 296-8. PubMed Abstract | Publisher Full Text Rebholz-Schuhmann D, Oellrich A, Hoehndorf R: Text-mining solutions for biomedical research: enabling integrative biology. Nat Rev Genet. 2012; 13(12): 829-839. PubMed Abstract | Publisher Full Text Talo' F, EuropePMC: EuropePMC/Biojs.Annotator: Biojs.Annotator 1.0 release. Zenodo. 2016. Data Source Van Auken K, Fey P, Berardini TZ, et al.: Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR. Database (Oxford). 2012; 2012: bas040. PubMed Abstract | Publisher Full Text | Free Full Text Wei CH, Kao HY, Lu Z: PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res. 2013; 41(Web Server issue): W518-22. PubMed Abstract | Publisher Full Text | Free Full Text
doi:10.12688/wellcomeopenres.10210.2 pmid:28948232 pmcid:PMC5527546 fatcat:ytjpdiofp5g7tnaobq4todlnwu