Lisen&Curate: A platform to facilitate knowledge tools for curation of regulation of transcription initiation in bacteria [article]

Carlos-Francisco Méndez-Cruz, Martín Díaz-Rodríguez, Francisco Guadarrama-García, Oscar William Lithgow-Serrano, Socorro Gama-Castro, Hilda Solano-Lira, Fabio Rinaldi, Julio Collado-Vides
2020 bioRxiv   pre-print
The amount of published papers in biomedical research makes it rather impossible for a researcher to keep up to date. This is where machine processing of scientific publications could contribute to facilitate the access to knowledge. How to make use of text mining capabilities and still preserve the high quality of manual curation, is the challenge we focused on. Here we present the Lisen&Curate system designed to enable current and future NLP capabilities within a curation environment
more » ... used in curation of literature on the regulation of transcription initiation in bacteria. The current version extracts regulatory interactions with the corresponding sentences for curators to confirm or reject accelerating their curation. It also uses an embedded metrics of sentence similarity offering the curator an alternative mechanism of navigating through semantically similar sentences within a given paper as well as across papers of a pre-defined corpus of publications pertinent to the task. We show results of the use of the system to curate literature in E. coli as well as literature in Salmonella. A major advantage of the system is to save as part of the curation work, the precise link for every curated piece of knowledge with the corresponding specific sentence(s) in the curated publication supporting it. We discuss future directions of this type of curation infrastructure.
doi:10.1101/2020.04.28.065243 fatcat:7bzv27eievhujmnjnai57unt5u