985 Hits in 2.6 sec

Automating curation using a natural language processing pipeline

Beatrice Alex, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Richard Tobin, Xinglong Wang
2008 Genome Biology  
as detection and normalization of interacting protein pairs, are still challenging for NLP systems.  ...  The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers.  ...  Acknowledgements The TXM pipeline on which this system is based was developed as part of a joint project with Cognia EU [33] , supported by the Text Mining Programme of ITI Life Sciences Scotland [34  ... 
doi:10.1186/gb-2008-9-s2-s10 pmid:18834488 pmcid:PMC2559981 fatcat:asmlwci6kze55j3m64gpicdzui

AutoML to Date and Beyond: Challenges and Opportunities [article]

Shubhra Kanti Karmaker Santu, Md. Mahadi Hassan, Micah J. Smith, Lei Xu, ChengXiang Zhai, Kalyan Veeramachaneni
2021 arXiv   pre-print
We begin by describing what an end-to-end machine learning pipeline actually looks like, and which subtasks of the machine learning pipeline have been automated so far.  ...  Finally, we lay out a roadmap for the future, pinpointing the research required to further automate the end-to-end machine learning pipeline and discussing important challenges that stand in the way of  ...  Models and software developed so far have enabled or made significant progress towards the automation of data visualization, cleaning and curation (DCC), machine learning (ML), feature engineering (FE)  ... 
arXiv:2010.10777v4 fatcat:arixmky6erdvhnmboe2sfgbb7a

Automating Generative Deep Learning for Artistic Purposes: Challenges and Opportunities [article]

Sebastian Berns, Terence Broad, Christian Guckelsberger, Simon Colton
2021 arXiv   pre-print
For the definition of targets, we adopt core concepts from automated machine learning and an analysis of generative deep learning pipelines, both in standard and artistic settings.  ...  We understand automation as the challenge of granting a generative system more creative autonomy, by framing the interaction between the user and the system as a co-creative process.  ...  Christian Guckelsberger is supported by the Academy of Finland Flagship programme "Finnish Center for Artificial Intelligence" (FCAI).  ... 
arXiv:2107.01858v1 fatcat:53sdqqb35ndyjinejj657xqe5q

AI Song Contest: Human-AI Co-Creation in Songwriting [article]

Cheng-Zhi Anna Huang, Hendrik Vincent Koops, Ed Newton-Rex, Monica Dinculescu, Carrie J. Cai
2020 arXiv   pre-print
Ultimately, teams not only had to manage the "flare and focus" aspects of the creative process, but also juggle them with a parallel process of exploring and curating multiple ML models and outputs.  ...  As ML models are not easily steerable, teams also generated massive numbers of samples and curated them post-hoc, or used a range of strategies to direct the generation, or algorithmically ranked the samples  ...  We also thank Tim Cooijmans for creating early versions of the figures and Michael Terry for feedback on this manuscript.  ... 
arXiv:2010.05388v1 fatcat:tfbs6zhgrzbfnnajg6y6x2nnyi

AI song contest: Human-AI co-creation in songwriting

Cheng-Zhi Anna Huang, Hendrik Vincent Koops, Ed Newton-Rex, Monica Dinculescu, Carrie Cai
2020 Zenodo  
Ultimately, teams not only had to manage the "flare and focus" aspects of the creative process, but also juggle that with a parallel process of exploring and curating multiple ML models and outputs.  ...  As ML models are not easily steerable, teams also generated massive numbers of samples and curated them post-hoc, or used a range of strategies to direct the generation or algorithmically ranked the samples  ...  Generate then curate A common approach was to generate a large quantity of musical samples, followed by automatically or manually curating them post-hoc.  ... 
doi:10.5281/zenodo.4245530 fatcat:dftkglgf4jgxfffuedyosmi55u

Threshy: Supporting Safe Usage of Intelligent Web Services [article]

Alex Cummaudo, Scott Barnett, Rajesh Vasa, John Grundy
2020 arXiv   pre-print
Threshy is designed for tuning the confidence scores returned by intelligent web services and does not deal with hyper-parameter optimisation used in ML models.  ...  Increased popularity of 'intelligent' web services provides end-users with machine-learnt functionality at little effort to developers.  ...  Threshy provides a visually interactive interface for developers to fine-tune thresholds and explore trade-offs of prediction hits/misses.  ... 
arXiv:2008.08252v1 fatcat:plw6vy3cyvespbjxrdbiw7jogm

The Treasury Chest of Text Mining: Piling Available Resources for Powerful Biomedical Text Mining

Nícia Rosário-Ferreira, Catarina Marques-Pereira, Manuel Pires, Daniel Ramalhão, Nádia Pereira, Victor Guimarães, Vítor Santos Costa, Irina S. Moreira
2021 BioChem  
Text mining (TM) is a semi-automatized, multi-step process, able to turn unstructured into structured data.  ...  TM relevance has increased upon machine learning (ML) and deep learning (DL) algorithms' application in its various steps.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/biochem1020007 fatcat:qve6xgoxuvbwzpqz6q45s7wlly


Sofie Van Landeghem, Bernard De Baets, Yves Van de Peer, Yvan Saeys
2011 Computational intelligence  
text mining tools for real-life tasks such as hypothesis generation, database curation and knowledge discovery.  ...  We have developed a machine learning framework to accurately extract complex genetic interactions from text.  ...  In the previous sections, we have described extensions and fine-tuning of our ML system.  ... 
doi:10.1111/j.1467-8640.2011.00403.x fatcat:wmvblcmuvbaznjivph2by27ebm

Hyperparameter Tuning and Pipeline Optimization via Grid Search Method and Tree-Based AutoML in Breast Cancer Prediction

Siti Fairuz Mat Radzi, Muhammad Khalis Abdul Karim, M Iqbal Saripan, Mohd Amiruddin Abdul Rahman, Iza Nurzawani Che Isa, Mohammad Johari Ibahim
2021 Journal of Personalized Medicine  
Breast cancer data were used in a comparative analysis of the TPOT-generated ML pipelines with the selected ML classifiers, optimized by a grid search approach.  ...  Features of radiomics have been presented as a guide for the ML pipeline selection from the breast cancer data set based on TPOT.  ...  Conflicts of Interest: The authors declare no competing interests.  ... 
doi:10.3390/jpm11100978 pmid:34683118 fatcat:kqqqjqggmrd2ne6nij5cpo53ky

Classifying protein-protein interaction articles using word and syntactic features

Sun Kim, W Wilbur
2011 BMC Bioinformatics  
Identifying protein-protein interactions (PPIs) from literature is an important step in mining the function of individual proteins as well as their biological network.  ...  The proposed method automatically identifies gene names by a Priority Model, then extracts grammar relations using a dependency parser.  ...  However, several constraints such as the problems of manual curation of a database, the rapid growth of the biomedical literature, and of newly discovered proteins, make it difficult for database curators  ... 
doi:10.1186/1471-2105-12-s8-s9 pmid:22151252 pmcid:PMC3269944 fatcat:4gjqfbaz7nfejey7wyscvr7w44

End-To-End Computer Vision Framework: An Open-Source Platform for Research and Education

Ciprian Orhei, Silviu Vert, Muguras Mocofan, Radu Vasiu
2021 Sensors  
Even if the main focus of the framework is on the Computer Vision processing pipeline, the framework offers solutions to incorporate even more complex activities, such as training Machine Learning models  ...  To better serve this purpose, research on the architecture and design of such systems is also important.  ...  Conflicts of Interest: The authors declare no conflicts of interest.  ... 
doi:10.3390/s21113691 pmid:34073282 fatcat:mgtmjuisunhfrkohnsnonkkhuu

RED-ML: a novel, effective RNA editing detection method based on machine learning

Heng Xiong, Dongbing Liu, Qiye Li, Mengyue Lei, Liqin Xu, Liang Wu, Zongji Wang, Shancheng Ren, Wangsheng Li, Min Xia, Lihua Lu, Haorong Lu (+9 others)
2017 GigaScience  
With the availability of RED-ML, it  ...  novel RNA editing sites without relying on curated RNA editing databases.  ...  We are indebted to colleagues of Beijing Genomics Institute who contributed to this work but were not included in the author list. We also thank Dr. Laurie Goodman and Dr.  ... 
doi:10.1093/gigascience/gix012 pmid:28328004 pmcid:PMC5467039 fatcat:nzwwqj2xe5b2nmh6b6k2gzfjwa

Putting Data Science Pipelines on the Edge [article]

Ali Akoglu, Genoveva Vargas-Solar
2021 arXiv   pre-print
Thereby, pipelines utilize a set of flexible building blocks that can be dynamically and automatically assembled and re-assembled to meet the dynamic changes in the workload's SLOs.  ...  This paper proposes a composable "Just in Time Architecture" for Data Science (DS) Pipelines named JITA-4DS and associated resource management techniques for configuring disaggregated data centers (DCs  ...  In contrast, data labs' objective is to provide tools for managing and curating data collections, automatically generating qualitative and quantitative meta-data.  ... 
arXiv:2103.07978v1 fatcat:pmcnfjtypzgunowv2bcebgwaga

S2CE: A Hybrid Cloud and Edge Orchestrator for Mining Exascale Distributed Streams [article]

Nicolas Kourtellis and Herodotos Herodotou and Maciej Grzenda and Piotr Wawrzyniak and Albert Bifet
2020 arXiv   pre-print
The explosive increase in volume, velocity, variety, and veracity of data generated by distributed and heterogeneous nodes such as IoT and other devices, continuously challenge the state of art in big  ...  Consequently, it reveals an urgent need to address the ever-growing gap between this expected exascale data generation and the extraction of insights from these data.  ...  Others focus on automatically optimizing and tuning workloads using various techniques such as cost-based (e.g., [43, 50, 87] ) or ML-based (e.g., [16, 89, 90] ).  ... 
arXiv:2007.01260v1 fatcat:hfavtqtpmnd2xo5uh7tzcomm4u

Data-driven materials research enabled by natural language processing and information extraction

Elsa A. Olivetti, Jacqueline M. Cole, Edward Kim, Olga Kononova, Gerbrand Ceder, Thomas Yong-Jin Han, Anna M. Hiszpanski
2020 Applied Physics Reviews  
The next level of depth within the materials examples that have leveraged NLP are those that perform some degree of ML on the data toward the pursuit of fundamental insights.  ...  fine-tuned on a variety of specific materials questions for which there are fewer data.  ... 
doi:10.1063/5.0021106 fatcat:75aap3lkjvhprleptl3bbp6w64
« Previous Showing results 1 — 15 out of 985 results