Filters








395 Hits in 8.7 sec

An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining [article]

Yifan Peng, Qingyu Chen, Zhiyong Lu
2020 arXiv   pre-print
In this work, we study a multi-task learning model with multiple decoders on varieties of biomedical and clinical natural language processing tasks such as text similarity, relation extraction, named entity  ...  Our empirical results demonstrate that the MTL fine-tuned models outperform state-of-the-art transformer models (e.g., BERT and its variants) by 2.0% and 1.3% in biomedical and clinical domains, respectively  ...  This work was also supported by the National Library of Medicine of the National Institutes of Health under award number K99LM013001.  ... 
arXiv:2005.02799v1 fatcat:nr7e2axopnbvva6kocz6wkojfm

An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining

Yifan Peng, Qingyu Chen, Zhiyong Lu
2020 Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing   unpublished
In this work, we study a multi-task learning model with multiple decoders on varieties of biomedical and clinical natural language processing tasks such as text similarity, relation extraction, named entity  ...  Our empirical results demonstrate that the MTL finetuned models outperform state-of-the-art transformer models (e.g., BERT and its variants) by 2.0% and 1.3% in biomedical and clinical domains, respectively  ...  This work was also supported by the National Library of Medicine of the National Institutes of Health under award number K99LM013001.  ... 
doi:10.18653/v1/2020.bionlp-1.22 fatcat:gvwttldrhvfdhiz3ph55j3upu4

Pre-trained Language Models in Biomedical Domain: A Systematic Survey [article]

Benyou Wang, Qianqian Xie, Jiahuan Pei, Prayag Tiwari, Zhao Li, Jie fu
2021 arXiv   pre-print
health records, protein, and DNA sequences for various biomedical tasks.  ...  This also benefits biomedical domain: researchers from informatics, medicine, and computer science (CS) communities propose various PLMs trained on biomedical datasets, e.g., biomedical text, electronic  ...  For advancing multi-document summarization of biomedical literature, Deyoug et al [46] released a novel dataset of multi-document summarization on medical studies called 2 , which contains over 470k  ... 
arXiv:2110.05006v2 fatcat:aykwfhgi4jgmfovissgdvknny4

Identification of Semantically Similar Sentences in Clinical Notes: Iterative Intermediate Training using Multi-Task Learning (Preprint)

Diwakar Mahajan, Ananya Poddar, Jennifer J Liang, Yen-Ting Lin, John M Prager, Parthasarathy Suryanarayanan, Preethi Raghavan, Ching-Huei Tsou
2020 JMIR Medical Informatics  
We incrementally ensembled the output from applying IIT-MTL on ClinicalBERT with the output of other language models (bidirectional encoder representations from transformers for biomedical text mining  ...  We developed an iterative intermediate training approach using multi-task learning (IIT-MTL), a multi-task training approach that employs iterative data set selection.  ...  Acknowledgments The authors wish to thank Dr Bharath Dandala and Venkata Joopudi for providing valuable feedback on the manuscript.  ... 
doi:10.2196/22508 pmid:33245284 fatcat:vmlnogpwgjhi5psw37dmbhfpge

AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models [article]

Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
2021 arXiv   pre-print
We discuss core concepts of transformer-based PLMs like pretraining methods, pretraining tasks, fine-tuning methods, and various embedding types specific to biomedical domain.  ...  We strongly believe there is a need for a survey paper that can provide a comprehensive survey of various transformer-based biomedical pretrained language models (BPLMs).  ...  [184] proposed a novel multi-task model based on BioBERT for biomedical question answering.  ... 
arXiv:2105.00827v2 fatcat:yzsr4tg7lrexzinrn5psw5r5q4

LinkBERT: Pretraining Language Models with Document Links [article]

Michihiro Yasunaga, Jure Leskovec, Percy Liang
2022 arXiv   pre-print
LinkBERT is especially effective for multi-hop reasoning and few-shot QA (+5% absolute improvement on HotpotQA and TriviaQA), and our biomedical LinkBERT sets new states of the art on various BioNLP tasks  ...  Language model (LM) pretraining can learn various knowledge from text corpora, helping downstream tasks.  ...  Acknowledgment We thank Siddharth Karamcheti, members of the Stanford P-Lambda, SNAP and NLP groups, as well as our anonymous reviewers for valuable feedback.  ... 
arXiv:2203.15827v1 fatcat:xo6alwunz5chvcrdpeahhs4oaa

Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders [article]

Fangyu Liu, Ivan Vulić, Anna Korhonen, Nigel Collier
2021 arXiv   pre-print
We propose an extremely simple, fast and effective contrastive learning technique, termed Mirror-BERT, which converts MLMs (e.g., BERT and RoBERTa) into such encoders in 20-30 seconds without any additional  ...  Notably, in the standard sentence semantic similarity (STS) tasks, our self-supervised Mirror-BERT model even matches the performance of the task-tuned Sentence-BERT models from prior work.  ...  Acknowledgements We thank the reviewers and the AC for their considerate comments. We also thank the LTL members and Xun Wang for insightful feedback.  ... 
arXiv:2104.08027v2 fatcat:aoeddhiep5cjvemqioa3dlgvee

Automated Mining of Leaderboards for Empirical AI Research [article]

Salomon Kabongo, Jennifer D'Souza, Sören Auer
2021 arXiv   pre-print
In this regard, the Leaderboards facet of information organization provides an overview on the state-of-the-art by aggregating empirical results from various studies addressing the same research challenge  ...  The construction of Leaderboards could be greatly expedited with automated text mining.  ...  Acknowledgements This work was co-funded by the Federal Ministry of Education and Research (BMBF) of Germany for the project LeibnizKILabor (grant no. 01DD20003) and by the European Research Council for  ... 
arXiv:2109.13089v1 fatcat:7wc55fqho5cwjblnoxsg46i4wq

Self-Alignment Pretraining for Biomedical Entity Representations [article]

Fangyu Liu, Ehsan Shareghi, Zaiqiao Meng, Marco Basaldella, Nigel Collier
2021 arXiv   pre-print
In contrast with previous pipeline-based hybrid systems, SapBERT offers an elegant one-model-for-all solution to the problem of medical entity linking (MEL), achieving a new state-of-the-art (SOTA) on  ...  This is of paramount importance for entity-level tasks such as entity linking where the ability to model entity relations (especially synonymy) is pivotal.  ...  Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets.  ... 
arXiv:2010.11784v2 fatcat:ln7msidxuvdxfeudcf5opyxlqa

Special Issue on "Natural Language Processing: Emerging Neural Approaches and Applications"

Massimo Esposito, Giovanni Luca Masala, Aniello Minutolo, Marco Pota
2021 Applied Sciences  
Nowadays, systems based on artificial intelligence are being developed, leading to impressive achievements in a variety of complex cognitive tasks, matching or even beating humans [...]  ...  Acknowledgments: We would like to thank all the authors, the dedicated referees, the editor team of applied sciences for their valuable contributions, making this special issue a success.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app11156717 fatcat:za7ue7daefc3bchbndvs6guo2q

Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering

Khishigsuren Davagdorj, Ling Wang, Meijing Li, Van-Huy Pham, Keun Ho Ryu, Nippon Theera-Umpon
2022 International Journal of Environmental Research and Public Health  
Second, representative vectors are extracted from a pre-trained BioBERT language model for biomedical text mining.  ...  Text Mining) BioBERT domain-specific language representations to enhance the clustering accuracy.  ...  Acknowledgments: The authors would like to thank reviewers for their essential suggestions to improve the manuscript. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/ijerph19105893 pmid:35627429 pmcid:PMC9141535 fatcat:z6tzrvefizburae6tpkbnragke

Does constituency analysis enhance domain-specific pre-trained BERT models for relation extraction? [article]

Anfu Tang, Claire Nédellec
2021 arXiv   pre-print
Recently many studies have been conducted on the topic of relation extraction.  ...  genes are studied.  ...  The pre- provide reliable sources for biomedical text mining.  ... 
arXiv:2112.02955v1 fatcat:xjdbtgfr2fg77fkcfi7m2gymea

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination [article]

Quan Wang and Songtai Dai and Benfeng Xu and Yajuan Lyu and Yong Zhu and Hua Wu and Haifeng Wang
2022 arXiv   pre-print
Extensive experiments on 11 Chinese biomedical language understanding tasks of various forms verify the effectiveness and superiority of our approach.  ...  Pre-trained language models (PLMs), such as BERT and GPT, have revolutionized the field of NLP, not only in the general domain but also in the biomedical domain.  ...  Biomedicine and healthcare, as a field with large, rapidly growing volume of free text and continually increasing demand for text mining, has received massive attention and achieved rapid progress.  ... 
arXiv:2110.07244v2 fatcat:gporge2qlrbg7lrbb2dejiaefm

Fast and Effective Biomedical Entity Linking Using a Dual Encoder [article]

Rajarshi Bhowmik and Karl Stratos and Gerard de Melo
2021 arXiv   pre-print
Biomedical entity linking is the task of identifying mentions of biomedical concepts in text documents and mapping them to canonical entities in a target thesaurus.  ...  We show that our proposed model is multiple times faster than existing BERT-based models while being competitive in accuracy for biomedical entity linking.  ...  We thank Diffbot and the Google Cloud Platform for granting us access to computing infrastructure used to run some of the experiments reported in this paper.  ... 
arXiv:2103.05028v1 fatcat:i5x3nmpxz5eaxjdzjrvnujgoci

Relation Classification for Bleeding Events from Electronic Health Records: Exploration of Deep Learning Systems (Preprint)

Avijit Mitra, Bhanu Pratap Singh Rawat, David D McManus, Hong Yu
2021 JMIR Medical Informatics  
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.  ...  ” [28] used BERT with entity information for relation A majority of previous studies on clinical text have primarily classification on the SemEval-2010 Task 8 data set [29] and  ... 
doi:10.2196/27527 fatcat:re3lu3csonazxa2w3pfegpanym
« Previous Showing results 1 — 15 out of 395 results