Filters








4,529 Hits in 6.9 sec

A Pipeline for Optimizing F1-Measure in Multi-label Text Classification

Bingyu Wang, Cheng Li, Virgil Pavlu, Jay Aslam
2018 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)  
Many classic multi-label learning algorithms focus on incorporating label dependencies in the model training phase and optimize for the strict set-accuracy measure.  ...  Multi-label text classification is the machine learning task wherein each document is tagged with multiple labels, and this task is uniquely challenging due to high dimensional features and correlated  ...  CONCLUSION In this paper our main goal is to develop a pipeline in order to reuse classic multi-label models to achieve high F1 scores on multi-label text data.  ... 
doi:10.1109/icmla.2018.00148 dblp:conf/icmla/WangLPA18 fatcat:blhslqaezfb7rbxu3yhb5avlwi

LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation [article]

Qingyu Chen, Jingcheng Du, Alexis Allot, Zhiyong Lu
2022 arXiv   pre-print
This study proposes LITMC-BERT, a transformer-based multi-label classification method in biomedical literature.  ...  It uses a shared transformer backbone for all the labels while also captures label-specific features and the correlations between label pairs.  ...  Robert Leaman for proofreading the manuscript. This research is supported by the NIH Intramural Research Program, National Library of Medicine.  ... 
arXiv:2204.08649v1 fatcat:jwtcrvtyk5hgnm7xunpr6t552e

Hybrid Tagger – An Industry-driven Solution for Extreme Multi-label Text Classification

Kristiina Vaik, Marit Asula, Raul Sirel
2020 Zenodo  
This paper presents an industry-driven solution for extreme multi-label classification with a massive label collection.  ...  The proposed approach incorporates a large number of binary classification models with label pre-filtering and employs methods and technologies shown to be applicable in industrial scenarios where high-end  ...  In the industry multi-label text classification can be used for many different applications, such as describing the subject of a news article by assigning a general topic (e.g. politics, history, sports  ... 
doi:10.5281/zenodo.4306169 fatcat:q5nutroftndw3fbgxt2rgdtymu

Improving Classification of Crisis-Related Social Media Content via Text Augmentation and Image Analysis

Shivam Sharma, Cody Buntain
2020 Text Retrieval Conference  
For the information type labels we did a union of the labels predicted by the synonym-augmentation pipeline.  ...  for every text in the testing data.  ... 
dblp:conf/trec/SharmaB20 fatcat:yilkvq6dbvgybnip2gsf43wcru

Pre-trained language models to extract information from radiological reports

Pilar López-Úbeda, Manuel Carlos Díaz-Galiano, Luis Alfonso Ureña López, María Teresa Martín-Valdivia
2021 Conference and Labs of the Evaluation Forum  
Specifically, we use a multi-class classification model, a binary classification model and a pipeline model for entity identification.  ...  Detecting relevant information automatically in biomedical texts is a crucial task because current health information systems are not prepared to analyze and extract this knowledge due to the time and  ...  have up to ten different entity types or the label non-entity (O), i.e., this approach can be considered as a multi-class classification for each token.  ... 
dblp:conf/clef/Lopez-UbedaDLM21 fatcat:cghdctwklvezrp6hli4acugtji

Regularizing Model Complexity and Label Structure for Multi-Label Text Classification [article]

Bingyu Wang, Cheng Li, Virgil Pavlu, Javed Aslam
2017 arXiv   pre-print
Multi-label text classification is a popular machine learning task where each document is assigned with multiple relevant labels.  ...  At prediction time, we apply support inference to restrict the search space to label sets encountered in the training set, and F-optimizer GFM to make optimal predictions for the F1 metric.  ...  F1 Metric and Optimal F Predictions The most widely used evaluation measure for multi-label text classification is the F1 metric [22] , which assigns partial credit to "almost correct" answers and handles  ... 
arXiv:1705.00740v1 fatcat:374ssbwjmvb25duau7wkwko5dy

BLUE at Memotion 2.0 2022: You have my Image, my Text and my Transformer [article]

Ana-Maria Bucur, Adrian Cosma, Ioan-Bogdan Iordache
2022 arXiv   pre-print
We showcase two approaches for meme classification (i.e. sentiment, humour, offensive, sarcasm and motivation levels) using a text-only method using BERT, and a Multi-Modal-Multi-Task transformer network  ...  Through our efforts, we obtain first place in task A, second place in task B and third place in task C. In addition, our team obtained the highest average score for all three tasks.  ...  We described two solutions for meme classification: i) text-only approach through fine-tuning a BERT model and ii) a Multi-Modal-Multi-Task transformer network that operates on both images and text.  ... 
arXiv:2202.07543v3 fatcat:hsi4cdc7zfhptle45ttbn64tg4

DeepADEMiner: A Deep Learning Pharmacovigilance Pipeline for Extraction and Normalization of Adverse Drug Effect Mentions on Twitter [article]

Arjun Magge, Elena Tutubalina, Zulfat Miftahutdinov, Ilseyar Alimova, Anne Dirkson, Suzan Verberne, Davy Weissenbacher, Graciela Gonzalez-Hernandez
2020 medRxiv   pre-print
Conclusion: Mining ADEs from Twitter posts using a pipeline architecture requires the different components to be trained and tuned based on input data imbalance in order to ensure optimal performance on  ...  Results: The system presented achieved a classification performance of F1 = 0.63, span detection performance of F1 = 0.44 and an end-to-end entity resolution performance of F1 = 0.34 on the presented dataset  ...  of Technical Requirements for Pharmaceuticals for Human Use (ICH).  ... 
doi:10.1101/2020.12.15.20248229 fatcat:vxuishs5gzetfecml5acw3edsy

Deep Learning Model for Sentiment Analysis on Short Informal Texts

Sam Farisa Chaerul Haviana, Bagus Satrio Waluyo Poetro
2022 Indonesian Journal of Electrical Engineering and Informatics (IJEEI)  
This paper proposes a classification model to classify short informal texts.  ...  To evaluate the model in real usage, an application was built. The results were very convincing, reaching 0.979 in accuracy and 0.63 in F1-Score.  ...  Also, many thanks to Fakultas Teknologi Industri (FTI) members for helping to collect the data used in this study.  ... 
doi:10.52549/ijeei.v10i1.3181 fatcat:p4znpeekcrb2pjfxu2b75cbjhu

Multi-Keyword Classification: A Case Study in Finnish Social Sciences Data Archive

Erjon Skenderi, Jukka Huhtamäki, Kostas Stefanidis
2021 Information  
Our selection of multi-label classification methods includes a Naive approach, Multi-label k Nearest Neighbours (ML-kNN), Multi-Label Random Forest (ML-RF), X-BERT and Parabel.  ...  We measured the classification accuracy of the combinations using Precision, Recall and F1 metrics.  ...  We consider the task of Multi-label text classification in the domain of social sciences by using the FSD data.  ... 
doi:10.3390/info12120491 fatcat:6tkiypwyazaizbauzsyeuxgbym

Multi-label classification of COVID-19-related articles with an autoML approach

Ilija Tavchioski, Boshko Koloski, Blaž Škrlj, Senja Pollak
2022 Zenodo  
to the shared task titled LitCovid track Multi-label topic classification for COVID-19 literature annotation.  ...  classification of COVID-19-related texts.  ...  This paper is supported by European Union's Horizon 2020 research and innovation programme under grant agreement No 825153, project EMBEDDIA (Cross-Lingual Embeddings for Less-Represented Languages in  ... 
doi:10.5281/zenodo.5854553 fatcat:l5mnk6rncvawrbydmbj3hjca2y

An Active Learning Based Emoji Prediction Method in Turkish

Emrah Inan
2020 International Journal of Intelligent Systems and Applications in Engineering  
In this paper, we present an active learning method to evaluate the emoji prediction of a tweet with a limited number of labelled Turkish emoji dataset.  ...  Emoji usage has become a standard in social media platforms since it can condense feelings beyond short textual information.  ...  To label texts with emojis there exist lexicons for multi-class and multi-label classification tasks. Novak et al.  ... 
doi:10.18201/ijisae.2020158882 fatcat:nx2ev6qbwfhajkluleiedjka3m

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction [article]

Weipeng Huang and Xingyi Cheng and Taifeng Wang and Wei Chu
2019 arXiv   pre-print
First, BERT is adopted as a feature extraction layer at the bottom of the multi-head selection framework. We further optimize BERT by introducing a semantic-enhanced task during BERT pre-training.  ...  In this paper, we report our method for the Information Extraction task in 2019 Language and Intelligence Challenge.  ...  The hidden state is then feed into a multi-sigmoid layer for classification.  ... 
arXiv:1908.05908v2 fatcat:qgzypr26hjev7j35rk7yucs2zm

Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training

Jianquan Ouyang, Mengen Fu
2022 Mathematics  
In the training phase, since our model requires a large amount of labeled training data, which is often expensive to obtain or unavailable in many tasks, we additionally use self-training to generate pseudo-labeled  ...  Therefore, to meet the comprehensive requirements in such application situations, we construct a multi-task fusion training reading comprehension model based on the BERT pre-training model.  ...  The pipeline approach uses a BERT model for training the extraction MRC task, turns non-extracted MRC tasks into a classification task, and learns the classification task directly for the sentences in  ... 
doi:10.3390/math10030310 fatcat:4gc3cwac5bd3tck57pnicdbfse

Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations [article]

Qingyu Chen, Alexis Allot, Robert Leaman, Rezarta Islamaj Doğan, Jingcheng Du, Li Fang, Kai Wang, Shuo Xu, Yuefu Zhang, Parsa Bagherzadeh, Sabine Bergler, Aakash Bhatnagar (+27 others)
2022 arXiv   pre-print
The highest performing submissions achieved 0.8875, 0.9181, and 0.9394 for macro F1-score, micro F1-score, and instance-based F1-score, respectively.  ...  ., Diagnosis and Treatment) to the articles in LitCovid. Despite the continuing advances in biomedical text mining methods, few have been dedicated to topic annotations in COVID-19 literature.  ...  optimal set of labels for each document.  ... 
arXiv:2204.09781v3 fatcat:us5rqgxjsbchbmaiypjbp4r7nq
« Previous Showing results 1 — 15 out of 4,529 results