Filters








8 Hits in 0.61 sec

Limitations of Assessing Active Learning Performance at Runtime [article]

Daniel Kottke, Jim Schellinger, Denis Huseljic, Bernhard Sick
2019 arXiv   pre-print
Classification algorithms aim to predict an unknown label (e.g., a quality class) for a new instance (e.g., a product). Therefore, training samples (instances and labels) are used to deduct classification hypotheses. Often, it is relatively easy to capture instances but the acquisition of the corresponding labels remain difficult or expensive. Active learning algorithms select the most beneficial instances to be labeled to reduce cost. In research, this labeling procedure is simulated and
more » ... ore a ground truth is available. But during deployment, active learning is a one-shot problem and an evaluation set is not available. Hence, it is not possible to reliably estimate the performance of the classification system during learning and it is difficult to decide when the system fulfills the quality requirements (stopping criteria). In this article, we formalize the task and review existing strategies to assess the performance of an actively trained classifier during training. Furthermore, we identified three major challenges: 1)~to derive a performance distribution, 2)~to preserve representativeness of the labeled subset, and 3) to correct against sampling bias induced by an intelligent selection strategy. In a qualitative analysis, we evaluate different existing approaches and show that none of them reliably estimates active learning performance stating a major challenge for future research for such systems. All plots and experiments are provided in a Jupyter notebook that is available for download.
arXiv:1901.10338v1 fatcat:izn3mkks5fakxjnjwiu2h5vnvu

Toward Optimal Probabilistic Active Learning Using a Bayesian Approach [article]

Daniel Kottke, Marek Herde, Christoph Sandrock, Denis Huseljic, Georg Krempl, Bernhard Sick
2020 arXiv   pre-print
Gathering labeled data to train well-performing machine learning models is one of the critical challenges in many applications. Active learning aims at reducing the labeling costs by an efficient and effective allocation of costly labeling resources. In this article, we propose a decision-theoretic selection strategy that (1) directly optimizes the gain in misclassification error, and (2) uses a Bayesian approach by introducing a conjugate prior distribution to determine the class posterior to
more » ... eal with uncertainties. By reformulating existing selection strategies within our proposed model, we can explain which aspects are not covered in current state-of-the-art and why this leads to the superior performance of our approach. Extensive experiments on a large variety of datasets and different kernels validate our claims.
arXiv:2006.01732v1 fatcat:qjuo3lbvtvcp3pl6ispfhe33uu

Toward optimal probabilistic active learning using a Bayesian approach

Daniel Kottke, Marek Herde, Christoph Sandrock, Denis Huseljic, Georg Krempl, Bernhard Sick
2021 Machine Learning  
AbstractGathering labeled data to train well-performing machine learning models is one of the critical challenges in many applications. Active learning aims at reducing the labeling costs by an efficient and effective allocation of costly labeling resources. In this article, we propose a decision-theoretic selection strategy that (1) directly optimizes the gain in misclassification error, and (2) uses a Bayesian approach by introducing a conjugate prior distribution to determine the class
more » ... ior to deal with uncertainties. By reformulating existing selection strategies within our proposed model, we can explain which aspects are not covered in current state-of-the-art and why this leads to the superior performance of our approach. Extensive experiments on a large variety of datasets and different kernels validate our claims.
doi:10.1007/s10994-021-05986-9 fatcat:uihdkyzrdrgb3pqbmqfuag5bdu

A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification [article]

Marek Herde, Denis Huseljic, Bernhard Sick, Adrian Calma
2021 arXiv   pre-print
Denis Huseljic received his B.Sc. and M.Sc. degrees in computer science from the Univ. of Kassel, Germany. Currently, he is also pursuing his Ph.D. degree in computer science there.  ...  Appendices of A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification Marek Herde , Denis Huseljic , Bernhard Sick ,  ... 
arXiv:2109.11301v1 fatcat:kmrh7p6x4fekrhz2rcakmzqkia

A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification

Marek Herde, Denis Huseljic, Bernhard Sick, Adrian Calma
2021 IEEE Access  
Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assume a single, omniscient annotator who never gets tired and charges uniformly regardless of query
more » ... ficulty. However, in real-world applications, we often face human annotators, e.g., crowd or in-house workers, who make annotation mistakes and can be reluctant to respond if tired or faced with complex queries. Recently, a wide range of novel AL strategies has been proposed to address these issues. They differ in at least one of the following three central aspects from traditional AL: (1) They explicitly consider (multiple) human annotators whose performances can be affected by various factors, such as missing expertise. (2) They generalize the interaction with human annotators by considering different query and annotation types, such as asking an annotator for feedback on an inferred classification rule. (3) They take more complex cost schemes regarding annotations and misclassifications into account. This survey provides an overview of these AL strategies and refers to them as real-world AL. Therefore, we introduce a general real-world AL strategy as part of a learning cycle and use its elements, e.g., the query and annotator selection algorithm, to categorize about 60 real-world AL strategies. Finally, we outline possible directions for future research in the field of AL.
doi:10.1109/access.2021.3135514 fatcat:o224fxssvrcetoashuxu3oc4ry

Challenges of Reliable, Realistic and Comparable Active Learning Evaluation

Daniel Kottke, Adrian Calma, Denis Huseljic, Georg Krempl, Bernhard Sick
2017 European Conference on Principles of Data Mining and Knowledge Discovery  
Daniel Kottke, Adrian Calma, Denis Huseljic, Georg Krempl, Bernhard Sick We do not recommend the abbrev. AUC because it can be mixed up with AUROC  ... 
dblp:conf/pkdd/KottkeCHKS17 fatcat:gb33hjl625gjdg6wa6rivfwhwy

A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification

Marek Herde, Denis Huseljic, Bernhard Sick, Adrian Calma, Universität Kassel
2022
DENIS HUSELJIC received the B.Sc. and M.Sc. degrees in computer science from the University of Kassel, Germany, where he is currently pursuing the Ph.D. degree in computer science.  ... 
doi:10.17170/kobra-202205036117 fatcat:2acodly4orbvtdrq54p3ehhwte

Learning Neural Textual Representations for Citation Recommendation

Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi
2021 2020 25th International Conference on Pattern Recognition (ICPR)  
Gaussian Experts in Local Approximation DAY 2 -Jan 13, 2021 Dervakos, Edmund; Filandrianos, Giorgos; Stamou, Giorgos 2531 Heuristics for Evaluation of AI Generated Music DAY 2 -Jan 13, 2021 Huseljic  ...  Iterative Spatiotemporal Fine-Tuning DAY 4 -Jan 15, 2021 Au-Yeung, Lee; Xie, Xianghua; Scale, Timothy Marcus; Chess, James Anthony 2842 DAY 4 -Jan 15, 2021 Herde, Marek; Kottke, Daniel; Huseljic  ... 
doi:10.1109/icpr48806.2021.9412725 fatcat:3vge2tpd2zf7jcv5btcixnaikm