8,557 Hits in 7.6 sec

Large-scale text categorization by batch mode active learning

Steven C. H. Hoi, Rong Jin, Michael R. Lyu
2006 Proceedings of the 15th international conference on World Wide Web - WWW '06  
Large-scale text categorization is an important research topic for Web data mining.  ...  The key of the batch mode active learning is how to reduce the redundancy among the selected examples such that each example provides unique information for model updating.  ...  Paul Komarek for sharing the text dataset and the logistic regression package, and comments from anonymous reviewers.  ... 
doi:10.1145/1135777.1135870 dblp:conf/www/HoiJL06 fatcat:y2nvxj23c5b35fkq6t47imq4ju

Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval

S.C.H. Hoi, Rong Jin, M.R. Lyu
2009 IEEE Transactions on Knowledge and Data Engineering  
We apply our batch mode active learning framework to both text categorization and image retrieval.  ...  In this paper, we present a framework for batch mode active learning, which selects a number of informative examples for manual labeling in each iteration.  ...  ACKNOWLEDGMENTS The authors would like to thank Dr Paul Komarek for sharing some text data sets. The work was supported in part by the US National Science  ... 
doi:10.1109/tkde.2009.60 fatcat:a3k2amnstndc5lfde4xq5epft4

Batch-mode active learning for technology-assisted review

Tanay Kumar Saha, Mohammad Al Hasan, Chandler Burgess, Md Ahsan Habib, Jeff Johnson
2015 2015 IEEE International Conference on Big Data (Big Data)  
Recent research demonstrates that Support Vector Machines (SVM) perform very well in finding a compact, yet effective, training dataset in an iterative fashion using batch-mode active learning.  ...  This is fueled largely by dramatic growth in data volumes that may be associated with many matters and investigations.  ...  It is also one of the best learning algorithms for large-scale text categorization.  ... 
doi:10.1109/bigdata.2015.7363867 dblp:conf/bigdataconf/SahaHBHJ15 fatcat:6rqn6nunuvhblaplhtdchlwmje

Batch Mode Active Learning for Networked Data

Lixin Shi, Yuhang Zhao, Jie Tang
2012 ACM Transactions on Intelligent Systems and Technology  
We study a novel problem of batch mode active learning for networked data.  ...  To scale to real large networks, we develop a parallel implementation of the algorithm.  ...  This method has been applied to large scale text categorization [Hoi et al. 2006a ], medical image classification [Hoi et al. 2006b ] and image retrieval [Steven et al. 2009 ].  ... 
doi:10.1145/2089094.2089109 fatcat:u7hrjucd6bfh5hv7ebznfzjrze

Analysing Predictive Coding Algorithms for Document Review

Aditi Wikhe
2021 International Journal for Research in Applied Science and Engineering Technology  
Keywords: Technology-assisted-review, predictive coding, machine learning, text classification, deep learning, CNN , Unscented Kalman Filter, Logistic Regression, SVM  ...  Attorneys now have been using machine learning techniques like text classification to identify responsive information.  ...  The superiority of their solution over existing methods (Brinker and SVMactive) for the experiment is supported by findings on a series of large-scale real-life legal document collections.  ... 
doi:10.22214/ijraset.2021.39076 fatcat:3yfyleh6trehjpfo3nakvn7sq4

Parallel MCMC Without Embarrassing Failures [article]

Daniel Augusto de Souza, Diego Mesquita, Samuel Kaski, Luigi Acerbi
2022 arXiv   pre-print
Embarrassingly parallel Markov Chain Monte Carlo (MCMC) exploits parallel computing to scale Bayesian inference to large datasets by using a two-step approach.  ...  Our strategy, Parallel Active Inference (PAI), leverages Gaussian Process (GP) surrogate modeling and active learning.  ...  We also acknowledge the computational resources provided by the Aalto Science-IT Project from Computer Science IT.  ... 
arXiv:2202.11154v2 fatcat:ssc5p6ox7fb4piq4talslnfuxa

A Survey of Active Learning for Text Classification using Deep Neural Networks [article]

Christopher Schröder, Andreas Niekler
2020 arXiv   pre-print
For active learning (AL) purposes, NNs are, however, less commonly used -- despite their current popularity.  ...  By using the superior text classification performance of NNs for AL, we can either increase a model's performance using the same amount of data or reduce the data and therefore the required annotation  ...  This research was partially funded by the Development Bank of Saxony (SAB) under project number 100335729.  ... 
arXiv:2008.07267v1 fatcat:joainuwblzbaplbls54tq4do3u

Combining link and content for collective active learning

Lixin Shi, Yuhang Zhao, Jie Tang
2010 Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10  
In this paper, we study a novel problem Collective Active Learning, in which we aim to select a batch set of "informative" instances from a networking data set to query the user in order to improve the  ...  accuracy of the learned classification model.  ...  Experimental results show that our approach clearly outperforms (+6%) the baseline methods of single mode active learning and batch mode active learning on linked data sets.  ... 
doi:10.1145/1871437.1871740 dblp:conf/cikm/ShiZT10 fatcat:botdwfoozffttov2eleay4kfam

Batch mode Adaptive Multiple Instance Learning for computer vision tasks

Wen Li, Lixin Duan, I. W. Tsang, Dong Xu
2012 2012 IEEE Conference on Computer Vision and Pattern Recognition  
Such batch mode framework significantly accelerates the traditional MIL methods for large scale applications and can be also used in dynamic environments such as object tracking.  ...  In this paper, we propose a novel batch mode framework, namely Batch mode Adaptive Multiple Instance Learning (BAMIL), to accelerate the instance-level MIL methods.  ...  Acknowledgement This research is supported by the Singapore National Research Foundation under its Interactive & Digital Media (IDM) Public Sector R&D Funding Initiative and administered by the IDM Programme  ... 
doi:10.1109/cvpr.2012.6247949 dblp:conf/cvpr/LiDTX12 fatcat:5lfkmrwcxvee5pyylpofxi36be

Batch Mode Sparse Active Learning

Lixin Shi, Yuhang Zhao
2010 2010 IEEE International Conference on Data Mining Workshops  
BMSAL(Batch Mode Sparse Active Learning).  ...  Keywords-batch mode sparse active learning; sparse classification; active learning; submodularity 2010 IEEE International Conference on Data Mining Workshops 978-0-7695-4257-7/10 $26.00  ...  We use two most popular batch mode active learning methods as our baseline: SVM active learning method is a batch mode active learning method.  ... 
doi:10.1109/icdmw.2010.175 dblp:conf/icdm/ShiZ10 fatcat:fuqc45ykgvd37bkptyf6ydomse

Possibilistic Fuzzy Clustering for Categorical Data Arrays Based on Frequency Prototypes and Dissimilarity Measures

Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Viktoriia O. Samitova
2017 International Journal of Intelligent Systems and Applications  
Fuzzy clustering procedures for categorical data are proposed in the paper.  ...  A detailed description of a possibilistic fuzzy clustering method based on frequency-based cluster prototypes and dissimilarity measures for categorical data is given.  ...  to their batch-mode analogues, the most meaningful characteristic for experimental researching was considered a system's self-learning speed.  ... 
doi:10.5815/ijisa.2017.05.07 fatcat:wqnf674aofef7dn7lt6aafdix4

Batch Mode Active Learning for Multimedia Pattern Recognition

Shayok Chakraborty, Vineeth Balasubramanian, Sethuraman Panchanathan
2012 2012 IEEE International Symposium on Multimedia  
This has expanded the possibility of solving real world problems using computational learning frameworks.  ...  However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise.  ...  ACKNOWLEDGEMENTS My tenure at Arizona State University has been influenced and guided by a number of people to whom I am deeply indebted.  ... 
doi:10.1109/ism.2012.101 dblp:conf/ism/ChakrabortyBP12 fatcat:kvr4sjlulrcv5cdwtrapadskm4

An enhanced short text categorization model with deep abundant representation

Yanhui Gu, Min Gu, Yi Long, Guandong Xu, Zhenglu Yang, Junsheng Zhou, Weiguang Qu
2018 World wide web (Bussum)  
Many researches focus on data sparsity and ambiguity issues in short text categorization.  ...  Short text categorization is a crucial issue to many applications, e.g., Information Retrieval, Question-Answering System, MRI Database Construction and so forth.  ...  With large-scale embedding representation and topic model, we can extract useful latent semantic information for short text categorization.  ... 
doi:10.1007/s11280-018-0542-9 fatcat:4oyqoapdlfgbnmrbb7zrrxowkm

Malware Binary Image Classification Using Convolutional Neural Networks

John Kiger, Shen-Shyang Ho, Vahid Heydari
2022 International Conference on Cyber Warfare and Security (ICIW)  
Furthermore, the proliferation of malicious files and new malware signatures increases year by year.  ...  One of these cybersecurity tasks where machine learning may prove advantageous is malware analysis and classification.  ...  Acknowledgement This material is based upon work supported by the National Science Foundation under Grant No. 1753900.  ... 
doi:10.34190/iccws.17.1.59 fatcat:3xgqmm3syfe3lninuqg5dxeumu

TARexp: A Python Framework for Technology-Assisted Review Experiments [article]

Eugene Yang, David D. Lewis
2022 arXiv   pre-print
Technology-assisted review (TAR) is an important industrial application of information retrieval (IR) and machine learning (ML).  ...  Large scale, deterministically reproducible experiments are supported.  ...  Expensive large scale experiments are therefore necessary to derive meaningful generalizations.  ... 
arXiv:2202.11827v1 fatcat:yzhxk6porfhhzo5drj4bwxiajy
« Previous Showing results 1 — 15 out of 8,557 results