Filters








9,708 Hits in 5.0 sec

A utility-theoretic ranking method for semi-automated text classification

Giacomo Berardi, Andrea Esuli, Fabrizio Sebastiani
2012 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '12  
In Semi-Automated Text Classification (SATC) an automatic classifierΦ labels a set of unlabelled documents D, following which a human annotator inspects (and corrects when appropriate) the labels attributed  ...  We develop a new utility-theoretic ranking method based on the notion of inspection gain, defined as the improvement in classification effectiveness that would derive by inspecting and correcting a given  ...  We call this scenario semi-automated text classification (SATC).  ... 
doi:10.1145/2348283.2348411 dblp:conf/sigir/BerardiES12 fatcat:6g2slzcggzczdf2huu2hr3lcv4

Utility-Theoretic Ranking for Semiautomated Text Classification

Giacomo Berardi, Andrea Esuli, Fabrizio Sebastiani
2015 ACM Transactions on Knowledge Discovery from Data  
Semi-Automated Text Classification (SATC) may be defined as the task of ranking a set D of automatically labelled textual documents in such a way that, if a human annotator validates (i.e., inspects and  ...  We develop new utility-theoretic ranking methods based on the notion of validation gain, defined as the improvement in classification effectiveness that would derive by validating a given automatically  ...  ACKNOWLEDGMENTS We would like to thank David Lewis and Diego Marcheggiani for many interesting discussions on the topics of this article.  ... 
doi:10.1145/2742548 fatcat:banga6436jdh5pfohgpleliztq

Semi-Automated Text Classification for Sensitivity Identification

Giacomo Berardi, Andrea Esuli, Craig Macdonald, Iadh Ounis, Fabrizio Sebastiani
2015 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management - CIKM '15  
We look at sensitivity identification in terms of semi-automated text classification (SATC), the task of ranking automatically classified documents so as to optimize the cost-effectiveness of human post-checking  ...  ., for personal or organizational privacy reasons.  ...  Hence, in this paper, our application of the utility-theoretic approach for semi-automated text classification to the recalloriented task of sensitivity identification, brings two contributions: (i) it  ... 
doi:10.1145/2806416.2806597 dblp:conf/cikm/BerardiEMO015 fatcat:sdeeh5jatrckrk35aq33jgifka

Content-specific network analysis of peer-to-peer communication in an online community for smoking cessation

Sahiti Myneni, Nathan K Cobb, Trevor Cohen
2017 AMIA Annual Symposium Proceedings  
In this paper, we present a methodology for high-throughput semantic and network analysis of large social media datasets, combining semi-automated text categorization with social network analytics.  ...  Implications for socio-behavioral health and wellness platforms are also discussed.  ...  SVD implementation we used for LSA.  ... 
pmid:28269890 pmcid:PMC5333292 fatcat:6zraiatuavenfngq64otipkcya

Unsupervised Key-phrase Extraction and Clustering for Classification Scheme in Scientific Publications [article]

Xiajing Li, Marios Daoutis
2021 arXiv   pre-print
Several methods have been explored for automating parts of Systematic Mapping (SM) and Systematic Review (SR) methodologies.  ...  Finally Semantic key-phrase clustering at term-level can group similar terms together that can be suitable for classification scheme.  ...  Recent text-mining algorithms and NLP techniques can become particularly useful for automating (parts of) this manual work within the systematic mapping studies procedure.  ... 
arXiv:2101.09990v2 fatcat:n6vnkvwxlbc4bogzx5ootfyyya

Active Learning Strategies for Technology Assisted Sensitivity Review [chapter]

Graham McDonald, Craig Macdonald, Iadh Ounis
2018 Lecture Notes in Computer Science  
Therefore, there is a need for new technology assisted review protocols to integrate automatic sensitivity classification into the sensitivity review process.  ...  In this work, we present a thorough evaluation of active learning strategies for sensitivity review.  ...  In that work, the authors evaluated the effectiveness of a utilitytheoretic [10] semi-automated text classification (SATC) approach, for sensitivity classification. The approach of Berardi et al.  ... 
doi:10.1007/978-3-319-76941-7_33 fatcat:kroy3sx4czef5g5fuo7k45q6vq

Data-Driven Requirements Elicitation: A Systematic Literature Review

Sachiko Lim, Aron Henriksson, Jelena Zdravkovic
2021 SN Computer Science  
The outcomes of automated requirements elicitation often result in mere identification and classification of requirements-related information or identification of features, without eliciting requirements  ...  The results reveal that the existing automated requirements elicitation primarily focuses on utilizing human-sourced data, especially online reviews, as requirements sources, and supervised machine learning  ...  Semi-automation refers to having a human-in-the-loop for automating requirements elicitation, thus requirements are directed by human interactions.  ... 
doi:10.1007/s42979-020-00416-4 fatcat:g4g7nb4mwbhuhgmmfxi5vir5ly

KDD2008 workshop report DMMT'08

Chris Ding, Tao Li, Shenghuo Zhu
2008 SIGKDD Explorations  
matrix computations • Feature selection and extraction • Graph-based learning (classification, semi-supervised learning and unsupervised learning) • Matrix factorization for classification Application  ...  and Sparsification • Sparse PCA and SVD • Randomized algorithms for matrix computation • Web search and ranking algorithms • Tensor analysis, 2DSVD and high order SVD • GSVD for classification • Latent  ... 
doi:10.1145/1540276.1540293 fatcat:724tpyimyzfx7hzx2g6t3lkjrq

From Theories to Queries

Burr Settles
2011 Journal of machine learning research  
This article surveys recent work in active learning aimed at making it more practical for real-world use.  ...  An active learner might iteratively select informative query instances to be labeled by an oracle, for example.  ...  sentiment classification in blogs using the pooling multinomials naïve Bayes approach, and consider a similar query setting for a semi-supervised graph/kernel-based text classifier.  ... 
dblp:journals/jmlr/Settles11 fatcat:ezffijkj35g4bgktfm3k2z5aoi

Graph Data Augmentation for Graph Machine Learning: A Survey [article]

Tong Zhao, Gang Liu, Stephan Günnemann, Meng Jiang
2022 arXiv   pre-print
We conclude by outlining currently unsolved challenges as well as directions for future research.  ...  Although several automated augmentation solutions exist for graph contrastive learning, automated augmentation methods for (semi-)supervised graph learning are still needed.  ...  Mixup Several Mixup methods were also proposed for graph classification. For example, the aforementioned Graph Mixup [Wang et al., 2021b ] also works for graph classification.  ... 
arXiv:2202.08871v1 fatcat:gjf7mgihkfbqdg6cqscflcw6ga

An approach based on Combination of Features for automatic news retrieval [article]

Mohammad Moradi, Elham Ghanbari, Mehrdad Maeen, Sasan Harifi
2020 arXiv   pre-print
Then, using the proposed approach, techniques of text categorization, evaluation criteria and ranking algorithms, the data were analyzed and examined.  ...  Understanding this information is very important for providing the best set of information resources for users.  ...  In their paper the multiword considered as an alternative index terms in vector space model for text representation with theoretical support.  ... 
arXiv:2004.11699v1 fatcat:vppybnsh4rdixiqz3vrdsf2iva

Network Text Analysis in Computer-Intensive Rapid Ethnography Retrieval: An Example from Political Networks of Sudan

Laurent Tambayong, Kathleen M. Carley
2019 Journal of Social Structure  
descriptions from texts and fusing the results from varied sources.  ...  Advances in text analysis, particularly the ability to extract network based information from texts, is enabling researches to conduct detailed socio-cultural ethnographies rapidly by retrieving characteristic  ...  Hence, the potential exists to use semi-automated text mining and assessment techniques to rapidly process vast quantities of text-based information.  ... 
doi:10.21307/joss-2019-028 fatcat:3dyucijqbvcdvpkeswkwsx26xu

Unsupervised Approaches for Textual Semantic Annotation, A Survey

Xiaofeng Liao, Zhiming Zhao
2019 ACM Computing Surveys  
Link to publication Creative Commons License (see https://creativecommons.org/use-remix/cc-licenses): CC BY Citation for published version (APA):  ...  ACKNOWLEDGMENTS The authors thank the anonymous reviewers for their helpful comments, in addition to Cees de Laat, Paul Martin, Jayachander Surbiryala, and ZeShun Shi for useful discussions.  ...  The candidate proper nouns are then categorized into the highest ranked concepts as the annotation for that piece of text.  ... 
doi:10.1145/3324473 fatcat:fg5ucwtloze6ljdlh4hqjkqxfe

Classification of Consumer Belief Statements From Social Media [article]

Gerhard Hagerer and Wenbin Le and Hannah Danner and Georg Groh
2021 arXiv   pre-print
on text classification tasks.  ...  It is not yet fully understood how this can be leveraged successfully for classification.  ...  We perceive a great potential in semi-automated class abstraction, especially for large-scale content analysis.  ... 
arXiv:2106.15498v1 fatcat:2zctkylvsbehfdxcvzpyki2cme

Optimising human inspection work in automated verbatim coding

Giacomo Berardi, Andrea Esuli, Fabrizio Sebastiani
2014 International Journal of Market Research  
However, in some of these contexts the accuracy standards imposed by the customer may be too high for today's automated verbatim coding technology; this means that human coders may need to devote some  ...  ) maximises the reduction in the overall error achieved for an available amount of human inspection work. † The order in which the authors are listed is purely alphabetical; each author has given an equally  ...  Thanks also to Ivano Luberti for many interesting discussions, and to two anonymous reviewers for providing many stimulating comments.  ... 
doi:10.2501/ijmr-2014-032 fatcat:z443gs5idfhtnmd36jwvmduj5u
« Previous Showing results 1 — 15 out of 9,708 results