A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2005; you can also visit the original URL.
The file type is application/pdf
.
Filters
Bootstrapping for hierarchical document classification
2003
Proceedings of the twelfth international conference on Information and knowledge management - CIKM '03
For this reason we propose a method for the bootstrapping 1 process that makes a first hypothesis of categorization for a set of unlabeled documents, with respect to a given empty hierarchy of concepts ...
Within this process, bootstrapping a taxonomy with examples represents a critical factor for the effective exploitation of any supervised learning model. ...
This is very important for a subsequent supervised classification, since it is the premise to obtain a homogeneous assignment of the documents to the nodes, and consequently, a highly accurate hierarchical ...
doi:10.1145/956863.956920
dblp:conf/cikm/AdamiAS03
fatcat:okxiwqm5sjbpbod2fntsewvoki
Bootstrapping for hierarchical document classification
2003
Proceedings of the twelfth international conference on Information and knowledge management - CIKM '03
For this reason we propose a method for the bootstrapping 1 process that makes a first hypothesis of categorization for a set of unlabeled documents, with respect to a given empty hierarchy of concepts ...
Within this process, bootstrapping a taxonomy with examples represents a critical factor for the effective exploitation of any supervised learning model. ...
This is very important for a subsequent supervised classification, since it is the premise to obtain a homogeneous assignment of the documents to the nodes, and consequently, a highly accurate hierarchical ...
doi:10.1145/956919.956920
fatcat:hfsgutcmiffnppqc5usuuwc4xa
On Dataless Hierarchical Text Classification
2014
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
In this paper, we systematically study the problem of dataless hierarchical text classification. ...
Our results show that bootstrapped dataless classification is competitive with supervised classification with thousands of labeled examples. ...
Dataless + Bootstrapping Inspired by the dataless flat classification paper (Chang et al. 2008) , we also propose a bootstrapping procedure for dataless hierarchical classification. ...
doi:10.1609/aaai.v28i1.8938
fatcat:sayc3vwx3vdtzkha54egcrb5du
Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing
[article]
2022
arXiv
pre-print
Simulation studies and real data applications demonstrate that the EM-HRNN model with bootstrap training outperforms other RNN-based models in document classification tasks. ...
Furthermore, we develop two bootstrap strategies to effectively and efficiently train the EM-HRNN model on long text documents. ...
Document Classification In the paper, we focus on the task of document classification. ...
arXiv:2201.08919v1
fatcat:jubyuozwrjcydalzpcz26upo54
Clustering documents into a web directory for bootstrapping a supervised classification
2005
Data & Knowledge Engineering
The management of hierarchically organized data is starting to play a key role in the knowledge management community due to the proliferation of topic hierarchies for text documents. ...
This paper proposes some solutions for the bootstrapping problem, that implicitly or explicitly use taxonomy definition: a baseline approach that classifies documents according to the class terms, and ...
Specifically, for any taxonomy, 90% of the documents were used to bootstrap the taxonomy. ...
doi:10.1016/j.datak.2004.11.003
fatcat:76buyyrlhzbpjfmzeyazymhuei
Clustering documents in a web directory
2003
Proceedings of the fifth ACM international workshop on Web information and data management - WIDM '03
Hierarchical categorization of documents is a task receiving growing interest due to the widespread proliferation of topic hierarchies for text documents. ...
In this paper, we propose some solutions for the bootstrapping problem, implicitly or explicitly using a taxonomy definition: a baseline approach where documents are classified according to class labels ...
Nevertheless, a first sample of classified documents is always required. Moreover, these models are devised for non-hierarchical sets of classes. ...
doi:10.1145/956714.956715
fatcat:yspvbfs6y5dqjkemql3kpjvq64
Clustering documents in a web directory
2003
Proceedings of the fifth ACM international workshop on Web information and data management - WIDM '03
Hierarchical categorization of documents is a task receiving growing interest due to the widespread proliferation of topic hierarchies for text documents. ...
In this paper, we propose some solutions for the bootstrapping problem, implicitly or explicitly using a taxonomy definition: a baseline approach where documents are classified according to class labels ...
Nevertheless, a first sample of classified documents is always required. Moreover, these models are devised for non-hierarchical sets of classes. ...
doi:10.1145/956699.956715
dblp:conf/widm/AdamiAS03
fatcat:4zebjctjpndbpl3n3pb43igla4
Learning to Integrate Web Taxonomies
2004
Social Science Research Network
The second technique, Co-Bootstrapping, tries to facilitate the exploitation of inter-taxonomy relationships by providing category indicator functions as additional features for the objects. ...
We investigate machine learning methods for automatically integrating objects from different taxonomies into a master taxonomy. ...
For each ik i S S ∈ ⊂ x , one reasonable way to achieve hierarchical CS is as follows: first compute (1 ) To extend Co-Bootstrapping, it is useful to consider hierarchies as trees. ...
doi:10.2139/ssrn.3199170
fatcat:bvvssera2fdqlntmgwespr62mq
Convex Point Estimation using Undirected Bayesian Transfer Hierarchies
[article]
2012
arXiv
pre-print
We show that our framework is effective for learning models that are part of transfer hierarchies for two real-life tasks: object shape modeling using Gaussian density estimation and document classification ...
When related learning tasks are naturally arranged in a hierarchy, an appealing approach for coping with scarcity of instances is that of transfer learning using a hierarchical Bayes framework. ...
We consider the task of density estimation for multivariate Gaussian shape models as well as a document classification task. ...
arXiv:1206.3252v1
fatcat:zym5jswsznc3hdmt7lgarohtpy
Bootstrapping Wikipedia to answer ambiguous person name queries
2014
2014 IEEE 30th International Conference on Data Engineering Workshops
While such features yield satisfactory results for a wide range of queries, they aggravate the problem of search for ambiguous entities: Searching for a person yields satisfactory results only if the person ...
We show that when searching with ambiguous person names the information from Wikipedia can be bootstrapped to group the results according to the individuals occurring in them. ...
WEB PAGE CLASSIFICATION Our goal is to transform the clustering of search results to ambiguous person name queries into a classification task, by bootstrapping knowledge base entities about people with ...
doi:10.1109/icdew.2014.6818303
dblp:conf/icde/GrutzeKZN14
fatcat:wxwuwip2hbfzbcejg7m5jeqxwe
Weakly-Supervised Hierarchical Text Classification
2019
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
In this paper, we propose a weakly-supervised neural method for hierarchical text classification. ...
Hierarchical text classification, which aims to classify text documents into a given hierarchy, is an important task in many real-world applications. ...
We thank anonymous reviewers for valuable and insightful feedback. ...
doi:10.1609/aaai.v33i01.33016826
fatcat:imnwyi4h5zh2zp74gi44dvqky4
Weakly-Supervised Hierarchical Text Classification
[article]
2018
arXiv
pre-print
In this paper, we propose a weakly-supervised neural method for hierarchical text classification. ...
Hierarchical text classification, which aims to classify text documents into a given hierarchy, is an important task in many real-world applications. ...
We thank anonymous reviewers for valuable and insightful feedback. ...
arXiv:1812.11270v1
fatcat:ka6tzyjmjzccbp4fts6cppovwe
Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data
[article]
2021
arXiv
pre-print
Our framework uses the fine-tuned generative models to sample pseudo-training data for training the classifier, and bootstraps on real unlabeled data for model refinement. ...
To accommodate such requirements, we introduce a new problem called coarse-to-fine grained classification, which aims to perform fine-grained classification on coarsely annotated data. ...
Acknowledgements We thank anonymous reviewers and program chairs for their valuable and insightful feedback. ...
arXiv:2109.10856v1
fatcat:xcsqk4vzgvhxpdfeprqaiozbxi
Limitations of Transformers on Clinical Text Classification
2021
IEEE journal of biomedical and health informatics
In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several ...
classification on long clinical texts is limited. ...
For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. ...
doi:10.1109/jbhi.2021.3062322
pmid:33635801
pmcid:PMC8387496
fatcat:yofi4nzvybehdpidxpirrmscm4
Classification of a COVID-19 dataset by using labels created from clustering algorithms
2021
Indonesian Journal of Electrical Engineering and Computer Science
In this paper, the hierarchical and k-means clustering techniques are used to create a tool for identifying similar articles on COVID-19 and filtering them based on their titles. ...
By using this tool, specialists can limit the number of articles they need to study and pre-process these articles via data framing, tokenisation, normalisation and term frequency-inverse document frequency ...
ACKNOWLEDGEMENTS This work was funded by the Allen Institute for AI, which prepared the CORD-19 dataset in partnership with leading research groups, and Kaggle, which hosted the COVID-19 Open Research ...
doi:10.11591/ijeecs.v21.i1.pp164-173
fatcat:qrl3526k25b6lfvx22viw6swui
« Previous
Showing results 1 — 15 out of 11,476 results