A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Hierarchical Metadata-Aware Document Categorization under Weak Supervision
[article]
2020
Hence, this paper studies how to integrate the label hierarchy, metadata, and text signals for document categorization under weak supervision. ...
Categorizing documents into a given label hierarchy is intuitively appealing due to the ubiquity of hierarchical topic structures in massive text corpora. ...
CONCLUSIONS We present HiMeCat, an embedding-based generative framework for hierarchical metadata-aware document categorization under weak supervision. ...
doi:10.48550/arxiv.2010.13556
fatcat:ecsgzyce3bbs7flpwk73viszga
MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information
[article]
2022
arXiv
pre-print
In this paper, we explore the potential of using metadata to help weakly supervised text classification. ...
To be specific, we model the relationships between documents and metadata via a heterogeneous information network. ...
[39, 40, 41, 44] use a small set of labeled documents or keywords as supervision to categorize text with metadata. ...
arXiv:2111.04022v3
fatcat:cuupcigyzfhj7hjiciwonbpjzq
Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification
[article]
2022
arXiv
pre-print
metadata-aware LMTC method trained on 10K-200K labeled documents; and (3) MICoL tends to predict more infrequent labels than supervised methods, thus alleviates the deteriorated performance on long-tailed ...
In this paper, we study LMTC under the zero-shot setting, which does not require any annotated documents with labels and only relies on label surface names and descriptions. ...
[19] propose a generic approach to add categorical metadata into neural text classifiers. Zhang et al. [68] further study metadata-aware LMTC. ...
arXiv:2202.05932v2
fatcat:ii74jumzsfd3nlxvcshmn2xlca
Automated Educational Course Metadata Generation Based on Semantics Discovery
[chapter]
2009
Lecture Notes in Computer Science
In this paper we present a method for automated metadata generation addressing the educational knowledge discovery problem. ...
The metadata are created automatically under the adaptive course author's (i.e., teacher's) supervision. Thus, his effort in the authoring process is reduced. ...
Learning objects were organized hierarchically and represented using the DocBook language. ...
doi:10.1007/978-3-642-04636-0_11
fatcat:62w4uyrp5jcmdgs7rjoixlqc3a
Modularization and multi-granularity reuse of learning resources
2009
ACM SIGMultimedia Records
Con-
6.2.4 Choosing Effectiveness Measures for Hierarchical Categorization
hierarchical classification [SLN03]. ...
(6.6) Categorization of a document with the kNN method is based on the similarity of documents. ...
The adaptation tool is designed to handle different document formats. ...
doi:10.1145/1662529.1662533
fatcat:cykmsfw3p5dvrfmnw4dzgvca3y
Automatic image semantic interpretation using social action and tagging data
2010
Multimedia tools and applications
Applications are categorized into four types: concept semantics, person identification, location semantics and event semantics. ...
Tang et al. presented a sparse graph-based semi-supervised learning approach for removing weak pairing between image features and tags [180] . ...
Functions are categorized into 'organization' and 'communication', whereas sociality is categorized into 'self' and 'social'. ...
doi:10.1007/s11042-010-0650-8
fatcat:kqu6kyess5f3re554jsueuuzem
Embedding-based Detection and Extraction of Research Topics from Academic Documents Using Deep Clustering
2021
Journal of Data and Information Science
Document embedding approaches are utilized to transform documents into vector-based representations. ...
Design/methodology/approach To achieve the objectives, we propose a modified deep clustering method to detect research trends from the abstracts and titles of academic documents. ...
However, the weak performance of BERT models may come as a surprise. ...
doi:10.2478/jdis-2021-0024
fatcat:ww67w2ezhvdrnedr47vwfddima
NTARC: A Data Model for the Systematic Review of Network Traffic Analysis Research
2020
Applied Sciences
Although the goals and methodologies are commonly similar, we lack initiatives to categorize the data, methods, and findings systematically. ...
The success of data repositories partially lies in creating metadata structures able to categorize and identify datasets and research objects effectively. ...
The database is made available under a Creative Commons Attribution 4.0 license. ...
doi:10.3390/app10124307
fatcat:cvgagi6qyjd3tf5orsp6nymdki
Video Summarization Using Deep Neural Networks: A Survey
[article]
2021
arXiv
pre-print
Based on the outcomes of these comparisons, as well as some documented considerations about the amount of annotated data and the suitability of evaluation protocols, we indicate potential future research ...
Instead of not using any ground-truth data, they use less-expensive weak labels (such as video-level metadata for video categorization and category-driven summarization, or ground-truth annotations for ...
This approach uses video-level metadata (e.g., the video title "A man is cooking") to define a categorization of videos. ...
arXiv:2101.06072v2
fatcat:7mozntfhdrf3lkw6pwcr5v2rpu
Big-Data-Augmented Approach to Emerging Technologies Identification: Case of Agriculture and Food Sector
2017
Social Science Research Network
of both structured (publication, patent metadata) and unstructured (full text reports, declarations and other documents) formats, sentence segmentation and word tokenization, word lemmatization and ...
This allows to map even emerging fields which haven't yet been categorized for the purposes of official statistics.
Figure 2. ...
doi:10.2139/ssrn.3078499
fatcat:o6ver5pxordcnmyoehokv5vuqa
Expert recommendation in community question answering: a review and future direction
2019
International Journal of Crowd Science
Findings This study proposes a comprehensive framework to categorize extant studies into three broad areas of CQA expert recommendation research: understanding profile modeling, recommendation approaches ...
(Riahi et al., 2012) proposed a segmented topic model (STM) that can discover the hierarchical structure of topics, and thus, instead of grouping all users' questions under one topic, allows each question ...
(Li and King, 2010) combined expertise-aware QLL with the Jelinek-Mercer smoothing model that leveraged multiple metadata features such as answer length, question-answer length, the number of answers ...
doi:10.1108/ijcs-03-2019-0011
fatcat:5waemn4e3zfu5b4n55f6qxafbu
Domino: Discovering Systematic Errors with Cross-Modal Embeddings
[article]
2022
arXiv
pre-print
Then, motivated by the recent development of powerful cross-modal representation learning approaches, we present Domino, an SDM that leverages cross-modal embeddings and a novel error-aware mixture model ...
Aortic Valve Malformation Classification (Noisy Label Slice): Weak supervision is commonly used in medical machine learning practice to label clinical datasets. ...
We begin with a base dataset D base that has either a hierarchical label structure (e.g. ImageNet) or rich metadata accompanying each example (e.g.. CelebA). ...
arXiv:2203.14960v3
fatcat:zci7on55mrft5i4lsvhoealira
TAN-NTM: Topic Attention Networks for Neural Topic Modeling
[article]
2021
arXiv
pre-print
To this end, we develop a framework TAN-NTM, which processes document as a sequence of tokens through a LSTM whose contextual outputs are attended in a topic-aware manner. ...
Further, we show that our method learns better latent document-topic features compared to existing topic models through improvement on two downstream tasks: document classification and topic guided keyphrase ...
Card et al. (2018) leverages document metadata but without metadata their method is same as ProdLDA which is our baseline. ...
arXiv:2012.01524v2
fatcat:qzupfp6zzvcejdhqhswx7whniu
Analyzing Sentiments in One Go: A Supervised Joint Topic Modeling Approach
2017
IEEE Transactions on Knowledge and Data Engineering
We propose a novel probabilistic supervised joint aspect and sentiment model (SJASM) to deal with the problems in one go under a unified framework. ...
SJASM represents each review document in the form of opinion pairs, and can simultaneously model aspect terms and corresponding opinion words of the review for hidden aspect and sentiment detection. ...
supported in part by a grant awarded by a Singapore MOE AcRF Tier 2 Grant (ARC30/12), a Singapore MOE AcRF Tier 1 Grant (RG 66/12), and a National Research Foundation, Prime Ministers Office, Singapore under ...
doi:10.1109/tkde.2017.2669027
fatcat:24uviz42ffhutc5ayjuvsudo4m
A Survey of Techniques for Event Detection in Twitter
2013
Computational intelligence
For instance, Twitter has changed the way people and businesses perform, seek advice, and create "ambient awareness" (a sort of virtual omnipresence) and reinforced the weak and strong tie of friendship ...
Nevertheless, they are also categorized according to the detection methods that involve supervised, unsupervised, and hybrid approaches. ...
doi:10.1111/coin.12017
fatcat:wr3wcvxmavbarityeu2szfcuw4
« Previous
Showing results 1 — 15 out of 1,414 results