A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Seed-Guided Deep Document Clustering
[chapter]
2020
Lecture Notes in Computer Science
In this paper, we jointly learn deep representations and bias the clustering results through the seed words, leading to a Seed-guided Deep Document Clustering approach. ...
This seed-guided constrained document clustering problem was recently addressed through topic modeling approaches. ...
Seed-Guided Deep Document Clustering Deep clustering consists in jointly performing clustering and deep representation learning in an unsupervised fashion (e.g., with an auto-encoder). ...
doi:10.1007/978-3-030-45439-5_1
fatcat:cug7brgy6bdxzcrwynaiarcz6y
CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring
[article]
2020
arXiv
pre-print
In this paper, we propose a method for seed-guided topical taxonomy construction, which takes a corpus and a seed taxonomy described by concept names as input, and constructs a more complete taxonomy based ...
on user's interest, wherein each node is represented by a cluster of coherent terms. ...
Seed-Guided Taxonomy Construction. ...
arXiv:2010.06714v1
fatcat:u7b7y444dnfxzalqmx4jhqfcqq
Urdu Documents Clustering with Unsupervised and Semi-Supervised Probabilistic Topic Modeling
2020
Information
Document clustering is to group documents according to certain semantic features. Topic model has a richer semantic structure and considerable potential for helping users to know document corpora. ...
Therefore, document clustering has become a challenging task in Urdu language, which has its own morphology, syntax and semantics. ...
Seed Topics Seeded-ULDA allow a user to guide the topic discovery process. The user can give sets of seeded words that are representative of the given dataset. ...
doi:10.3390/info11110518
fatcat:4t3pre3d2vegzf2kojgeqhflsu
Identification of Vine Weeds in Florida Citrus
1969
EDIS
A combination of leaf, stem, fruit, and/or seed characteristics will aid in the identification process. ...
A useful guide of characteristics to identify broadleaf plants are included at the end of this article. ...
Flowers: yellow, 5 to 8 cm long, funnel-shaped, single or in small clusters.
Fruit: flat, bean-like, up to 20 inches long, contains oblong, winged seeds. ...
doi:10.32473/edis-hs185-2003
fatcat:vochtehfw5brtp6w6isa6dlw7i
Discovering Topic Representative Terms for Short Text Clustering
2019
IEEE Access
INDEX TERMS Short text, clustering, topic representative terms. ...
., supported by a cluster of short texts), and we also call them topic representative terms. ...
Short texts belong to the same latent topic are grouped as a cluster. STC 2 [4] is a deep learning based clustering framework for short texts. ...
doi:10.1109/access.2019.2927345
fatcat:7jltjkmohzae5ha2wk5rlvblzi
CrowdTSC: Crowd-based Neural Networks for Text Sentiment Classification
[article]
2020
arXiv
pre-print
Sampling and clustering are utilized to reduce the cost of crowdsourcing. ...
Also, we present an attention-based neural network and a hybrid neural network, which incorporate the collected keywords as human being's guidance into deep neural networks. ...
To reduce the monetary cost of hiring the crowd workers, we design a cluster-based crowdsourcing method to collect keywords in the given text datasets. ...
arXiv:2004.12389v1
fatcat:wu23ccmilnad5bvuwiy7diqx7i
Scaling up Analogy with Crowdsourcing and Machine Learning
2016
International Conference on Case-Based Reasoning
We demonstrate our approach with a crowdsourced analogy identification task, whose results are used to train deep learning algorithms. ...
In this paper, we propose to leverage crowdsourcing techniques to construct a dataset with rich "analogy-tuning" signals, used to guide machine learning models towards matches based on relations rather ...
However, these methods are not aimed at finding analogical clusters, which requires supporting deep relational similarity rather than surface similarity. ...
dblp:conf/iccbr/ChanHSK16
fatcat:33ye36x4mzg6xig6txr7ik676i
A Survey on Automatically Mining Facets for Web Queries
2017
International Journal of Electrical and Computer Engineering (IJECE)
From these top seed sites facets are extracted by document parsing, weighting, clustering and ranking of the extracted facets. ...
to my guide and Head of the Department of Computer Engineering, RMDSSOE, Prof. ...
doi:10.11591/ijece.v7i6.pp3700-3704
fatcat:ahqoepjrfbb75c4xfjcdqnfidm
Feature space learning model
2018
Journal of Ambient Intelligence and Humanized Computing
To avoid the complex training processes in deep learning models which project original feature space into low-dimensional ones, we propose a novel feature space learning (FSL) model. ...
FSL algorithms are proposed with the feature space updating procedure; (3) FSL can provide a better data understanding and learn descriptive and compact feature spaces without the tough training for deep ...
This model combines prior information and an assumption of consistency, which could not only embed the labeled information in similarity measurements, but also guide the clustering procedures. ...
doi:10.1007/s12652-018-0805-4
pmid:31068980
pmcid:PMC6502470
fatcat:ny7qwf3axrbklk25sy2cn2voum
Improving Seeded k-Means Clustering with Deviation- and Entropy-Based Term Weightings
2020
IEICE transactions on information and systems
The outcome of document clustering depends on the scheme used to assign a weight to each term in a document. ...
In addition, their potential combinations are investigated to find optimal solutions in guiding the clustering process. ...
, Information and Communication Engineers trolling/guiding the process of clustering documents. ...
doi:10.1587/transinf.2019iip0017
fatcat:nsngvz7ewnfobhizk44utc4v24
Seeded Hierarchical Clustering for Expert-Crafted Taxonomies
[article]
2022
arXiv
pre-print
In this work, we study Seeded Hierarchical Clustering (SHC): the task of automatically fitting unlabeled data to such taxonomies using only a small set of labeled examples. ...
HierSeed assigns documents to topics by weighing document density against topic hierarchical structure. ...
Definitions Problem Formulation Given an unlabeled corpus D (the fitting set), a hierarchy of topics T 1 of height N and a seed documents set S for each topic, the aim of Seeded Hierarchical Clustering ...
arXiv:2205.11602v1
fatcat:jmdwans4jnejlp5rxpwlyd3dbe
W2VLDA: Almost Unsupervised System for Aspect Based Sentiment Analysis
[article]
2017
arXiv
pre-print
Assessing the seed words impact Since the proposed approach heavily relies on the seed words (i.e. seeds words are the only source of supervision to guide the algorithm to the desired goal), it is interesting ...
In the case of modelling the polarity of the documents, it usually means using a carefully selected set of seed words. ...
arXiv:1705.07687v2
fatcat:iuxbvind6rcz7flr2432hdlwu4
Multi-Label Annotation and Classification of Arabic Texts Based on Extracted Seed Keyphrases and Bi-Gram Alphabet Feed Forward Neural Networks Model
2022
ACM Transactions on Asian and Low-Resource Language Information Processing
In this phase, review data instances were automatically annotated as multi-label based on the extracted seed keyphrases clusters. ...
These keyphrases are referred to as seed keyphrases. Extracted seed keyphrases are divided into several clusters based on their topics. Each cluster is assigned a suitable label. ...
To identify cluster labels, the obtained seed keyphrases are divided into groups (clusters) based on their relevance to a specific topic (their context). ...
doi:10.1145/3539607
fatcat:ld4u43gdsrdqtm22vs27r2zuye
Service Class Driven Dynamic Data Source Discovery with DynaBot
2007
International Journal of Web Services Research
Third, DYNABOT incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. ...
To address these challenges, we present DYNABOT, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DYNABOT has three unique characteristics. ...
Another optimization that can be used to guide the analysis process is document text analysis. ...
doi:10.4018/jwsr.2007070102
fatcat:hzqbi4umnfbllpr3y43tldhwmq
Topic Embeddings - A New Approach to Classify Very Short Documents Based on Predefined Topics
2019
International Conference on Wirtschaftsinformatik
We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. ...
We use a real-world dataset from online advertising, which is comprised of markedly short documents. ...
In addition to the topic-word level, seeded LDA guides the probability distributions in the document-topic layer. ...
dblp:conf/wirtschaftsinformatik/LommelRFJ19
fatcat:7ex3th6dn5hyhnyp5gz4yueabe
« Previous
Showing results 1 — 15 out of 24,932 results