254,040 Hits in 11.6 sec

Term Similarity and Weighting Framework for Text Representation [chapter]

Sadiq Sani, Nirmalie Wiratunga, Stewart Massie, Robert Lothian
2011 Lecture Notes in Computer Science  
In this paper we present a framework that combines term-similarity and weighting for text representation.  ...  Term-similarity measures are often used to improve representation by capturing semantic relationships between terms. Another consideration for representation involves the importance of terms.  ...  Text Representation Framework The first step in our framework is to obtain all pairwise term similarity values to populate a term-term similarity matrix T where the row and column dimensions of T represent  ... 
doi:10.1007/978-3-642-23291-6_23 fatcat:jhyxkunon5czzo7tefik2jl6di

Relevance feedback methods for logo and trademark image retrieval on the web

Euripides G. M. Petrakis, Klaydios Kontis, Epimenidis Voutsakis, Evangelos E. Milios
2006 Proceedings of the 2006 ACM symposium on Applied computing - SAC '06  
This evaluation demonstrates that term re-weighting based on text and image content is the most effective approach.  ...  This work extends the existing framework of image retrieval with relevance feedback on the Web by incorporating text and image content into the search and feedback process.  ...  The text similarity between a query Q and an image I is computed as S text (Q, I) = X i ∈ representation w text i S text i (Q, I), (1) where w text i are weights (inner weights) denoting the relative significance  ... 
doi:10.1145/1141277.1141532 dblp:conf/sac/PetrakisKVM06 fatcat:6sehsnlbcvahtfiqrqzh42hbwe

Video + CLIP Baseline for Ego4D Long-term Action Anticipation [article]

Srijan Das, Michael S. Ryoo
2022 arXiv   pre-print
Our Video + CLIP framework makes use of a large-scale pre-trained paired image-text model: CLIP and a video encoder Slowfast network.  ...  In this report, we introduce our adaptation of image-text models for long-term action anticipation.  ...  Acknowledgment We thank Jinghuan Shang and Xiang Li for their technical help in setting up the Ego4D dataset. We thank valuable discussions with members of Robotics Lab at Stony Brook University.  ... 
arXiv:2207.00579v1 fatcat:csnakcptpba35n6oryyho5ffji

Combining gene sequence similarity and textual information for gene function annotation in the literature

Luo Si, Danni Yu, Daisuke Kihara, Yi Fang
2008 Information retrieval (Boston)  
A supervised learning method is utilized to obtain the weights for combining the three types of evidence to assign appropriate Gene Ontology terms for target genes.  ...  This paper proposes a framework to improve the gene function annotation in the literature by considering both the textual information in the literature and the functions of genes with sequences similar  ...  We extract the text from the name field and the exact synonym field as multiple text representations for each GO term. Therefore, a specific GO term G may have multiple text representations as Tg !  ... 
doi:10.1007/s10791-008-9053-0 fatcat:vxlvdeza2rbjfmcx2hvuosc7py

Text-Based Ontology Enrichment Using Hierarchical Self-organizing Maps

Emil St. Chifu, Ioan Alfred Letia
2008 International Semantic Web Conference  
histograms (DCH) and the document frequency times inverse term frequency (DF-ITF) weighting scheme.  ...  In this paper we describe an unsupervised framework for domain ontology enrichment based on mining domain text corpora.  ...  The candidates for labels of newly inserted concepts are terms collected by mining a text corpus.  ... 
dblp:conf/semweb/ChifuL08 fatcat:74kckd5levg7tk5g2kiv7itici

Building Contextual Anchor Text Representation using Graph Regularization

Na Dai
Anchor texts are useful complementary description for target pages, widely applied to improve search relevance.  ...  The constraints draw from the estimation of anchor-anchor, anchor-page, and page-page similarity.  ...  Davison for the guidance on her thesis work. This work was supported by a grant from the National Science Foundation under award IIS-0803605.  ... 
doi:10.1609/aaai.v26i1.8123 fatcat:uzvbngcrabffzkds2dsvbnx2ei

Knowledge-driven graph similarity for text classification

Niloofer Shanavas, Hui Wang, Zhiwei Lin, Glenn Hawe
2020 International Journal of Machine Learning and Cybernetics  
In this paper, we present a graph kernel-based text classification framework which utilises the structural information in text effectively through the weighting and enrichment of a graph-based representation  ...  We introduce weighted co-occurrence graphs to represent text documents, which weight the terms and their dependencies based on their relevance to text classification.  ...  The proposed text classification framework can be adapted for different domains by designing the similarity matrix based on the domain.  ... 
doi:10.1007/s13042-020-01221-4 fatcat:wfzom7w3xjc5pfq4vxdwi6t2em

A novel semantic level text classification by combining NLP and Thesaurus concepts

R. Nagaraj
2014 IOSR Journal of Computer Engineering  
The semantic weight of terms related to the concepts from Wikipedia and Word Net are used to represent semantic information.  ...  The semantic vector space model of terms by combining the Word Net and Wikipedia is being further improved the classification accuracy of the Text classification.  ...  The weight of each term in a document is usually measured via two schemes: Binary (1 for term appearing in the document, 0 for not) and Term Frequency-Inverse Document Frequency (TF-IDF).  ... 
doi:10.9790/0661-16461426 fatcat:2diyndiwdnfutaoswlj3bqklpq

Text Classification based on Word Subspace with Term-Frequency [article]

Erica K. Shimomoto, Lincon S. Souza, Bernardo B. Gatto, Kazuhiro Fukui
2018 arXiv   pre-print
To incorporate the word frequency directly in the subspace model, we further extend the word subspace to the term-frequency (TF) weighted word subspace.  ...  Based on these new concepts, text classification can be performed under the mutual subspace method (MSM) framework.  ...  ACKNOWLEDGMENT This work is supported by JSPS KAKENHI Grant Number JP16H02842 and the Japanese Ministry of Education, Culture, Sports, Science, and Technology (MEXT) scholarship.  ... 
arXiv:1806.03125v1 fatcat:k2ixentftnhhrpee5gpxdlikma

Network Creation: Overview [chapter]

Christian Borgelt
2012 Lecture Notes in Computer Science  
We rather find the data we want to fuse, connect, analyze and thus exploit for creative discoveries, stored in flat files, (relational) databases, text document collections and the like.  ...  As a consequence, we need, as an initial step, methods that construct a network representation by analyzing tabular and textual data, in order to identify entities that can serve as nodes and to extract  ...  [5] discusses a representation and reasoning framework for graphs with probabilistically weighted edges that relies on the ProbLog language.  ... 
doi:10.1007/978-3-642-31830-6_4 fatcat:j5xzanveuzazdjtiz4l3pfkc7a

A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

Aytug Onan, Mansur Alp Tocoglu
2021 IEEE Access  
., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting functions (i.e., odds ratio, relevance frequency, balanced  ...  For the evaluation task, the presented framework has been evaluated on three-sarcasm identification corpus.  ...  In term weighted tri-gram text representation scheme, a weighted text representation has been obtained for each n-gram (namely, for n = 1, n = 2, and n = 3).  ... 
doi:10.1109/access.2021.3049734 fatcat:kzobtjmgp5b2bcsufcmmuoufza

A Hybrid Similarity Concept for Browsing Semi-structured Product Items [chapter]

Markus Zanker, Sergiu Gordea, Markus Jessenitschnig, Michael Schnabl
2006 Lecture Notes in Computer Science  
Similarity is an important underlying concept for the above techniques.  ...  Furthermore, we implemented our hybrid similarity concept in a service component and give evaluation results for the e-tourism domain.  ...  , where text similarity was weighted 30% and all structured features like category, price or facilities 70% -alone text similarity with positive and negative preferences alg pos/neg -pure similarity of  ... 
doi:10.1007/11823865_3 fatcat:h3fwjytuzbeppcvf66653oibge

Rules and fuzzy rules in text: concept, extraction and usage

D.H. Kraft, M.J. Martı́n-Bautista, J. Chen, D. Sánchez
2003 International Journal of Approximate Reasoning  
In this paper, we focus on the concept of rule and the management of uncertainty in text applications.  ...  The different structures considered for the construction of the rules, the extraction of the knowledge base and the applications and usage of these rules are detailed.  ...  This representation of documents, terms and weights is similar to the one of the vector space model [38] .  ... 
doi:10.1016/j.ijar.2003.07.005 fatcat:36ypsda6mvh4jkty5runl4hrqm

Enhancing text clustering by leveraging Wikipedia semantics

Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua Li, Qiang Yang, Zheng Chen
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
Then, we develop a unified framework to leverage these semantic relations in order to enhance traditional content similarity measure for text clustering.  ...  In addition, with the optimized weights for hypernym, synonym, and associative concepts that are tuned with the help of a few labeled data users provided, the clustering performance can be further improved  ...  After remove the stopwords and stemmed by stemmer such as Porter stemmer [8] , the stemmed terms construct a vector representation for each text document.  ... 
doi:10.1145/1390334.1390367 dblp:conf/sigir/HuFCZLYC08 fatcat:fdtcfrunzbb57jfmkhmjzugerq

Self-Taught convolutional neural networks for short text clustering

Jiaming Xu, Bo Xu, Peng Wang, Suncong Zheng, Guanhua Tian, Jun Zhao, Bo Xu
2017 Neural Networks  
Here we propose a flexible Self-Taught Convolutional neural network framework for Short Text Clustering (dubbed STC^2), which can flexibly and successfully incorporate more useful semantic features and  ...  Short text clustering is a challenging problem due to its sparseness of text representation.  ...  Acknowledgments We would like to thank reviewers for their comments, and acknowledge Kag-  ... 
doi:10.1016/j.neunet.2016.12.008 pmid:28157556 fatcat:ytmml7ddcnc6rhk5gsuhrvty7a
« Previous Showing results 1 — 15 out of 254,040 results