21,858 Hits in 8.0 sec

An encoding technique based on word importance for the clustering of Web documents

J. Zakos, B. Verma
2002 Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.  
We use a two level self-organizing map architecture to generate clusters of words and documents.  ...  A web document retrieval system is presented to demonstrate how this approach could be integrated into web search.  ...  It uses a two level Kohonen's self-organizing map approach to group words and documents of similar contextual similarity.  ... 
doi:10.1109/iconip.2002.1201885 fatcat:yodgeufqlzhcvikwjaiijqa3jq

Using a Connectionist Approach for Enhancing Domain Ontologies: Self-Organizing Word Category Maps Revisited [chapter]

Michael Dittenbach, Dieter Merkl, Helmut Berger
2003 Lecture Notes in Computer Science  
The terms, which are extracted from domain-specific text documents, are mapped onto a two-dimensional map to provide an intuitive interface displaying semantically similar words in spatially similar regions  ...  In this paper, we present an approach based on neural networks for organizing words of a specific domain according to their semantic relations.  ...  In other words, the average contexts of words at displacements −1 and +1 constitute the contextual description. x i = x i (−1) x i (1) (2) Self-Organizing Map Algorithm The self-organizing map (SOM)  ... 
doi:10.1007/978-3-540-45228-7_27 fatcat:tumnovrn25dnhebcndhzbc3txy

Integrating contextual information to enhance SOM-based text document clustering

Daniel Pullwitt
2002 Neural Networks  
Exploration of text corpora using Self-Organizing Maps has shown promising results in recent years.  ...  Topographic map approaches usually use the original vector space model known from Information Retrieval for text document representation.  ...  The Self-Organizing Map (SOM) (Kohonen, 1995) , an unsupervised algorithm for clustering and topographic mapping, has been repeatedly used for this task, examples being some basic work in the ET-Map  ... 
doi:10.1016/s0893-6080(02)00082-5 pmid:12416697 fatcat:7j74aplxebbnxpwbdnvgctzwj4

Self-organizing Maps in Web Mining and Semantic Web [chapter]

Emil St., Ioan Alfred
2010 Self-Organizing Maps  
Web Mining with Self-organizing Maps Applying SOM on natural language data means doing data mining on text data, for instance Web documents (Lagus, 2000) .  ...  After training a SOM on all the words in a collection of documents -where the vectorial coding of words represents the contextual usage -, the result self-organizing map groups the words in semantic categories  ...  Self-organizing Maps in Web Mining and Semantic Web, Self-Organizing Maps, George K Matsopoulos (Ed.), ISBN: 978-953-307-074-2, InTech, Available from:  ... 
doi:10.5772/9172 fatcat:vqyekr43lnafloejqisnfbzv5m

Skim-Attention: Learning to Focus via Document Layout [article]

Laura Nguyen, Thomas Scialom, Jacopo Staiano, Benjamin Piwowarski
2021 arXiv   pre-print
Skim-Attention can be further combined with long-range Transformers to efficiently process long documents.  ...  Transformer-based pre-training techniques of text and layout have proven effective in a number of document understanding tasks.  ...  Acknowledgments JRR wants to thank the organizers for a fantastic conference in the Canadian wilderness. Many thanks who contributed the content of this discussion, among them are J.  ... 
arXiv:2109.01078v1 fatcat:43jhyddsjfhzjlcyjla6ybrsra

Towards a Linguistic Stylometric Model for the Authorship Detection in Cybercrime Investigations

Abdulfattah Omar, Aldawsari Bader Deraan
2019 International Journal of English Linguistics  
It is also clear that the use of a self-organizing map (SOM) led to better clustering performance because of its capacity to integrate two different linguistic levels for each author profile.  ...  This study proposes an integrated framework that considers letter-pair frequencies/combinations along with the lexical features of documents as a means to identifying the authorship of short texts posted  ...  It is also clear that the use of a self-organizing map (SOM) leads to better clustering performance with its capacity to integrate two different linguistic levels (i.e., both morphological and lexical  ... 
doi:10.5539/ijel.v9n5p182 fatcat:2n5omcn7vfhwvh4js3thporcs4

Cybercrime and Authorship Detection in Very Short Texts A Quantitative Morpho-lexical Approach

Abdulfattah Omar
2019 مجلة البحث العلمی فی الآداب  
It is also clear that the use of the self-organizing map (SOM) led to better clustering performance for its capacity to integrate two different linguistic levels of each author profile together.  ...  The present study proposes an integrated framework that considers letterpair frequencies/combinations along with the lexical features of documents.  ...  For classification purposes, the self-organizing maps (SOM) model is used.  ... 
doi:10.21608/jssa.2019.38725 fatcat:etmpfy6u4bekxonvo4vp4bnzb4

Helping Knowledge Cross Boundaries: Using Knowledge Visualization to Support Cross-Community Sensemaking

Jasminko Novak
2007 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07)  
The developed method enables the visualization of implicit structures of personal and community knowledge and their use for multi-perspective access to community information spaces.  ...  Eliciting personal points of view To construct such maps based on users personal points of view we combine statistical text-analysis and self-organized clustering with methods for supervised learning of  ...  The Structuring View lets the user organize his information seeking results into personal maps (adding and grouping documents, naming clusters etc.)  ... 
doi:10.1109/hicss.2007.245 dblp:conf/hicss/Novak07 fatcat:evi7lpyx7rgh5awpjpxg6yifoe

Semi-Automated Extraction of Targeted Data fromWeb Pages

F. Estievenart, J. Meurisse, J. Hainaut, P. Thiran
2006 22nd International Conference on Data Engineering Workshops (ICDEW'06)  
Such rules mainly record a semantic interpretation of recurring types of information in a cluster of similar Web documents and their location in those documents.  ...  The World Wide Web can be considered an infinite source of information for both individuals and organizations.  ...  The paper is organized as follows: in Section 2, the main concepts of our approach are defined: page cluster, page component and mapping rule.  ... 
doi:10.1109/icdew.2006.135 dblp:conf/icde/EstievenartMHT06 fatcat:vwhfskctujcr7egcynig6mbire

A multimedia interactive search engine based on graph-based and non-linear multimodal fusion

Anastasia Moumtzidou, Ilias Gialampoukidis, Theodoros Mironidis, Dimitris Liparas, Stefanos Vrochidis, Ioannis Kompatsiaris
2016 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)  
This paper presents an interactive multimedia search engine, which is capable of searching into multimedia collections by fusing textual and visual information.  ...  Apart from multimedia search, the engine is able to perform text search and image retrieval independently using both high-level and lowlevel information.  ...  Using Self Organizing Maps [12] , all images are clustered, hence, all images of the collection are organized by color.  ... 
doi:10.1109/cbmi.2016.7500276 dblp:conf/cbmi/MoumtzidouGMLVK16 fatcat:zx3tgnl6dbcljaaezmakgmc72y

Dragon Toolkit: Incorporating Auto-Learned Semantic Knowledge into Large-Scale Text Retrieval and Mining

Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
2007 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007)  
Our method extracts explicit topic signatures from documents and then statistically maps them into singleword features.  ...  The dragon toolkit reflects our method and its effectiveness is demonstrated by three tasks, text retrieval, text classification, and text clustering.  ...  Examples of semantic mapping are shown in Figure 2 . If topic signatures such as multiword phrases and ontological concepts self-contain contextual information, the mapping is context-sensitive.  ... 
doi:10.1109/ictai.2007.117 dblp:conf/ictai/ZhouZH07 fatcat:n7oe6nms3vhphpa4f26zzoo2d4

A Detailed Survey on Topic Modeling for Document and Short Text Data

S. Likhitha, B. S., H. M.
2019 International Journal of Computer Applications  
Text mining is one of the most significant field in the digital era due to the rapid growth of textual information. Topic models are gaining popularity in the last few years.  ...  These methods gained popularity in extracting hidden themes from the document (corpus).  ...  It finds more meaningful contextual structure. Kandemir et al., [77] worked on by integrating LDA and sparse Gaussian processes.  ... 
doi:10.5120/ijca2019919265 fatcat:jmti3vkmufa3xkywpo3pebravi

Projection: A Mixed-Initiative Research Process [article]

Austin Silveria
2022 arXiv   pre-print
The interface supports adding context to searches and visualizing information in multiple dimensions with techniques such as hierarchical clustering and spatial projections.  ...  Communication of dense information between humans and machines is relatively low bandwidth.  ...  The tool clusters documents on a map based on their similarity (as seen in Figure 1 ), the goal being that similar documents show up on the map next to each other and can be grouped into hierarchical  ... 
arXiv:2201.03107v1 fatcat:xvv7nxfrszgc7phrbl5z5retry

Semantic Research for Digital Libraries

Hsinchun Chen
1999 D-Lib Magazine  
TNs proved »<k» esses 1 • Automatic Categorization: A category map is the result of performing a neural network-based clustering (self-organizing) of similar documents and automatic category labeling.  ...  of heterogeneous repositories with disparate semantics, clustering and automatic hierarchical organization of information, and algorithms for automatic rating, ranking, and evaluation of information quality  ... 
doi:10.1045/october99-chen fatcat:gxlv5znrxza2xmpcak7h4noway

Text Analysis for Constructing Design Representations [chapter]

Andy Dong, Alice M. Agogino
1996 Artificial Intelligence in Design '96  
We integrate the design document learning system with an agent-based collaborative design system for fetching design information based on the "smart drawings" paradigm.  ...  Along with the benefits of this concurrency comes the complexity of sharing and accessing design information.  ...  Acknowledgements The authors would like to acknowledge William H Wood III for his valuable comments and converting the scanned document images into ASCII text, and John Wiley and Sons, Inc. for their permission  ... 
doi:10.1007/978-94-009-0279-4_2 fatcat:c2h7iwaxr5dnrnhibo564hrnsi
« Previous Showing results 1 — 15 out of 21,858 results