68,934 Hits in 5.5 sec

Supervised Learning for Automatic Classification of Documents using Self-Organizing Maps

Dina Goren-Bar, Tsvi Kuflik, Dror Lev
2000 DELOS Workshops / Conferences  
This study presents the application of SOM and LVQ to automatic document classification, based on predefined set of clusters. A set of documents, manually clustered by domain expert was used.  ...  Automatic Document Classification that corresponds with user-predefined classes is a challenging and widely researched area.  ...  SOM and LVQ may get as input a training set of documents, train the specific ANN for that set and, later on, cluster an incoming stream of new documents to the automatically generated categories.  ... 
dblp:conf/delos/Goren-BarKL00 fatcat:65g6nhxjsrbczlom6wscv2z6ni

Toward File Consolidation by Document Categorization [chapter]

Abdel Belaïd, André Alusse
2006 Lecture Notes in Computer Science  
An efficient adaptive document classification and categorization approach is proposed for personal file creation corresponding to user's specific needs and profile.  ...  This kind of approach is needed because the search engines are often too general to offer a precise answer to the user request.  ...  File Constitution In order to discover sets of similar documents and highlight categories, the categorization, automatic clustering and summarization of documents are possible issues to help the user to  ... 
doi:10.1007/11669487_39 fatcat:l4koj5ocvrbzhmsfeugqphxoyi

A Survey: Techniques of an Efficient Search Annotation based on Web Content Mining

Sobana. E, Muthusankar.D Muthusankar.D
2014 International Journal of Computer Applications  
This paper focus on how to extract the information effectively based on classification and clustering, and detecting phishing websites.  ...  Due to the overloaded of information in web, the information extraction is not effectively based on user needs.  ...  The main goal of XML document classification is to build a classifier model that can automatically assign XML documents to some existing categories.  ... 
doi:10.5120/18181-9072 fatcat:6vppuxoy3zb35do7mawyeabvni

Adaptive Content Mapping for Internet Navigation [chapter]

R.W. Brause, M. Ueberall
2003 Internet-Based Intelligent Information Processing Systems  
• Load balancing For large document collections, the interaction speed and therefore the user acceptance of the system depends on the ability to automatically distribute the workload within a cluster  ...  Automatic classification by clustering The similarity measures defined so far can be used to group documents into clusters. These semantic clusters represent a natural classification.  ... 
doi:10.1142/9789812795342_0002 fatcat:fdtxjjqmjvddzjhigxgavbuf7a

Reduction of Search Space in Restful Service Discovery

G. Venugopal, P. Radhika Raju, A. Ananda Rao
2019 International Journal of Scientific Research in Computer Science Engineering and Information Technology  
TASSIC approach will search the semantic characteristics of search and match interface terms in the service document.  ...  A new approach has proposed for reduction of the search space in restful service discovery to develop a k-Nearest Neighbor classification algorithm. it provide candidate services with ranking based on  ...  Service Classification Service classification is the process of automatically classify a RESTful web services / Api's to one or more predefined classify based on its features of vector and similarity between  ... 
doi:10.32628/cseit195430 fatcat:3rb2fxtygzbt5ngcrx5uv7fi3u

Text documents clustering using data mining techniques

Ahmed Adeeb Jalal, Basheer Husham Ali
2021 International Journal of Power Electronics and Drive Systems (IJPEDS)  
The proposed approach uses title, abstract, and keywords of the paper, in addition to the categories topics to perform the classification process.  ...  Subsequently, documents are classified and clustered into the primary categories based on the highest measure of cosine similarity between category weight and documents weights.  ...  Clustering will help the user to get all relevant documents in one category and the search can be limited to some important documents of his choice.  ... 
doi:10.11591/ijece.v11i1.pp664-670 fatcat:7jftmohobndavdgt6jg3iclvke

Concept Extraction and Clustering for Topic Digital Library Construction

Chengzhi Zhang, Dan Wu
2008 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology  
Firstly, documents in a special domain are automatically produced by document classification approach. Then, the keywords of each document are extracted using the machine learning approach.  ...  This paper is to introduce a new approach to build topic digital library using concept extraction and document clustering.  ...  Firstly, documents subset of a special domain is produced by automatic document classification approach.  ... 
doi:10.1109/wiiat.2008.81 dblp:conf/iat/ZhangW08 fatcat:gmeidqchqrd6xdw6ais2ka2vny

A novel method to Automatically Categorizing Search Results using Web Search Goals

Rohini B.Mothe, V. S. Deshmukh
2014 International Journal of Computer Applications  
Pseudo-document contain set of keywords which are different aspects of query. And then performing clustering on these pseudo-document using fuzzy k-mean clustering.  ...  Based upon this feedback doing text processing, enriching each url by combination of title and snippet ,and mapping these data to Pseudo-document.  ...  In this approach, our aim is to discover different user search goals for a query and depict each search goal with some keywords automatically which used as labels of clusters.  ... 
doi:10.5120/17318-3899 fatcat:733lbus3qjc4nb6a54civsjt5a

Automated subject classification of textual web documents

Koraljka Golub
2006 Journal of Documentation  
Findings -Provides major similarities and differences between the three approaches: document pre-processing and utilization of web--specific document characteristics is common to all the approaches; major  ...  Design/methodology/approach -A range of works dealing with automated classification of full--text web documents are discussed.  ...  In this approach the clusters and, to a limited degree, relationships between clusters are derived automatically from the documents to be clustered, and the documents are subsequently assigned to those  ... 
doi:10.1108/00220410610666501 fatcat:cdlhrejd7jfmzlxn7q646xk3ya

Hierarchical Document Classification Using Automatically Generated Hierarchy [chapter]

Tao Li, Shenghuo Zhu
2005 Proceedings of the 2005 SIAM International Conference on Data Mining  
The linear discriminant projection approach first transforms all documents onto a low-dimensional space and then clusters the categories into hierarchies accordingly.  ...  Although considerable research has been conducted in the field of hierarchical document categorization, little has been done on automatic genera-tion of topic hierarchies.  ...  Acknowledgment The authors would like to thank the anonymous reviewers for their invaluabale comments.  ... 
doi:10.1137/1.9781611972757.53 dblp:conf/sdm/Li05 fatcat:p642regjbrhmjiw5avw7cufwmq

Hierarchical document classification using automatically generated hierarchy

Tao Li, Shenghuo Zhu, Mitsunori Ogihara
2007 Journal of Intelligent Information Systems  
The linear discriminant projection approach first transforms all documents onto a low-dimensional space and then clusters the categories into hierarchies accordingly.  ...  Although considerable research has been conducted in the field of hierarchical document categorization, little has been done on automatic genera-tion of topic hierarchies.  ...  Acknowledgment The authors would like to thank the anonymous reviewers for their invaluabale comments.  ... 
doi:10.1007/s10844-006-0019-7 fatcat:oidcaew5sjambjhdbtpgo3mjci

Applications of text mining within systematic reviews

James Thomas, John McNaught, Sophia Ananiadou
2011 Research Synthesis Methods  
In this paper, we describe the application of four text mining technologies, namely, automatic term recognition, document clustering, classification and summarization, which support the identification  ...  Text mining technologies offer one possible way forward in reducing the amount of time systematic reviews take to conduct.  ...  Acknowledgements The authors thank Mark Newman, Josephine Kavanagh, and the editors and anonymous peer reviewers for their helpful comments on earlier drafts of this paper.  ... 
doi:10.1002/jrsm.27 pmid:26061596 fatcat:a4o5uptnzjalvndqiilashqmjm

PEx-WEB: Content-based Visualization of Web Search Results

Fernando V. Paulovich, Roberto Pinho, Charl P. Botha, Anton Heijs, Rosane Minghim
2008 2008 12th International Conference Information Visualisation  
The second techniques is capable of identifying, labeling and displaying topics within sub-groups of documents on the map.  ...  This paper presents a system that adapts two techniques to map and explore web results visually in order to find relevant patterns and relationships amongst the resulting documents.  ...  We wish to acknowledge the work of our undergraduate and research students as well as research colleagues in processing some data and discussing various issues of the work.  ... 
doi:10.1109/iv.2008.94 dblp:conf/iv/PaulovichPBHM08 fatcat:2jz4x7mw7vaa5jdkwkaq6pzvvq

Supporting the education evidence portal via text mining

S. Ananiadou, P. Thompson, J. Thomas, T. Mu, S. Oliver, M. Rickinson, Y. Sasaki, D. Weissenbacher, J. McNaught
2010 Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences  
New features include automatic classification of documents according to a taxonomy, automatic clustering of search results according to similar document content, and automatic identification and highlighting  ...  As users often have limited time, they would benefit from enhanced methods of performing searches and viewing results, allowing them to drill down to information of interest more efficiently, without having  ...  We would like to thank Brian Rea and Bill Black (NaCTeM), Claire Stansfield and Ruth Stewart (EPPI-Centre), Julia Reed (Department for Children, Schools and Families) and members of the eep Development  ... 
doi:10.1098/rsta.2010.0152 pmid:20643679 pmcid:PMC2981997 fatcat:ep4fjbzjurew3glrhksf6fxymm

Automatic Subject Indexing of Text

Koraljka Golub
2019 Knowledge organization  
Document clustering automatically both creates groups of related documents and extracts names of subjects depicting the group at hand.  ...  The following major approaches are discussed, in terms of their similarities and differences, advantages and disadvantages for automatic assigned indexing from KOSs: "text categorization," "document clustering  ...  In hard clustering, one document may be a member of one cluster only, while in fuzzy clustering, any document may belong to any number of clusters.  ... 
doi:10.5771/0943-7444-2019-2-104 fatcat:bpauojjk7ndtngmi6nxjimce6e
« Previous Showing results 1 — 15 out of 68,934 results