Filters








25,510 Hits in 7.4 sec

A Supervised Requirement-oriented Patent Classification Scheme Based on the Combination of Metadata and Citation Information

Fujin Zhu, Xuefeng Wang, Donghua Zhu, Yuqin Liu
2015 International Journal of Computational Intelligence Systems  
as the document representation for the new method since it can obtain relatively high classification accuracy with a dramatically simplified document preprocessing process.  ...  The former ones, such as the IPC and UPC, are normally developed by different patent offices in the world mainly for the purpose of patentability examination and patent retrieval, while the latter is for  ...  In our new patent classification scheme, instead of using only the narrative text, such as the title and abstract, as the document representation, the combination of metadata and citation information of  ... 
doi:10.1080/18756891.2015.1023588 fatcat:gj7ofy6yxvadnejhqlpgnpqfwi

A conception-based approach to automatic subject term assignment for scientific journal articles

EunKyung Chung, Samantha K. Hastings
2007 Proceedings of the American Society for Information Science and Technology  
This study proposes a conception-based approach to automatic subject term assignment when using Text Classification (TC) techniques.  ...  Based on the identification of semantic sources and conception-based approaches, the experiment explores the significance of individual semantic sources and conception-based approaches for the effectiveness  ...  Shawne Miksa for her insightful comments for the earlier versions of this study. We also thank Amy Eklund for editing this paper and the anonymous reviewers for their comments.  ... 
doi:10.1002/meet.1450430149 fatcat:nupc3mc3sna73bvtgnlgedsxte

Tracking the Evolution of Clustering, Machine Learning, Automatic Indexing and Automatic Classification in Knowledge Organization

Richard P. Smiraglia, Xin Cai
2017 Knowledge organization  
He has explored domain analysis for evolution of knowledge organization, epistemological analysis of the role of authorship in bibliographic tradition, the evolution of knowledge and its representation  ...  is a professor and member of the Knowledge Organization Research Group in the iSchool at the University of Wisconsin-Milwaukee.  ...  By that we mean both the problems of defining and structuring appropriate KOSs for specific domains and the problems of subsequently using those KOSs to index documents.  ... 
doi:10.5771/0943-7444-2017-3-215 fatcat:xqte4jwljzcsbei7poxceamkya

Venue Classification of Research Papers in Scholarly Digital Libraries [chapter]

Cornelia Caragea, Corina Florescu
2018 Lecture Notes in Computer Science  
The metadata of the crawled papers, e.g., title, authors, and references, are automatically extracted before the papers are indexed in a digital library.  ...  We explore a supervised learning approach to automatically classifying the venue of a research paper using information solely available from the content of the paper and show experimentally on a dataset  ...  Any opinions, findings, and conclusions expressed here are those of the author and do not necessarily reflect the views of NSF.  ... 
doi:10.1007/978-3-030-00066-0_11 fatcat:kigj6xmdjfcfthupc6msrt7poa

A framework of automatic subject term assignment for text categorization: An indexing conception-based approach

EunKyung Chung, Shawne Miksa, Samantha K. Hastings
2010 Journal of the American Society for Information Science and Technology  
Using F-measure, the experiment results showed that cited works, source title, and title were as effective as the full text while a keyword was found more effective than the full text.  ...  In other words, in the context of a typical scientific journal article dataset, the objective contents and authors' intentions were more desirable for automatic subject term assignment via TC than the  ...  Third, for the domain-oriented indexing conception, source title and title of cited works are considered.  ... 
doi:10.1002/asi.21272 fatcat:huhvzrantnfpxml3uqznutscla

Experiments with cited titles for automatic document indexing and similarity measure in a probabilistic context

K. L. Kwok
1985 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '85  
~.EXPERIMENTAL METHODS AND RESULTS Unlike titles and abstracts, use of cited titles for indexing is uncommon, and databases of documents containing these items are not readily available.  ...  Journal of the ASIS. 27:129-146; 1976. SCHIMINOVICH, S. "Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm."  ... 
doi:10.1145/253495.253522 dblp:conf/sigir/Kwok85 fatcat:aqyjkki5l5fstjfhdkoj5dge6u

Representing and extracting lung cancer study metadata: Study objective and study design

Jean I. Garcia-Gathright, Andrea Oh, Phillip A. Abarca, Mary Han, William Sago, Marshall L. Spiegel, Brian Wolf, Edward B. Garon, Alex A.T. Bui, Denise R. Aberle
2015 Computers in Biology and Medicine  
The classification results and the top features determined by the classifiers suggest that this scheme would be generalizable to other mutations in lung cancer, as well as studies on driver mutations in  ...  Casama's support vector machine (SVM) automatically classified the abstracts by study objective with as much as 129% higher Fscores compared to PubMed's built-in filters.  ...  Acknowledgments This work was supported by NLM T15-LM007356, NIH/NLM R01-LM009961, NIH K23CA149079, and the UCLA Department of Radiological Sciences.  ... 
doi:10.1016/j.compbiomed.2015.01.004 pmid:25618216 pmcid:PMC4331232 fatcat:bp2cxgduqrh2zhesnc42y5xdsy

Trainable Citation-enhanced Summarization of Scientific Articles

Horacio Saggion, Ahmed AbuRa'ed, Francesco Ronzano
2016 ACM/IEEE Joint Conference on Digital Libraries  
In order to cope with the growing number of relevant scientific publications to consider at a given time, automatic text summarization is a useful technique.  ...  In recent years a number of evaluation challenges have been proposed to address the problem of summarizing a scientific paper taking advantage of its citation network (i.e., the papers that cite the given  ...  Acknowledgements This work is (partly) supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502), the TUNER project (TIN2015  ... 
dblp:conf/jcdl/SaggionAR16 fatcat:5dfo4n6ezvdktdh3yqicya4jni

Semi-automatic System for Title Construction [article]

Swagata Duari, Vasudha Bhatnagar
2019 arXiv   pre-print
We evaluate the proposed system by computing the overlap between extracted keywords and the list of title-words for documents, and we observe a macro-averaged precision of 82%.  ...  The system extracts and recommends impactful words from the text, which the author can creatively use to construct an appropriate title for the manuscript.  ...  We constructed the title for the manuscript using these suggested keywords. Fig. 1 . 1 Diagrammatic representation of the semi-automatic system for title construction.  ... 
arXiv:1905.00470v1 fatcat:n4k3vkbspnf3bevarjqavfdm44

Automatic Subject Indexing of Text

Koraljka Golub
2019 Knowledge organization  
Document clustering automatically both creates groups of related documents and extracts names of subjects depicting the group at hand.  ...  Document classification re-uses the intellectual effort invested into creating a KOS for subject indexing and even simple string-matching algorithms have been reported to achieve good results, because  ...  or keywords as the content representation for each document.  ... 
doi:10.5771/0943-7444-2019-2-104 fatcat:bpauojjk7ndtngmi6nxjimce6e

Importance of HTML Structural Elements and Metadata in Automated Subject Classification [chapter]

Koraljka Golub, Anders Ardö
2005 Lecture Notes in Computer Science  
The aim of the study was to determine how significance indicators assigned to different Web page elements (internal metadata, title, headings, and main text) influence automated classification.  ...  It was shown that for best results all the elements have to be included in the classification process.  ...  Acknowledgements The research was funded by ALVIS, an EU Sixth Framework Programme, Information Society Technologies (IST-1-002068-STP), and The Swedish Agency for Innovation Systems (P22504-1 A).  ... 
doi:10.1007/11551362_33 fatcat:5qsrxly4jbc4borbvyib22rf3u

Evaluating semantometrics from computer science publications

Christin Katharina Kreutz, Premtim Sahitaj, Ralf Schenkel
2020 Scientometrics  
Afterwards we broach the issues of numbers of references and citations as well as years of publication for the three classes.  ...  The utilisation of one-vector representations for the ternary classification task resulted in an accuracy of .949 which is +.1475 compared to the binary SOTA.  ...  Doc2Vec, BERT and LDA are going to be used as document vector representations for semantometrics as described in "Methodology" section.  ... 
doi:10.1007/s11192-020-03409-5 fatcat:472gwkgosbbdpfk6cfwbf3bxma

Exploring Significant Characteristics and Models for Classification of Structure Function of Academic Documents

Bowen Ma, Chengzhi Zhang, Yuzhuo Wang
2020 Data and Information Management  
Based on the chapter titles and the in-chapter texts, traditional machine learning and deep learning models are both used for classifier training.  ...  In this study, the proceedings of the Association for Computational Linguistics (ACL) conferences are used as the primitive corpus, and the training corpus of chapter category is obtained by manual annotation  ...  We also thank Zhao Y for checking the corpus, whose earnest and diligent efforts are the foundation for the research.  ... 
doi:10.2478/dim-2020-0031 fatcat:qaqq26ezvjalte2crqyvef6hk4

Deep Context: A Neural Language Model for Large-scale Networked Documents

Hao Wu, Kristina Lerman
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
Our model, Deep Context Vector, takes advantage of distributed representations to exploit the word order in document sentences, as well as the semantic connections among linked documents in a document  ...  We demonstrate its effectiveness and efficiency on document classification and link prediction tasks.  ...  Acknowledgements This work was supported in part by the National Science Foundation under grant SMA-1360058 and in part by Leidos. We thank anonymous reviewers for their helpful comments.  ... 
doi:10.24963/ijcai.2017/431 dblp:conf/ijcai/WuL17 fatcat:23vc5g7jgvgkvheeq2buyeevma

An unsupervised approach to automatic classification of scientific literature utilizing bibliographic metadata

Arash Joorabchi, Abdulhussain E. Mahdi
2011 Journal of information science  
The method is based on identifying all the references cited in the document to be classified and, using the subject classification metadata of extracted references as catalogued in existing conventional  ...  Motivated by the ever-increasing number of e-documents and the high cost of manual classification, Automatic Text Classification/Categorisation (ATC) -the automatic assignment of natural language text  ...  With the assumption that the majority of materials, such as books and journals, cited in a scientific document belong to the same or closely relevant classification category(ies) as that of the citing  ... 
doi:10.1177/0165551511417785 fatcat:63fpqqdzije4tbfgwaevwdhtze
« Previous Showing results 1 — 15 out of 25,510 results