Filters








2,545 Hits in 7.6 sec

Making Sense of Massive Amounts of Scientific Publications: the Scientific Knowledge Miner Project

Francesco Ronzano, Ana Freire, Diego Sáez-Trumper, Horacio Saggion
2016 ACM/IEEE Joint Conference on Digital Libraries  
The World Wide Web has become the hugest repository ever for scientific publications and it continues to increase at an unprecedented rate.  ...  We present the Scientific Knowledge Miner (SKM) Project, that aims to investigate new approaches and frameworks to facilitate the extraction of knowledge from scientific publications across different disciplines  ...  Inventor (FP7-ICT-2013.8.1 -Grant: 611383), the Catalonia Trade and Investment Agency (Agència per la competitivitat de l'empresa, ACCIÓ) and the TUNER project (TIN2015-65308-C5-5-R, MINECO/FEDER, UE).  ... 
dblp:conf/jcdl/RonzanoFSS16 fatcat:ykxart2w3nd45bwvexcz5c7z6u

Unsupervised Extraction of Popular Product Attributes from E-Commerce Web Sites by Considering Customer Reviews

Lidong Bing, Tak-Lam Wong, Wai Lam
2016 ACM Transactions on Internet Technology  
We develop an unsupervised learning framework for extracting popular product attributes from product description pages originated from different E-commerce Web sites.  ...  As an unsupervised model, our framework can be easily applied to a variety of new domains and Web sites without the need of labeling training samples.  ...  In this article, we develop an unsupervised learning framework for extracting popular product attributes from product description pages originated from different Web sites.  ... 
doi:10.1145/2857054 fatcat:ffy4i5fve5hrdcfvfre57x3whm

A COMPARATIVE ANALYSIS OF WEB INFORMATION EXTRACTION TECHNIQUES DEEP LEARNING vs. NAÏVE BAYES vs. BACK PROPAGATION NEURAL NETWORKS IN WEB DOCUMENT EXTRACTION

Sharmila J., Subramani A.
2016 ICTACT Journal on Soft Computing  
Web utilization is expanding in an uncontrolled way. A particular framework is required for controlling such extensive measure of information in the web space.  ...  The main objective of this investigation is web document extraction utilizing different grouping algorithm and investigation. This work extricates the data from the web URL.  ...  In future, we can incorporate different machine learning technique like hybrid algorithms for capable information extraction from the web sites.  ... 
doi:10.21917/ijsc.2016.0156 fatcat:4atmism5urdntnxghqafmq7o4a

Information mining in remote sensing image archives: system concepts

M. Datcu, H. Daschiel, A. Pelizzari, M. Quartulli, A. Galoppo, A. Colapicchioni, M. Pastori, K. Seidel, P.G. Marchetti, S. D'Elia
2003 IEEE Transactions on Geoscience and Remote Sensing  
The offline part aims at the extraction of primitive image features, their compression, and data reduction, the generation of a completely unsupervised image content-index, and the ingestion of the catalogue  ...  Index Terms-Content-based image retrieval (CBIR), image information mining, information extraction, statistical learning.  ...  , additional relevant information, and participating with the evaluation of the KIM system.  ... 
doi:10.1109/tgrs.2003.817197 fatcat:oi7telilqjb4vpxnf3wtpwagoi

Searchable web sites recommendation

Yang Song, Nam Nguyen, Li-wei He, Scott Imig, Robert Rounthwaite
2011 Proceedings of the fourth ACM international conference on Web search and data mining - WSDM '11  
The language models for queries and searchable sites are built using information mined from client-side browsing logs.  ...  The static rank for each searchable site leverages features extracted from these client-side logs such as number of queries that are submitted to this site, and features extracted from general search engines  ...  Secondly, we use a completely unsupervised approach for searchable site recommendation. Our method does not involve hand-labeling of any data or features.  ... 
doi:10.1145/1935826.1935890 dblp:conf/wsdm/SongNHIR11 fatcat:247yxjfhenhr5lxmp7mcfzjyea

Web mining: Machine learning for web applications

Hsinchun Chen, Michael Chau
2005 Annual Review of Information Science and Technology  
day, collects gigabytes of clickstream data across different Web sites.  ...  It is also difficult to collect Web usage data across different sites because most server log data and the data collected by companies such as Doubleclick are proprietary.  ... 
doi:10.1002/aris.1440380107 fatcat:wdqwbszj7valbnyjfysbb4ap4y

Clinical Decision Support Systems in Orthodontics: A Narrative Review of Data Science Approaches

Najla N. Al Turkestani, Jonas Bianchi, Romain Deleat‐Besson, Celia Le, Li Tengfei, Juan Carlos Prieto, Marcela Gurgel, Antonio C.O. Ruellas, Camila Massaro, Aron Aliaga Del Castillo, Karine Evangelista, Marilia Yatabe (+10 others)
2021 Orthodontics & craniofacial research  
Then, we introduce a web-based data management platform, the Data Storage for Computation and Integration, for temporomandibular joint and dental clinical decision support systems.  ...  Proper management and analysis of these data via high-end computing solutions, artificial intelligence and machine learning approaches can assist in extracting meaningful information that enhances population  ...  The DSCI uses Amazon Web Services which enable distributed computing across multi-site clinical centres.  ... 
doi:10.1111/ocr.12492 pmid:33973362 pmcid:PMC8988880 fatcat:shlbd4su2nc6xcjfsojeiio7dy

IRIS: our prototype rule generation system

Lisa Singh, Peter Scheuermann, Bin Chen, Belur V. Dasarathy
1999 Data Mining and Knowledge Discovery: Theory, Tools, and Technology  
One of the unique features of IRIS is that it generates rules using the more structured component of the HTML documents, as well as the conceptual knowledge extracted from the unstructured blocks of text  ...  To date, two of our major contributions have been the design of a system architecture that facilitates the discovery of rules from HTML documents and the development of an efficient association rule algorithm  ...  In contrast, because different sites maintain different HTML documents, complete uniformity does not exist across sites.  ... 
doi:10.1117/12.339992 dblp:conf/dmkdttt/SinghSC99 fatcat:xofzbvvvhrdpbe2xlkzpmfx6ee

Cross Domain Mean Approximation for Unsupervised Domain Adaptation

Shao-Fei Zang, Yu-Hu Cheng, Xue-Song Wang, Qiang Yu, Guo-Sen Xie
2020 IEEE Access  
Thirdly, we construct a classifier utilizing CDMA metric and neighbor information. Finally, the proposed feature extraction approach and classifier are combined to realize transfer learning.  ...  Secondly, Joint Distribution Adaptation based on Cross Domain Mean Approximation (JDA-CDMA) is developed on the basis of CDMA to extract shared feature and simultaneously reduce the marginal and conditional  ...  Applying CDMA to deep learning for feature extraction is an interesting topic to explore in future.  ... 
doi:10.1109/access.2020.3012152 fatcat:kzknkk26jndbpdxdbko4k4xbfa

Sentiment Analysis Using Text Mining: A Review

Swati Redhu
2018 International Journal on Data Science and Technology  
This paper provides an overview of different methods used in text mining and sentiment analysis elaborating on all subtasks.  ...  There is a growing need for developing different methodologies and models for efficiently processing the texts and extracting apt information.  ...  For text mining and sentiment analysis, the major steps required are data acquirement, data conversion, feature representation, feature extraction and different machine learning algorithms.  ... 
doi:10.11648/j.ijdst.20180402.12 fatcat:eeweilxmenev7oltuzqf3dxwoi

Knock it off

Matthew F. Der, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker
2014 Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14  
We describe an automated system for the large-scale monitoring of Web sites that serve as online storefronts for spamadvertised goods.  ...  Using these features, which are mined from a small initial seed of labeled data, we are able to profile the Web sites of forty-four distinct affiliate programs that account, collectively, for hundreds  ...  from Google, Microsoft, Yahoo, and the UCSD Center for Networked Systems (CNS).  ... 
doi:10.1145/2623330.2623354 dblp:conf/kdd/DerSSV14 fatcat:e2lztj4c4bghnamq44xgc4xkui

Heterogeneous Network Embedding via Deep Architectures

Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C. Aggarwal, Thomas S. Huang
2015 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '15  
In particular, we demonstrate that the rich content and linkage information in a heterogeneous network can be captured by such an approach, so that similarities among cross-modal data can be measured directly  ...  In such cases, both the content and linkage structure provide important cues for creating a unified feature representation of the underlying network.  ...  Since our proposed method is an end-to-end learning framework, it does not require feature extraction for image inputs.  ... 
doi:10.1145/2783258.2783296 dblp:conf/kdd/ChangHTQAH15 fatcat:2n2g4tirsfbbrebkqyzl4gonaa

Machine transliteration and transliterated text retrieval: a survey

Dinesh Kumar Prabhakar, Sukomal Pal
2018 Sadhana (Bangalore)  
translation and information retrieval.  ...  Users of the WWW across the globe are increasing rapidly.  ...  On the other hand, transliteration mining is the process of extracting (mining) transliteration pairs from different resources, either parallel or comparable corpora, or the Web between A and B.  ... 
doi:10.1007/s12046-018-0828-8 fatcat:dg3gwugmqrfevnzu3deuk5w67i

Unsupervised Approaches for Textual Semantic Annotation, A Survey

Xiaofeng Liao, Zhiming Zhao
2019 ACM Computing Surveys  
Link to publication Creative Commons License (see https://creativecommons.org/use-remix/cc-licenses): CC BY Citation for published version (APA):  ...  ACKNOWLEDGMENTS The authors thank the anonymous reviewers for their helpful comments, in addition to Cees de Laat, Paul Martin, Jayachander Surbiryala, and ZeShun Shi for useful discussions.  ...  They also propose an unsupervised generative probabilistic method and utilize text and knowledge joint representations to perform entity disambiguation.  ... 
doi:10.1145/3324473 fatcat:fg5ucwtloze6ljdlh4hqjkqxfe

Modeling Social Networks using Data Mining Approaches-Review

Fatima Hassan, Suhad Faisal Behadili
2022 Iraqi Journal of Science  
Simply, social media delivers an available podium for employers for sharing information.  ...  Data Mining has ability to present applicable designs that can be useful for employers, commercial, and customers.  ...  This information underlines the significance methods of mining of data in mining opinion stated on site of social network.  ... 
doi:10.24996/ijs.2022.63.3.35 fatcat:dnpkdkzssjhvbarxa22hc4cu2i
« Previous Showing results 1 — 15 out of 2,545 results