13,997 Hits in 6.3 sec

A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles [chapter]

Besnik Fetahu, Stefan Dietze, Bernardo Pereira Nunes, Marco Antonio Casanova, Davide Taibi, Wolfgang Nejdl
2014 Lecture Notes in Computer Science  
To address this issue, we propose an approach for creating linked dataset profiles. A profile consists of structured dataset metadata describing topics and their relevance.  ...  To enable a good trade-off between scalability and accuracy of generated profiles, appropriate parameters are determined experimentally.  ...  Our main contributions consist of (i) a scalable method for efficiently generating structured dataset profiles, combining and configuring suitable methods for NER, topic extraction and ranking as part  ... 
doi:10.1007/978-3-319-07443-6_35 fatcat:j6jbceyqszgshan6fnd4kphs5i

Retrieval, Crawling and Fusion of Entity-centric Data on the Web [chapter]

Stefan Dietze
2017 Lecture Notes in Computer Science  
On the one hand, recommendation, linking, profiling and retrieval can provide efficient means to enable discovery and search of entity-centric data, specifically when dealing with traditional knowledge  ...  To this end, markup data lends itself as a data source for aiding tasks such as knowledge base augmentation, where data fusion techniques are required to address the inherent characteristics of markup  ...  While all discussed works are joint research with numerous colleagues, friends and collaborators from a number of research institutions, the author would like to thank all involved researchers for the  ... 
doi:10.1007/978-3-319-53640-8_1 fatcat:bilhfrwhgvgwnfm3wub4g55lr4

Real World Applications of Machine Learning Techniques over Large Mobile Subscriber Datasets [article]

Jobin Wilson, Chitharanj Kachappilly, Rakesh Mohan, Prateek Kapadia, Arun Soman, Santanu Chaudhury
2015 arXiv   pre-print
Communication Service Providers (CSPs) are in a unique position to utilize their vast transactional data assets generated from interactions of subscribers with network elements as well as with other subscribers  ...  CSPs could leverage its data assets for a gamut of applications such as service personalization, predictive offer management, loyalty management, revenue forecasting, network capacity planning, product  ...  Our approach involves transforming subscribers and contents into a single latent topic space to generate recommendations.  ... 
arXiv:1502.02215v1 fatcat:zgjpne4cu5hn7pecrgvmq437bi

Predicting Friendship Links in Social Networks Using a Topic Modeling Approach [chapter]

Rohit Parimi, Doina Caragea
2011 Lecture Notes in Computer Science  
As an alternative, we propose a topic modeling approach to the problem of predicting new friendships based on interests and existing friendships.  ...  We construct features for the link prediction problem based on the resulting topic distributions.  ...  A topic model, in general, is a generative model, i.e. it specifies a probabilistic way in which documents can be generated.  ... 
doi:10.1007/978-3-642-20847-8_7 fatcat:pr45l4gz5jgz7nduw6yveuumae

Personalized Concept-Based Search and Exploration on the Web of Data Using Results Categorization [chapter]

Melike Sah, Vincent Wade
2013 Lecture Notes in Computer Science  
When the user selects a concept lens for exploration, results are immediately personalized.  ...  Our personalization approach is non-intrusive, privacy preserving and scalable since it does not require login and implemented at the client-side.  ...  Generally user profiles are utilized for results re-ranking. In contrast to the general approach, our personalized search approach is driven by results categorization.  ... 
doi:10.1007/978-3-642-38288-8_36 fatcat:2kl6pruezrhmnplxx2qignmczy

2020 Index IEEE Transactions on Knowledge and Data Engineering Vol. 32

2021 IEEE Transactions on Knowledge and Data Engineering  
., +, TKDE April 2020 803-808 Complication Risk Profiling in Diabetes Care: A Bayesian Multi-Task and Feature Relationship Learning Approach.  ...  ., +, TKDE Nov. 2020 2060-2074 A Scalable Multi-Data Sources Based Recursive Approximation Approach for Fast Error Recovery in Big Sensing Data on Cloud.  ... 
doi:10.1109/tkde.2020.3038549 fatcat:75f5fmdrpjcwrasjylewyivtmu

Monitoring Network Evolution using MDL

Jure Ferlez, Christos Faloutsos, Jure Leskovec, Dunja Mladenic, Marko Grobelnik
2008 2008 IEEE 24th International Conference on Data Engineering  
We illustrate our algorithm on synthetic and large real datasets, and we show that the results of the TimeFall agree with human intuition.  ...  Given publication titles and authors, what can we say about the evolution of scientific topics and communities over time? Which communities shrunk, which emerged, and which split, over time?  ...  In Figure 1 each box represents a community of words -a user profile topic. Each topic is thus described by a set of keywords that are characteristic for the corresponding topic.  ... 
doi:10.1109/icde.2008.4497545 dblp:conf/icde/FerlezFLMG08 fatcat:rngivrbobzej5n3v7xmalvgqaa

Implementing a Volunteer Notification System into a Scalable, Analytical Realtime Data Processing Environment [chapter]

Jesko Elsner, Tomas Sivicki, Philipp Meisen, Tobias Meisen, Sabina Jeschke
2016 Automation, Communication and Cybernetics in Science and Engineering 2015/2016  
This paper will focus on a basic concept for implementing a VNS approach into a scalable, fault-tolerant environment that uses state-of-the-art analytical tools to process information streams in real-time  ...  With respect to volunteer notification systems (VNS), the resulting vast amounts of data can be utilized for profiling and predicting the whereabouts of people that, combined with machine learning algorithms  ...  As clients generally subscribe to specific topics in order to achieve push-like notifications, horizontal scaling will result in brokers having different information and topic structures.  ... 
doi:10.1007/978-3-319-42620-4_64 fatcat:pd2nu6qe5ffb3m76hoctaimfse


Ahmad Assaf, Aline Senart, Raphaël Troncy
2015 Proceedings of the 24th International Conference on World Wide Web - WWW '15 Companion  
In this paper, we propose Roomba, a scalable automatic approach for extracting, validating, correcting and generating descriptive linked dataset profiles.  ...  While Roomba is generic, we target CKAN-based data portals and we validate our approach against a set of open data portals including the Linked Open Data (LOD) cloud as viewed on the DataHub.  ...  In this paper, we propose Roomba, a scalable automatic approach for extracting, validating, correcting and generating descriptive linked dataset profiles.  ... 
doi:10.1145/2740908.2742827 dblp:conf/www/AssafST15 fatcat:qbypp6pzxjhhtgu2adtc4ajt5e

Towards robust and scalable peer-to-peer social networks

Alexandra Olteanu, Guillaume Pierre
2012 Proceedings of the Fifth Workshop on Social Network Systems - SNS '12  
We evaluate our algorithms using real large-scale datasets, and show that they can disseminate information efficiently while controlling node degrees, even in the presence of high churn.  ...  Acknowledgments We would like to thank the anonymous reviewers for their feedback and suggestions. We thank Stefan Bucur and Cristian Zamfir for their help and advice.  ...  A simple approach would be to replicate a user's profile in nodes owned by its friends, or the friends of its friends.  ... 
doi:10.1145/2181176.2181186 dblp:conf/sns/OlteanuP12 fatcat:yasfkto34nhbrojdoezikhsofy

DeGPar: Large Scale Topic Detection Using Node-Cut Partitioning on Dense Weighted Graphs

Kambiz Ghoorchian, Sarunas Girdzijauskas, Fatemeh Rahimian
2017 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)  
In this paper we address the scalability problem by introducing an efficient and scalable graph based algorithm for TD on short texts, leveraging dimensionality reduction and clustering techniques.  ...  The results on two widely used benchmark datasets show that our algorithm not only maintains a similar or better accuracy, but also performs by an order of magnitude faster than the state-of-the-art approaches  ...  For example, it does not allow for overlapping topic structures, since the model emphasizes on an orthogonal basis for topic representation.  ... 
doi:10.1109/icdcs.2017.19 dblp:conf/icdcs/GhoorchianGR17 fatcat:boncjoix5vgudmz3dm3c6p7siq

EZLDA: Efficient and Scalable LDA on GPUs [article]

Shilong Wang
2020 arXiv   pre-print
LDA is a statistical approach for topic modeling with a wide range of applications.  ...  To this end, we introduce EZLDA which achieves efficient and scalable LDA training on GPUs with the following three contributions: First, EZLDA introduces three-branch sampling method which takes advantage  ...  INTRODUCTION Topic modeling is a type of statistical approach that reveals the latent (i.e., unobserved) topics for a collection of documents (also referred to as corpus).  ... 
arXiv:2007.08725v1 fatcat:zb4zolzb7jfmrkpywshx55xwbu

A NoSQL Approach for Aspect Mining of Cultural Heritage Streaming Data

Gerasimos Vonitsanos, Andreas Kanavos, Alaa Mohasseb, Dimitrios Tsolis
2019 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA)  
In this paper, we present a NoSQL database approach for aspect mining of a cultural heritage scenario by taking advantage of Apache Spark streaming architecture.  ...  Naturally representing and efficiently processing a large number of opinions can be implemented with the use of streaming technologies.  ...  This framework will be oriented towards efficient content delivery, and it is shown how it can be applied to a dataset consisting of different topics in order for meaningful opinions about historical sites  ... 
doi:10.1109/iisa.2019.8900770 dblp:conf/iisa/VonitsanosKMT19 fatcat:vqrlkkrxtbdwjd7ebchbreypgq

Hierarchical Categorization of Open Source Software by Online Profiles

Tao WANG, Huaimin WANG, Gang YIN, Cheng YANG, Xiang LI, Peng ZOU
2014 IEICE transactions on information and systems  
In this paper, we propose a novel approach to hierarchically categorize software projects based on their online profiles.  ...  We design a SVMbased categorization framework and adopt a weighted combination strategy to aggregate different types of profile attributes from multiple repositories.  ...  We would like to thank Xiao Li, Yue Yu and Li Fang for their collaboration. We would also thank the anonymous reviewers for providing us constructive comments and suggestions.  ... 
doi:10.1587/transinf.2014edp7007 fatcat:boh4ea5gjbdbzpfmuf7fnf5ihq

Using DMoz for constructing ontology from data stream

M. Grobelnik, J. Brank, D. Mladenic, B. Novak, B. Fortuna
2006 28th International Conference on Information Technology Interfaces, 2006.  
In general, concepts and relations can be formed into an ontological structure either by clustering or by classification into an existing topic hierarchy.  ...  We propose the latter using DMoz as an existing topic hierarchy. The approach is efficient and can scale to large data sets.  ...  The approach proposed has a great emphasis on efficiency, scalability, and being able to process large quantities of data (hundreds of thousands of documents).  ... 
doi:10.1109/iti.2006.1708521 fatcat:nxsceopeibfdxahkpi3xdx3g5u
« Previous Showing results 1 — 15 out of 13,997 results