Filters








359,798 Hits in 2.6 sec

Stream Similarity Mining [chapter]

Erik Vee
2016 Encyclopedia of Database Systems  
Cross-References Approximation and Data Reduction Techniques Stream Data Management Stream Mining 1 median n jZ .1/ j; : : : ; jZ .k/ j o ; For every subset A of [n], the min-hash forA (with respect  ...  Data Mining Often individual entities are represented by massive streams of data (e.g., phone calls from a large company, or IP addresses of users visiting a given web site, or items bought at a grocery  ... 
doi:10.1007/978-1-4899-7993-3_373-2 fatcat:ck43a7vqsfanhlwdx2xd37hf3e

Opinion Stream Mining [chapter]

Myra Spiliopoulou, Eirini Ntoutsi, Max Zimmermann
2016 Encyclopedia of Machine Learning and Data Mining  
Opinion stream mining aims at learning and adaptation of a polarity model over a stream of opinionated documents, i.e., documents associated with a polarity.  ...  In this chapter, we overview methods for polarity learning in a stream environment focusing especially on how these methods deal with the challenges imposed by the stream nature of the data, namely the  ...  Synonyms Mining a Stream of Opinionated Documents; Polarity Learning on a Stream Definition Opinion stream mining is a variant of stream mining, of text mining and of opinion mining.  ... 
doi:10.1007/978-1-4899-7502-7_905-1 fatcat:d2fgtvgjzbhz7pkmt44amrfpmy

Mining Text Streams [chapter]

Charu C. Aggarwal
2012 Mining Text Data  
In this chapter, we review text stream mining algorithms for a wide variety of problems in data mining such as clustering, classification and topic modeling.  ...  Such text streams provide unprecedented challenges to data mining algorithms from an efficiency perspective.  ...  Conclusions This chapter studies the problem of mining text streams.  ... 
doi:10.1007/978-1-4614-3223-4_9 fatcat:x3q4nx36zzea7pn2y5w7xmli34

Mining data streams

Mohamed Medhat Gaber, Arkady Zaslavsky, Shonali Krishnaswamy
2005 SIGMOD record  
ヒストグラム、分位数、頻度モーメント -データ集合の全特徴を表すわけではない(近似解) • アグリゲーション -統計的尺度の計算 • 手段・多様性 -ストリーム要約、集約データのマイニングへの利用 • 問題点 -データ分布の変動が激しいと性能悪化 2.2 タスクに基づく解決 • 主に三つ -近似アルゴリズム -スライド窓 -出力粒度 2.2.1 近似アルゴリズム • 近似アルゴリズムの起源 -Data Streams  ... 
doi:10.1145/1083784.1083789 fatcat:ohpvea5v4ranboovqfwyyjbi7m

Mining Data Streams [chapter]

Charu C. Aggarwal
2015 Data Mining  
conventional decision trees • Assumption: third best split attribute significantly worse than the best two ones (may not be realistic) • Can the same approach be applied to other hierarchical data mining  ...  Records arrive at a rapid rate Data stream is sequence of records n r r , , 1 L Introduction Data Streams • Computation Model • Requirements -Data Stream Stream Processing Engine  ... 
doi:10.1007/978-3-319-14142-8_12 fatcat:ndpytwhxnfdjzkipzj43tdg3u4

Data Stream Mining [chapter]

Mohamed Medhat Gaber, Arkady Zaslavsky, Shonali Krishnaswamy
2009 Data Mining and Knowledge Discovery Handbook  
Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams.  ...  This text explains the theoretical and practical foundations of the methods and streams available in MOA. The moa and the weka are both birds native to New Zealand.  ...  A majority of concept drift research in data streams mining is done using traditional data mining frameworks such as WEKA [158] .  ... 
doi:10.1007/978-0-387-09823-4_39 fatcat:3n27k753zzhidbkjynx7iwmt7e

Mining Positional Data Streams [chapter]

Jens Haase, Ulf Brefeld
2015 Lecture Notes in Computer Science  
We study frequent pattern mining from positional data streams. Existing approaches require discretised data to identify atomic events and are not applicable in our continuous setting.  ...  We propose an efficient trajectory-based preprocessing to identify similar movements and a distributed pattern mining algorithm to identify frequent trajectories.  ...  [1] propose the first approach to mine unrestricted episodes. Our approach generalises [1] to mining positional data streams.  ... 
doi:10.1007/978-3-319-17876-9_7 fatcat:ewwm6iaxrffvrhmhvniq55c6ka

Mountain mining damages streams

Natasha Gilbert
2010 Nature  
Mountain mining damages streams West Virginia's mountains contain valuable low-sulphur coal. "Even at very low levels of mining we found a dramatic impact on water quality and stream composition.  ...  The team also noted "sharp declines" in some stream invertebrates in areas where as little as 1% of the watershed had been mined.  ... 
doi:10.1038/466806a pmid:20703278 fatcat:24qgvwirjjb7dpkyshry5quqk4

Graph Mining on Streams [chapter]

Linda L. Hill, Mehmet M. Dalkiliç, Brahim Medjahed, Mourad Ouzzani, Ahmed K. Elmagarmid, Joseph M. Hellerstein, Colin R. Reeves, Christopher B. Jones, Ross S. Purves, Michael F. Goodchild, Jayant Sharma, John Herring (+24 others)
2009 Encyclopedia of Database Systems  
Such a stream naturally defines an undirected, unweighted graph Graph mining on streams is concerned with estimating properties of G, or finding patterns within G, given the usual constraints of the data-stream  ...  SYNONYMS Graph Streams; Semi-Streaming Model DEFINITION Consider a data stream A = a 1 , a 2 , . . . , a m where each data item a k ∈ [n] × [n].  ...  Multi-Pass Models: It is common in graph mining to consider algorithms that may take more than one pass over the stream.  ... 
doi:10.1007/978-0-387-39940-9_184 fatcat:7lcaeiin7zehvef7ltistncg4m

Indexing and mining streams

Christos Faloutsos
2004 Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04  
DESCRIPTION -OBJECTIVES How can we find patterns in a sequence of sensor measurements (eg., a sequence of temperatures, or water-pollutant measurements)? How can we compress it? What are the major tools for forecasting and outlier detection? The objective of this tutorial is to provide a concise and intuitive overview of the most important tools, that can help us find patterns in sensor sequences. Sensor data analysis becomes of increasingly high importance, thanks to the decreasing cost of
more » ... reasing cost of hardware and the increasing on-sensor processing abilities. We review the state of the art in three related fields: (a) fast similarity search for time sequences, (b) linear forecasting with the traditional AR (autoregressive) and ARIMA methodologies and (c) non-linear forecasting, for chaotic/self-similar time sequences, using lag-plots and fractals. The emphasis of the tutorial is to give the intuition behind these powerful tools, which is usually lost in the technical literature, as well as to give case studies that illustrate their practical use. NOTICE. : At SIGMOD, Prof. Dennis Shasha will be delivering a related but complementary tutorial, which focuses on anomaly and burst detection in financial and scientific time series. OUTLINE Similarity Search. We shall cover the need for similarity search; the most popular distance functions (Euclidean, LP norms, time-warping); the most successful indexing methods (R-trees [4], M-trees [2]; and the most popular feature extraction methods from signal processing (DFT, Wavelets, SVD), as well as Multidimensional Scaling and FastMap [3]. Linear Forecasting. We will cover the main idea behind linear forecasting, the popular AR methodology [1] and the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro t or commercial advantage and that copies bear this notice and the full citation on the rst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speci c permission and/or a fee.
doi:10.1145/1007568.1007728 dblp:conf/sigmod/Faloutsos04 fatcat:p36y6szokne2fhnrrgwduqejtq

Graph Mining on Streams [chapter]

Andrew McGregor
2016 Encyclopedia of Database Systems  
Multi-Pass Models: It is common in graph mining to consider algorithms that may take more than one pass over the stream.  ...  Such a stream naturally defines an undirected, unweighted graph G = (V, E) where V = {v 1 , . . . , v n } and , Graph mining on streams is concerned with estimating properties of G, or finding patterns  ...  Massive graphs also arise in structured data mining, where the relationships among the data items in the data set are represented as graphs, and social networks.  ... 
doi:10.1007/978-1-4899-7993-3_184-2 fatcat:mcru25yfabh2lj77xhzkw5aoyi

A Data Stream Mining System

Hetal Thakkar, Barzan Mozafari, Carlo Zaniolo
2008 2008 IEEE International Conference on Data Mining Workshops  
On-line data stream mining has attracted much research interest, but systems that can be used as a workbench for online mining have not been researched, since they pose many difficult research challenges  ...  algorithms that are fast & light enough to be effective on data streams, and (iii) support for Mining Model Definition Language (MMDL) that allows users to define new mining algorithms as a set of tasks  ...  Each mining flow has an input stream and an output stream, INSTREAM and OUT-STREAM, respectively. The analyst specifies how the tasks of a mining model interconnect via intermediate streams, e.g.  ... 
doi:10.1109/icdmw.2008.133 dblp:conf/icdm/ThakkarMZ08 fatcat:ehspz3x7k5h4hd6sbja4z54ine

Querying and mining data streams

Minos Garofalakis, Johannes Gehrke, Rajeev Rastogi
2002 Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02  
for approximate-query processing. § Mining High-speed Data Streams: Single-pass algorithms for association-rule discovery, clustering, and decision-tree construction over data streams. § Advanced Topics  ...  of streaming XML documents.  ...  for approximate-query processing. § Mining High-speed Data Streams: Single-pass algorithms for association-rule discovery, clustering, and decision-tree construction over data streams. § Advanced Topics  ... 
doi:10.1145/564793.564794 fatcat:ps7a7vvegndj7bmrhclmzl42ry

Mining developer communication data streams [article]

Andy M. Connor, Jacqui Finlay, Russel Pears
2014 arXiv   pre-print
This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes.  ...  In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication.  ...  in order to apply data stream mining techniques [17] to facilitate the mining of project data on the fly to provide rapid just-in-time information to guide software decisions.  ... 
arXiv:1407.6104v1 fatcat:utxl3cltsnb2niije2bvospuqi

Active Mining of Data Streams [chapter]

Wei Fan, Yi-an Huang, Haixun Wang, Philip S. Yu
2004 Proceedings of the 2004 SIAM International Conference on Data Mining  
Most previously proposed mining methods on data streams make an unrealistic assumption that "labelled" data stream is readily available and can be mined at anytime.  ...  In this paper, we propose a new concept of demand-driven active data mining. It estimates the error of the model on the new data stream without knowing the true class labels.  ...  Demand-driven Active Mining of Data Streams We are proposing a demand-driven active stream data mining process that solves the problems of passive stream data mining.  ... 
doi:10.1137/1.9781611972740.46 dblp:conf/sdm/FanHWY04 fatcat:rfe3kgzdubgtlf6xeabi6h3u2u
« Previous Showing results 1 — 15 out of 359,798 results