100,372 Hits in 2.1 sec

Outlier Detection for Text Data [chapter]

Ramakrishnan Kannan, Hyenkyun Woo, Charu C. Aggarwal, Haesun Park
2017 Proceedings of the 2017 SIAM International Conference on Data Mining  
Our approach has significant advantages over traditional methods for text outlier detection.  ...  The problem of outlier detection is extremely challenging in many domains such as text, in which the attribute values are typically non-negative, and most values are zero.  ...  Given the relevant literature for NMF and text outliers, we propose the first approach to detect outliers in text data using non-negative matrix factorization.  ... 
doi:10.1137/1.9781611974973.55 dblp:conf/sdm/KannanWAP17 fatcat:khzjklon2jhinjkqwatnnzziye

Outlier Detection for Text Data : An Extended Version [article]

Ramakrishnan Kannan, Hyenkyun Woo, Charu C. Aggarwal, Haesun Park
2017 arXiv   pre-print
Our approach has significant advantages over traditional methods for text outlier detection.  ...  The problem of outlier detection is extremely challenging in many domains such as text, in which the attribute values are typically non-negative, and most values are zero.  ...  license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.  ... 
arXiv:1701.01325v1 fatcat:ujl4qkjqdjasdosoprxbuhztv4

Anomaly-Injected Deep Support Vector Data Description for Text Outlier Detection [article]

Zeyu You, Yichu Zhou, Tao Yang, Wei Fan
2021 arXiv   pre-print
Anomaly detection or outlier detection is a common task in various domains, which has attracted significant research efforts in recent years.  ...  Existing works mainly focus on structured data such as numerical or categorical data; however, anomaly detection on unstructured textual data is less attended.  ...  He provides meaningful suggestions and tremendous help for this work.  ... 
arXiv:2110.14729v1 fatcat:iw42w6wr4vhark663chove646m

Outlier Reduction using Hybrid Approach in Data Mining

Nancy Lekhi, Manish Mahajan
2015 International Journal of Modern Education and Computer Science  
There is lots of outlier detection methods present in data mining. The proposed method use Integrating Semantic Knowledge (SOF) for outlier detection.  ...  The proposed method uses two data mining techniques for cluster formulation i.e. weighted k-means and neural network where weighted kmeans is the clustering technique that can apply on text and date data  ...  with distance based method for outlier detection.  ... 
doi:10.5815/ijmecs.2015.05.06 fatcat:s6gz25wovfh4nb323mz5kdclyy

Efficient Outlier Detection in Text Corpus Using Rare Frequency and Ranking

Wathsala Anupama Mohotti, Richi Nayak
2020 ACM Transactions on Knowledge Discovery from Data  
in [2] Numerical data Subspace outlier detection in [36] Numerical data Projection-based Random projection Text data based outlier detection [6] NMF based outlier detection [31] Text data Angle-based  ...  Poor scalability of a proximity approach for the high dimensional data further makes it less effective for text outlier detection.  ... 
doi:10.1145/3399712 fatcat:r3i4rg2tlzgfhmd3utp4ydy7vi

Supporting the search for cross-context links by outlier detection methods

Borut Sluban, Nada Lavrač
2010 BMC Bioinformatics  
Several outlier detection methods have been developed in data and text mining, mainly used for noise filtering and error detection purposes.  ...  Except for [1], outlier detection in text mining has not yet been used for exploratory purposes.  ...  Several outlier detection methods have been developed in data and text mining, mainly used for noise filtering and error detection purposes.  ... 
doi:10.1186/1471-2105-11-s5-p2 pmcid:PMC2956395 fatcat:svqswtm2gfc6lczutsll3f2dy4

Use of Stylometry and Outlier Detection Algorithm in Online Writing Sample to Detect Outliers

Sonia Sharma
2014 IOSR Journal of Engineering  
Outlier Detection, now a days, is one of the emerging technology used in data mining. The data objects which deviate from the other data objects in the data set are considered to be as outliers.  ...  Outlier detection contains a broad spectrum of techniques to detect outliers. [1] Here, we are going to propose an algorithm which detects outliers (unmatched sample) in online writing sample.  ...  Outlier detection is one of the important technique used in the data mining. Outlier Detection Outlier detection, as we said, is one of the important branch in the data mining.  ... 
doi:10.9790/3021-04422730 fatcat:fyju6xlbc5as7gof7yd6tbthgq

An Efficient Model for Mining Outlier Opinions

Neama Hassan, Laila A., Yehia K.
2020 International Journal of Advanced Computer Science and Applications  
The proposed MOoM model, stands for Mining Outlier Opinion Model, offers for the first time the ability to mine outlier opinions from product's free-text reviews.  ...  The proposed model consists of three modules; Data preprocessing module, Opinion mining module and outlier opinions detection module.  ...  The 3D Scatter Plot for Outlier Detection Module. TABLE I .  ... 
doi:10.14569/ijacsa.2020.0110522 fatcat:vaswykcav5conf6c5via4dqzbi

Approximate Document Outlier Detection Using Random Spectral Projection [chapter]

Mazin Aouf, Laurence A. F. Park
2012 Lecture Notes in Computer Science  
Outlier detection is an important process for text document collections, but as the collection grows, the detection process becomes a computationally expensive task.  ...  Random projection has shown to provide a good fast approximation of sparse data, such as document vectors, for outlier detection.  ...  We also showed that the DCT projection provides similar accuracy and computation time as random projection, but it requires much less storage (60% less for the large document set).  ... 
doi:10.1007/978-3-642-35101-3_49 fatcat:zwfcviexyjczljdmm3xlc7ua3e

Noise Reduction and Content Retrieval from Web Pages

Surabhi Lingwal
2013 International Journal of Computer Applications  
Web contents of different fields which can offer important information to users are available in the Web like multimedia data, structured, semistructured and unstructured data.  ...  Data on web pages contain formatting code, advertisement, navigation links, etc.  ...  Algorithm for Outlier Detection Filter Outliers This operation takes a Dataset as input and returns a new DataSet including only the data that fulfill a condition.  ... 
doi:10.5120/12729-9573 fatcat:ozlsoy2uhjek3jv4tivbipmabm

Outlier Detection and Trend Detection: Two Sides of the Same Coin

Erich Schubert, Michael Weiler, Arthur Zimek
2015 2015 IEEE International Conference on Data Mining Workshop (ICDMW)  
Many established outlier detection methods are designed to search for low-density objects in a static data set of vectors in Euclidean space.  ...  Outlier detection is commonly defined as the process of finding unusual, rare observations in a large data set, without prior knowledge of which objects to look for.  ...  Outlier Detection in Data Streams Recently, Sadik and Gruenwald [62] gave an overview on research issues for outlier detection in data streams.  ... 
doi:10.1109/icdmw.2015.79 dblp:conf/icdm/SchubertWZ15 fatcat:xntevut4pbhw5e4c3pcpenhcdy

Classification of Concept Drifting Data Streams Using Adaptive Novel-Class Detection

Aparna Yeshwantrao Ladekar, M. Y. Joshi
2016 International Journal of Computer Engineering in Research Trends  
Practically it is not possible to store and use all data for training purpose whenever required due to infinite length of data streams. Feature evolution frequently occurs in many text streams.  ...  In text streams new features like words or phrases may occur when stream progresses. New classes evolving in the data stream which occurs concept-evolution as a result.  ...  Parsing and pre-processing done on text document i.e. twitter messages. The text is divided into words i.e. tokens. The features extracted forms a feature space for a particular text.  ... 
doi:10.22362/ijcert/2016/v3/i9/48901 fatcat:2kvggmvrbbhcleur5lrvmqqtge


Pavel Stefanovič, Olga Kurasova
2018 Neural Network World  
To detect outliers in SOM, a genetic algorithm-based travelling salesman approach has been applied. After outliers are detected and removed, the SOM quality has to be estimated.  ...  SOM is used for data clustering as well as dimensionality reduction and the results obtained are presented in a special graphical form.  ...  Conclusions The research deals with a capability of self-organizing maps to detect data outliers. A new algorithm for detection of outliers in SOM has been proposed.  ... 
doi:10.14311/nnw.2018.28.006 fatcat:wq7wz7u5yngjrnueuwpovgeljq

Detecting Anomalies in Sequences of Short Text Using Iterative Language Models

Cynthia Freeman, Ian Beaver, Abdullah Mueen
2021 Proceedings of the ... International Florida Artificial Intelligence Research Society Conference  
In addition, literature for detecting anomalies in time ordered sequences of short text is shallow considering the abundance of such data sets in online settings.  ...  We introduce a technique for detecting anomalies in sequences of short textual data by adaptively and iteratively learning low perplexity language models.  ...  We use as a baseline a non-negative matrix factorization method adjusted for detecting outliers in text called Text Outliers using Non-Negative Matrix Factorization (TONMF) developed in (Kannan et al.  ... 
doi:10.32473/flairs.v34i1.128551 fatcat:dw3zqegorveudj5fyg4jbdripe

Spatial data mining for finding nearest neighbor and outlier detection

2016 International Journal of Latest Trends in Engineering and Technology  
In GIS there can be many problems that include finding the nearest neighbor or finding the outliers in a data set.  ...  Spatial databases support spatial indexing, efficient algorithms for processing spatial operations, and domain specific rules for query optimization.  ...  Outlier detection plays vital roles in astronomy for finding any abnormal activity or for weather prediction also.  ... 
doi:10.21172/1.81.004 fatcat:i3zv2lin5navvh4safdbkr6gey
« Previous Showing results 1 — 15 out of 100,372 results