A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2007; you can also visit the original URL.
The file type is application/pdf
.
Filters
Significance and Recovery of Block Structures in Binary Matrices with Noise
[chapter]
2006
Lecture Notes in Computer Science
Frequent itemset mining (FIM) is one of the core problems in the field of Data Mining and occupies a central place in its literature. ...
We begin by establishing several results concerning the extremal behavior of submatrices of ones in a binary matrix with random entries. ...
[28] assessed the significance of bi-clusters in a real-valued matrix using likelihood-based weights, a normal approximation and a standard Bonferroni bound to account for the multiplicity of submatrices ...
doi:10.1007/11776420_11
fatcat:gu4qi73knjhrxdbkoqkc7x6diy
Greedy Search-Binary PSO Hybrid for Biclustering Gene Expression Data
2010
International Journal of Computer Applications
As a useful data mining technique biclustering identifies local patterns from gene expression data. ...
A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. ...
Moreover clustering happens to partition the genes into disjoint sets i.e. each gene is associated with a single biological function, which in fact is in contradiction to the biological system [1] . ...
doi:10.5120/651-908
fatcat:7klls6iavzevvpfd3rsho3gzva
Partitioning a matrix with non-guillotine cuts to minimize the maximum cost
2002
Discrete Applied Mathematics
We consider the problem of partitioning a matrix of m rows and n columns of non-negative integers into M smaller submatrices. ...
With each submatrix is associated a cost equal to the sum of its elements. The objective is to minimize the cost of the submatrix of maximum cost. ...
The second application deals with the balanced subdivision of a rectangular mining area among M mining companies. ...
doi:10.1016/s0166-218x(00)00286-9
fatcat:mxj2avhc2zebfosyyskvh5k7mu
On the maximal size of Large-Average and ANOVA-fit Submatrices in a Gaussian Random Matrix
[article]
2010
arXiv
pre-print
We investigate the maximal size of distinguished submatrices of a Gaussian random matrix. ...
Of interest are submatrices whose entries have average greater than or equal to a positive constant, and submatrices whose entries are well-fit by a two-way ANOVA model. ...
We would also like to thank John Hartigan for pointing out the use of the Gaussian comparison principle as an alternative way of obtaining the bounds of Proposition 1. ...
arXiv:1009.0562v1
fatcat:d6hrdp3reja7rc63ro35sh52ae
Partition of a Binary Matrix intok(k ≥ 3) Exclusive Row and Column Submatrices Is Difficult
2014
Mathematical Problems in Engineering
Biclustering in matrices with binary entries ("0"/"1") can be simplified into the problem of finding submatrices with entries of "1." ...
Biclustering aims at finding a bicluster—a subset of objects that exhibit similar behavior across a subset of attributes, or vice versa. ...
Moreover, the complexity of some variants of finding bicliques in bipartite graphs is open, for example, the maximum ±1 edge weight biclique problem [15] . ...
doi:10.1155/2014/934630
fatcat:s6ok6rn3frdzvoyshpcu77lxx4
Aggregated 2D range queries on clustered points
2016
Information Systems
Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP ...
Our experimental evaluation shows that this technique can speed up aggregated queries up to more than an order of magnitude, with a small space overhead. ...
To do this, we traverse the tree as in a top-k range query, but we only output weights whose value is in ½w 1 ; w 2 . Moreover, we discard submatrices whose maximum weight is below w 2 . ...
doi:10.1016/j.is.2016.03.004
fatcat:4jrb2sthlbd5znzqmx4uq4ikri
SiBIC: A Tool for Generating a Network of Biclusters Captured by Maximal Frequent Itemset Mining
[chapter]
2018
Msphere
Acknowledgements Part of this research has been supported by MEXT KAKENHI #16H02868 and #17H01783, ACCEL and PRESTO of JST and FiDiPro of Tekes. ...
Click 'NODES' and then click 'W.DEG' (weighted degree) to sort the table. 3. Click a cell in the row of the node with the maximum weighted degree. 4. ...
Gene Set Networks To visualize the biclusters, we use gene set networks, each being a weighted graph, where a node corresponds to a coexpressed gene set and an edge indicates the difference of experimental ...
doi:10.1007/978-1-4939-8561-6_8
pmid:30030806
fatcat:jcwzay6gffcz3mmdeosessfh7y
On the maximal size of large-average and ANOVA-fit submatrices in a Gaussian random matrix
2013
Bernoulli
Running title: Maximal submatrices of a Gaussian random matrix Keywords: analysis of variance, data mining, Gaussian random matrix, large average submatrix, random matrix theory, second moment method ...
We investigate the maximal size of distinguished submatrices of a Gaussian random matrix. ...
In particular, the vertex set V of G is the disjoint union of two sets V 1 and V 2 , with |V 1 | = m and |V 2 | = n, corresponding to the rows and columns of X, respectively. ...
doi:10.3150/11-bej394
pmid:24194673
pmcid:PMC3816128
fatcat:wmzzsjshnzfdxg5q2ywjcbss3a
Mining discrete patterns via binary matrix factorization
2009
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09
A best approximation on such data has a minimum set of inconsistent entries, i.e., mismatches between the given binary data and the approximate matrix. ...
Mining discrete patterns in binary data is important for subsampling, compression, and clustering. ...
., two disjoint submatrices. By this, two child nodes of the root are constructed. ...
doi:10.1145/1557019.1557103
dblp:conf/kdd/ShenJY09
fatcat:hntbxwti7fguvfg2o5ow7rdf4y
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure
2016
Computational Intelligence and Neuroscience
Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. ...
Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. ...
Classic weighting schemes [20, 21] are proposed on the basis of information about the frequency distribution of index terms within the whole collection or within the relevant and nonrelevant sets of ...
doi:10.1155/2016/1096271
pmid:27579031
pmcid:PMC4992544
fatcat:uhgrhvkr25fk3nvslbadbdtcaq
Homology Computation of Large Point Clouds using Quantum Annealing
[article]
2016
arXiv
pre-print
In this paper, we present a quantum annealing pipeline for computation of homology of large point clouds. The pipeline takes as input a graph approximating the given point cloud. ...
It uses quantum annealing to compute a clique covering of the graph and then uses this cover to construct a Mayer-Vietoris complex. ...
It consists of partitioning the vertex set of G into k non-empty and fixed-sized subsets so that the total weight of edges connecting distinct subsets is minimized. ...
arXiv:1512.09328v3
fatcat:pvx6jnwtkvds3lbrv4d6iif5le
Data Ranking and Clustering via Normalized Graph Cut Based on Asymmetric Affinity
[chapter]
2013
Lecture Notes in Computer Science
The first method requires a priori known class labeled data that can be utilized, e.g., for a calibration phase of a braincomputer interface (BCI). ...
In this paper, we present an extension of the state-of-theart normalized graph cut method based on asymmetry of the affinity matrix. ...
Generally speaking, nCut is the maximum a posteriori estimation because its value depends on the number of entries in submatrices T L, T R, BL, BR, see Figure 1 . ...
doi:10.1007/978-3-642-41184-7_57
fatcat:ynrmzn335ves5n6jaqxkpmaeuy
Finding Biclusters by Random Projections
[chapter]
2004
Lecture Notes in Computer Science
Given a matrix X composed of symbols, a bicluster is a submatrix of X obtained by removing some of the rows and some of the columns of X in such a way that each row of what is left reads the same string ...
A detailed probabilistic analysis of the algorithm and an asymptotic study of the statistical significance of the solutions are given. We report results of extensive simulations on synthetic data. ...
This problem has a variety of applications ranging from computational biology to data mining. ...
doi:10.1007/978-3-540-27801-6_8
fatcat:3xq7vcfvcnfsnffuusb5khcjc4
Finding biclusters by random projections
2006
Theoretical Computer Science
Given a matrix X composed of symbols, a bicluster is a submatrix of X obtained by removing some of the rows and some of the columns of X in such a way that each row of what is left reads the same string ...
A detailed probabilistic analysis of the algorithm and an asymptotic study of the statistical significance of the solutions are given. We report results of extensive simulations on synthetic data. ...
This problem has a variety of applications ranging from computational biology to data mining. ...
doi:10.1016/j.tcs.2006.09.023
fatcat:moai2q3rk5gblodxq6qyglioqu
Robust Calibration for Localization in Clustered Wireless Sensor Networks
2010
IEEE Transactions on Automation Science and Engineering
To use the FAST-LTS, one needs to input a trimming parameter, which is a function of the sensor redundancy in a network. ...
Applying the robust estimators available from robust statistics research to a wireless sensor network, however, faces a number of computational challenges. ...
(a) Disconnected clusters. (b) Connected clusters. The design matrix consists of disjoint submatrices. ...
doi:10.1109/tase.2009.2013475
fatcat:wjtkmor3mbduhg7mn45j6jk6ja
« Previous
Showing results 1 — 15 out of 197 results