Significance and Recovery of Block Structures in Binary Matrices with Noise
[chapter]

2006
*
Lecture Notes in Computer Science
*

Frequent itemset

doi:10.1007/11776420_11
fatcat:gu4qi73knjhrxdbkoqkc7x6diy
*mining*(FIM) is one*of*the core problems in the field*of*Data*Mining*and occupies*a*central place in its literature. ... We begin by establishing several results concerning the extremal behavior*of**submatrices**of*ones in*a*binary matrix with random entries. ... [28] assessed the significance*of*bi-clusters in*a*real-valued matrix using likelihood-based*weights*,*a*normal approximation and*a*standard Bonferroni bound to account for the multiplicity*of**submatrices*...##
###
Greedy Search-Binary PSO Hybrid for Biclustering Gene Expression Data

2010
*
International Journal of Computer Applications
*

As

doi:10.5120/651-908
fatcat:7klls6iavzevvpfd3rsho3gzva
*a*useful data*mining*technique biclustering identifies local patterns from gene expression data. ...*A*bicluster*of**a*gene expression dataset is*a*subset*of*genes which exhibit similar expression patterns along*a*subset*of*conditions. ... Moreover clustering happens to partition the genes into*disjoint**sets*i.e. each gene is associated with*a*single biological function, which in fact is in contradiction to the biological system [1] . ...##
###
Partitioning a matrix with non-guillotine cuts to minimize the maximum cost

2002
*
Discrete Applied Mathematics
*

We consider the problem

doi:10.1016/s0166-218x(00)00286-9
fatcat:mxj2avhc2zebfosyyskvh5k7mu
*of*partitioning*a*matrix*of*m rows and n columns*of*non-negative integers into M smaller*submatrices*. ... With each submatrix is associated*a*cost equal to the sum*of*its elements. The objective is to minimize the cost*of*the submatrix*of**maximum*cost. ... The second application deals with the balanced subdivision*of**a*rectangular*mining*area among M*mining*companies. ...##
###
On the maximal size of Large-Average and ANOVA-fit Submatrices in a Gaussian Random Matrix
[article]

2010
*
arXiv
*
pre-print

We investigate the maximal size

arXiv:1009.0562v1
fatcat:d6hrdp3reja7rc63ro35sh52ae
*of*distinguished*submatrices**of**a*Gaussian random matrix. ...*Of*interest are*submatrices*whose entries have average greater than or equal to*a*positive constant, and*submatrices*whose entries are well-fit by*a*two-way ANOVA model. ... We would also like to thank John Hartigan for pointing out the use*of*the Gaussian comparison principle as an alternative way*of*obtaining the bounds*of*Proposition 1. ...##
###
Partition of a Binary Matrix intok(k ≥ 3) Exclusive Row and Column Submatrices Is Difficult

2014
*
Mathematical Problems in Engineering
*

Biclustering in matrices with binary entries ("0"/"1") can be simplified into the problem

doi:10.1155/2014/934630
fatcat:s6ok6rn3frdzvoyshpcu77lxx4
*of*finding*submatrices*with entries*of*"1." ... Biclustering aims at finding*a*bicluster—*a*subset*of*objects that exhibit similar behavior across*a*subset*of*attributes, or vice versa. ... Moreover, the complexity*of*some variants*of*finding bicliques in bipartite graphs is open, for example, the*maximum*±1 edge*weight*biclique problem [15] . ...##
###
Aggregated 2D range queries on clustered points

2016
*
Information Systems
*

Efficient processing

doi:10.1016/j.is.2016.03.004
fatcat:4jrb2sthlbd5znzqmx4uq4ikri
*of*aggregated range queries on two-dimensional grids is*a*common requirement in information retrieval and data*mining*systems, for example in Geographic Information Systems and OLAP ... Our experimental evaluation shows that this technique can speed up aggregated queries up to more than an order*of*magnitude, with*a*small space overhead. ... To do this, we traverse the tree as in*a*top-k range query, but we only output*weights*whose value is in ½w 1 ; w 2 . Moreover, we discard*submatrices*whose*maximum**weight*is below w 2 . ...##
###
SiBIC: A Tool for Generating a Network of Biclusters Captured by Maximal Frequent Itemset Mining
[chapter]

2018
*
Msphere
*

Acknowledgements Part

doi:10.1007/978-1-4939-8561-6_8
pmid:30030806
fatcat:jcwzay6gffcz3mmdeosessfh7y
*of*this research has been supported by MEXT KAKENHI #16H02868 and #17H01783, ACCEL and PRESTO*of*JST and FiDiPro*of*Tekes. ... Click 'NODES' and then click 'W.DEG' (*weighted*degree) to sort the table. 3. Click*a*cell in the row*of*the node with the*maximum**weighted*degree. 4. ... Gene*Set*Networks To visualize the biclusters, we use gene*set*networks, each being*a**weighted*graph, where*a*node corresponds to*a*coexpressed gene*set*and an edge indicates the difference*of*experimental ...##
###
On the maximal size of large-average and ANOVA-fit submatrices in a Gaussian random matrix

2013
*
Bernoulli
*

Running title: Maximal

doi:10.3150/11-bej394
pmid:24194673
pmcid:PMC3816128
fatcat:wmzzsjshnzfdxg5q2ywjcbss3a
*submatrices**of**a*Gaussian random matrix Keywords: analysis*of*variance, data*mining*, Gaussian random matrix, large average submatrix, random matrix theory, second moment method ... We investigate the maximal size*of*distinguished*submatrices**of**a*Gaussian random matrix. ... In particular, the vertex*set*V*of*G is the*disjoint*union*of*two*sets*V 1 and V 2 , with |V 1 | = m and |V 2 | = n, corresponding to the rows and columns*of*X, respectively. ...##
###
Mining discrete patterns via binary matrix factorization

2009
*
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09
*

*A*best approximation on such data has

*a*minimum

*set*

*of*inconsistent entries, i.e., mismatches between the given binary data and the approximate matrix. ...

*Mining*discrete patterns in binary data is important for subsampling, compression, and clustering. ... ., two

*disjoint*

*submatrices*. By this, two child nodes

*of*the root are constructed. ...

##
###
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure

2016
*
Computational Intelligence and Neuroscience
*

Thirdly, two corpora,

doi:10.1155/2016/1096271
pmid:27579031
pmcid:PMC4992544
fatcat:uhgrhvkr25fk3nvslbadbdtcaq
*a*Chinese corpus and an English corpus, are used to evaluate the performances*of*the proposed methods. ... Firstly, we make*a*survey*of*existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. ... Classic*weighting*schemes [20, 21] are proposed on the basis*of*information about the frequency distribution*of*index terms within the whole collection or within the relevant and nonrelevant*sets**of*...##
###
Homology Computation of Large Point Clouds using Quantum Annealing
[article]

2016
*
arXiv
*
pre-print

In this paper, we present

arXiv:1512.09328v3
fatcat:pvx6jnwtkvds3lbrv4d6iif5le
*a*quantum annealing pipeline for computation*of*homology*of*large point clouds. The pipeline takes as input*a*graph approximating the given point cloud. ... It uses quantum annealing to compute*a*clique covering*of*the graph and then uses this cover to construct*a*Mayer-Vietoris complex. ... It consists*of*partitioning the vertex*set**of*G into k non-empty and fixed-sized subsets so that the total*weight**of*edges connecting distinct subsets is minimized. ...##
###
Data Ranking and Clustering via Normalized Graph Cut Based on Asymmetric Affinity
[chapter]

2013
*
Lecture Notes in Computer Science
*

The first method requires

doi:10.1007/978-3-642-41184-7_57
fatcat:ynrmzn335ves5n6jaqxkpmaeuy
*a*priori known class labeled data that can be utilized, e.g., for*a*calibration phase*of**a*braincomputer interface (BCI). ... In this paper, we present an extension*of*the state-*of*-theart normalized graph cut method based on asymmetry*of*the affinity matrix. ... Generally speaking, nCut is the*maximum**a*posteriori estimation because its value depends on the number*of*entries in*submatrices*T L, T R, BL, BR, see Figure 1 . ...##
###
Finding Biclusters by Random Projections
[chapter]

2004
*
Lecture Notes in Computer Science
*

Given

doi:10.1007/978-3-540-27801-6_8
fatcat:3xq7vcfvcnfsnffuusb5khcjc4
*a*matrix X composed*of*symbols,*a*bicluster is*a*submatrix*of*X obtained by removing some*of*the rows and some*of*the columns*of*X in such*a*way that each row*of*what is left reads the same string ...*A*detailed probabilistic analysis*of*the algorithm and an asymptotic study*of*the statistical significance*of*the solutions are given. We report results*of*extensive simulations on synthetic data. ... This problem has*a*variety*of*applications ranging from computational biology to data*mining*. ...##
###
Finding biclusters by random projections

2006
*
Theoretical Computer Science
*

Given

doi:10.1016/j.tcs.2006.09.023
fatcat:moai2q3rk5gblodxq6qyglioqu
*a*matrix X composed*of*symbols,*a*bicluster is*a*submatrix*of*X obtained by removing some*of*the rows and some*of*the columns*of*X in such*a*way that each row*of*what is left reads the same string ...*A*detailed probabilistic analysis*of*the algorithm and an asymptotic study*of*the statistical significance*of*the solutions are given. We report results*of*extensive simulations on synthetic data. ... This problem has*a*variety*of*applications ranging from computational biology to data*mining*. ...##
###
Robust Calibration for Localization in Clustered Wireless Sensor Networks

2010
*
IEEE Transactions on Automation Science and Engineering
*

To use the FAST-LTS, one needs to input

doi:10.1109/tase.2009.2013475
fatcat:wjtkmor3mbduhg7mn45j6jk6ja
*a*trimming parameter, which is*a*function*of*the sensor redundancy in*a*network. ... Applying the robust estimators available from robust statistics research to*a*wireless sensor network, however, faces*a*number*of*computational challenges. ... (*a*) Disconnected clusters. (b) Connected clusters. The design matrix consists*of**disjoint**submatrices*. ...
