678 Hits in 4.8 sec

Learning Markov Chain in Unordered Dataset [article]

Yao-Hung Hubert Tsai, Han Zhao, Ruslan Salakhutdinov, Nebojsa Jojic
2019 arXiv   pre-print
Nevertheless, datasets often exhibit rich structure in practice, and we argue that there exist some unknown order within the data instances.  ...  By assuming that the instances are sampled from a Markov chain, our goal is to learn the transitional operator of the underlying Markov chain, as well as the order by maximizing the generation probability  ...  We elaborate the design of the transition operator in Fig. 8 . In our design, U can be seen as a gating mechanism between input X t and the learned updateX.  ... 
arXiv:1711.03167v3 fatcat:6hk6pj635fcjpopmgf4fknusj4

A Survey of Graph Mining Techniques for Biological Datasets [chapter]

S. Parthasarathy, S. Tatikonda, D. Ucar
2010 Managing and Mining Graph Data  
The field of bioinformatics has emerged as important application area in this context. Examples abound ranging from the analysis of protein interaction networks to the analysis of phylogenetic data.  ...  In this article we survey the principal results in the field examining them both from the algorithmic contributions and applicability in the domain in question.  ...  There has been a tremendous amount of work being done in developing fast algorithms to compute tree edit distance for both ordered and unordered trees.  ... 
doi:10.1007/978-1-4419-6045-0_18 dblp:series/ads/ParthasarathyTU10 fatcat:aeu53r3dbzd67d5whkjypv64uq

Local Learning for Mining Outlier Subgraphs from Network Datasets [chapter]

Manish Gupta, Arun Mallya, Subhro Roy, Jason H. D. Cho, Jiawei Han
2014 Proceedings of the 2014 SIAM International Conference on Data Mining  
Experimental results on several synthetic and real datasets show the effectiveness of the proposed approach in computing interesting outliers.  ...  For example for a co-authorship network, given a subgraph containing three authors, one expects all three authors to be say data mining authors.  ...  Acknowledgements The work was supported in part by the U. We would also like to thank the Institute for Genomic Biology at University of Illinois, Urbana Champaign for their equipment.  ... 
doi:10.1137/1.9781611973440.9 dblp:conf/sdm/GuptaMRCH14 fatcat:lrptpcgoabg4fczrohtvmapjyi

Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets [article]

Jakob Runge
2022 arXiv   pre-print
The method is order-independent and consistent in the oracle case.  ...  PCMCI^+ can be of considerable use in many real world application scenarios where often time resolutions are too coarse to resolve time delays and strong autocorrelation is present.  ...  Algorithm 2 tests all (unordered lagged and ordered contemporaneous) adjacent links (X i t−τ , X j t ) and iterates through contemporaneous conditions S ⊆ A t (X j t ), but in addition each CI test is  ... 
arXiv:2003.03685v2 fatcat:sgrlwcow45gcjjazohioqcaek4

Clustering of Biological Datasets in the Era of Big Data

Richard Röttger
2016 Journal of Integrative Bioinformatics  
SummaryClustering is a long-standing problem in computer science and is applied in virtually any scientific field for exploring the inherent structure of datasets.  ...  This manuscript provides an overview of the crucial steps and the most common techniques involved in conducting a state-of-the-art cluster analysis of biomedical datasets.  ...  For instance, density based clustering tools might be unable to discover any significant change of densities in such datasets.  ... 
doi:10.1515/jib-2016-300 fatcat:luooildgdrgfvd6bakmuiy7c54

SuperMIC: Analyzing Large Biological Datasets in Bioinformatics with Maximal Information Coefficient

Chao Wang, Dong Dai, Xi Li, Aili Wang, Xuehai Zhou
2017 IEEE/ACM Transactions on Computational Biology & Bioinformatics  
The maximal information coefficient (MIC) has been proposed to discover relationships and associations between pairs of variables.  ...  In this paper we explore a parallel approach which uses MapReduce framework to improve the computing efficiency and throughput of the MIC computation.  ...  Given a server number of k, we divide the input dataset in HDFS into k ordered parts with equal range according to the points' x value.  ... 
doi:10.1109/tcbb.2016.2550430 pmid:27076457 fatcat:7jtxobwxbba2dhj3wtuz3zsepe

A Survey on Session-based Recommender Systems [article]

Shoujin Wang, Longbing Cao, Yan Wang, Quan Z. Sheng, Mehmet Orgun, Defu Lian
2021 arXiv   pre-print
We propose a general problem statement of SBRSs, summarize the diversified data characteristics and challenges of SBRSs, and define a taxonomy to categorize the representative SBRS research.  ...  Finally, we discuss new research opportunities in this exciting and vibrant area.  ...  An ordered (unordered) session refers to a session in which the interactions are (not) chronologically ordered.  ... 
arXiv:1902.04864v3 fatcat:oka5bvibzzbk5oreltrupehaey

Phylogeny and the inference of evolutionary trajectories

Lillian Hancock, Erika J. Edwards
2014 Journal of Experimental Botany  
This study simulated ordered and unordered character evolution across a diverse set of phylogenetic trees to understand how tree size, models of evolution, and sampling efforts influence the ability to  ...  The simulations show that small trees (15 taxa) do not contain enough information to correctly infer either an ordered or unordered trajectory, although inference improves as tree size and sampling increases  ...  Acknowledgements This work was funded in part by the National Science Foundation (grant DEB-1252901 to E.J.E.).  ... 
doi:10.1093/jxb/eru118 pmid:24755279 pmcid:PMC4085962 fatcat:kn7hyvh27zfbfp5io5bl5txs7q

Structured Priors for Structure Learning [article]

Vikash Mansinghka, Charles Kemp, Thomas Griffiths, Joshua Tenenbaum
2012 arXiv   pre-print
For several realistic, sparse datasets, we show that the bias towards systematicity of connections provided by our model yields more accurate learned networks than a traditional, uniform prior approach  ...  Here we capture this form of prior knowledge in a hierarchical Bayesian framework, and exploit it to enable structure learning and type discovery from small datasets.  ...  Generate an ordering o of the K + classes in the partition z uniformly at random: P ( o| z) = 1 K + ! (2) Then o a contains the order of class a. 3.  ... 
arXiv:1206.6852v1 fatcat:u5zozclwc5evpnuo2xqor7jlge

Network inference through synergistic subnetwork evolution

Lipi Acharya, Robert Reynolds, Dongxiao Zhu
2015 EURASIP Journal on Bioinformatics and Systems Biology  
Gene sets represent the sets of genes participating in active paths without prior knowledge of the order in which genes occur within each path.  ...  in the network.  ...  discovered at Generation Index 1, . . . , 1000.  ... 
doi:10.1186/s13637-015-0027-4 pmid:26640480 pmcid:PMC4662719 fatcat:gufrtuq7bzb2fgrkiorw2ascea

Learning frequent behaviours of the users in Intelligent Environments

Asier Aztiria
2010 Journal of Ambient Intelligence and Smart Environments  
In order to provide personalized and adapted services, it is necessary to know the preferences and habits of users.  ...  In MavPad environment, LFPUBS was tested with different confidence levels using data collected in three different trials, whereas in WSU Smart Apartment environment LFPUBS was able to discover a predefined  ...  Probabilistic methods such as Bayesian logic networks and Markov logic networks have been used to model activities [15] [16] . Other techniques have also been used.  ... 
doi:10.3233/ais-2010-0084 fatcat:i4ikkqf7orhntjurjzebofxrxq

The evolution of logic circuits for the purpose of protein contact map prediction

Samuel D. Chapman, Christoph Adami, Claus O. Wilke, Dukka B KC
2017 PeerJ  
and the selection of relevant features in a dataset.  ...  We show that such a method is feasible, and in addition that evolution allows the logic circuits to be trained on the dataset in an unbiased manner so that it can be used in both contact map prediction  ...  Knoester for developing the Markov network codebase used here and for help adapting it to this work.  ... 
doi:10.7717/peerj.3139 pmid:28439455 pmcid:PMC5398280 fatcat:dtdr6f67tvdizbqy5nt6pqw24q

Estimating Graphlet Statistics via Lifting [article]

Kirill Paramonov, James Sharpnack
2018 arXiv   pre-print
We outline three variants of lifted graphlet counts: the ordered, unordered, and shotgun estimators.  ...  This work introduces a framework for estimating the graphlet count - the number of occurrences of a small subgraph motif (e.g. a wedge or a triangle) in the network.  ...  INTRODUCTION In 1970, [9] discovered that transitivity-the tendency of friends of friends to be friends themselves-is a prevalent feature in social networks.  ... 
arXiv:1802.08736v1 fatcat:lcv4h7mei5ecnh7bcuuvh5xzh4

Markov Boundary Discovery with Ridge Regularized Linear Models [article]

Eric V. Strobl, Shyam Visweswaran
2015 arXiv   pre-print
Experimental results show that the modified RRLMs are competitive against state-of-the-art algorithms in discovering part of the Markov boundary from gene expression data.  ...  Our approach combines ideas in Markov boundary and sufficient dimension reduction theory.  ...  Expert-Designed Models We evaluated CRP on datasets generated from four expert-designed discrete Bayesian networks (brief descriptions are given in Table 1 ).  ... 
arXiv:1509.03935v1 fatcat:26xjxzfxtzgvtpiaommo4vfwtq

Time Series Motifs Statistical Significance [chapter]

Nuno Castro, Paulo J. Azevedo
2011 Proceedings of the 2011 SIAM International Conference on Data Mining  
This is unfeasible even for moderately sized datasets, since the number of discovered motifs tends to be prohibitively large.  ...  In this work we present an approach to calculate time series motifs statistical significance.  ...  Since a random process generated the database, all discovered motifs are meaningless. In fact, this example is depicted in Fig. 1 .  ... 
doi:10.1137/1.9781611972818.59 dblp:conf/sdm/CastroA11 fatcat:l6bgcxtmi5bqfo2sqfxbu4mrfm
« Previous Showing results 1 — 15 out of 678 results