A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Learning Sparse Log-Ratios for High-Throughput Sequencing Data
[article]
2021
bioRxiv
pre-print
In the context of high-throughput genetic sequencing data, and Compositional Data more generally, an important class of features are the log-ratios between subsets of the input variables. ...
Building on recent literature on continuous relaxations of discrete latent variables, we design a novel learning algorithm that identifies sparse log-ratios several orders of magnitude faster than competing ...
Log-ratios are an important class of features for analyzing high-throughput sequencing (HTS) metagenomic data (Wooley et al., 2010; Gloor & Reid, 2016; Gloor et al., 2017; Quinn et al., 2018) . ...
doi:10.1101/2021.02.11.430695
fatcat:hepgni7uabbpnl6so32j5hybfq
Learning Sparse Log-Ratios for High-Throughput Sequencing Data
2021
Bioinformatics
In the context of high-throughput sequencing (HTS) data, and compositional data (CoDa) more generally, an important class of biomarkers are the log-ratios between the input variables. ...
intractable for existing sparse log-ratio selection methods. ...
Thus, learning sparse log-ratios is a central problem in CoDa. ...
doi:10.1093/bioinformatics/btab645
pmid:34498030
pmcid:PMC8696089
fatcat:46vfdmfbnzduljfzj4qyc4pfli
AtlantECO deliverable 5.1- Reference catalogue of network reconstruction methods
2022
Zenodo
We built a computational pipeline for the inference of species ecological networks from heterogeneous data types, combining statistical and ecological metrics, as well as probabilistic and machine learning ...
This workflow will be used in Task 5.2 to build an all-Atlantic plankton ecological network from omics data compiled within WP2. ...
In addition, due to technical factors, sequencing count data are sparse, that is contain many zeros. ...
doi:10.5281/zenodo.6405186
fatcat:vbiu3fupafeidlamqmo3l45xdy
Multichannel ALOHA with Exploration Phase
[article]
2020
arXiv
pre-print
In this paper, we consider exploration for multichannel ALOHA by transmitting preambles before transmitting data packets and show that the maximum throughput can be improved by a factor of 2 - exp(-1) ...
In order to see whether or not there are other active users, suppose that each active user can transmit a preamble sequence before data packet transmission, which can be seen as the exploration to learn ...
To avoid it, as suggested in [30] , sparse preamble sequences can be used. (b) with L = ⌈ λ 2 2δM 2 ⌉. ...
arXiv:2001.11115v1
fatcat:s76uq7xkmbbpzm5cievudcwwjq
Smart Contract Vulnerability Detection using Graph Neural Network
2020
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Then, we propose a degree-free graph convolutional neural network (DR-GCN) and a novel temporal message propagation network (TMP) to learn from the normalized graphs for vulnerability detection. ...
In this paper, we explore using graph neural networks (GNNs) for smart contract vulnerability detection. ...
A standard self-attention layer has complexity of O(n 2 · d), which is too high for long sequences. ...
doi:10.24963/ijcai.2020/450
dblp:conf/ijcai/ZhaoLZZ20
fatcat:uvbo74cs4ffyllmdquako6q6pq
A Critique of Differential Abundance Analysis, and Advocacy for an Alternative
[article]
2021
arXiv
pre-print
Beyond replacing DAA, they can also be used for many other bespoke analyses, including dimension reduction and multi-omics data integration. ...
It is largely taken for granted that differential abundance analysis is, by default, the best first step when analyzing genomic data. We argue that this is not necessarily the case. ...
To understand normalization, we first need to understand the biases found in high-throughput sequencing data. Dillies et al. ...
arXiv:2104.07266v2
fatcat:rc3qpqspwrb2rfrfk6mygb4bv4
Learning Compositional Representations of Interacting Systems with Restricted Boltzmann Machines: Comparative Study of Lattice Proteins
[article]
2019
arXiv
pre-print
As such, RBM were recently proposed for characterizing the patterns of coevolution between amino acids in protein sequences and for designing new sequences. ...
and their sparse variants. ...
J.T. acknowledges funding from the Safra Center for Bioinformatics, Tel Aviv University. ...
arXiv:1902.06495v1
fatcat:wfugdutgc5dlpni3g4wgcd2g6m
Machine learning at the limit
2015
2015 IEEE International Conference on Big Data (Big Data)
We have shown that Kylix approaches the practical network throughput limit for allreduce, a basic primitive for distributed machine learning. ...
Many systems have been developed for machine learning at scale. ...
This is very significant for machine learning on typical data (text, social networks, web data, server logs,..). ...
doi:10.1109/bigdata.2015.7363760
dblp:conf/bigdataconf/CannyZJCM15
fatcat:wrhujzxjcnhkjphusw5lv46rca
CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data
2021
NAR Genomics and Bioinformatics
Due to the highly compositional nature of metagenomic data, the cumulative sum-scaling method is used at both taxa and gene levels for compositional data analysis in our pipeline. ...
When appropriate gene catalogs are available, mapping-based methods are preferred over assembly based approaches, especially for analyzing the data at the functional level. ...
handling composition bias and zero-inflation in high-throughput sequencing data Software Packages/tools developed to handle sparsity in metagenome data. ...
doi:10.1093/nargab/lqaa107
pmid:33575649
pmcid:PMC7787360
fatcat:egjnspcwqbeedhp5vzkjzpvxye
Deep generative models of genetic variation capture mutation effects
[article]
2017
arXiv
pre-print
The model, learned in an unsupervised manner solely from sequence information, is grounded with biologically motivated priors, reveals latent organization of sequence families, and can be used to extrapolate ...
independent or pairwise models that are based on the same evolutionary data. ...
While in progress Sinai et al also reported on use of variational autoencoders for protein sequences [84] . A.J.R. is supported by DOE CSGF fellowship DE-FG02-97ER25308. ...
arXiv:1712.06527v1
fatcat:tnzxib67fnbzbcela3a6qqf2zu
Tree-aggregated predictive modeling of microbiome data
2021
Scientific Reports
AbstractModern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. ...
By contrast, our framework, which we call (ee-ggregation of ompositional data), learns data-adaptive taxon aggregation levels for predictive modeling, greatly reducing the need for user-defined aggregation ...
The framework leverages the hierarchical nature of microbial sequencing data to learn parsimonious log-ratios of microbial compositions along the taxonomic or phylogenetic tree that best predict continuous ...
doi:10.1038/s41598-021-93645-3
pmid:34267244
pmcid:PMC8282688
fatcat:mqurup26czhmfh6hftw5eanixy
Investigation of Adiposity Measures and Operational Taxonomic unit (OTU) Data Transformation Procedures in Stool Samples from a German Cohort Study Using Machine Learning Algorithms
2020
Microorganisms
We analyzed around 2000 stool samples from the KORA (Cooperative Health Research in the Region of Augsburg) cohort using high-throughput 16S rRNA gene amplicon sequencing representing a total microbial ...
transformation approaches (i.e., no transformation, relative abundance without and with log-transformation, as well as centered and isometric log-ratio transformations); and (iii) predictions from nine ...
Microbiota Profiling by 16S rRNA Amplicon Sequencing The microbiome measurement procedures via high-throughput 16S rRNA gene sequencing have been described in detail elsewhere [19] . ...
doi:10.3390/microorganisms8040547
pmid:32290101
fatcat:roswnsfuxbfw7od3gb37sxpaua
Disentangling microbial associations from hidden environmental and technical factors via latent graphical models
[article]
2019
bioRxiv
pre-print
Our method comes with theoretical performance guarantees and is available within the SParse InversE Covariance estimation for Ecological ASsociation Inference (SPIEC-EASI) framework (SpiecEasi R-package ...
to jointly identify compositional biases, latent factors that correlate with observed technical covariates, and robust statistical microbial associations that replicate across different gut microbial data ...
High-throughput amplicon and metagenomic sequencing techniques are transforming our understanding of microbial ecosystems. ...
doi:10.1101/2019.12.21.885889
fatcat:vk4zyf7bw5ghfic42m325mcexe
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
[article]
2021
arXiv
pre-print
In modern internet industries, deep learning based recommender systems have became an indispensable building block for a wide spectrum of applications, such as search engine, news feed, and short video ...
Moreover, an intelligent resource manager has been deployed to maximize the throughput of JIZHI over the shared infrastructure by searching the optimal resource allocation plan from historical logs and ...
To improve the recommendation performance, the industrial DNNs are usually trained with massive data samples, where each data sample is typically extremely high-dimensional and sparse (i.e., hundreds billions ...
arXiv:2106.01674v1
fatcat:zogjjhrfezc3pkboekw56ebf5y
iAMCTD: Improved Adaptive Mobility of Courier Nodes in Threshold-Optimized DBR Protocol for Underwater Wireless Sensor Networks
2014
International Journal of Distributed Sensor Networks
Unlike existing depth-based acoustic protocols, the proposed protocol exploits network density for time-critical applications. ...
In order to tackle flooding, path loss, and propagation latency, we calculate optimal holding time ( ) and use routing metrics: localization-free signal-to-noise ratio (LSNR), signal quality index (SQI ...
Figure 1 : 1 10 log ( ( )) = 17 − 30 log ( ) , 10 log ( ( )) = 40 + 20 ( − 0.5) + 26 log ( ) Routing of on-demand data.10 log ( ( )) = 50 + 7.5 1/2 + 20 log ( ) − 40 log ( + 0.4) , 10 log ( th ( )) = − ...
doi:10.1155/2014/213012
fatcat:gadvl3g2sffdzmzw3pi2hnspcq
« Previous
Showing results 1 — 15 out of 5,919 results