5,919 Hits in 4.9 sec

Learning Sparse Log-Ratios for High-Throughput Sequencing Data [article]

Elliott Gordon-Rodriguez, Thomas P Quinn, John P Cunningham
2021 bioRxiv   pre-print
In the context of high-throughput genetic sequencing data, and Compositional Data more generally, an important class of features are the log-ratios between subsets of the input variables.  ...  Building on recent literature on continuous relaxations of discrete latent variables, we design a novel learning algorithm that identifies sparse log-ratios several orders of magnitude faster than competing  ...  Log-ratios are an important class of features for analyzing high-throughput sequencing (HTS) metagenomic data (Wooley et al., 2010; Gloor & Reid, 2016; Gloor et al., 2017; Quinn et al., 2018) .  ... 
doi:10.1101/2021.02.11.430695 fatcat:hepgni7uabbpnl6so32j5hybfq

Learning Sparse Log-Ratios for High-Throughput Sequencing Data

Elliott Gordon-Rodriguez, Thomas P Quinn, John P Cunningham, Pier Luigi Martelli
2021 Bioinformatics  
In the context of high-throughput sequencing (HTS) data, and compositional data (CoDa) more generally, an important class of biomarkers are the log-ratios between the input variables.  ...  intractable for existing sparse log-ratio selection methods.  ...  Thus, learning sparse log-ratios is a central problem in CoDa.  ... 
doi:10.1093/bioinformatics/btab645 pmid:34498030 pmcid:PMC8696089 fatcat:46vfdmfbnzduljfzj4qyc4pfli

AtlantECO deliverable 5.1- Reference catalogue of network reconstruction methods

Budinich Marko, Eveillard Damien, Chaffron Samuel
2022 Zenodo  
We built a computational pipeline for the inference of species ecological networks from heterogeneous data types, combining statistical and ecological metrics, as well as probabilistic and machine learning  ...  This workflow will be used in Task 5.2 to build an all-Atlantic plankton ecological network from omics data compiled within WP2.  ...  In addition, due to technical factors, sequencing count data are sparse, that is contain many zeros.  ... 
doi:10.5281/zenodo.6405186 fatcat:vbiu3fupafeidlamqmo3l45xdy

Multichannel ALOHA with Exploration Phase [article]

Jinho Choi
2020 arXiv   pre-print
In this paper, we consider exploration for multichannel ALOHA by transmitting preambles before transmitting data packets and show that the maximum throughput can be improved by a factor of 2 - exp(-1)  ...  In order to see whether or not there are other active users, suppose that each active user can transmit a preamble sequence before data packet transmission, which can be seen as the exploration to learn  ...  To avoid it, as suggested in [30] , sparse preamble sequences can be used. (b) with L = ⌈ λ 2 2δM 2 ⌉.  ... 
arXiv:2001.11115v1 fatcat:s76uq7xkmbbpzm5cievudcwwjq

Smart Contract Vulnerability Detection using Graph Neural Network

Yuan Zhuang, Zhenguang Liu, Peng Qian, Qi Liu, Xiang Wang, Qinming He
2020 Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  
Then, we propose a degree-free graph convolutional neural network (DR-GCN) and a novel temporal message propagation network (TMP) to learn from the normalized graphs for vulnerability detection.  ...  In this paper, we explore using graph neural networks (GNNs) for smart contract vulnerability detection.  ...  A standard self-attention layer has complexity of O(n 2 · d), which is too high for long sequences.  ... 
doi:10.24963/ijcai.2020/450 dblp:conf/ijcai/ZhaoLZZ20 fatcat:uvbo74cs4ffyllmdquako6q6pq

A Critique of Differential Abundance Analysis, and Advocacy for an Alternative [article]

Thomas P Quinn, Elliott Gordon-Rodriguez, Ionas Erb
2021 arXiv   pre-print
Beyond replacing DAA, they can also be used for many other bespoke analyses, including dimension reduction and multi-omics data integration.  ...  It is largely taken for granted that differential abundance analysis is, by default, the best first step when analyzing genomic data. We argue that this is not necessarily the case.  ...  To understand normalization, we first need to understand the biases found in high-throughput sequencing data. Dillies et al.  ... 
arXiv:2104.07266v2 fatcat:rc3qpqspwrb2rfrfk6mygb4bv4

Learning Compositional Representations of Interacting Systems with Restricted Boltzmann Machines: Comparative Study of Lattice Proteins [article]

Jérôme Tubiana, Simona Cocco, Rémi Monasson
2019 arXiv   pre-print
As such, RBM were recently proposed for characterizing the patterns of coevolution between amino acids in protein sequences and for designing new sequences.  ...  and their sparse variants.  ...  J.T. acknowledges funding from the Safra Center for Bioinformatics, Tel Aviv University.  ... 
arXiv:1902.06495v1 fatcat:wfugdutgc5dlpni3g4wgcd2g6m

Machine learning at the limit

John Canny, Huasha Zhao, Bobby Jaros, Ye Chen, Jiangchang Mao
2015 2015 IEEE International Conference on Big Data (Big Data)  
We have shown that Kylix approaches the practical network throughput limit for allreduce, a basic primitive for distributed machine learning.  ...  Many systems have been developed for machine learning at scale.  ...  This is very significant for machine learning on typical data (text, social networks, web data, server logs,..).  ... 
doi:10.1109/bigdata.2015.7363760 dblp:conf/bigdataconf/CannyZJCM15 fatcat:wrhujzxjcnhkjphusw5lv46rca

CAMAMED: a pipeline for composition-aware mapping-based analysis of metagenomic data

Mohammad H Norouzi-Beirami, Sayed-Amir Marashi, Ali M Banaei-Moghaddam, Kaveh Kavousi
2021 NAR Genomics and Bioinformatics  
Due to the highly compositional nature of metagenomic data, the cumulative sum-scaling method is used at both taxa and gene levels for compositional data analysis in our pipeline.  ...  When appropriate gene catalogs are available, mapping-based methods are preferred over assembly based approaches, especially for analyzing the data at the functional level.  ...  handling composition bias and zero-inflation in high-throughput sequencing data Software Packages/tools developed to handle sparsity in metagenome data.  ... 
doi:10.1093/nargab/lqaa107 pmid:33575649 pmcid:PMC7787360 fatcat:egjnspcwqbeedhp5vzkjzpvxye

Deep generative models of genetic variation capture mutation effects [article]

Adam J. Riesselman, John B. Ingraham, Debora S. Marks
2017 arXiv   pre-print
The model, learned in an unsupervised manner solely from sequence information, is grounded with biologically motivated priors, reveals latent organization of sequence families, and can be used to extrapolate  ...  independent or pairwise models that are based on the same evolutionary data.  ...  While in progress Sinai et al also reported on use of variational autoencoders for protein sequences [84] . A.J.R. is supported by DOE CSGF fellowship DE-FG02-97ER25308.  ... 
arXiv:1712.06527v1 fatcat:tnzxib67fnbzbcela3a6qqf2zu

Tree-aggregated predictive modeling of microbiome data

Jacob Bien, Xiaohan Yan, Léo Simpson, Christian L. Müller
2021 Scientific Reports  
AbstractModern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale.  ...  By contrast, our framework, which we call (ee-ggregation of ompositional data), learns data-adaptive taxon aggregation levels for predictive modeling, greatly reducing the need for user-defined aggregation  ...  The framework leverages the hierarchical nature of microbial sequencing data to learn parsimonious log-ratios of microbial compositions along the taxonomic or phylogenetic tree that best predict continuous  ... 
doi:10.1038/s41598-021-93645-3 pmid:34267244 pmcid:PMC8282688 fatcat:mqurup26czhmfh6hftw5eanixy

Investigation of Adiposity Measures and Operational Taxonomic unit (OTU) Data Transformation Procedures in Stool Samples from a German Cohort Study Using Machine Learning Algorithms

Martina Troll, Stefan Brandmaier, Sandra Reitmeier, Jonathan Adam, Sapna Sharma, Alice Sommer, Marie-Abèle Bind, Klaus Neuhaus, Thomas Clavel, Jerzy Adamski, Dirk Haller, Annette Peters (+1 others)
2020 Microorganisms  
We analyzed around 2000 stool samples from the KORA (Cooperative Health Research in the Region of Augsburg) cohort using high-throughput 16S rRNA gene amplicon sequencing representing a total microbial  ...  transformation approaches (i.e., no transformation, relative abundance without and with log-transformation, as well as centered and isometric log-ratio transformations); and (iii) predictions from nine  ...  Microbiota Profiling by 16S rRNA Amplicon Sequencing The microbiome measurement procedures via high-throughput 16S rRNA gene sequencing have been described in detail elsewhere [19] .  ... 
doi:10.3390/microorganisms8040547 pmid:32290101 fatcat:roswnsfuxbfw7od3gb37sxpaua

Disentangling microbial associations from hidden environmental and technical factors via latent graphical models [article]

Zachary D Kurtz, Richard Bonneau, Christian L Müller
2019 bioRxiv   pre-print
Our method comes with theoretical performance guarantees and is available within the SParse InversE Covariance estimation for Ecological ASsociation Inference (SPIEC-EASI) framework (SpiecEasi R-package  ...  to jointly identify compositional biases, latent factors that correlate with observed technical covariates, and robust statistical microbial associations that replicate across different gut microbial data  ...  High-throughput amplicon and metagenomic sequencing techniques are transforming our understanding of microbial ecosystems.  ... 
doi:10.1101/2019.12.21.885889 fatcat:vk4zyf7bw5ghfic42m325mcexe

JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu [article]

Hao Liu, Qian Gao, Jiang Li, Xiaochao Liao, Hao Xiong, Guangxing Chen, Wenlin Wang, Guobao Yang, Zhiwei Zha, Daxiang Dong, Dejing Dou, Haoyi Xiong
2021 arXiv   pre-print
In modern internet industries, deep learning based recommender systems have became an indispensable building block for a wide spectrum of applications, such as search engine, news feed, and short video  ...  Moreover, an intelligent resource manager has been deployed to maximize the throughput of JIZHI over the shared infrastructure by searching the optimal resource allocation plan from historical logs and  ...  To improve the recommendation performance, the industrial DNNs are usually trained with massive data samples, where each data sample is typically extremely high-dimensional and sparse (i.e., hundreds billions  ... 
arXiv:2106.01674v1 fatcat:zogjjhrfezc3pkboekw56ebf5y

iAMCTD: Improved Adaptive Mobility of Courier Nodes in Threshold-Optimized DBR Protocol for Underwater Wireless Sensor Networks

N. Javaid, M. R. Jafri, Z. A. Khan, U. Qasim, T. A. Alghamdi, M. Ali
2014 International Journal of Distributed Sensor Networks  
Unlike existing depth-based acoustic protocols, the proposed protocol exploits network density for time-critical applications.  ...  In order to tackle flooding, path loss, and propagation latency, we calculate optimal holding time ( ) and use routing metrics: localization-free signal-to-noise ratio (LSNR), signal quality index (SQI  ...  Figure 1 : 1 10 log ( ( )) = 17 − 30 log ( ) , 10 log ( ( )) = 40 + 20 ( − 0.5) + 26 log ( ) Routing of on-demand data.10 log ( ( )) = 50 + 7.5 1/2 + 20 log ( ) − 40 log ( + 0.4) , 10 log ( th ( )) = −  ... 
doi:10.1155/2014/213012 fatcat:gadvl3g2sffdzmzw3pi2hnspcq
« Previous Showing results 1 — 15 out of 5,919 results