13 Hits in 7.1 sec

BGT: efficient and flexible genotype query across many samples

Heng Li
2015 Bioinformatics  
Summary: BGT is a compact format, a fast command line tool and a simple web application for efficient and convenient query of whole-genome genotypes and frequencies across tens to hundreds of thousands  ...  On real data, it encodes the haplotypes of 32,488 samples across 39.2 million SNPs into a 7.4GB database and decodes a couple of hundred million genotypes per CPU second.  ...  : efficient and flexible genotype query across many samples Heng Li Broad  ... 
doi:10.1093/bioinformatics/btv613 pmid:26500154 pmcid:PMC5963361 fatcat:fzsumximy5frrlv7npbssjzpsi

SeqArray—a storage-efficient high-performance data format for WGS variant calls

Xiuwen Zheng, Stephanie M Gogarten, Michael Lawrence, Adrienne Stilp, Matthew P Conomos, Bruce S Weir, Cathy Laurie, David Levine, Inanc Birol
2017 Bioinformatics  
Variant call format (VCF) is a general text-based format developed to store variant genotypes and their annotations. However, VCF files are large and data retrieval is relatively slow.  ...  Results: Benchmarks using 1000 Genomes Phase 3 data show file sizes are 14.0 Gb (VCF), 12.3 Gb (BCF, binary VCF), 3.5 Gb (BGT) and 2.6 Gb (SeqArray) respectively.  ...  Acknowledgements We thank Roy Kuraisa and other members of the Genetic Analysis Center in the Department of Biostatistics at University of Washington for using and testing the utilities in the SeqArray  ... 
doi:10.1093/bioinformatics/btx145 pmid:28334390 pmcid:PMC5860110 fatcat:ocdm6zt3ljbvvnarlrxzrghunm

VCFShark: how to squeeze a VCF file [article]

Sebastian Deorowicz, Agnieszka Danek
2020 bioRxiv   pre-print
We propose VCFShark squeezing them up to an order of magnitude better than the de facto standards (gzipped VCF and BCF). Availability and Implementation: .  ...  Funding The work was supported by National Science Centre, Poland, project DEC-2017/25/B/ST6/01525 and by POIG.02.03.01-24-099/13 grant: "GeCONiI-Upper Silesian Center for Computational Science and Engineering  ...  (b) Compression ratios for 1000GPp3 and HRC datasets. Li,H. (2015) BGT: efficient and flexible genotype query across many samples. Bioinformatics 32, 590-592.  ... 
doi:10.1101/2020.12.18.423437 fatcat:xvapvrprsrdennkqeiroz2f7b4

BGEN: a binary file format for imputed genotype and haplotype data [article]

Gavin Band, Jonathan Marchini
2018 bioRxiv   pre-print
imputed genotypes per second.  ...  Here we present a binary file format (the BGEN format) that can store both directly-typed and statistically imputed genotype data, and achieves substantial space savings by data compression and the use  ...  BGT: efficient and flexible genotype query across many samples. Bioinformatics 32, 590-592, Band G. and Marchini J.  ... 
doi:10.1101/308296 fatcat:rrollgg3kbfgdipxnblgir7k2u

Vcfanno: fast, flexible annotation of genetic variants

Brent S. Pedersen, Ryan M. Layer, Aaron R. Quinlan
2016 Genome Biology  
Here we describe vcfanno, which flexibly extracts and summarizes attributes from multiple annotation files and integrates the annotations within the INFO column of the original VCF file.  ...  too many homozygous alternative genotypes to be plausible for a rare, recessive disorder.  ...  When annotating the 1000 Genomes VCF that includes 2504 sample genotypes, vcfanno requires 42 minutes using 16 cores, versus 17 minutes without genotypes.  ... 
doi:10.1186/s13059-016-0973-5 pmid:27250555 pmcid:PMC4888505 fatcat:3dxgarc47zdmhpqrsoksmhleia

The Next Generation Precision Medical Record - A Framework for Integrating Genomes and Wearable Sensors with Medical Records [article]

Daryl Waggott, Anja Bog, Enakshi Singh, Prag Batra, Mark Wright, Euan Ashley
2016 bioRxiv   pre-print
The current phase is being rolled out to over 1500 patients in clinics across the hospital system.  ...  Core functionality included patient timelines with integrated text analytics, personalized genomic curation and wearable alerts.  ...  BGT developed by Heng Li is a promising format which aims to be more compact in size, more efficient to process, and more flexible on query than conventional BCF2 format.  ... 
doi:10.1101/039651 fatcat:67n67bemcbbopfo4v5uilizrl4

Robust and rapid algorithms facilitate large-scale whole genome sequencing downstream analysis in an integrative framework

Miaoxin Li, Jiang Li, Mulin Jun Li, Zhicheng Pan, Jacob Shujui Hsu, Dajiang J. Liu, Xiaowei Zhan, Junwen Wang, Youqiang Song, Pak Chung Sham
2017 Nucleic Acids Research  
KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic  ...  ANNOVAR and SNPEff).  ...  ACKNOWLEDGEMENTS The authors acknowledge a number of institutes and projects for their free data sets used in this study: 1000 Genomes Project, Online Mendelian Inheritance in Man, dbNSFP, etc.  ... 
doi:10.1093/nar/gkx019 pmid:28115622 pmcid:PMC5435951 fatcat:7aohk5mghnabbep6cvvim73owa

Proceedings_of_Measuring_Behavior_2014.pdf [article]

Andrew Spink, Egon Van Den Broek, Leanne Loijens, Marta Woloszynowska-Fraser, Lucas P. J. J. Noldus
2020 Figshare  
Proceedings of the 9th International Conference on Methods and Techniques in Behavioral Research  ...  For this study the gull with the PTT id 41781(sampling interval 1 hour) was used to test the effect of animal movement with different environmental variables at different temporal scales.  ...  APVV-0753-10 and APVV-0539-12. Acknowledgement The study was supported by grants No. APVV-0753-10 and APVV-0539-12.  ... 
doi:10.6084/m9.figshare.11708187 fatcat:62o74kbgczf2tp2dk64jeszxr4

Abstracts Presented at the Thirtieth Annual International Neuropsychological Society Conference, February 13???16, 2002 Toronto, Canada

2002 Journal of the International Neuropsychological Society  
The criteria set encompasses a broader and more heterogeneous range of impaired individuals, is not limited to amnesic deficits, is comparable across samples, has the possibility for uniform application  ...  Recognition Discrimination and Response Bias Across the Life-Span.  ...  Education and gender effects remained consistent across samples.  ... 
doi:10.1017/s1355617702822019 fatcat:3oxum6t47jhxdkyttu2aqc2mwi

Final Program, Abstracts Presented at the Thirty–Fifth Annual International Neuropsychological Society Conference, February 7–10, Portland, Oregon, USA

2007 Journal of the International Neuropsychological Society  
Care (BDI-PC), Chicago Multiscale Depression Inventory (CMDI) combined mood and evaluative scales, and the Depression Proneness Rating Scale (DPRS).  ...  Objective: To examine the differences in disease variables, age, depression proneness, and reported depression between relapsing remitting and secondary progressive MS patients.  ...  for genotype (all ps < .07).  ... 
doi:10.1017/s1355617707079969 fatcat:hyqc3ut5ybg3rnk6shatuylham

Ecological Genomics of Marine Picocyanobacteria

D. J. Scanlan, M. Ostrowski, S. Mazard, A. Dufresne, L. Garczarek, W. R. Hess, A. F. Post, M. Hagemann, I. Paulsen, F. Partensky
2009 Microbiology and Molecular Biology Reviews  
ACKNOWLEDGMENTS We thank Penny Chisholm, Nathan Ahlgren, and Gabrielle Rocap for providing access to submitted papers. The work presented in this review was supported by the European  ...  Two Cu-transporting P 1 -type ATPases have been described in some freshwater cyanobacteria, presumed to be one (CtaA) for transport across the cytoplasmic membrane and one (PacS) for transport across the  ...  g., clusters 1309, 1934, 1966, 2444, 2503 , 6742, 6856, 7044 to -6, and 8072 to -3) or that the bgt and N-II systems described above are more promiscuous in their uptake capacity in these marine strains  ... 
doi:10.1128/mmbr.00035-08 pmid:19487728 pmcid:PMC2698417 fatcat:u67qa4qdergvnh6dyao7ehtjrm

A reference sequence for Blumeria graminis f. sp. tritici (wheat powdery mildew) and its application for comparative and evolutionary genomics

Simone Oberhänsli
Powdery mildew of wheat and barley are caused by B.g. tritici and B.g. hordei, respectively.  ...  highly diverse and massively populated with transposable elements (TEs).  ...  We thank the Blugen consortium (✇✇✇✳❜•✉❣❡♥✳♦r❣) and especially Dr.  ... 
doi:10.5167/uzh-89957 fatcat:op7i4kre7zgehlyeiuajs22gwu

On the Impact of Transposon Activity on Genome Evolution

Stefan Roffler
Bennett MD and Smith JB: Nuclear DNA amounts in angiosperms. Philos Trans R Soc Lond B Biol Sci 1976, 274:227-274.  ...  Dissertation zur Erlangung der naturwissenschatlichen Doktorwürde (Dr. sc. nat.) vorgelegt der Mathematisch-naturwissenschaftlichen Fakultät der Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation  ...  Wing for granting early access to the O. glaberrima data and G. Treier for support in the statistical analysis. This study was supported by the Swiss National Foundation grant # 31003A_138505/1.  ... 
doi:10.5167/uzh-129345 fatcat:r57o43ql4ngtjiak6vayjqmjpm