A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Tabix: fast retrieval of sequence features from generic TAB-delimited files
2011
Bioinformatics
Tabix is the first generic tool that indexes position sorted files in TAB-delimited formats such as GFF, BED, PSL, SAM and SQL export, and quickly retrieves features overlapping specified regions. ...
Tabix features include few seek function calls per query, data compression with gzip compatibility and direct FTP/HTTP access. ...
of direct FTP/HTTP access and Jim Kent, James Bonfield and Richard Durbin for their helpful discussions on general indexing techniques. ...
doi:10.1093/bioinformatics/btq671
pmid:21208982
pmcid:PMC3042176
fatcat:5pshpfozwnb75piffwhkpd7agq
The Biological Reference Repository (BioR): a rapid and flexible system for genomics annotation
2014
Computer applications in the biosciences : CABIOS
The BioR toolkit provides the functionality to combine and retrieve annotation from these catalogs via the command-line interface. ...
Commands from the toolkit can be combined with other UNIX commands for advanced annotation processing. We also provide instructions for the development of custom annotation pipelines. ...
ACKNOWLEDGEMENT The authors thank the Center for Individualized Medicine at Mayo Clinic for funding the development of BioR. Conflict of Interest: none declared. ...
doi:10.1093/bioinformatics/btu137
pmid:24618464
pmcid:PMC4071205
fatcat:x6dcfudyynfa5pqqnj4ba3sle4
The variant call format and VCFtools
2011
Bioinformatics
VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. ...
VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: ...
Conflict of Interest: none declared. ...
doi:10.1093/bioinformatics/btr330
pmid:21653522
pmcid:PMC3137218
fatcat:bu6imoalw5hypbfua45gzlsnpy
GORpipe: a query tool for working with sequence data based on a Genomic Ordered Relational (GOR) architecture
2016
Bioinformatics
Motivation: Our aim was to create a general-purpose relational data format and analysis tools to provide an efficient and coherent framework for working with large volumes of DNA sequence data. ...
The system can for instance be used to annotate sequence variants, find genomic spatial overlap between various types of genomic features, filter and aggregate them in various ways. ...
In the rest of this paper we introduce the GOR architecture, briefly describe our tab-delimited storage format and explain how other genomic ordered tabular formats can be used with our system. ...
doi:10.1093/bioinformatics/btw199
pmid:27339714
pmcid:PMC5048061
fatcat:rxxgwarcorgidb2fo3go5chzly
ClinVar data parsing
2017
Wellcome Open Research
Li H: Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics. 2011; 27(5): 718-9. PubMed Abstract | Publisher Full Text | Free Full Text 5. ...
This software repository provides a pipeline for converting raw ClinVar data files into analysis-friendly tab-delimited tables, and also provides these tables for the most recent ClinVar release. ...
Join the TXT file to aggregate the clinical significances from multiple submitters and generate VCF files. • Join with ExAC or gnomAD data and generate table files. ...
doi:10.12688/wellcomeopenres.11640.1
pmid:28630944
pmcid:PMC5473414
fatcat:ahkz2duzebdtnoeaksockxplc4
HTSlib: C library for reading/writing high-throughput sequencing data
2021
GigaScience
Considerable improvements have been made to the original code plus many new features including newer access protocols, the addition of the CRAM file format, better indexing and iterators, and better use ...
Since the original publication of the VCF and SAM formats, an explosion of software tools have been created to process these data files. ...
HTSlib includes 2 standalone programs that work with BGZF; bgzip is a general-purpose compression tool while tabix works on tab-delimited genome coordinate files (e.g., BED and GFF) and provides indexing ...
doi:10.1093/gigascience/giab007
pmid:33594436
pmcid:PMC7931820
fatcat:sxbk4myxvzajbl2zg27vf3jqzu
HTSlib - C library for reading/writing high-throughput sequencing data
[article]
2020
bioRxiv
pre-print
Considerable improvements have been made to the original code plus many new features including newer access protocols, the addition of the CRAM file format, better indexing and iterators, and better use ...
Since the original publication of the VCF and SAM formats, an explosion of software tools have been created to process these data files. ...
HTSlib includes two standalone programs that work with BGZF; bgzip is a general purpose compression tool while tabix works on tab delimited genome coordinate files (e.g. ...
doi:10.1101/2020.12.16.423064
fatcat:6e6lbp36zral7kpezgkhqrc3l4
FEATnotator: A tool for integrated annotation of sequence features and variation, facilitating interpretation in genomics experiments
2015
Methods
Association of genomic positional information, such as results from an expansive variety of next-generation sequencing experiments, with annotated reference features such as genes or predicted protein ...
When the experimental system includes polymorphic genomic inputs, rapid calculation of gene structural and protein translational effects of sequence variation from the reference can be invaluable. ...
inputs for FEATnotator are tab delimited text files in which each row represents a single locus. ...
doi:10.1016/j.ymeth.2015.04.028
pmid:25934264
fatcat:5bdztvdijjaufpgbxwc5nj5xge
The Pancreatic Islet Regulome Browser
2017
Frontiers in Genetics
We herein present the Islet Regulome Browser, a tool that allows fast access and exploration of pancreatic islet epigenomic and transcriptomic data produced by different labs worldwide. ...
of the non-coding genome. ...
We would also like to thank Iñaki Martinez, System Administrator at the Program for Predictive and Personalized Medicine of Cancer at the Institute Germans Trias i Pujol (IGPT) Bioinformatics Core, for ...
doi:10.3389/fgene.2017.00013
pmid:28261261
pmcid:PMC5306130
fatcat:ramx2joutrco3kumparlxzkgea
VAS: a convenient web portal for efficient integration of genomic features with millions of genetic variants
2014
BMC Genomics
Conclusions: VAS is specially designed to handle annotation tasks with long lists of genetic variants and large numbers of annotating features efficiently. ...
High-throughput experimental methods have fostered the systematic detection of millions of genetic variants from any human genome. ...
The integration results are stored in a tab-delimited file. The user will then be shown a summary page of the integration results. ...
doi:10.1186/1471-2164-15-886
pmid:25306238
pmcid:PMC4210471
fatcat:wflio4bi6nhdpnfwu7aqxeycle
VIVA (VIsualization of VAriants): A VCF file visualization tool
[article]
2019
bioRxiv
pre-print
ABSTRACTThe volume and pace of data accumulation from high-throughput sequencing studies have been amplified by recent rapid technological advances in biological sciences. ...
Visualization of genomic data is essential for quality control, exploration, and interpretation. ...
Since next generation sequencing is becoming increasingly accessible to 31 researchers and clinicians, the ability to easily retrieve and visualize genomic data from 32 VCF files is needed. ...
doi:10.1101/589879
fatcat:2amqyvoq6ra3ffoklwelfzkvye
FASTAFS: file system virtualisation of random access compressed FASTA files
[article]
2020
bioRxiv
pre-print
The relatively large files require additional files beyond the scope of the original format, to identify sequences and provide random access. ...
This guarantees in-sync virtualised metadata files and offers fast random-access decompression using Zstandard (zstd). ...
Li, "Tabix: fast retrieval of sequence features from generic TAB-delimited files," Bioinformatics,
249
vol. 27, no. 5, pp. 718-719, 2011, doi: 10.1093/bioinformatics/btq671.
250
[20] C. ...
doi:10.1101/2020.11.11.377689
fatcat:4es44ocbanhhnko3nt4ll4o4ya
Vcfanno: fast, flexible annotation of genetic variants
[article]
2016
bioRxiv
pre-print
Vcfanno can extract and summarize multiple attributes from one or more annotation files and append the resulting annotations to the INFO field of the original VCF file. ...
However, comprehensive variant annotation with diverse file formats is difficult with existing methods.Results: We have developed vcfanno as a flexible toolset that simplifies the annotation of genetic ...
A simple configuration file is used to specify both the source files and the set of attributes (in the case of VCF) or columns (in the case of BED or other tab-delimited) that should be added to the query ...
doi:10.1101/041863
fatcat:aqnglywf4rhctkwhk4efy2uj7y
Security Provisioning and Compression of Diverse Genomic Data based on Advanced Encryption Standard (AES) Algorithm
2021
International Journal of Biology and Biomedical Engineering
The paper discusses sequenced DNA, which may take the form of raw data obtained from sequencing. ...
One of the main issues faced by genomic laboratories is the 'cost of storage' due to the large data file of the human genome (ranging from 30 GB to 200 GB). ...
The SAM Format is a text format used in a series of ASCII columns delimited by tab to store the sequence data. ...
doi:10.46300/91011.2021.15.14
fatcat:qwaxau5ia5bsnns53unacgc7wq
Vcfanno: fast, flexible annotation of genetic variants
2016
Genome Biology
Here we describe vcfanno, which flexibly extracts and summarizes attributes from multiple annotation files and integrates the annotations within the INFO column of the original VCF file. ...
The integration of genome annotations is critical to the identification of genetic variants that are relevant to studies of disease or other traits. ...
A simple configuration file is used to specify both the source files and the set of attributes (in the case of VCF) or columns (in the case of BED or other tab-delimited formats) that should be added to ...
doi:10.1186/s13059-016-0973-5
pmid:27250555
pmcid:PMC4888505
fatcat:3dxgarc47zdmhpqrsoksmhleia
« Previous
Showing results 1 — 15 out of 37 results