GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

Umadevi Paila, Brad A. Chapman, Rory Kirchner, Aaron R. Quinlan, Paul P. Gardner
2013 PLoS Computational Biology  
Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and flexible set of genome annotations (e.g., dbSNP,
more » ... NCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or variant prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate the utility of GEMINI for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to will provide researchers with a standard framework for medical genomics.
doi:10.1371/journal.pcbi.1003153 pmid:23874191 pmcid:PMC3715403 fatcat:lae4evkav5c4ren6y7cttlekye