Peer Review #2 of "Zbrowse: an interactive GWAS results browser (v0.1)"
[peer_review]
2015
unpublished
The growing number of genotyped populations, the advent of high-throughput phenotyping techniques and the development of GWAS analysis software has rapidly accelerated the number of GWAS experimental results. Candidate gene discovery from these results files is often tedious, involving many manual steps searching for genes in windows around a significant SNP. This problem rapidly becomes more complex when an analyst wishes to compare multiple GWAS studies for pleiotropic or environment specific
more »
... effects. To this end, we have developed a fast and intuitive interactive browser for the viewing of GWAS results with a focus on an ability to compare results across multiple traits or experiments. The software can easily be run on a desktop computer with software that bioinformaticians are likely already familiar with. Additionally, the software can be hosted or embedded on a server for easy access by anyone with a modern web browser. PeerJ Comp Sci reviewing PDF | (24 25 Title: Zbrowse: an interactive GWAS results browser 26 27 Abstract 28 29 The growing number of genotyped populations, the advent of high-throughput phenotyping techniques 30 and the development of GWAS analysis software has rapidly accelerated the number of GWAS 31 experiments. Candidate gene discovery from these datasets is often tedious, involving many manual 32 steps searching for genes in windows around a significant SNP. This problem rapidly becomes more 33 complex when trying to compare multiple GWAS studies to identify pleiotropic or 34 treatment/environment specific effects. To address this problem, we have developed a fast and 35 intuitive interactive browser for the viewing of GWAS results with a focus on an ability to compare 36 results across multiple traits or experiments. The software can easily be run on a desktop computer PeerJ Comp Sci reviewing PDF | (CS-2015:03:4323:1:0:CHECK 8 May 2015) Reviewing Manuscript 211 212 Currently, maize, soybean, arabidopsis and sorghum are downloaded with the browser source 213 package. We have developed an application to quickly add organisms to the browser from annotations 214 downloaded from the Plant Genomics Portal (Phytozome) to the local installation of ZBrowse. 215 Additionally, we will be formatting requested and popular organisms and releasing the files on GitHub. 216 These will be easy to download and incorporate into your existing browser installation. 217 218 Adding a new organism manually requires two additional files to be created and placed into the 219 ZBrowse installation directory. One is a flat text file with three lines. The first line tells the browser 220 what the display name for the organism is. The second line tells the browser the names and size of PeerJ Comp Sci reviewing PDF | (CS-2015:03:4323:1:0:CHECK 8 May 2015) Reviewing Manuscript 221 each genome feature (i.e. chromosomes, scaffold, etc.) and the third line is the path to a csv file 222 containing the annotation information. The annotation file needs to have the following columns: name, 223 chromosome, transcript_start, transcript_end, strand, ID, defLine, bestArabhitDefline and 224 bestRiceHitDefline. 225 226 Technical Foundation 227 228 The GWAS browser is written in the R programming language using packages that provide wrappers 229 around popular javascript web applications including shiny (RStudio Inc., 2013) and rCharts 230 (Vaidyanathan, 2013). Because of this, the browser can be run locally with only R and any modern 231 web browser. Internal data processing makes use of the plyr package (Wickham, 2011). The 232 javascript plots are drawn using Highcharts (highcharts.com) and are available for use under the 233 Creative Commons Attribution-NonCommercial 3.0 License. Tables are generated using the javascript 234 library Datatables (datatables.net) and xtable (Dahl, 2013). All of the tools and software used are 235 either free or open source. The use of R to build the web application makes it more easily accessible 236 to bioinformaticians to extend than if it was written in pure javascript. Many GWAS programs are 237 written in R (Kang et al., 2008; Segura et al., 2012; Lipka et al., 2012). So, many scientists performing 238 GWAS will already have some familiarity with R constructs, even if they are not computational 239 biologists. This familiarity will hopefully make it easier for the community who is using the browser to 240 extend it and modify it for their purposes. 241 242 Limitations 243 244 The browser takes a fundamentally different approach from current state of the art browsers. It is 245 focused on the ability to quickly plot a variety of GWAS experiments on a single Manhattan plot. A 246 caveat to this ability, however, is that it cannot plot every SNP in a genotype dataset. Due to memory, 247 time, and plotting constraints the current browser is limited to approximately 5000 data points per trait, 248 which is significantly less than most genotype datasets. Of course, only the most strongly associated 249 SNPs are typically of interest, so this problem can be easily mitigated by trimming the input file to 250 contain only significant associations (e.g., p<0.05). Currently, the browser will automatically trim the 251 number of points being plotted to only display the top 5000 points based on the y-axis column. Future 252 improvements to the browser could support the plotting of more information by binning points when 253 zoomed out to a point where over plotting is an issue and only loading individual data points 254 asynchronously when the zoom level is sufficient to see individual points. 256 The generality of the browser allows for it to be used with any SNP dataset. Only chromosome 257 number and base pair information needs to be provided for each SNP. However, this means that PeerJ Comp Sci reviewing PDF | (
doi:10.7287/peerj-cs.3v0.1/reviews/2
fatcat:wqkkamhdpjaelpwui5l43bkyue