Reproducible probe-level analysis of the Affymetrix Exon 1.0 ST array with R/Bioconductor
Briefings in Bioinformatics
The presence of different transcripts of a gene across samples can be analysed by whole-transcriptome microarrays. Reproducing results from published microarray data represents a challenge owing to the vast amounts of data and the large variety of preprocessing and filtering steps used before the actual analysis is carried out. To guarantee a firm basis for methodological development where results with new methods are compared with previous results, it is crucial to ensure that all analyses are
... completely reproducible for other researchers. We here give a detailed workflow on how to perform reproducible analysis of the GeneChip Õ Human Exon 1.0 ST Array at probe and probeset level solely in R/Bioconductor, choosing packages based on their simplicity of use. To exemplify the use of the proposed workflow, we analyse differential splicing and differential gene expression in a publicly available dataset using various statistical methods. We believe this study will provide other researchers with an easy way of accessing gene expression data at different annotation levels and with the sufficient details needed for developing their own tools for reproducible analysis of the GeneChip Õ Human Exon 1.0 ST Array. Maria Rodrigo-Domingo is a PhD student in biostatistics at the Department of Mathematical Sciences of Aalborg University. Her project focuses on the improvement of the statistical methods for the detection of differential splicing using Affymetrix's exon array. Rasmus Waagepetersen is a professor in statistics at the Department of Mathematical Sciences of Aalborg University with research interests in spatial statistics, simulation based inference, generalized linear mixed models and quantitative genetics. Julie StÖve BÖdker holds a PhD in molecular biology and she is a post-doc researcher at the research laboratory of the Department of Haematology, Aalborg University Hospital. Her work includes analysing microarray platforms from Affymetrix, with a focus on DLBCL. Steffen Falgreen is a PhD student in biostatistics at the research laboratory of the Department of Haematology, Aalborg University Hospital. In his project, he is developing new statistical models for the prediction of chemotherapy outcome on DLBCL. Malene Krag Kjeldsen holds a PhD in medicine and she is a post-doc researcher at the research laboratory of the Department of Haematology, Aalborg University Hospital. She works with transcription factors known to be involved in B-cell differentiation and DLBCL. Hans Erik Johnsen, MD and professor in clinical haematology, is responsible for the research activity and infrastructure of the Department of Haematology, Aalborg University Hospital. His main interest is studies of pathogenesis of haematological malignancies to generate new predictive strategies in clinical practice. Karen Dybk×r is a molecular biologist and senior scientist. She is an associate professor at Aalborg University Hospital, and is responsible for the functional laboratory of the Department of Haematology including cell culturing facilities. Martin BÖgsted is a senior biostatistician and associate professor and responsible for biostatistics at the Department of Haematology, Aalborg University Hospital. His research deals with the methodology of clinical bioinformatics and statistics, with special focus on applications in experimental oncology.