Incorporating computational resources in a cancer research program

Nicholas T. Woods, Ankita Jhuraney, Alvaro N. A. Monteiro
2014 Human Genetics  
Recent technological advances have transformed cancer genetics research. These advances have served as the basis for the generation of a number of richly annotated datasets relevant to the cancer geneticist. In addition, many of these technologies are now within reach of smaller laboratories to answer specific biological questions. Thus, one of the most pressing issues facing an experimental cancer biology research program in genetics is incorporating data from multiple sources to annotate,
more » ... alize, and analyze the system under study. Fortunately, there are several computational resources to aid in this process. However, a significant effort is required to adapt a molecular biology-based research program to take advantage of these datasets. Here, we discuss the lessons learned in our laboratory and share several recommendations to make this transition effectively. This article is not meant to be a comprehensive evaluation of all the available resources, but rather highlight those that we have incorporated into our laboratory and how to choose the most appropriate ones for your research program. Everyday Practice Although not every laboratory will re-tool to generate large data sets using high throughput methods, it is expected that in general most will be dealing with comparatively larger datasets (e.g. a handful of mutants versus several hundred mutants; one or two cell lines versus a panel of tens of cell lines, etc.) and should be aware of common pitfalls. Batch effects Not normally encountered in small-scale molecular biology experiments, batch effects are the systematic error introduced when a large (or sequentially collected) number of samples are processed in different batches. This is a common issue in microarray data where numerous samples have been collected and run by different labs and a joint analysis is attempted (Benito et al. 2004) . The variability in microarray results is impacted by nonbiological factors, such as reagents from different lots, setup by different lab workers, and Woods et al. Abbreviations AP-MS affinity purification coupled to mass spectrometry ENCODE Encyclopedia of DNA Elements FANTOM Functional Annotation of the Mammalian Genome HCIP high confidence interacting proteins Woods et al.
doi:10.1007/s00439-014-1496-3 pmid:25324189 pmcid:PMC4401625 fatcat:gkrgo3kbsnag3hadp67q3pzqcq