FUNCTIONAL GENOMICS TO AID GENOMIC PREDICTION MODELS IN CASSAVA
Genomic Prediction (GP) is commonly performed using tens of thousands, even millions of single-nucleotide polymorphism (SNP) markers. Associations among the phenotypes and genotypes of a training population are used to predict the performance of un-phenotyped target populations. Traditionally, the markers used in these studies are treated similarly, irrespective of their position in the genome, their proximity to regulatory elements, or whether they reside within biologically-relevant genes.
... content of this dissertation aims to both identify "functional" regions in the cassava (Manihot esculenta) genome and test whether the incorporation of prior biological knowledge enhances the accuracy of GP models in this crop. In 2013, at the onset of this research, very few genomic resources were available within the cassava research community; a draft of the genome sequence had recently been released and a solid platform for low-coverage genotyping using genotyping-bysequencing (GBS) was online. As a means of generating more genomic resources, we first began by identifying nucleotide-binding site leucine-rich repeat (NBS-LRR) genes associated with biotic resistance across the cassava genome. We then leveraged the NBS-LRR information, together with genomic annotations and a transcriptomics experiment, in a second study to identify genes involved in the interaction of cassava with Cassava Brown Streak Virus (CBSV). We later used biologically-informed GP methods to compare models with and without biologically-relevant information. Until the final phase of our research, our efforts had focused on identifying functional elements within the coding fraction of the genome. In an effort to build upon several genome-wide association (GWA) studies illustrating the importance of regulatory regions outside genes, the third study explored cassava's nascent transcriptome. In doing so, we were able to identify key components of plant transcriptional regulation and candidate enhancer regions that not been previously described. Moreover, we showed that these candidate enhancer regions contributed disproportionately to the SNP heritability of several complex traits. The research presented herein provides holistic insight into cassava's genomic resources, and it is our hope that it is useful to future research and breeding endeavors within this staple crop species. v BIOGRAPHICAL SKETCH Roberto Lozano was born April 6, 1987 in Lima, Peru, the son of Elsa and Pedro, and grew up happily playing soccer. Roberto began his university career at "Universidad Peruana Cayetano Heredia" studying biology. It was during his studies that he discovered his interest for molecular biology and bioinformatics. Later, under the supervision of Gisella Orjeda, he started working in potato genetics and genomics (even when he promised himself that he would never work with plants). In the fall of 2013, Roberto began graduate work in the Plant Breeding and Genetics Section at Cornell University. Under the supervision of Jean-Luc Jannink, he focused his research on using functional genomics to aid genomic predictions in cassava and learned more quantitative statistics than he would have ever imagined. This project gave him the opportunity to travel around the world, explore new cultures, establish professional connections on different continents, and, most importantly, make some amazing friends. vi To the memory of Martha Hamblin vii ACKNOWLEDGMENTS I would like to give special acknowledgments to Martha Hamblin and Jean-Luc Jannink who risked accepting me on their research team and gave me the space and freedom to pursue the scientific questions of interest to me. Martha helped me survive the first year; between the heavy coursework, the new country, and impostor syndrome, her friendship, patience, and kind words put me back on track. I really benefited from Jean-Luc's scientific advice and quantitative genetics wizardry, but, most importantly, he showed me the importance of a well-balanced family and work life. I would also like to thank all the members from the Jannink-Sorrells lab, for maintaining a friendly environment full of scientific discussions. I would like to thank all the members of the NextGen Cassava Breeding Project, both domestically and internationally, as they were a great team to work with. Thank you to the Section of Plant Breeding and Genetics within the School of Integrative Plant Science; Cornell University is a great place to not only pursue for scientific endeavors, but it is also a wonderful place to meet amazing people. A particular "thank you" is due to Dunia Pino del Carpio, who helped point my efforts in the right direction. Thanks to all the great friends I made here, and to Hannah, my incredible and supportive girlfriend. Finally, I would like to thank my supportive family, especially my mom. She always let me pursue my dreams, even when it meant I was thousands of kilometers from home.