Aiming off the target: studying repetitive DNA using target capture sequencing reads [article]

Lucas Costa, Andre Marques, Chris Buddenhagen, William Wayt Thomas, Bruno Huettel, Veit Schubert, Steven Dodsworth, Andreas Houben, Gustavo Souza, Andrea Pedrosa-Harand
2020 bioRxiv   pre-print
With the advance of high-throughput sequencing (HTS), reduced-representation methods such as target capture sequencing (TCS) emerged as cost-efficient ways of gathering genomic information. As the off-target reads from such sequencing are expected to be similar to genome skims (GS), we assessed the quality of repeat characterization using this data. For this, repeat composition from TCS datasets of five Rhynchospora (Cyperaceae) species were compared with GS data from the same taxa. All the
more » ... e taxa. All the major repetitive DNA families were identified in TCS, including repeats that showed abundances as low as 0.01% in the GS data. Rank correlation between GS and TCS repeat abundances were moderately high (r = 0.58-0.85), increasing after filtering out the targeted loci from the raw TCS reads (r = 0.66-0.92). Repeat data obtained by TCS was also reliable to develop a cytogenetic probe and solve phylogenetic relationships of Rhynchospora species with high support. In light of our results, TCS data can be effectively used for cyto- and phylogenomic investigations of repetitive DNA. Given the growing availability of HTS reads, driven by global phylogenomic projects, our strategy represents a way to recycle genomic data and contribute to a better characterization of plant biodiversity.
doi:10.1101/2020.12.10.419515 fatcat:grdemoxubfa5fasdp7iqpboe3m