TY - JOUR
T1 - Aiming off the target
T2 - recycling target capture sequencing reads for investigating repetitive DNA
AU - Costa, Lucas
AU - Marques, André
AU - Buddenhagen, Chris
AU - Thomas, William Wayt
AU - Huettel, Bruno
AU - Schubert, Veit
AU - Dodsworth, Steven
AU - Houben, Andreas
AU - Souza, Gustavo
AU - Pedrosa-Harand, Andrea
N1 - Publisher Copyright:
© 2021 The Author(s) 2021. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For permissions, please e-mail: [email protected].
PY - 2021/11/9
Y1 - 2021/11/9
N2 - Background and Aims: With the advance of high-Throughput sequencing, reduced-representation methods such as target capture sequencing (TCS) emerged as cost-efficient ways of gathering genomic information, particularly from coding regions. As the off-Target reads from such sequencing are expected to be similar to genome skimming (GS), we assessed the quality of repeat characterization in plant genomes using these data. Methods: Repeat composition obtained from TCS datasets of five Rhynchospora (Cyperaceae) species were compared with GS data from the same taxa. In addition, a FISH probe was designed based on the most abundant satellite found in the TCS dataset of Rhynchospora cephalotes. Finally, repeat-based phylogenies of the five Rhynchospora species were constructed based on the GS and TCS datasets and the topologies were compared with a gene-Alignment-based phylogenetic tree. Key Results: All the major repetitive DNA families were identified in TCS, including repeats that showed abundances as low as 0.01 % in the GS data. Rank correlations between GS and TCS repeat abundances were moderately high (r=0.58-0.85), increasing after filtering out the targeted loci from the raw TCS reads (r=0.66-0.92). Repeat data obtained by TCS were also reliable in developing a cytogenetic probe of a new variant of the holocentromeric satellite Tyba. Repeat-based phylogenies from TCS data were congruent with those obtained from GS data and the gene-Alignment tree. Conclusions: Our results show that off-Target TCS reads can be recycled to identify repeats for cyto-and phylogenomic investigations. Given the growing availability of TCS reads, driven by global phylogenomic projects, our strategy represents a way to recycle genomic data and contribute to a better characterization of plant biodiversity.
AB - Background and Aims: With the advance of high-Throughput sequencing, reduced-representation methods such as target capture sequencing (TCS) emerged as cost-efficient ways of gathering genomic information, particularly from coding regions. As the off-Target reads from such sequencing are expected to be similar to genome skimming (GS), we assessed the quality of repeat characterization in plant genomes using these data. Methods: Repeat composition obtained from TCS datasets of five Rhynchospora (Cyperaceae) species were compared with GS data from the same taxa. In addition, a FISH probe was designed based on the most abundant satellite found in the TCS dataset of Rhynchospora cephalotes. Finally, repeat-based phylogenies of the five Rhynchospora species were constructed based on the GS and TCS datasets and the topologies were compared with a gene-Alignment-based phylogenetic tree. Key Results: All the major repetitive DNA families were identified in TCS, including repeats that showed abundances as low as 0.01 % in the GS data. Rank correlations between GS and TCS repeat abundances were moderately high (r=0.58-0.85), increasing after filtering out the targeted loci from the raw TCS reads (r=0.66-0.92). Repeat data obtained by TCS were also reliable in developing a cytogenetic probe of a new variant of the holocentromeric satellite Tyba. Repeat-based phylogenies from TCS data were congruent with those obtained from GS data and the gene-Alignment tree. Conclusions: Our results show that off-Target TCS reads can be recycled to identify repeats for cyto-and phylogenomic investigations. Given the growing availability of TCS reads, driven by global phylogenomic projects, our strategy represents a way to recycle genomic data and contribute to a better characterization of plant biodiversity.
KW - Genome skimming
KW - holocentric
KW - reduced-representation sequencing
KW - RepeatExplorer
KW - Rhynchospora
KW - satellite DNA
KW - transposable elements
UR - http://www.scopus.com/inward/record.url?scp=85121128121&partnerID=8YFLogxK
UR - https://uobrep.openrepository.com/handle/10547/625011
U2 - 10.1093/aob/mcab063
DO - 10.1093/aob/mcab063
M3 - Article
C2 - 34050647
AN - SCOPUS:85121128121
SN - 0305-7364
VL - 128
SP - 835
EP - 848
JO - Annals of Botany
JF - Annals of Botany
IS - 7
ER -