鱼类基因组中长末端重复反转座子的进化分析

Evolutionary analysis of long terminal repeat retrotransposons in fish genomes

  • 摘要:
    目的 系统性的研究长末端重复反转座子(LTR-反转座子)在鱼类基因组中的进化。
    方法 本研究选择了来自不同目的30种典型鱼类物种,以及11种非鱼类(涵盖鱼类、两栖类、爬行类、鸟类和哺乳类)的脊椎动物作为对比的参考物种,对这些基因组中的全长LTR-反转座子,进行了系统性地鉴定与进化分析。
    结果 不同物种间LTR-反转座子的数量差异显著,在鲑形目和狗鱼目鱼类中,拥有更多的数量。通过比较LTR两臂的序列差异,估算其插入基因组的时间,发现这些物种LTR-反转座子大规模的扩增始于近250万年内,且扩增速率与扩增起始时间呈负相关,暗示其可能在第四纪冰期等环境剧变时期增强了宿主适应性。对LTR-反转座子的家族分析表明,Ty3/Gypsy型LTR-反转座子在绝大多数物种中占主导地位。在鱼类中,Ty3/Gypsy中的V-clade型是最为活跃且年轻的亚家族。
    结论 系统发育分析揭示了不同鱼类支系(鲑-狗-Clade、鲤-鲱-鲟-Clade、鲈-鳕-Clade)共享一些古老亚家族,同时也存在支系特有的亚家族(如鲑-狗-Clade的Osvaldo),越是年轻的LTR-反转座子在宿主基因组中的复制活跃性更高。对LTR-反转座子中的关键蛋白——反转录酶的保守结构域分析发现,其包含5个核心motif,其中motif 4的DD结构域(含保守的Asp-Asp)对维持复制活性至关重要,该结构域的序列变异与不同亚家族的复制和进化密切相关。意义本研究第一次系统性的描绘了鱼类LTR-反转座子的进化过程,为理解鱼类基因组如何通过接纳和调控这些外源DNA的插入来实现快速的适应性进化提供了关键证据。

     

    Abstract:
    Transposable elements are DNA sequences that can move or copy themselves from one location to another within a genome. Depending on whether the transposition process involves an RNA intermediate, transposable elements are classified into DNA transposons and retrotransposons. Retrotransposons are RNA-mediated and generate new copies at new genomic locations via a copy-and-paste mechanism, which include long terminal repeat retrotransposons (LTR-RTs) and non-LTR retrotransposons. LTR-RTs are characterized by the presence of directly repeated LTR arms flanking the internal coding region. A full-length LTR-RT typically consists of terminal repeat sequences, a capsid protein encoded by the gag gene, and a polyprotein associated with reverse transcription and integration. In metazoans, LTR-RTs are composed of an envelope protein and a polyprotein that includes aspartase, integrase, reverse transcriptase, and Ribonuclease H. Among them, reverse transcriptase and Ribonuclease H are involved in the replication and transposition of LTR-RTs, while integrase facilitates the integration of the DNA form of retrotransposons into the host genome. Based on the arrangement of reverse transcriptase, integrase, and Ribonuclease H sequences, LTR-RTs are classified into two types: Ty1/copia and Ty3/Gypsy. The LTR arms themselves do not encode any proteins. However, the timing of LTR-RT insertion into the host genome can be inferred from sequence variations between a pair of LTR arms. In fish, LTR-RTs exhibit the highest transcriptional activity immediately after zygotic genome activation. The insertion of LTR-RTs) into host genomes plays a critical role in host gene evolution.
    Objective This study systematically investigated the evolutionary dynamics of LTR-RTs within fish genomes.
    Methods We selected 30 representative fish species across various orders, along with 11 non-fish vertebrate species (covering amphibians, reptiles, birds, and mammals) as comparative references. Full-length LTR-RTs within these genomes were predicted to analyze their evolutionary patterns.
    Results The analysis outputs indicated a significant variation in the abundance of LTR-RTs among different species when the Salmoniformes and Esociformes genome exhibited to carry the highest copy numbers. Comparison of the sequence divergence between the two LTR arms was used to estimate insertion ages, which revealed a large-scale expansion of the LTR-RTs in these species initiated within the last 2.5 million years. The expansion rate was negatively correlated with the onset time of expansion, suggesting that these elements may have enhanced host adaptability during periods of environmental upheaval, such as the Quaternary glaciation. Family analysis displayed that Ty3/Gypsy type LTR-RTs predominated in most species. Among the fish species, the V-clade of Ty3/Gypsy emerged as the most active and "young" subfamily.
    Conclusion Phylogenetic analysis demonstrated that while different fish lineages (Salmon-Esocid Clade, Cyprinid-Clupeid-Sturgeon Clade, and Percid-Gadid Clade) shared several ancient subfamilies, lineage-specific subfamilies also existed (e.g., Osvaldo in the Salmon-Esocid Clade). The recent-inserted LTR-RTs exhibited higher transpositional activity in the host genome. Analysis of the reverse transcriptase among typical species identified five core conserved function motifs, of which, the DD domain (containing conserved Asp-Asp residues) of Motif 4 was essential for maintaining replicative activity. Sequence variation within this domain is closely associated with the replication and evolution of different subfamilies. Significance This study provides the first systematic depiction of the evolutionary trajectory of LTR-RTs in fish. These findings offer crucial evidence for understanding how fish genomes achieve rapid adaptive evolution through the integration and regulation of these exogenous DNA elements.

     

/

返回文章
返回