Abstract
Pangasius sutchi, a significant freshwater economic fish in Southeast Asia, is characterized by rapid growth, ease of cultivation, rich nutritional content, and the absence of small intermuscular bones. First introduced to China from Thailand in 1978, P. sutchi achieved a breakthrough in artificial breeding in 1997 and has since been extensively promoted in Guangdong, Guangxi, and Hainan provinces. Current research on P. sutchi primarily focused on breeding models, nutritional feed development, disease control, and fish product processing technology, with less emphasis on basic biology, particularly molecular biology. This study sequenced the full-length transcriptome from brain, gills, heart, liver, spleen, head kidney, stomach, intestines, gonads, and muscles of sexually matured P. sutchi using Single Molecule Real-Time (SMRT) sequencing on the PacBio Sequel platform to elucidate the genetic basis and support molecular biology research. A total of 1 487 336 high-quality reads were obtained, averaging 83 592 bp in length with an N50 of 162 901 bp. After self-correction, 1 005 955 CCS (Circular Consensus Sequence) were derived, and following filtration, 667 973 polyA-containing FLNC (full-length non-concatenated) were identified, averaging 2 057 bp in length with an N50 of 2 359 bp. For gene and transcript annotation, 614 078 (91.93%) FLNC were used, identifying 19 835 known genes and 9 348 novel genes. In addition, 50 311 ORF (open reading frame), 79 922 alternative splicing, 18 fusion genes, and 20 215 alternative polyadenylation sites were predicted. Of the 9 348 novel genes, 3 912, 2 385, 2 167, 81 and 1 520 were annotated in NR (non-redundant protein sequences), GO (gene ontology), KEGG (Kyoto encyclopedia of genes and genomes), KOG (eukaryotic orthologous groups) and SwissProt databases, respectively. GO enrichment analysis revealed that 1 309, 1 351, and 1 524 new genes were enriched in cellular process, cellular anatomical entities, and binding terms, respectively. KEGG enrichment indicated that the new genes were primarily enriched in cellular processes such as eukaryotes (106), signal transduction (276), folding, sorting and degradation (79), amino acid metabolism (63), and endocrine system (197). 4 624 lncRNA were obtained in P. sutchi, regulating 32 283 target mRNA. GO enrichment results showed that target mRNA were mainly enriched in cellular processes (12 084), cellular anatomical entity (18 034) and binding (12 772). KEGG analysis indicated that the target mRNA were predominantly enriched in the transport and catabolism pathway (1 437), signal transduction (4 165) pathway, folding, sorting and degradation (643) pathway, carbohydrate metabolism (584) pathway and immune system (2 135). In this study, the full-length transcriptome sequencing data analysis and functional annotation have enriched the genetic resources of P. sutchi and provided a basis for further research on the biological characteristics and gene function of P. sutchi.