摘要: |
皱边喉毛花为藏药藏茵陈基源植物之一,其包含丰富的药用成分。为进一步了解皱边喉毛花转录组,丰富其基因注释、代谢通路等遗传信息,该研究利用PacBio测序平台对皱边喉毛花叶片进行全长转录组测序。结果表明:(1)全长转录组测序共获得17 Gb的高质量数据,对795 698 个CCS序列进行聚类和去冗余,最终获得87 814 条高质量的全长转录本。(2)与7个数据库比对后,共有277 451 条转录本注释成功,其中注释到NR数据库的转录本最多,有39 214 条。26 396 条转录本成功注释到KOG数据库中,共有26 个子类。39 104 条转录本注释到KEGG数据库中,涉及6 个主要通路和40 个子通路。39 102 条转录本注释到GO数据库中,按分子功能、生物学过程和细胞成分3大类对注释成功的转录本进行分类。(3)SSR分析共鉴定到22 861 个SSR,其中单碱基重复最为丰富; 共检测到1 874 个转录因子和15 166 个长非编码RNA(LncRNA),而注释到转录本最多的转录因子家族是C3H。(4)筛选出55 条与单萜类及黄酮类化合物合成相关的转录本。该研究结果丰富了皱边喉毛花的转录组信息,为进一步筛选皱边喉毛花药用成分合成相关的关键基因提供了重要的遗传资源。 |
关键词: 皱边喉毛花, 全长转录组, 代谢通路, 转录因子, 长非编码RNA |
DOI:10.11931/guihaia.gxzw202207050 |
分类号:Q943 |
文章编号:1000-3142(2023)07-1335-12 |
Fund project:第二次青藏高原科学考察研究项目(2019QZKK05020102); 青海省科技国际合作专项(2021-HZ-807)。 |
|
Full-length transcriptome information for Tibetan medicine “Zangyinchen” of original plant Comastoma polycladum |
HAN Shuang1,2, XU Hao1,2, YU Jingya1,2, HAN Yun1,2, ZHANG Faqi1*
|
1. Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences,
Xining 810001, China;2. College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100039, China
|
Abstract: |
Comastoma polycladum is one of the original plant of Tibetan medicine “Zangyinchen”, which contains abundant medical components. To further know the transcriptome of C. polycladum and enrich its genetic information of gene annotation and metabolic pathway, the Pacbio sequencing platform was used to perform full-length transcriptome sequencing of C. polycladum leaves. The results were as follows:(1)A total of 17 Gb of sequencing data was collected, and 87 814 high-quality full-length transcripts were obtained by clustering and de-redundancy of 795 698 CCS sequences.(2)Comparing with seven databases, 277 451 transcripts were annotated successfully, and in NR database with 39 214 transcripts annotated the most transcripts. A total of 26 396 transcripts were annotated to the KOG database, with 26 subcategories, and a total of 39 104 transcripts with six major pathways and 40 secondary pathways to the KEGG database. A total of 39 102 transcripts were annotated to the GO database, which were divided into three major categories: molecular function, biological process and cellular component.(3)SSR analysis yielded 22 861 SSRs, with single-base repeats being the most abundant. A total of 1 874 transcription factors and 15 166 long non-coding RNAs(LncRNAs)were identified, and the C3H transcription factor family had the most annotated transcripts.(4)A total of 55 transcripts involved in the synthesis of monoterpenes and flavonoids were screened out. These results enrich the transcriptome information of C. polycladum, and provide significant genetic resources for further screening of candidate genes related to the synthesis of medicinal components of C. polycladum. |
Key words: Comastoma polycladum, full-length transcriptome, metabolic pathway, transcription factor, LncRNA |