Cite this article: | 唐 其, 马小军, 莫长明, 潘丽梅, 韦荣昌, 赵 欢.罗汉果全基因组Survey分析[J].广西植物,2015,35(6):786-791.[Click copy] |
TANG Qi, MA Xiao-Jun, MO Chang-Ming,
PAN Li-Mei, WEI Rong-Chang, ZHAO Huan.Genome survey analysis in Siraitia grosvenorii[J].Guihaia,2015,35(6):786-791.[Click copy] |
|
|
|
This article has been:browse 20339times Download 4672times |
Scan the code! |
|
罗汉果全基因组Survey分析 |
唐 其2,3, 马小军1*, 莫长明3, 潘丽梅3, 韦荣昌3, 赵 欢1
|
1. 中国医学科学院 药用植物研究所, 北京 100193;2. 湖南农业大学 园艺园林
学院, 长沙 410128;3. 广西药用植物园, 南宁 530023
|
|
摘要: |
罗汉果是广西特有药用及甜料植物,其主要成分之一甜苷V作为天然、非糖甜味剂,具有广阔的开发前景,但罗汉果目前完全来自于栽培,适生区狭窄,连作障碍严重,加之含量低导致甜苷V生产成本居高不下,严重限制了其应用。为了减少盲目性,在大规模全基因组深度测序之前,先做低覆盖度的基因组Survey测序,评价基因组的大小及复杂程度,以确定适合该植物全基因组的测序研究策略。该研究采用第二代高通量测序技术(Illumina HiseqTM 2000)首次测定了罗汉果基因组大小,并利用生物信息学方法估计罗汉果杂合率、重复序列和GC含量等基因组信息。结果表明:(1)获得了18.1 Gb罗汉果基因组测序数据,基因组大小估计为344.95 Mb左右,测序深度为52×;(2)从K-mer分布曲线发现罗汉果基因组有明显的杂合峰,杂合率达1.5%,基因组高杂合导致组装的结果中Contig N50和Scaffold N50的长度比预期的要短很多,还造成GC平均深度及含量分布明显异常,存在一个低深度分布区域。基因组主峰后面有微弱的重复峰,说明罗汉果存在较多的重复序列;(3)由于罗汉果存在高杂合率和重复序列较多的特点,该基因组测序分析仅采用全基因组鸟枪法(WGS)策略不合适,为了更好地对全基因组进行序列拼接和组装,可尝试结合采用Fosmid-to-Fosmid或BAC-to-BAC策略。该研究结果对于揭示罗汉果产量、有效成分含量、发育及抗病虫的分子机制,以及通过分子育种来提高甜苷V含量和降低生产成本具有重要意义,为全基因组测序策略的选择提供了依据。 |
关键词: 罗汉果 基因组测序 杂合率 GC含量 鸟枪法测序策略 |
DOI:10.11931/guihaia.gxzw201404041 |
分类号: |
Fund project:收稿日期: 2014-05-28修回日期: 2014-07-22 基金项目: 国家自然科学基金(81373914,31400275); 国家科技支撑计划项目(2011BAI01B03); 广西农业科技成果转化项目(桂科转1123013-12); 广西自然科学基金(2013GXNSFBA019170); 湖南省科技计划重点项目(2014SK2005); 广西卫生厅中医药科技专项(GZPT1235)。 作者简介: 唐其(1981-),男,湖南株洲县人,博士,助理研究员,研究方向为药用植物分子生物学,(E-mail)tangqi423@sina.com。*通讯作者: 马小军,博士,教授,从事药用植物生物技术研究,(E-mail)mayixuan10@163.com。 |
|
Genome survey analysis in Siraitia grosvenorii |
TANG Qi2,3, MA Xiao-Jun1*, MO Chang-Ming3,
PAN Li-Mei3, WEI Rong-Chang3, ZHAO Huan1
|
1. Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Beijing 100193, China;2. Horticultrue and Landscape College, Hunan Agricultural University, Changsha 410128, China;3. Guangxi Botanical Garden of Medicinal Plants, Nanning 530023, China
|
Abstract: |
Siraitia grosvenorii(Luohanguo)is a herbaceous perennial medicinal and sweetener plant native in Guangxi of China. It has long been used in traditional Chinese medicine as a natural sweetener and also as a folk medicine for the treatment of lung congestion, colds and sore throats. Many cucurbitane-type triterpene glycosides have been isolated and characterized from S. grosvenorii. The active components responsible for the sweetness are the mogrosides, which are members of the family of triterpene glycosides. Mogroside V has an important prospect as natural and low calorie sweetener, which is nearly 425 times sweeter than sucrose. S. grosvenorii currently depends totally on cultivation in China. It's limitedly applied because the narrow distribution, serious continuous cropping obstacle, low content and high extraction costs of mogroside V. In order to reduce blindness research and determine the appropriate sequencing strategy, the genome survey before large-scale genome sequencing is needed. This survey can provide information about the size and complexity of the whole genome of the S. grosvenorii. The next generation sequencing technology which has been emerged as a cost effective approach for high-through-put sequence determination has dramatically improved the efficiency and speed of genes discovery and genome research. Genome sequencing of S. grosvenorii has the vital significance to reveal the molecular mechanism of yield, content, growth, pest and disease resistance, and provides an efficient approach to improve content and reduce cost of mogroside V by molecular breeding. In this study, the genome size of S. grosvenorii was determined by next-generation sequencing technologies(NGS, Illumina HiseqTM 2000). The hybridity percentage, repeats, and GC depth were also estimated by bioinformatics analysis. The results were as follows:(1)Two DNA libraries of 170 bp and 500 bp are constructed. After cleaning and quality checks, more than 18.1 Gb high quality data from the genome is generated, which were assembled into 943 296 contigs and 433 325 scaffolds by SOAP denovo software. The contig and scaffold numbers of the length more than 2 kb were 17 855 and 27 993 separately. The longest length of contig and scaffold were 29 kb and 268 kb. The N50 length of contig and scaffold were 484 bp and 2 331 bp. The average genome size and sequencing coverage depth of S. grosvenorii was about 344.95 Mb and 52 times respectively;(2)The genome of S. grosvernrii had obvious hybridity peak by K-mer method, the hybridity percentage as high as 1.5%. The assembly results showed that the length of contig N50 and scaffold N50 are much shorter than expected. High hybridity percentage of the genome leads to apparently unusual phenomenon between average depth and GC content, and had a low depth distribution area. There was a weak repeat peak behind the main peak, which demonstrated that S.grosvenorii has more repetitive sequences;(3)Whole-genome shotgun sequencing(WGS)should not be used to S. grosvenorii genome sequencing separately, and the Fosmid-to-Fosmid or BAC-to-BAC library could be combinational used for better results. This study would not only obtain the basic resources of genome, but also provide a theoretical basis and target genes for S. grosvenorii in transgenic breeding and genetic engineering. |
Key words: Siraitia grosvenorii(Luohanguo) genome sequencing hybridity percentage GC depth whole-genome shotgun sequencing |
|
|
|
|
|