|
|
|
This article has been:browse 6988times Download 1938times |
Scan the code! |
|
基于5 993个核基因的被子植物系统发育关系研究 |
金 鑫1,2,程 书1,2, 杨 拓3, 余 慷1,2, 段肖霞1,4,
倪雪梅1,4, 李世明1,2, 张耕耘1,4*
|
1. 深圳华大生命科学研究院, 广东 深圳 518083;2. 深圳市华大农业应用研究院, 广东 深圳 518120;3. 国家基因库, 广东 深圳 518120;4. 基因组学农业部重点实验室, 广东 深圳 518083
|
|
摘要: |
系统发育关系的构建对被子植物分类及进化研究非常重要。长期以来,被子植物系统发育的研究,大多使用质体基因、线粒体基因或少数保守的单拷贝核基因。该研究从已注释基因组或转录组中搜集88种被子植物(包含58目)的核基因集; 通过对其进行同源基因聚类及去旁系同源基因,获得了5 993个一对一的直系同源基因家族(即对于每个基因家族,每种植物最多一条序列,最少包含50个物种); 使用截取各种不同数目基因集的DNA或氨基酸序列,采用串联法(concatenation)和溯祖法(coalescence),共构建了20棵进化树。比较这些进化树,虽然大部分结果支持APG IV中描述的被子植物主要支系之间的关系[(真双子叶植物,单子叶植物),木兰类植物],但真双子叶植物内部各目分支的演化关系与APG IV有一个很大的不同,即认为檀香目和石竹目是蔷薇类植物的姊妹群。基于这些进化树,估算了被子植物各目分支的分化时间,结果表明被子植物的起源时间为237.78百万年前(95%置信区间为202.6~278.08),与主流观点认为的225百万年至240百万年前一致。以上结果为构建进化树提供了一种可行性策略,这种方法允许使用基因数目更多而计算速度更快。 |
关键词: 系统发育关系, 被子植物, 核基因, 同源基因聚类, 串联法, 溯祖法, 分化时间 |
DOI:10.11931/guihaia.gxzw201905048 |
分类号:Q949.4 |
文章编号:1000-3142(2020)01-0044-16 |
Fund project:国家科技支撑计划(2015BAD02B01-7); 广东省农作物核心资源开发应用企业重点实验室(2011A091000047); 深圳市科技计划项目(JCYJ20150831201123287); 深圳作物分子设计聚合育种工程实验室提升项目(深发改 [2015]946号)[Supported by Key Technology R & D Program of China(2015BAD02B01-7); Key Laboratory of Crop Core Resources Development and Application Enterprises of Guangdong(2011A091000047); Science and Technology Program of Shenzhen(JCYJ20150831201123287); Molecular Design and Polymerization Breeding Engineering Laboratory of Shenzhen([2015]946)]。 |
|
Reconstruction of angiosperm phylogeny based on 5 993 nuclear genes |
JIN Xin1,2, CHENG Shu1,2, YANG Tuo3, YU Kang1,2, DUAN Xiaoxia1,4,
NI Xuemei1,4, LI Shiming1,2, ZHANG Gengyun1,4*
|
1. BGI-Shenzhen, Shenzhen 518083, Guangdong, China;2. BGI Institute of Applied Agriculture, Shenzhen 518120, Guangdong, China;3. China National GeneBank, Shenzhen 518120, Guangdong, China;4. Key Laboratory of Genomics,
Ministry of Agriculture, BGI-Shenzhen, Shenzhen 518083, Guangdong, China
|
Abstract: |
Construction of phylogeny is important for classification and research of angiosperms. For a long time, angiosperm phylogeny has been analysed using plastid genes, mitochondrial genes or a few conserved single-copy nuclear genes. We collected nuclear gene sets of 88 species of angiosperm(contains 58 orders)from annotated genomes or transcriptomes. By using a combined homology- and phylogeny tree-based approach, we obtained a total of 5 993 one-to-one ortholog groups(one sequence of each species for each ortholog group), each of which was represented by at least 50 species. Then, a total of 20 species trees were reconstructed using methods with different combinations of reconstruction(concatenation-based and coalescence-based)and sequence type(nucleotide or amino acid)for gene data sets with different gene occupancy values. Most of the resulting topologies support the relationships of the major clades of angiosperm as described in APG IV, but present different deep relationships among major clades in eudicots phylogeny such as the placement of Santalales and Caryophyllales as sisters to Rosids. We estimate the divergence times of the major clades of angiosperm and conclude that the origin of angiosperm is about 237.78 million years ago(95% confidence interval is 202.6-278.08), which is in accordance with the previously accepted 225 million years to 240 million years ago. This study provides an efficient strategy for building phylogenetic trees using thousands of genes with ultrafast calculation. |
Key words: phylogeny, angiosperms, nuclear genes, ortholog inference, concatenation, coalescence, divergence time |
|
|
|
|
|