Should genes with missing data be excluded from phylogenetic analyses? | |
Jiang, Wei1,2,3; Chen, Si-Yun2; Wang, Hong1,2,3; Li, De-Zhu1,2,3; Wiens, John J.4; Li,DZ (reprint author),Chinese Acad Sci,Kunming Inst Bot,Key Lab Plant Divers & Biogeog East Asia,Kunming 650201,Yunnan,Peoples R China.; jiangwei@mail.kib.ac.cn; chensiyun@mail.kib.ac.cn; wanghong@mail.kib.ac.cn; dzl@mail.kib.ac.cn; wiensj@email.arizona.edu | |
2014-11-01 | |
发表期刊 | MOLECULAR PHYLOGENETICS AND EVOLUTION |
ISSN | 1055-7903 ; 1055-7903 |
卷号 | 80期号:18B页码:308-318 |
摘要 | Phylogeneticists often design their studies to maximize the number of genes included but minimize the overall amount of missing data. However, few studies have addressed the costs and benefits of adding characters with missing data, especially for likelihood analyses of multiple loci. In this paper, we address this topic using two empirical data sets (in yeast and plants) with well-resolved phylogenies. We introduce varying amounts of missing data into varying numbers of genes and test whether the benefits of excluding genes with missing data outweigh the costs of excluding the non-missing data that are associated with them. We also test if there is a proportion of missing data in the incomplete genes at which they cease to be beneficial or harmful, and whether missing data consistently bias branch length estimates. Our results indicate that adding incomplete genes generally increases the accuracy of phylogenetic analyses relative to excluding them, especially when there is a high proportion of incomplete genes in the overall dataset (and thus few complete genes). Detailed analyses suggest that adding incomplete genes is especially helpful for resolving poorly supported nodes. Given that we find that excluding genes with missing data often decreases accuracy relative to including these genes (and that decreases are generally of greater magnitude than increases), there is little basis for assuming that excluding these genes is necessarily the safer or more conservative approach. We also find no evidence that missing data consistently bias branch length estimates. (C) 2014 Elsevier Inc. All rights reserved. |
关键词 | Accuracy Maximum Likelihood Missing Data Phylogeny |
学科领域 | Biochemistry & Molecular Biology ; Biochemistry & Molecular Biology ; Evolutionary Biology ; Evolutionary Biology ; Genetics & Heredity ; Genetics & Heredity |
DOI | 10.1016/j.ympev.2014.08.006 |
收录类别 | SCI |
语种 | 英语 |
WOS记录号 | WOS:000343742200027 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.kib.ac.cn/handle/151853/18514 |
专题 | 中国西南野生生物种质资源库 |
通讯作者 | Li,DZ (reprint author),Chinese Acad Sci,Kunming Inst Bot,Key Lab Plant Divers & Biogeog East Asia,Kunming 650201,Yunnan,Peoples R China.; jiangwei@mail.kib.ac.cn; chensiyun@mail.kib.ac.cn; wanghong@mail.kib.ac.cn; dzl@mail.kib.ac.cn; wiensj@email.arizona.edu |
作者单位 | 1.Chinese Acad Sci, Kunming Inst Bot, Key Lab Plant Divers & Biogeog East Asia, Kunming 650201, Yunnan, Peoples R China 2.Chinese Acad Sci, Kunming Inst Bot, Germplasm Bank Wild Species, Plant Germplasm & Genom Ctr, Kunming 650201, Yunnan, Peoples R China 3.Univ Chinese Acad Sci, Kunming Coll Life Sci, Kunming 650201, Yunnan, Peoples R China 4.Univ Arizona, Dept Ecol & Evolutionary Biol, Tucson, AZ 85721 USA |
推荐引用方式 GB/T 7714 | Jiang, Wei,Chen, Si-Yun,Wang, Hong,et al. Should genes with missing data be excluded from phylogenetic analyses?[J]. MOLECULAR PHYLOGENETICS AND EVOLUTION,2014,80(18B):308-318. |
APA | Jiang, Wei.,Chen, Si-Yun.,Wang, Hong.,Li, De-Zhu.,Wiens, John J..,...&wiensj@email.arizona.edu.(2014).Should genes with missing data be excluded from phylogenetic analyses?.MOLECULAR PHYLOGENETICS AND EVOLUTION,80(18B),308-318. |
MLA | Jiang, Wei,et al."Should genes with missing data be excluded from phylogenetic analyses?".MOLECULAR PHYLOGENETICS AND EVOLUTION 80.18B(2014):308-318. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
Jiang-2014-Should ge(1085KB) | 开放获取 | CC BY-NC-ND | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论