Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 19(3): e1010154, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36947561

RESUMO

Missing observations in trait datasets pose an obstacle for analyses in myriad biological disciplines. Considering the mixed results of imputation, the wide variety of available methods, and the varied structure of real trait datasets, a framework for selecting a suitable imputation method is advantageous. We invoked a real data-driven simulation strategy to select an imputation method for a given mixed-type (categorical, count, continuous) target dataset. Candidate methods included mean/mode imputation, k-nearest neighbour, random forests, and multivariate imputation by chained equations (MICE). Using a trait dataset of squamates (lizards and amphisbaenians; order: Squamata) as a target dataset, a complete-case dataset consisting of species with nearly complete information was formed for the imputation method selection. Missing data were induced by removing values from this dataset under different missingness mechanisms: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). For each method, combinations with and without phylogenetic information from single gene (nuclear and mitochondrial) or multigene trees were used to impute the missing values for five numerical and two categorical traits. The performances of the methods were evaluated under each missing mechanism by determining the mean squared error and proportion falsely classified rates for numerical and categorical traits, respectively. A random forest method supplemented with a nuclear-derived phylogeny resulted in the lowest error rates for the majority of traits, and this method was used to impute missing values in the original dataset. Data with imputed values better reflected the characteristics and distributions of the original data compared to complete-case data. However, caution should be taken when imputing trait data as phylogeny did not always improve performance for every trait and in every scenario. Ultimately, these results support the use of a real data-driven simulation strategy for selecting a suitable imputation method for a given mixed-type trait dataset.


Assuntos
Projetos de Pesquisa , Filogenia , Simulação por Computador , Fenótipo , Análise por Conglomerados
2.
J Mol Evol ; 88(8-9): 689-702, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33009923

RESUMO

Myriad environmental and biological traits have been investigated for their roles in influencing the rate of molecular evolution across various taxonomic groups. However, most studies have focused on a single trait, while controlling for additional factors in an informal way, generally by excluding taxa. This study utilized a dataset of cytochrome c oxidase subunit I (COI) barcode sequences from over 7000 ray-finned fish species to test the effects of 27 traits on molecular evolutionary rates. Environmental traits such as temperature were considered, as were traits associated with effective population size including body size and age at maturity. It was hypothesized that these traits would demonstrate significant correlations with substitution rate in a multivariable analysis due to their associations with mutation and fixation rates, respectively. A bioinformatics pipeline was developed to assemble and analyze sequence data retrieved from the Barcode of Life Data System (BOLD) and trait data obtained from FishBase. For use in phylogenetic regression analyses, a maximum likelihood tree was constructed from the COI sequence data using a multi-gene backbone constraint tree covering 71% of the species. A variable selection method that included both single- and multivariable analyses was used to identify traits that contribute to rate heterogeneity estimated from different codon positions. Our analyses revealed that molecular rates associated most significantly with latitude, body size, and habitat type. Overall, this study presents a novel and systematic approach for integrative data assembly and variable selection methodology in a phylogenetic framework.


Assuntos
Código de Barras de DNA Taxonômico , Evolução Molecular , Peixes , Animais , Meio Ambiente , Peixes/classificação , Peixes/genética , Fenótipo , Filogenia
3.
Heredity (Edinb) ; 122(5): 513-524, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30202084

RESUMO

The evolutionary speed hypothesis (ESH) suggests that molecular evolutionary rates are higher among species inhabiting warmer environments. Previously, the ESH has been investigated using small numbers of latitudinally-separated sister lineages; in animals, these studies typically focused on subsets of Chordata and yielded mixed support for the ESH. This study analyzed public DNA barcode sequences from the cytochrome c oxidase subunit I (COI) gene for six of the largest animal phyla (Arthropoda, Chordata, Mollusca, Annelida, Echinodermata, and Cnidaria) and paired latitudinally-separated taxa together informatically. Of 8037 lineage pairs, just over half (51.6%) displayed a higher molecular rate in the lineage inhabiting latitudes closer to the equator, while the remainder (48.4%) displayed a higher rate in the higher-latitude lineage. To date, this study represents the most comprehensive analysis of latitude-related molecular rate differences across animals. While a statistically-significant pattern was detected from our large sample size, our findings suggest that the EHS may not serve as a strong universal mechanism underlying the latitudinal diversity gradient and that COI molecular clocks may generally be applied across latitudes. This study also highlights the merits of using automation to analyze large DNA barcode datasets.


Assuntos
Evolução Molecular , Clima Tropical , Animais , Biodiversidade , Código de Barras de DNA Taxonômico , DNA Mitocondrial/genética , Bases de Dados Genéticas , Complexo IV da Cadeia de Transporte de Elétrons/genética , Geografia , Invertebrados/classificação , Invertebrados/genética , Modelos Lineares , Filogenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA