Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 177
Filtrar
1.
Genetics ; 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38805070

RESUMO

Detecting introgression between closely related populations or species is a fundamental objective in evolutionary biology. Existing methods for detecting migration and inferring migration rates from population genetic data often assume a neutral model of evolution. Growing evidence of the pervasive impact of selection on large portions of the genome across diverse taxa suggests that this assumption is unrealistic in most empirical systems. Further, ignoring selection has previously been shown to negatively impact demographic inferences (e.g., of population size histories). However, the impacts of biologically realistic selection on inferences of migration remain poorly explored. Here, we simulate data under models of background selection, selective sweeps, balancing selection, and adaptive introgression. We show that ignoring selection sometimes leads to false inferences of migration in popularly used methods that rely on the site frequency spectrum (SFS). Specifically, balancing selection and some models of background selection result in the rejection of isolation-only models in favor of isolation-with-migration models and lead to elevated estimates of migration rates. BPP, a method that analyzes sequence data directly, showed false positives for all conditions at recent divergence times, but balancing selection also led to false positives at medium divergence times. Our results suggest that such methods may be unreliable in some empirical systems, such that new methods that are robust to selection need to be developed.

2.
Sci Rep ; 14(1): 10514, 2024 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-38714721

RESUMO

Adverse pregnancy outcomes (APOs) affect a large proportion of pregnancies and represent an important cause of morbidity and mortality worldwide. Yet the pathophysiology of APOs is poorly understood, limiting our ability to prevent and treat these conditions. To search for genetic markers of maternal risk for four APOs, we performed multi-ancestry genome-wide association studies (GWAS) for pregnancy loss, gestational length, gestational diabetes, and preeclampsia. We clustered participants by their genetic ancestry and focused our analyses on three sub-cohorts with the largest sample sizes: European, African, and Admixed American. Association tests were carried out separately for each sub-cohort and then meta-analyzed together. Two novel loci were significantly associated with an increased risk of pregnancy loss: a cluster of SNPs located downstream of the TRMU gene (top SNP: rs142795512), and the SNP rs62021480 near RGMA. In the GWAS of gestational length we identified two new variants, rs2550487 and rs58548906 near WFDC1 and AC005052.1, respectively. Lastly, three new loci were significantly associated with gestational diabetes (top SNPs: rs72956265, rs10890563, rs79596863), located on or near ZBTB20, GUCY1A2, and RPL7P20, respectively. Fourteen loci previously correlated with preterm birth, gestational diabetes, and preeclampsia were found to be associated with these outcomes as well.


Assuntos
Diabetes Gestacional , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Resultado da Gravidez , Humanos , Gravidez , Feminino , Resultado da Gravidez/genética , Diabetes Gestacional/genética , Adulto , Pré-Eclâmpsia/genética , Predisposição Genética para Doença , Paridade/genética
3.
Mol Phylogenet Evol ; 196: 108066, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38565358

RESUMO

Machine learning has increasingly been applied to a wide range of questions in phylogenetic inference. Supervised machine learning approaches that rely on simulated training data have been used to infer tree topologies and branch lengths, to select substitution models, and to perform downstream inferences of introgression and diversification. Here, we review how researchers have used several promising machine learning approaches to make phylogenetic inferences. Despite the promise of these methods, several barriers prevent supervised machine learning from reaching its full potential in phylogenetics. We discuss these barriers and potential paths forward. In the future, we expect that the application of careful network designs and data encodings will allow supervised machine learning to accommodate the complex processes that continue to confound traditional phylogenetic methods.


Assuntos
Aprendizado de Máquina , Filogenia , Aprendizado de Máquina Supervisionado , Modelos Genéticos
4.
Syst Biol ; 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38421146

RESUMO

Hundreds or thousands of loci are now routinely used in modern phylogenomic studies. Concatenation approaches to tree inference assume that there is a single topology for the entire dataset, but different loci may have different evolutionary histories due to incomplete lineage sorting, introgression, and/or horizontal gene transfer; even single loci may not be treelike due to recombination. To overcome this shortcoming, we introduce an implementation of a multi-tree mixture model that we call MAST. This model extends a prior implementation by Boussau et al. (2009) by allowing users to estimate the weight of each of a set of pre-specified bifurcating trees in a single alignment. The MAST model allows each tree to have its own weight, topology, branch lengths, substitution model, nucleotide or amino acid frequencies, and model of rate heterogeneity across sites. We implemented the MAST model in a maximum-likelihood framework in the popular phylogenetic software, IQ-TREE. Simulations show that we can accurately recover the true model parameters, including branch lengths and tree weights for a given set of tree topologies, under a wide range of biologically realistic scenarios. We also show that we can use standard statistical inference approaches to reject a single-tree model when data are simulated under multiple trees (and vice versa). We applied the MAST model to multiple primate datasets and found that it can recover the signal of incomplete lineage sorting in the Great Apes, as well as the asymmetry in minor trees caused by introgression among several macaque species. When applied to a dataset of four Platyrrhine species for which standard concatenated maximum likelihood and gene tree approaches disagree, we observe that MAST gives the highest weight (i.e. the largest proportion of sites) to the tree also supported by gene tree approaches. These results suggest that the MAST model is able to analyse a concatenated alignment using maximum likelihood, while avoiding some of the biases that come with assuming there is only a single tree. We discuss how the MAST model can be extended in the future.

5.
Curr Biol ; 33(22): R1166-R1172, 2023 11 20.
Artigo em Inglês | MEDLINE | ID: mdl-37989088

RESUMO

Biological differences between males and females lead to many differences in physiology, disease, and overall health. One of the most prominent disparities is in the number of germline mutations passed to offspring: human males transmit three times as many mutations as do females. While the classic explanation for this pattern invokes differences in post-puberty germline replication between the sexes, recent whole-genome evidence in humans and other mammals has cast doubt on this mechanism. Here, we review recent work that is inconsistent with a replication-driven model of male-biased mutation, and propose an alternative, 'faulty male' hypothesis. This model proposes that males are less able to repair and/or protect DNA from damage compared to females. Importantly, we suggest that this new model for male-biased mutation may also help to explain several pronounced differences between the sexes in cancer, aging, and DNA repair. Although the detailed contributions of genetic, epigenetic, and hormonal influences of biological sex on mutation remain to be fully understood, a reconsideration of the mechanisms underlying these differences will lead to a deeper understanding of evolution and disease.


Assuntos
Genoma , Células Germinativas , Feminino , Animais , Masculino , Humanos , Mutação , Mamíferos/genética , Envelhecimento
6.
Bioinformatics ; 39(9)2023 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-37669126

RESUMO

MOTIVATION: The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics. RESULTS: We developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics. AVAILABILITY AND IMPLEMENTATION: phyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/.


Assuntos
Benchmarking , Evolução Biológica , Filogenia , Heterogeneidade Genética , Aprendizado de Máquina
7.
Genetics ; 225(2)2023 10 04.
Artigo em Inglês | MEDLINE | ID: mdl-37602697

RESUMO

Adverse pregnancy outcomes (APOs) are major risk factors for women's health during pregnancy and even in the years after pregnancy. Due to the heterogeneity of APOs, only few genetic associations have been identified. In this report, we conducted genome-wide association studies (GWASs) of 479 traits that are possibly related to APOs using a large and racially diverse study, Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b). To display extensive results, we developed a web-based tool GnuMoM2b (https://gnumom2b.cumcobgyn.org/) for searching, visualizing, and sharing results from a GWAS of 479 pregnancy traits as well as phenome-wide association studies of more than 17 million single nucleotide polymorphisms. The genetic results from 3 ancestries (Europeans, Africans, and Admixed Americans) and meta-analyses are populated in GnuMoM2b. In conclusion, GnuMoM2b is a valuable resource for extraction of pregnancy-related genetic results and shows the potential to facilitate meaningful discoveries.


Assuntos
Estudo de Associação Genômica Ampla , Fenômica , Gravidez , Feminino , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Fatores de Risco , Polimorfismo de Nucleotídeo Único
8.
medRxiv ; 2023 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-37333377

RESUMO

Adverse pregnancy outcomes (APOs) are major risk factors for women's health during pregnancy and even in the years after pregnancy. Due to the heterogeneity of APOs, only few genetic associations have been identified. In this report, we conducted genome-wide association studies (GWAS) of 479 traits that are possibly related to APOs using a large and racially diverse study, Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b). To display the extensive results, we developed a web-based tool GnuMoM2b ( https://gnumom2b.cumcobgyn.org/ ) for searching, visualizing, and sharing results from GWAS of 479 pregnancy traits as well as phenome-wide association studies (PheWAS) of more than 17 million single nucleotide polymorphisms (SNPs). The genetic results from three ancestries (Europeans, Africans, and Admixed Americans) and meta-analyses are populated in GnuMoM2b. In conclusion, GnuMoM2b is a valuable resource for extraction of pregnancy-related genetic results and shows the potential to facilitate meaningful discoveries.

9.
Thorax ; 78(11): 1118-1125, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37280096

RESUMO

BACKGROUND: Although 1 billion people live in informal (slum) settlements, the consequences for respiratory health of living in these settlements remain largely unknown. This study investigated whether children living in an informal settlement in Nairobi, Kenya are at increased risk of asthma symptoms. METHODS: Children attending schools in Mukuru (an informal settlement in Nairobi) and a more affluent area (Buruburu) were compared. Questionnaires quantified respiratory symptoms and environmental exposures; spirometry was performed; personal exposure to particulate matter (PM2.5) was estimated. RESULTS: 2373 children participated, 1277 in Mukuru (median age, IQR 11, 9-13 years, 53% girls), and 1096 in Buruburu (10, 8-12 years, 52% girls). Mukuru schoolchildren were from less affluent homes, had greater exposure to pollution sources and PM2.5. When compared with Buruburu schoolchildren, Mukuru schoolchildren had a greater prevalence of symptoms, 'current wheeze' (9.5% vs 6.4%, p=0.007) and 'trouble breathing' (16.3% vs 12.6%, p=0.01), and these symptoms were more severe and problematic. Diagnosed asthma was more common in Buruburu (2.8% vs 1.2%, p=0.004). Spirometry did not differ between Mukuru and Buruburu. Regardless of community, significant adverse associations were observed with self-reported exposure to 'vapours, dusts, gases, fumes', mosquito coil burning, adult smoker(s) in the home, refuse burning near homes and residential proximity to roads. CONCLUSION: Children living in informal settlements are more likely to develop wheezing symptoms consistent with asthma that are more severe but less likely to be diagnosed as asthma. Self-reported but not objectively measured air pollution exposure was associated with increased risk of asthma symptoms.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Asma , Criança , Adulto , Feminino , Animais , Humanos , Masculino , Poluentes Atmosféricos/análise , Quênia/epidemiologia , Poluição do Ar/análise , Asma/diagnóstico , Asma/epidemiologia , Asma/etiologia , Material Particulado/efeitos adversos , Material Particulado/análise , Exposição Ambiental/efeitos adversos , Exposição Ambiental/análise , Sons Respiratórios , Gases , Espirometria
10.
Sustain Sci ; 18(3): 1429-1444, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37124120

RESUMO

Transdisciplinary research (TDR) approaches have been cited as essential for overcoming the intractable sustainability challenges that the world is currently facing, including air pollution, water management and climate change. However, such approaches can be difficult to undertake in practice and can consequently fail to add value. Therefore, examples of what works in practice (and what does not) are helpful to guide future research. In this study, we used a conceptual TDR framework as the basis to examine and evaluate the strengths and weaknesses of our approach in a project exploring air pollution in an informal settlement in Nairobi, Kenya. Reflection diaries exploring experiences of participation in the project were undertaken by the project team (comprising academic and community partners) at multiple time points throughout the project. These reflection diaries played an important role in evaluation and for providing space for team learning. Diaries were thematically coded according to the TDR framework to explore aspects of the project that worked well, and areas which presented challenges. We draw upon our reflections, and the extant literature, to make practical recommendations for researchers undertaking TDR projects in future. Recommendations focus on three key project stages (pre-funding, funded period, post-funding) and include; building the team in a way that includes all key stakeholders in relevant and appropriate roles, giving everyone sufficient time to work on the project, and ensuring regular and open communication. Building these recommendations into the design and delivery of transdisciplinary sustainability science projects will support progress towards achieving the Sustainable Development Goals (SDGs). Supplementary Information: The online version contains supplementary material available at 10.1007/s11625-023-01317-0.

11.
Mol Biol Evol ; 40(5)2023 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-37158385

RESUMO

Despite the increasing abundance of whole transcriptome data, few methods are available to analyze global gene expression across phylogenies. Here, we present a new software package (Computational Analysis of Gene Expression Evolution [CAGEE]) for inferring patterns of increases and decreases in gene expression across a phylogenetic tree, as well as the rate at which these changes occur. In contrast to previous methods that treat each gene independently, CAGEE can calculate genome-wide rates of gene expression, along with ancestral states for each gene. The statistical approach developed here makes it possible to infer lineage-specific shifts in rates of evolution across the genome, in addition to possible differences in rates among multiple tissues sampled from the same species. We demonstrate the accuracy and robustness of our method on simulated data and apply it to a data set of ovule gene expression collected from multiple self-compatible and self-incompatible species in the genus Solanum to test hypotheses about the evolutionary forces acting during mating system shifts. These comparisons allow us to highlight the power of CAGEE, demonstrating its utility for use in any empirical system and for the analysis of most morphological traits. Our software is available at https://github.com/hahnlab/CAGEE/.


Assuntos
Perfilação da Expressão Gênica , Filogenia , Software , Solanum , Solanum/classificação , Solanum/genética , Evolução Biológica
12.
Proc Natl Acad Sci U S A ; 120(22): e2220389120, 2023 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-37216509

RESUMO

Phylogenetic comparative methods have long been a mainstay of evolutionary biology, allowing for the study of trait evolution across species while accounting for their common ancestry. These analyses typically assume a single, bifurcating phylogenetic tree describing the shared history among species. However, modern phylogenomic analyses have shown that genomes are often composed of mosaic histories that can disagree both with the species tree and with each other-so-called discordant gene trees. These gene trees describe shared histories that are not captured by the species tree, and therefore that are unaccounted for in classic comparative approaches. The application of standard comparative methods to species histories containing discordance leads to incorrect inferences about the timing, direction, and rate of evolution. Here, we develop two approaches for incorporating gene tree histories into comparative methods: one that constructs an updated phylogenetic variance-covariance matrix from gene trees, and another that applies Felsenstein's pruning algorithm over a set of gene trees to calculate trait histories and likelihoods. Using simulation, we demonstrate that our approaches generate much more accurate estimates of tree-wide rates of trait evolution than standard methods. We apply our methods to two clades of the wild tomato genus Solanum with varying rates of discordance, demonstrating the contribution of gene tree discordance to variation in a set of floral traits. Our approaches have the potential to be applied to a broad range of classic inference problems in phylogenetics, including ancestral state reconstruction and the inference of lineage-specific rate shifts.


Assuntos
Algoritmos , Software , Filogenia , Simulação por Computador , Probabilidade , Modelos Genéticos
13.
Sci Adv ; 9(1): eabm7047, 2023 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-36608127

RESUMO

The generation times of our recent ancestors can tell us about both the biology and social organization of prehistoric humans, placing human evolution on an absolute time scale. We present a method for predicting historical male and female generation times based on changes in the mutation spectrum. Our analyses of whole-genome data reveal an average generation time of 26.9 years across the past 250,000 years, with fathers consistently older (30.7 years) than mothers (23.2 years). Shifts in sex-averaged generation times have been driven primarily by changes to the age of paternity, although we report a substantial increase in female generation times in the recent past. We also find a large difference in generation times among populations, reaching back to a time when all humans occupied Africa.

14.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36383168

RESUMO

MOTIVATION: Site concordance factors (sCFs) have become a widely used way to summarize discordance in phylogenomic datasets. However, the original version of sCFs was calculated by sampling a quartet of tip taxa and then applying parsimony-based criteria for discordance. This approach has the potential to be strongly affected by multiple hits at a site (homoplasy), especially when substitution rates are high or taxa are not closely related. RESULTS: Here, we introduce a new method for calculating sCFs. The updated version uses likelihood to generate probability distributions of ancestral states at internal nodes of the phylogeny. By sampling from the states at internal nodes adjacent to a given branch, this approach substantially reduces-but does not abolish-the effects of homoplasy and taxon sampling. AVAILABILITY AND IMPLEMENTATION: Updated sCFs are implemented in IQ-TREE 2.2.2. The software is freely available at https://github.com/iqtree/iqtree2/releases. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.


Assuntos
Software , Filogenia , Probabilidade
15.
Genome Biol Evol ; 14(10)2022 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-36173788

RESUMO

A male mutation bias is observed across vertebrates, and, where data are available, this bias is accompanied by increased per-generation mutation rates with parental age. While continuing mitotic cell division in the male germline post puberty has been proposed as the major cellular mechanism underlying both patterns, little direct evidence for this role has been found. Understanding the evolution of the per-generation mutation rate among species requires that we identify the molecular mechanisms that change between species. Here, we study the per-generation mutation rate in an extended pedigree of the brown (grizzly) bear, Ursus arctos horribilis. Brown bears hibernate for one-third of the year, a period during which spermatogenesis slows or stops altogether. The reduction of spermatogenesis is predicted to lessen the male mutation bias and to lower the per-generation mutation rate in this species. However, using whole-genome sequencing, we find that both male bias and per-generation mutation rates are highly similar to that expected for a non-hibernating species. We also carry out a phylogenetic comparison of substitution rates along the lineage leading to brown bear and panda (a non-hibernating species) and find no slowing of the substitution rate in the hibernator. Our results contribute to accumulating evidence that suggests that male germline cell division is not the major determinant of mutation rates and mutation biases. The results also provide a quantitative basis for improved estimates of the timing of carnivore evolution.


Assuntos
Hibernação , Ursidae , Animais , Masculino , Ursidae/genética , Hibernação/genética , Taxa de Mutação , Filogenia , Mutação em Linhagem Germinativa , Células Germinativas
16.
JAMA Netw Open ; 5(8): e2229158, 2022 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-36040739

RESUMO

Importance: Polygenic risk scores (PRS) for type 2 diabetes (T2D) can improve risk prediction for gestational diabetes (GD), yet the strength of the association between genetic and lifestyle risk factors has not been quantified. Objective: To assess the association of PRS and physical activity in existing GD risk models and identify patient subgroups who may receive the most benefits from a PRS or physical activity intervention. Design, Settings, and Participants: The Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be cohort was established to study individuals without previous pregnancy lasting at least 20 weeks (nulliparous) and to elucidate factors associated with adverse pregnancy outcomes. A subcohort of 3533 participants with European ancestry was used for risk assessment and performance evaluation. Participants were enrolled from October 5, 2010, to December 3, 2013, and underwent genotyping between February 19, 2019, and February 28, 2020. Data were analyzed from September 15, 2020, to November 10, 2021. Exposures: Self-reported total physical activity in early pregnancy was quantified as metabolic equivalents of task (METs). Polygenic risk scores were calculated for T2D using contributions of 84 single nucleotide variants, weighted by their association in the Diabetes Genetics Replication and Meta-analysis Consortium data. Main Outcomes and Measures: Estimation of the development of GD from clinical, genetic, and environmental variables collected in early pregnancy, assessed using measures of model discrimination. Odds ratios and positive likelihood ratios were used to evaluate the association of PRS and physical activity with GD risk. Results: A total of 3533 women were included in this analysis (mean [SD] age, 28.6 [4.9] years). In high-risk population subgroups (body mass index ≥25 or aged ≥35 years), individuals with high PRS (top 25th percentile) or low activity levels (METs <450) had increased odds of a GD diagnosis of 25% to 75%. Compared with the general population, participants with both high PRS and low activity levels had higher odds of a GD diagnosis (odds ratio, 3.4 [95% CI, 2.3-5.3]), whereas participants with low PRS and high METs had significantly reduced risk of a GD diagnosis (odds ratio, 0.5 [95% CI, 0.3-0.9]; P = .01). Conclusions and Relevance: In this cohort study, the addition of PRS was associated with the stratified risk of GD diagnosis among high-risk patient subgroups, suggesting the benefits of targeted PRS ascertainment to encourage early intervention.


Assuntos
Diabetes Mellitus Tipo 2 , Diabetes Gestacional , Adulto , Estudos de Coortes , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Mellitus Tipo 2/genética , Diabetes Gestacional/epidemiologia , Diabetes Gestacional/genética , Exercício Físico , Feminino , Predisposição Genética para Doença , Humanos , Gravidez
17.
Mol Biol Evol ; 39(7)2022 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-35771663

RESUMO

The mutation rate is a fundamental evolutionary parameter with direct and appreciable effects on the health and function of individuals. Here, we examine this important parameter in the domestic cat, a beloved companion animal as well as a valuable biomedical model. We estimate a mutation rate of 0.86 × 10-8 per bp per generation for the domestic cat (at an average parental age of 3.8 years). We find evidence for a significant paternal age effect, with more mutations transmitted by older sires. Our analyses suggest that the cat and the human have accrued similar numbers of mutations in the germline before reaching sexual maturity. The per-generation mutation rate in the cat is 28% lower than what has been observed in humans, but is consistent with the shorter generation time in the cat. Using a model of reproductive longevity, which takes into account differences in the reproductive age and time to sexual maturity, we are able to explain much of the difference in per-generation rates between species. We further apply our reproductive longevity model in a novel analysis of mutation spectra and find that the spectrum for the cat resembles the human mutation spectrum at a younger age of reproduction. Together, these results implicate changes in life-history as a driver of mutation rate evolution between species. As the first direct observation of the paternal age effect outside of rodents and primates, our results also suggest a phenomenon that may be universal among mammals.


Assuntos
Longevidade , Taxa de Mutação , Animais , Gatos/genética , Pré-Escolar , Humanos , Longevidade/genética , Mamíferos , Mutação , Idade Paterna , Reprodução/genética
18.
Mol Biol Evol ; 39(6)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35642314

RESUMO

Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.


Assuntos
Genoma , Animais , Análise por Conglomerados , Filogenia
20.
Elife ; 112022 01 12.
Artigo em Inglês | MEDLINE | ID: mdl-35018888

RESUMO

In the past decade, several studies have estimated the human per-generation germline mutation rate using large pedigrees. More recently, estimates for various nonhuman species have been published. However, methodological differences among studies in detecting germline mutations and estimating mutation rates make direct comparisons difficult. Here, we describe the many different steps involved in estimating pedigree-based mutation rates, including sampling, sequencing, mapping, variant calling, filtering, and appropriately accounting for false-positive and false-negative rates. For each step, we review the different methods and parameter choices that have been used in the recent literature. Additionally, we present the results from a 'Mutationathon,' a competition organized among five research labs to compare germline mutation rate estimates for a single pedigree of rhesus macaques. We report almost a twofold variation in the final estimated rate among groups using different post-alignment processing, calling, and filtering criteria, and provide details into the sources of variation across studies. Though the difference among estimates is not statistically significant, this discrepancy emphasizes the need for standardized methods in mutation rate estimations and the difficulty in comparing rates from different studies. Finally, this work aims to provide guidelines for computational and statistical benchmarks for future studies interested in identifying germline mutations from pedigrees.


Assuntos
Técnicas Genéticas , Mutação em Linhagem Germinativa , Macaca mulatta/genética , Taxa de Mutação , Animais , Técnicas Genéticas/instrumentação , Células Germinativas , Laboratórios , Linhagem , Padrões de Referência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...