Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Elife ; 122023 Dec 18.
Article in English | MEDLINE | ID: mdl-38108819

ABSTRACT

Gene flow between species, although usually deleterious, is an important evolutionary process that can facilitate adaptation and lead to species diversification. It also makes estimation of species relationships difficult. Here, we use the full-likelihood multispecies coalescent (MSC) approach to estimate species phylogeny and major introgression events in Heliconius butterflies from whole-genome sequence data. We obtain a robust estimate of species branching order among major clades in the genus, including the 'melpomene-silvaniform' group, which shows extensive historical and ongoing gene flow. We obtain chromosome-level estimates of key parameters in the species phylogeny, including species divergence times, present-day and ancestral population sizes, as well as the direction, timing, and intensity of gene flow. Our analysis leads to a phylogeny with introgression events that differ from those obtained in previous studies. We find that Heliconius aoede most likely represents the earliest-branching lineage of the genus and that 'silvaniform' species are paraphyletic within the melpomene-silvaniform group. Our phylogeny provides new, parsimonious histories for the origins of key traits in Heliconius, including pollen feeding and an inversion involved in wing pattern mimicry. Our results demonstrate the power and feasibility of the full-likelihood MSC approach for estimating species phylogeny and key population parameters despite extensive gene flow. The methods used here should be useful for analysis of other difficult species groups with high rates of introgression.


Subject(s)
Butterflies , Animals , Butterflies/genetics , Biological Evolution , Chromosome Inversion , Gene Flow , Phenotype
2.
Mol Biol Evol ; 40(8)2023 08 03.
Article in English | MEDLINE | ID: mdl-37552932

ABSTRACT

Genomic data are informative about the history of species divergence and interspecific gene flow, including the direction, timing, and strength of gene flow. However, gene flow in opposite directions generates similar patterns in multilocus sequence data, such as reduced sequence divergence between the hybridizing species. As a result, inference of the direction of gene flow is challenging. Here, we investigate the information about the direction of gene flow present in genomic sequence data using likelihood-based methods under the multispecies-coalescent-with-introgression model. We analyze the case of two species, and use simulation to examine cases with three or four species. We find that it is easier to infer gene flow from a small population to a large one than in the opposite direction, and easier to infer inflow (gene flow from outgroup species to an ingroup species) than outflow (gene flow from an ingroup species to an outgroup species). It is also easier to infer gene flow if there is a longer time of separate evolution between the initial divergence and subsequent introgression. When introgression is assumed to occur in the wrong direction, the time of introgression tends to be correctly estimated and the Bayesian test of gene flow is often significant, while estimates of introgression probability can be even greater than the true probability. We analyze genomic sequences from Heliconius butterflies to demonstrate that typical genomic datasets are informative about the direction of interspecific gene flow, as well as its timing and strength.


Subject(s)
Butterflies , Animals , Likelihood Functions , Bayes Theorem , Butterflies/genetics , Genome , Genomics , Gene Flow , Phylogeny , Hybridization, Genetic
3.
Mol Biol Evol ; 39(12)2022 12 05.
Article in English | MEDLINE | ID: mdl-36317198

ABSTRACT

Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate introgression rates and times under misspecified models. Here we use mathematical analysis and computer simulation to examine estimation bias and issues of interpretation when the model of gene flow is misspecified in analysis of genomic datasets, for example, if introgression is assigned to the wrong lineages. In the case of two species, we establish a correspondence between the migration rate in the continuous migration model and the introgression probability in the introgression model. When gene flow occurs continuously through time but in the analysis is assumed to occur at a fixed time point, common evolutionary parameters such as species divergence times are surprisingly well estimated. However, the time of introgression tends to be estimated towards the recent end of the period of continuous gene flow. When introgression events are assigned incorrectly to the parental or daughter lineages, introgression times tend to collapse onto species divergence times, with introgression probabilities underestimated. Overall, our analyses suggest that the simple introgression model is useful for extracting information concerning between-specific gene flow and divergence even when the model may be misspecified. However, for reliable inference of gene flow it is important to include multiple samples per species, in particular, from hybridizing species.


Subject(s)
Gene Flow , Genomics , Computer Simulation
4.
Sci Rep ; 12(1): 4185, 2022 03 09.
Article in English | MEDLINE | ID: mdl-35264716

ABSTRACT

Streptococcus agalactiae, also known as Lancefield Group B Streptococcus (GBS), is typically regarded as a neonatal pathogen; however, several studies have shown that the bacteria are capable of causing invasive diseases in non-pregnant adults as well. The majority of documented cases were from Southeast Asian countries, and the most common genotype found was ST283, which is also known to be able to infect fish. This study sequenced 12 GBS ST283 samples collected from adult patients in Thailand. Together with publicly available sequences, we performed temporo-spatial analysis and estimated population dynamics of the bacteria. Putative drug resistance genes were also identified and characterized, and the drug resistance phenotypes were validated experimentally. The results, together with historical records, draw a detailed picture of the past transmission history of GBS ST283 in Southeast Asia.


Subject(s)
Streptococcal Infections , Streptococcus agalactiae , Animals , Asia, Southeastern/epidemiology , Genomics , Humans , Phylogeny , Streptococcal Infections/epidemiology , Streptococcal Infections/microbiology , Streptococcus agalactiae/genetics
5.
Syst Biol ; 71(5): 1159-1177, 2022 08 10.
Article in English | MEDLINE | ID: mdl-35169847

ABSTRACT

Introgressive hybridization plays a key role in adaptive evolution and species diversification in many groups of species. However, frequent hybridization and gene flow between species make estimation of the species phylogeny and key population parameters challenging. Here, we show that by accounting for phasing and using full-likelihood methods, introgression histories and population parameters can be estimated reliably from whole-genome sequence data. We employ the multispecies coalescent (MSC) model with and without gene flow to infer the species phylogeny and cross-species introgression events using genomic data from six members of the erato-sara clade of Heliconius butterflies. The methods naturally accommodate random fluctuations in genealogical history across the genome due to deep coalescence. To avoid heterozygote phasing errors in haploid sequences commonly produced by genome assembly methods, we process and compile unphased diploid sequence alignments and use analytical methods to average over uncertainties in heterozygote phase resolution. There is robust evidence for introgression across the genome, both among distantly related species deep in the phylogeny and between sister species in shallow parts of the tree. We obtain chromosome-specific estimates of key population parameters such as introgression directions, times and probabilities, as well as species divergence times and population sizes for modern and ancestral species. We confirm ancestral gene flow between the sara clade and an ancestral population of Heliconius telesiphe, a likely hybrid speciation origin for Heliconius hecalesia, and gene flow between the sister species Heliconius erato and Heliconius himera. Inferred introgression among ancestral species also explains the history of two chromosomal inversions deep in the phylogeny of the group. This study illustrates how a full-likelihood approach based on the MSC makes it possible to extract rich historical information of species divergence and gene flow from genomic data. [3s; bpp; gene flow; Heliconius; hybrid speciation; introgression; inversion; multispecies coalescent].


Subject(s)
Butterflies , Animals , Butterflies/genetics , Genomics , Hybridization, Genetic , Likelihood Functions , Phylogeny
6.
Sci Rep ; 12(1): 1565, 2022 01 28.
Article in English | MEDLINE | ID: mdl-35091638

ABSTRACT

Mycobacterium tuberculosis (Mtb) lineage 1 (L1) contributes considerably to the disease morbidity. While whole genome sequencing (WGS) is increasingly used for studying Mtb, our understanding of genetic diversity of L1 remains limited. Using phylogenetic analysis of WGS data from endemic range in Asia and Africa, we provide an improved genotyping scheme for L1. Mapping deletion patterns of the 68 direct variable repeats (DVRs) in the CRISPR region of the genome onto the phylogeny provided supporting evidence that the CRISPR region evolves primarily by deletion, and hinted at a possible Southeast Asian origin of L1. Both phylogeny and DVR patterns clarified some relationships between different spoligotypes, and highlighted the limited resolution of spoligotyping. We identified a diverse repertoire of drug resistance mutations. Altogether, this study demonstrates the usefulness of WGS data for understanding the genetic diversity of L1, with implications for public health surveillance and TB control. It also highlights the need for more WGS studies in high-burden but underexplored regions.


Subject(s)
Mycobacterium tuberculosis
7.
Microb Genom ; 7(11)2021 11.
Article in English | MEDLINE | ID: mdl-34787541

ABSTRACT

Mycobacterium tuberculosis (Mtb) lineage 2 (L2) strains are present globally, contributing to a widespread tuberculosis (TB) burden, particularly in Asia where both prevalence of TB and numbers of drug resistant TB are highest. The increasing availability of whole-genome sequencing (WGS) data worldwide provides an opportunity to improve our understanding of the global genetic diversity of Mtb L2 and its association with the disease epidemiology and pathogenesis. However, existing L2 sublineage classification schemes leave >20 % of the Modern Beijing isolates unclassified. Here, we present a revised SNP-based classification scheme of L2 in a genomic framework based on phylogenetic analysis of >4000 L2 isolates from 34 countries in Asia, Eastern Europe, Oceania and Africa. Our scheme consists of over 30 genotypes, many of which have not been described before. In particular, we propose six main genotypes of Modern Beijing strains, denoted L2.2.M1-L2.2.M6. We also provide SNP markers for genotyping L2 strains from WGS data. This fine-scale genotyping scheme, which can classify >98 % of the studied isolates, serves as a basis for more effective monitoring and reporting of transmission and outbreaks, as well as improving genotype-phenotype associations such as disease severity and drug resistance. This article contains data hosted by Microreact.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Tuberculosis , Genotype , Humans , Phylogeny , Tuberculosis/microbiology , Tuberculosis, Multidrug-Resistant/epidemiology
8.
Infect Genet Evol ; 91: 104802, 2021 07.
Article in English | MEDLINE | ID: mdl-33684570

ABSTRACT

Tuberculosis is still problematic as it affects large numbers of people globally. Mycobacterium tuberculosis Lineage 1 (L1) or Indo Oceanic Lineage, one of widespread major lineages, has a specific geographic distribution and high mortality. It is highly diverse and endemic in several high burden countries. However, studies on the global burden of L1 and its sublineages remain limited. This may lead to the underestimation of the importance of its variance in developing and applying tuberculosis control measures. This study aimed to estimate the number of patients infected with M. tuberculosis L1 and its sublineages worldwide. The proportion of L1 among tuberculosis patients was searched in published reports from countries around the world and the number of patients was calculated based on a WHO report on country incidences and populations. The numbers of patients infected with the five major sublineages, namely L1.1.1, L1.1.2, L1.1.3, L1.2.1, and L1.2.2 were estimated where information was available. It was found that L1 accounted for 28% of global tuberculosis cases in 2012 and 2018. Over 80% of the L1 global burden was in India, the Philippines, Indonesia and Bangladesh, which are also among the countries with highest absolute numbers of tuberculosis patients in the world. Globally, the estimated number of patients infected with M. tuberculosis L1.2.1 and L1.1.2 was over 1.1 million and of patients infected with L1.1.1 was about 200,000. This study demonstrated that L1 contributes significantly to the global burden of tuberculosis. To achieve the End TB Strategy, more attention needs to be paid to the responses of M. tuberculosis L1 to various control measures.


Subject(s)
Global Burden of Disease , Mycobacterium tuberculosis/physiology , Tuberculosis/epidemiology , Humans , Incidence , Mycobacterium tuberculosis/classification , Tuberculosis/microbiology
9.
Transbound Emerg Dis ; 68(2): 435-444, 2021 Mar.
Article in English | MEDLINE | ID: mdl-32578388

ABSTRACT

Tilapia lake virus (TiLV) is an emerging virus that is rapidly spreading across the world. Over the past 6 years (2014-2020), TiLV outbreaks had been reported in at least 16 countries, spanning three continents, including Asia, Africa, and America. Despite its enormous economic impact, its origin, evolution and epidemiology are still largely poorly characterized. Here, we report eight TiLV whole-genome sequences from Thailand sampled between 2014 and 2019. Together with publicly available sequences from various regions of the world, we estimated the origin of TiLV to be between 2003 and 2009, 5-10 years before the first report of the virus in Israel in 2014. Our analyses consistently showed that TiLV started to spread in 2000s, and reached its peak in 2014-2016, matching well with the timing of its first report. From 2016 onwards, the global TiLV population declined steadily. This could be a result of herd immunity building up in the fish population, and/or a reflection of a better awareness of the virus coupled with a better and more cautious protocol of Tilapia importation. Despite the fact that we included all publicly available sequences, our analyses revealed long unsampled histories of TiLVs in many countries, especially towards its basal diversification. This result highlights the lack and the need for systematic surveillance of TiLV in fish.


Subject(s)
Fish Diseases/virology , Orthomyxoviridae Infections/veterinary , Orthomyxoviridae/genetics , Tilapia/virology , Animals , Fish Diseases/epidemiology , Genome, Viral , Genomics , Lakes , Orthomyxoviridae Infections/virology
10.
Sci Rep ; 9(1): 13718, 2019 09 23.
Article in English | MEDLINE | ID: mdl-31548561

ABSTRACT

Global Mycobacterium tuberculosis population comprises 7 major lineages. The Beijing strains, particularly the ones classified as Modern groups, have been found worldwide, frequently associated with drug resistance, younger ages, outbreaks and appear to be expanding. Here, we report analysis of whole genome sequences of 1170 M. tuberculosis isolates together with their patient profiles. Our samples belonged to Lineage 1-4 (L1-L4) with those of L1 and L2 being equally dominant. Phylogenetic analysis revealed several new or rare sublineages. Differential associations between sublineages of M. tuberculosis and patient profiles, including ages, ethnicity, HIV (human immunodeficiency virus) infection and drug resistance were demonstrated. The Ancestral Beijing strains and some sublineages of L4 were associated with ethnic minorities while L1 was more common in Thais. L2.2.1.Ancestral 4 surprisingly had a mutation that is typical of the Modern Beijing sublineages and was common in Akha and Lahu tribes who have migrated from Southern China in the last century. This may indicate that the evolutionary transition from the Ancestral to Modern Beijing sublineages might be gradual and occur in Southern China, where the presence of multiple ethnic groups might have allowed for the circulations of various co-evolving sublineages which ultimately lead to the emergence of the Modern Beijing strains.


Subject(s)
Biological Evolution , Mycobacterium tuberculosis/genetics , Phylogeny , Tuberculosis, Multidrug-Resistant/microbiology , Tuberculosis, Pulmonary/microbiology , Adult , Aged , Beijing , China , Drug Resistance, Multiple, Bacterial/genetics , Female , Humans , Male , Middle Aged , Mycobacterium tuberculosis/isolation & purification , Whole Genome Sequencing , Young Adult
11.
Mol Biol Evol ; 35(10): 2512-2527, 2018 10 01.
Article in English | MEDLINE | ID: mdl-30102363

ABSTRACT

Deep coalescence and introgression make it challenging to infer phylogenetic relationships among closely related species that arose through radiative speciation events. Despite numerous phylogenetic analyses and the availability of whole genomes, the phylogeny in the Anopheles gambiae species complex has not been confidently resolved. Here we extract over 80, 000 coding and noncoding short segments (called loci) from the genomes of six members of the species complex and use a Bayesian method under the multispecies coalescent model to infer the species tree, which takes into account genealogical heterogeneity across the genome and uncertainty in the gene trees. We obtained a robust estimate of the species tree from the distal region of the X chromosome: (A. merus, ((A. melas, (A. arabiensis, A. quadriannulatus)), (A. gambiae, A. coluzzii))), with A. merus to be the earliest branching species. This species tree agrees with the chromosome inversion phylogeny and provides a parsimonious interpretation of inversion and introgression events. Simulation informed by the real data suggest that the coalescent approach is reliable while the sliding-window analysis used in a previous phylogenomic study generates artifactual species trees. Likelihood ratio test of gene flow revealed strong evidence of autosomal introgression from A. arabiensis into A. gambiae (at the average rate of ∼0.2 migrants per generation), but not in the opposite direction, and introgression of the 3 L chromosomal region from A. merus into A. quadriannulatus. Our results highlight the importance of accommodating incomplete lineage sorting and introgression in phylogenomic analyses of species that arose through recent radiative speciation events.


Subject(s)
Anopheles/genetics , Phylogeny , Animal Migration , Animals , Evolution, Molecular , Genome, Insect , Hybridization, Genetic , X Chromosome
12.
Curr Biol ; 25(22): 2939-50, 2015 Nov 16.
Article in English | MEDLINE | ID: mdl-26603774

ABSTRACT

The timing of divergences among metazoan lineages is integral to understanding the processes of animal evolution, placing the biological events of species divergences into the correct geological timeframe. Recent fossil discoveries and molecular clock dating studies have suggested a divergence of bilaterian phyla >100 million years before the Cambrian, when the first definite crown-bilaterian fossils occur. Most previous molecular clock dating studies, however, have suffered from limited data and biases in methodologies, and virtually all have failed to acknowledge the large uncertainties associated with the fossil record of early animals, leading to inconsistent estimates among studies. Here we use an unprecedented amount of molecular data, combined with four fossil calibration strategies (reflecting disparate and controversial interpretations of the metazoan fossil record) to obtain Bayesian estimates of metazoan divergence times. Our results indicate that the uncertain nature of ancient fossils and violations of the molecular clock impose a limit on the precision that can be achieved in estimates of ancient molecular timescales. For example, although we can assert that crown Metazoa originated during the Cryogenian (with most crown-bilaterian phyla diversifying during the Ediacaran), it is not possible with current data to pinpoint the divergence events with sufficient accuracy to test for correlations between geological and biological events in the history of animals. Although a Cryogenian origin of crown Metazoa agrees with current geological interpretations, the divergence dates of the bilaterians remain controversial. Thus, attempts to build evolutionary narratives of early animal evolution based on molecular clock timescales appear to be premature.


Subject(s)
Biological Evolution , Evolution, Molecular , Sequence Analysis, DNA/methods , Animals , Bayes Theorem , Fossils , Humans , Models, Genetic , Phylogeny , Uncertainty
SELECTION OF CITATIONS
SEARCH DETAIL
...