Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Nature ; 602(7896): 263-267, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34937052

RESUMEN

High-throughput sequencing projects generate genome-scale sequence data for species-level phylogenies1-3. However, state-of-the-art Bayesian methods for inferring timetrees are computationally limited to small datasets and cannot exploit the growing number of available genomes4. In the case of mammals, molecular-clock analyses of limited datasets have produced conflicting estimates of clade ages with large uncertainties5,6, and thus the timescale of placental mammal evolution remains contentious7-10. Here we develop a Bayesian molecular-clock dating approach to estimate a timetree of 4,705 mammal species integrating information from 72 mammal genomes. We show that increasingly larger phylogenomic datasets produce diversification time estimates with progressively smaller uncertainties, facilitating precise tests of macroevolutionary hypotheses. For example, we confidently reject an explosive model of placental mammal origination in the Palaeogene8 and show that crown Placentalia originated in the Late Cretaceous with unambiguous ordinal diversification in the Palaeocene/Eocene. Our Bayesian methodology facilitates analysis of complete genomes and thousands of species within an integrated framework, making it possible to address hitherto intractable research questions on species diversifications. This approach can be used to address other contentious cases of animal and plant diversifications that require analysis of species-level phylogenomic datasets.


Asunto(s)
Evolución Molecular , Mamíferos , Filogenia , Animales , Teorema de Bayes , Euterios/clasificación , Euterios/genética , Femenino , Mamíferos/clasificación , Mamíferos/genética , Placenta , Embarazo , Especificidad de la Especie
2.
Mol Biol Evol ; 39(1)2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34694387

RESUMEN

We use first principles of population genetics to model the evolution of proteins under persistent positive selection (PPS). PPS may occur when organisms are subjected to persistent environmental change, during adaptive radiations, or in host-pathogen interactions. Our mutation-selection model indicates protein evolution under PPS is an irreversible Markov process, and thus proteins under PPS show a strongly asymmetrical distribution of selection coefficients among amino acid substitutions. Our model shows the criteria ω>1 (where ω is the ratio of nonsynonymous over synonymous codon substitution rates) to detect positive selection is conservative and indeed arbitrary, because in real proteins many mutations are highly deleterious and are removed by selection even at positively selected sites. We use a penalized-likelihood implementation of the PPS model to successfully detect PPS in plant RuBisCO and influenza HA proteins. By directly estimating selection coefficients at protein sites, our inference procedure bypasses the need for using ω as a surrogate measure of selection and improves our ability to detect molecular adaptation in proteins.


Asunto(s)
Modelos Genéticos , Selección Genética , Sustitución de Aminoácidos , Codón , Evolución Molecular , Mutación
3.
Proc Natl Acad Sci U S A ; 116(12): 5693-5698, 2019 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-30819890

RESUMEN

Recent sequencing efforts have led to estimates of human cytomegalovirus (HCMV) genome-wide intrahost diversity that rival those of persistent RNA viruses [Renzette N, Bhattacharjee B, Jensen JD, Gibson L, Kowalik TF (2011) PLoS Pathog 7:e1001344]. Here, we deep sequence HCMV genomes recovered from single and longitudinally collected blood samples from immunocompromised children to show that the observations of high within-host HCMV nucleotide diversity are explained by the frequent occurrence of mixed infections caused by genetically distant strains. To confirm this finding, we reconstructed within-host viral haplotypes from short-read sequence data. We verify that within-host HCMV nucleotide diversity in unmixed infections is no greater than that of other DNA viruses analyzed by the same sequencing and bioinformatic methods and considerably less than that of human immunodeficiency and hepatitis C viruses. By resolving individual viral haplotypes within patients, we reconstruct the timing, likely origins, and natural history of superinfecting strains. We uncover evidence for within-host recombination between genetically distinct HCMV strains, observing the loss of the parental virus containing the nonrecombinant fragment. The data suggest selection for strains containing the recombinant fragment, generating testable hypotheses about HCMV evolution and pathogenesis. These results highlight that high HCMV diversity present in some samples is caused by coinfection with multiple distinct strains and provide reassurance that within the host diversity for single-strain HCMV infections is no greater than for other herpesviruses.


Asunto(s)
Citomegalovirus/genética , Recombinación Genética/genética , Sobreinfección/genética , Secuencia de Bases/genética , Niño , Preescolar , Infecciones por Citomegalovirus/virología , ADN Viral/genética , Femenino , Variación Genética/genética , Genoma Humano/genética , Genoma Viral , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Huésped Inmunocomprometido/genética , Lactante , Recién Nacido , Masculino , Análisis de Secuencia de ADN/métodos
4.
BMC Bioinformatics ; 22(1): 285, 2021 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-34049487

RESUMEN

BACKGROUND: Many important applications in bioinformatics, including sequence alignment and protein family profiling, employ sequence weighting schemes to mitigate the effects of non-independence of homologous sequences and under- or over-representation of certain taxa in a dataset. These schemes aim to assign high weights to sequences that are 'novel' compared to the others in the same dataset, and low weights to sequences that are over-represented. RESULTS: We formalise this principle by rigorously defining the evolutionary 'novelty' of a sequence within an alignment. This results in new sequence weights that we call 'phylogenetic novelty scores'. These scores have various desirable properties, and we showcase their use by considering, as an example application, the inference of character frequencies at an alignment column-important, for example, in protein family profiling. We give computationally efficient algorithms for calculating our scores and, using simulations, show that they are versatile and can improve the accuracy of character frequency estimation compared to existing sequence weighting schemes. CONCLUSIONS: Our phylogenetic novelty scores can be useful when an evolutionarily meaningful system for adjusting for uneven taxon sampling is desired. They have numerous possible applications, including estimation of evolutionary conservation scores and sequence logos, identification of targets in conservation biology, and improving and measuring sequence alignment accuracy.


Asunto(s)
Algoritmos , Biología Computacional , Filogenia , Alineación de Secuencia
5.
Nature ; 513(7518): 422-425, 2014 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-25043003

RESUMEN

The somatic mutations present in the genome of a cell accumulate over the lifetime of a multicellular organism. These mutations can provide insights into the developmental lineage tree, the number of divisions that each cell has undergone and the mutational processes that have been operative. Here we describe whole genomes of clonal lines derived from multiple tissues of healthy mice. Using somatic base substitutions, we reconstructed the early cell divisions of each animal, demonstrating the contributions of embryonic cells to adult tissues. Differences were observed between tissues in the numbers and types of mutations accumulated by each cell, which likely reflect differences in the number of cell divisions they have undergone and varying contributions of different mutational processes. If somatic mutation rates are similar to those in mice, the results indicate that precise insights into development and mutagenesis of normal human cells will be possible.


Asunto(s)
Linaje de la Célula/genética , Células Clonales/citología , Células Clonales/metabolismo , Genoma/genética , Mutagénesis/genética , Mutación/genética , Animales , Relojes Biológicos/genética , División Celular , Células Cultivadas , Embrión de Mamíferos/citología , Humanos , Masculino , Ratones , Ratones Endogámicos C57BL , Tasa de Mutación , Organoides/citología , Organoides/metabolismo , Filogenia , Análisis de Secuencia de ADN , Cola (estructura animal)/citología
6.
Mol Biol Evol ; 35(7): 1783-1797, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29618097

RESUMEN

Accurate reconstruction of ancestral states is a critical evolutionary analysis when studying ancient proteins and comparing biochemical properties between parental or extinct species and their extant relatives. It relies on multiple sequence alignment (MSA) which may introduce biases, and it remains unknown how MSA methodological approaches impact ancestral sequence reconstruction (ASR). Here, we investigate how MSA methodology modulates ASR using a simulation study of various evolutionary scenarios. We evaluate the accuracy of ancestral protein sequence reconstruction for simulated data and compare reconstruction outcomes using different alignment methods. Our results reveal biases introduced not only by aligner algorithms and assumptions, but also tree topology and the rate of insertions and deletions. Under many conditions we find no substantial differences between the MSAs. However, increasing the difficulty for the aligners can significantly impact ASR. The MAFFT consistency aligners and PRANK variants exhibit the best performance, whereas FSA displays limited performance. We also discover a bias towards reconstructed sequences longer than the true ancestors, deriving from a preference for inferring insertions, in almost all MSA methodological approaches. In addition, we find measures of MSA quality generally correlate highly with reconstruction accuracy. Thus, we show MSA methodological differences can affect the quality of reconstructions and propose MSA methods should be selected with care to accurately determine ancestral states with confidence.


Asunto(s)
Técnicas Genéticas , Alineación de Secuencia
7.
Proc Biol Sci ; 286(1898): 20182418, 2019 03 13.
Artículo en Inglés | MEDLINE | ID: mdl-30836875

RESUMEN

Resolving the timing and pattern of early placental mammal evolution has been confounded by conflict among divergence date estimates from interpretation of the fossil record and from molecular-clock dating studies. Despite both fossil occurrences and molecular sequences favouring a Cretaceous origin for Placentalia, no unambiguous Cretaceous placental mammal has been discovered. Investigating the differing patterns of evolution in morphological and molecular data reveals a possible explanation for this conflict. Here, we quantified the relationship between morphological and molecular rates of evolution. We show that, independent of divergence dates, morphological rates of evolution were slow relative to molecular evolution during the initial divergence of Placentalia, but substantially increased during the origination of the extant orders. The rapid radiation of placentals into a highly morphologically disparate Cenozoic fauna is thus not associated with the origin of Placentalia, but post-dates superordinal origins. These findings predict that early members of major placental groups may not be easily distinguishable from one another or from stem eutherians on the basis of skeleto-dental morphology. This result supports a Late Cretaceous origin of crown placentals with an ordinal-level adaptive radiation in the early Paleocene, with the high relative rate permitting rapid anatomical change without requiring unreasonably fast molecular evolutionary rates. The lack of definitive Cretaceous placental mammals may be a result of morphological similarity among stem and early crown eutherians, providing an avenue for reconciling the fossil record with molecular divergence estimates for Placentalia.


Asunto(s)
Evolución Biológica , Euterios/anatomía & histología , Filogenia , Animales , Euterios/clasificación , Evolución Molecular
8.
Nucleic Acids Res ; 44(8): e77, 2016 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-26819408

RESUMEN

Sequence Logos and its variants are the most commonly used method for visualization of multiple sequence alignments (MSAs) and sequence motifs. They provide consensus-based summaries of the sequences in the alignment. Consequently, individual sequences cannot be identified in the visualization and covariant sites are not easily discernible. We recently proposed Sequence Bundles, a motif visualization technique that maintains a one-to-one relationship between sequences and their graphical representation and visualizes covariant sites. We here present Alvis, an open-source platform for the joint explorative analysis of MSAs and phylogenetic trees, employing Sequence Bundles as its main visualization method. Alvis combines the power of the visualization method with an interactive toolkit allowing detection of covariant sites, annotation of trees with synapomorphies and homoplasies, and motif detection. It also offers numerical analysis functionality, such as dimension reduction and classification. Alvis is user-friendly, highly customizable and can export results in publication-quality figures. It is available as a full-featured standalone version (http://www.bitbucket.org/rfs/alvis) and its Sequence Bundles visualization module is further available as a web application (http://science-practice.com/projects/sequence-bundles).


Asunto(s)
Secuencia de Bases/genética , Biología Computacional/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos
9.
Lancet Glob Health ; 12(6): e1027-e1037, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38762283

RESUMEN

BACKGROUND: Medical consumable stock-outs negatively affect health outcomes not only by impeding or delaying the effective delivery of services but also by discouraging patients from seeking care. Consequently, supply chain strengthening is being adopted as a key component of national health strategies. However, evidence on the factors associated with increased consumable availability is limited. METHODS: In this study, we used the 2018-19 Harmonised Health Facility Assessment data from Malawi to identify the factors associated with the availability of consumables in level 1 facilities, ie, rural hospitals or health centres with a small number of beds and a sparsely equipped operating room for minor procedures. We estimate a multilevel logistic regression model with a binary outcome variable representing consumable availability (of 130 consumables across 940 facilities) and explanatory variables chosen based on current evidence. Further subgroup analyses are carried out to assess the presence of effect modification by level of care, facility ownership, and a categorisation of consumables by public health or disease programme, Malawi's Essential Medicine List classification, whether the consumable is a drug or not, and level of average national availability. FINDINGS: Our results suggest that the following characteristics had a positive association with consumable availability-level 1b facilities or community hospitals had 64% (odds ratio [OR] 1·64, 95% CI 1·37-1·97) higher odds of consumable availability than level 1a facilities or health centres, Christian Health Association of Malawi and private-for-profit ownership had 63% (1·63, 1·40-1·89) and 49% (1·49, 1·24-1·80) higher odds respectively than government-owned facilities, the availability of a computer had 46% (1·46, 1·32-1·62) higher odds than in its absence, pharmacists managing drug orders had 85% (1·85, 1·40-2·44) higher odds than a drug store clerk, proximity to the corresponding regional administrative office (facilities greater than 75 km away had 21% lower odds [0·79, 0·63-0·98] than facilities within 10 km of the district health office), and having three drug order fulfilments in the 3 months before the survey had 14% (1·14, 1·02-1·27) higher odds than one fulfilment in 3 months. Further, consumables categorised as vital in Malawi's Essential Medicine List performed considerably better with 235% (OR 3·35, 95% CI 1·60-7·05) higher odds than other essential or non-essential consumables and drugs performed worse with 79% (0·21, 0·08-0·51) lower odds than other medical consumables in terms of availability across facilities. INTERPRETATION: Our results provide evidence on the areas of intervention with potential to improve consumable availability. Further exploration of the health and resource consequences of the strategies discussed will be useful in guiding investments into supply chain strengthening. FUNDING: UK Research and Innovation as part of the Global Challenges Research Fund (Thanzi La Onse; reference MR/P028004/1), the Wellcome Trust (Thanzi La Mawa; reference 223120/Z/21/Z), the UK Medical Research Council, the UK Department for International Development, and the EU (reference MR/R015600/1).


Asunto(s)
Instituciones de Salud , Malaui , Humanos , Instituciones de Salud/estadística & datos numéricos , Instituciones de Salud/provisión & distribución , Accesibilidad a los Servicios de Salud/estadística & datos numéricos , Equipos y Suministros/provisión & distribución , Censos
10.
Mol Biol Evol ; 28(6): 1755-67, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21109586

RESUMEN

Four influenza pandemics have struck the human population during the last 100 years causing substantial morbidity and mortality. The pandemics were caused by the introduction of a new virus into the human population from an avian or swine host or through the mixing of virus segments from an animal host with a human virus to create a new reassortant subtype virus. Understanding which changes have contributed to the adaptation of the virus to the human host is essential in assessing the pandemic potential of current and future animal viruses. Here, we develop a measure of the level of adaptation of a given virus strain to a particular host. We show that adaptation to the human host has been gradual with a timescale of decades and that none of the virus proteins have yet achieved full adaptation to the selective constraints. When the measure is applied to historical data, our results indicate that the 1918 influenza virus had undergone a period of preadaptation prior to the 1918 pandemic. Yet, ancestral reconstruction of the avian virus that founded the classical swine and 1918 human influenza lineages shows no evidence that this virus was exceptionally preadapted to humans. These results indicate that adaptation to humans occurred following the initial host shift from birds to mammals, including a significant amount prior to 1918. The 2009 pandemic virus seems to have undergone preadaptation to human-like selective constraints during its period of circulation in swine. Ancestral reconstruction along the human virus tree indicates that mutations that have increased the adaptation of the virus have occurred preferentially along the trunk of the tree. The method should be helpful in assessing the potential of current viruses to found future epidemics or pandemics.


Asunto(s)
Adaptación Biológica , Interacciones Huésped-Patógeno , Infecciones por Orthomyxoviridae/inmunología , Orthomyxoviridae/inmunología , Algoritmos , Animales , Aves , Bases de Datos Genéticas , Perros , Aptitud Genética , Interacciones Huésped-Patógeno/inmunología , Humanos , Modelos Biológicos , Infecciones por Orthomyxoviridae/epidemiología , Pandemias , Filogenia , Proteínas de la Matriz Viral/química , Proteínas de la Matriz Viral/genética
11.
Virus Evol ; 8(2): veac093, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36478783

RESUMEN

Longitudinal deep sequencing of viruses can provide detailed information about intra-host evolutionary dynamics including how viruses interact with and transmit between hosts. Many analyses require haplotype reconstruction, identifying which variants are co-located on the same genomic element. Most current methods to perform this reconstruction are based on a high density of variants and cannot perform this reconstruction for slowly evolving viruses. We present a new approach, HaROLD (HAplotype Reconstruction Of Longitudinal Deep sequencing data), which performs this reconstruction based on identifying co-varying variant frequencies using a probabilistic framework. We illustrate HaROLD on both RNA and DNA viruses with synthetic Illumina paired read data created from mixed human cytomegalovirus (HCMV) and norovirus genomes, and clinical datasets of HCMV and norovirus samples, demonstrating high accuracy, especially when longitudinal samples are available.

12.
Bioinformatics ; 26(9): 1260-1, 2010 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-20299327

RESUMEN

UNLABELLED: ArchSchema is a Java Web Start application that generates a dynamic 2D network of related Pfam domain architectures. Each node corresponds to a different architecture (shown as a sequence of coloured boxes) and indicates whether any 3D structural information is available in the PDB. Satellite nodes can show either the UniProt codes or the PDB codes of proteins having the given architecture. Search options allow search by UniProt code or Pfam domain identifier, and results can be filtered by domain, organism, or by selecting only proteins in the PDB. AVAILABILITY: ArchSchema can be freely accessed at http://www.ebi.ac.uk/Tools/archschema.


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , Algoritmos , Bases de Datos de Proteínas , Humanos , Cadenas de Markov , Estructura Terciaria de Proteína , Proteínas Proto-Oncogénicas c-cbl/genética , Transducción de Señal , Interfaz Usuario-Computador
13.
Wellcome Open Res ; 6: 261, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35299708

RESUMEN

Hundreds of different mathematical models have been proposed for describing electrophysiology of various cell types. These models are quite complex (nonlinear systems of typically tens of ODEs and sometimes hundreds of parameters) and software packages such as the Cancer, Heart and Soft Tissue Environment (Chaste) C++ library have been designed to run simulations with these models in isolation or coupled to form a tissue simulation. The complexity of many of these models makes sharing and translating them to new simulation environments difficult. CellML is an XML format that offers a widely-adopted solution to this problem. This paper specifically describes the capabilities of two new Python tools: the cellmlmanip library for reading and manipulating CellML models; and chaste_codegen, a CellML to C++ converter. These tools provide a Python 3 replacement for a previous Python 2 tool (called PyCML) and they also provide additional new features that this paper describes. Most notably, they can generate analytic Jacobians without the use of proprietary software, and also find singularities occurring in equations and automatically generate and apply linear approximations to prevent numerical problems at these points.

14.
PLoS Comput Biol ; 5(11): e1000564, 2009 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-19911053

RESUMEN

The natural reservoir of Influenza A is waterfowl. Normally, waterfowl viruses are not adapted to infect and spread in the human population. Sometimes, through reassortment or through whole host shift events, genetic material from waterfowl viruses is introduced into the human population causing worldwide pandemics. Identifying which mutations allow viruses from avian origin to spread successfully in the human population is of great importance in predicting and controlling influenza pandemics. Here we describe a novel approach to identify such mutations. We use a sitewise non-homogeneous phylogenetic model that explicitly takes into account differences in the equilibrium frequencies of amino acids in different hosts and locations. We identify 172 amino acid sites with strong support and 518 sites with moderate support of different selection constraints in human and avian viruses. The sites that we identify provide an invaluable resource to experimental virologists studying adaptation of avian flu viruses to the human host. Identification of the sequence changes necessary for host shifts would help us predict the pandemic potential of various strains. The method is of broad applicability to investigating changes in selective constraints when the timing of the changes is known.


Asunto(s)
Biología Computacional/métodos , Interacciones Huésped-Patógeno/genética , Virus de la Influenza A , Modelos Genéticos , Selección Genética , Animales , Anseriformes , Flujo Genético , Humanos , Virus de la Influenza A/genética , Virus de la Influenza A/patogenicidad , Filogenia , Análisis de Secuencia de Proteína
15.
Genetics ; 197(1): 257-71, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24532780

RESUMEN

We develop a maximum penalized-likelihood (MPL) method to estimate the fitnesses of amino acids and the distribution of selection coefficients (S = 2Ns) in protein-coding genes from phylogenetic data. This improves on a previous maximum-likelihood method. Various penalty functions are used to penalize extreme estimates of the fitnesses, thus correcting overfitting by the previous method. Using a combination of computer simulation and real data analysis, we evaluate the effect of the various penalties on the estimation of the fitnesses and the distribution of S. We show the new method regularizes the estimates of the fitnesses for small, relatively uninformative data sets, but it can still recover the large proportion of deleterious mutations when present in simulated data. Computer simulations indicate that as the number of taxa in the phylogeny or the level of sequence divergence increases, the distribution of S can be more accurately estimated. Furthermore, the strength of the penalty can be varied to study how informative a particular data set is about the distribution of S. We analyze three protein-coding genes (the chloroplast rubisco protein, mammal mitochondrial proteins, and an influenza virus polymerase) and show the new method recovers a large proportion of deleterious mutations in these data, even under strong penalties, confirming the distribution of S is bimodal in these real data. We recommend the use of the new MPL approach for the estimation of the distribution of S in species phylogenies of protein-coding genes.


Asunto(s)
Simulación por Computador , Evolución Molecular , Filogenia , Animales , Secuencia de Bases , Aptitud Genética , Humanos , Funciones de Verosimilitud , Mutación , Selección Genética
16.
Nat Genet ; 45(5): 542-545, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23563608

RESUMEN

The blood group Vel was discovered 60 years ago, but the underlying gene is unknown. Individuals negative for the Vel antigen are rare and are required for the safe transfusion of patients with antibodies to Vel. To identify the responsible gene, we sequenced the exomes of five individuals negative for the Vel antigen and found that four were homozygous and one was heterozygous for a low-frequency 17-nucleotide frameshift deletion in the gene encoding the 78-amino-acid transmembrane protein SMIM1. A follow-up study showing that 59 of 64 Vel-negative individuals were homozygous for the same deletion and expression of the Vel antigen on SMIM1-transfected cells confirm SMIM1 as the gene underlying the Vel blood group. An expression quantitative trait locus (eQTL), the common SNP rs1175550 contributes to variable expression of the Vel antigen (P = 0.003) and influences the mean hemoglobin concentration of red blood cells (RBCs; P = 8.6 × 10(-15)). In vivo, zebrafish with smim1 knockdown showed a mild reduction in the number of RBCs, identifying SMIM1 as a new regulator of RBC formation. Our findings are of immediate relevance, as the homozygous presence of the deletion allows the unequivocal identification of Vel-negative blood donors.


Asunto(s)
Antígenos de Grupos Sanguíneos/genética , Membrana Eritrocítica/metabolismo , Eritrocitos/inmunología , Eliminación de Gen , Homocigoto , Proteínas de la Membrana/genética , Sitios de Carácter Cuantitativo , Alelos , Animales , Biomarcadores/metabolismo , Antígenos de Grupos Sanguíneos/inmunología , Antígenos de Grupos Sanguíneos/metabolismo , Ensayo de Cambio de Movilidad Electroforética , Eritrocitos/metabolismo , Eritrocitos/patología , Exoma/genética , Femenino , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Isoanticuerpos/inmunología , Proteínas de la Membrana/inmunología , Proteínas de la Membrana/metabolismo , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Embarazo , Pez Cebra/genética
17.
Genetics ; 190(3): 1101-15, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22209901

RESUMEN

Estimation of the distribution of selection coefficients of mutations is a long-standing issue in molecular evolution. In addition to population-based methods, the distribution can be estimated from DNA sequence data by phylogenetic-based models. Previous models have generally found unimodal distributions where the probability mass is concentrated between mildly deleterious and nearly neutral mutations. Here we use a sitewise mutation-selection phylogenetic model to estimate the distribution of selection coefficients among novel and fixed mutations (substitutions) in a data set of 244 mammalian mitochondrial genomes and a set of 401 PB2 proteins from influenza. We find a bimodal distribution of selection coefficients for novel mutations in both the mitochondrial data set and for the influenza protein evolving in its natural reservoir, birds. Most of the mutations are strongly deleterious with the rest of the probability mass concentrated around mildly deleterious to neutral mutations. The distribution of the coefficients among substitutions is unimodal and symmetrical around nearly neutral substitutions for both data sets at adaptive equilibrium. About 0.5% of the nonsynonymous mutations and 14% of the nonsynonymous substitutions in the mitochondrial proteins are advantageous, with 0.5% and 24% observed for the influenza protein. Following a host shift of influenza from birds to humans, however, we find among novel mutations in PB2 a trimodal distribution with a small mode of advantageous mutations.


Asunto(s)
Modelos Genéticos , Mutación , Filogenia , Selección Genética , Algoritmos , Animales , Simulación por Computador , Evolución Molecular , Flujo Genético , Humanos , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA