RESUMEN
The increase of brain neuron number in relation with brain size is currently considered to be the major evolutionary path to high cognitive power in amniotes. However, how changes in neuron density did contribute to the evolution of the information-processing capacity of the brain remains unanswered. High neuron densities are seen as the main reason why the fovea located at the visual center of the retina is responsible for sharp vision in birds and primates. The emergence of foveal vision is considered as a breakthrough innovation in visual system evolution. We found that neuron densities in the largest visual center of the midbrain - i.e., the optic tectum - are two to four times higher in modern birds with one or two foveae compared to birds deprived of this specialty. Interspecies comparisons enabled us to identify elements of a hitherto unknown developmental process set up by foveate birds for increasing neuron density in the upper layers of their optic tectum. The late progenitor cells that generate these neurons proliferate in a ventricular zone that can expand only radially. In this particular context, the number of cells in ontogenetic columns increases, thereby setting the conditions for higher cell densities in the upper layers once neurons did migrate.
Asunto(s)
Columbidae , Retina , Animales , Retina/fisiología , Neuronas , Colículos Superiores , MorfogénesisRESUMEN
Patterns of molecular coevolution can reveal structural and functional constraints within or among organic molecules. These patterns are better understood when considering the underlying evolutionary process, which enables us to disentangle the signal of the dependent evolution of sites (coevolution) from the effects of shared ancestry of genes. Conversely, disregarding the dependent evolution of sites when studying the history of genes negatively impacts the accuracy of the inferred phylogenetic trees. Although molecular coevolution and phylogenetic history are interdependent, analyses of the two processes are conducted separately, a choice dictated by computational convenience, but at the expense of accuracy. We present a Bayesian method and associated software to infer how many and which sites of an alignment evolve according to an independent or a pairwise dependent evolutionary process, and to simultaneously estimate the phylogenetic relationships among sequences. We validate our method on synthetic datasets and challenge our predictions of coevolution on the 16S rRNA molecule by comparing them with its known molecular structure. Finally, we assess the accuracy of phylogenetic trees inferred under the assumption of independence among sites using synthetic datasets, the 16S rRNA molecule and 10 additional alignments of protein-coding genes of eukaryotes. Our results demonstrate that inferring phylogenetic trees while accounting for dependent site evolution significantly impacts the estimates of the phylogeny and the evolutionary process.
Asunto(s)
Evolución Molecular , Filogenia , Teorema de Bayes , Modelos Genéticos , ARN Ribosómico 16S/química , ARN Ribosómico 16S/genética , Reproducibilidad de los Resultados , Programas InformáticosRESUMEN
The study of molecular coevolution, due to its potential to identify gene regions under functional or structural constraints, has recently been subject to numerous scientific inquiries. Particular efforts have been conducted to develop methods predicting the presence of coevolution in molecular sequences. Among these methods, a few aim to model the underlying evolutionary process of coevolution, which enable to differentiate the shared history of genes to coevolution and thus improve their accuracy. However, the usage of such methods remains sparse due to their expensive computational cost and the lack of resources alleviating this issue. Here we present CoevDB (http://phylodb.unil.ch/CoevDB), a database containing the result of a large-scale analysis of intramolecular coevolution of 8201 protein-coding genes of bony vertebrates. The web interface of CoevDB gives access to the results to 800 millions of statistical tests corresponding to all the pairs of sites analyzed. Several type of queries enable users to explore the database by either targeting specific genes or by discovering genes having promising estimations of coevolution.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Evolución Molecular , Sistemas de Lectura Abierta , Vertebrados/genética , Animales , Filogenia , Programas Informáticos , Interfaz Usuario-Computador , Vertebrados/clasificaciónRESUMEN
Amino-acid coevolution can be referred to mutational compensatory patterns preserving the function of a protein. Viral envelope glycoproteins, which mediate entry of enveloped viruses into their host cells, are shaped by coevolution signals that confer to viruses the plasticity to evade neutralizing antibodies without altering viral entry mechanisms. The functions and structures of the two envelope glycoproteins of the Hepatitis C Virus (HCV), E1 and E2, are poorly described. Especially, how these two proteins mediate the HCV fusion process between the viral and the cell membrane remains elusive. Here, as a proof of concept, we aimed to take advantage of an original coevolution method recently developed to shed light on the HCV fusion mechanism. When first applied to the well-characterized Dengue Virus (DENV) envelope glycoproteins, coevolution analysis was able to predict important structural features and rearrangements of these viral protein complexes. When applied to HCV E1E2, computational coevolution analysis predicted that E1 and E2 refold interdependently during fusion through rearrangements of the E2 Back Layer (BL). Consistently, a soluble BL-derived polypeptide inhibited HCV infection of hepatoma cell lines, primary human hepatocytes and humanized liver mice. We showed that this polypeptide specifically inhibited HCV fusogenic rearrangements, hence supporting the critical role of this domain during HCV fusion. By combining coevolution analysis and in vitro assays, we also uncovered functionally-significant coevolving signals between E1 and E2 BL/Stem regions that govern HCV fusion, demonstrating the accuracy of our coevolution predictions. Altogether, our work shed light on important structural features of the HCV fusion mechanism and contributes to advance our functional understanding of this process. This study also provides an important proof of concept that coevolution can be employed to explore viral protein mediated-processes, and can guide the development of innovative translational strategies against challenging human-tropic viruses.
Asunto(s)
Evolución Molecular , Hepacivirus/fisiología , Proteínas del Envoltorio Viral/metabolismo , Internalización del Virus , Animales , Carcinoma Hepatocelular/metabolismo , Carcinoma Hepatocelular/patología , Carcinoma Hepatocelular/virología , Hepatitis C/metabolismo , Hepatitis C/patología , Hepatitis C/virología , Humanos , Neoplasias Hepáticas/metabolismo , Neoplasias Hepáticas/patología , Neoplasias Hepáticas/virología , Ratones , Ratones Endogámicos C57BL , Unión Proteica , Células Tumorales Cultivadas , Proteínas del Envoltorio Viral/química , Proteínas del Envoltorio Viral/genética , Replicación ViralRESUMEN
Major histocompatibility complex class I (MHC-I) molecules are critical to adaptive immune defence mechanisms in vertebrate species and are encoded by highly polymorphic genes. Polymorphic sites are located close to the ligand-binding groove and entail MHC-I alleles with distinct binding specificities. Some efforts have been made to investigate the relationship between polymorphism and protein stability. However, less is known about the relationship between polymorphism and MHC-I co-evolutionary constraints. Using Direct Coupling Analysis (DCA) we found that co-evolution analysis accurately pinpoints structural contacts, although the protein family is restricted to vertebrates and comprises less than five hundred species, and that the co-evolutionary signal is mainly driven by inter-species changes, and not intra-species polymorphism. Moreover, we show that polymorphic sites in human preferentially avoid co-evolving residues, as well as residues involved in protein stability. These results suggest that sites displaying high polymorphism may have been selected during vertebrates' evolution to avoid co-evolutionary constraints and thereby maximize their mutability.
Asunto(s)
Sitios de Unión/genética , Evolución Molecular , Antígenos de Histocompatibilidad Clase I , Polimorfismo Genético/genética , Animales , Antígenos de Histocompatibilidad Clase I/química , Antígenos de Histocompatibilidad Clase I/genética , Antígenos de Histocompatibilidad Clase I/metabolismo , Humanos , Modelos Moleculares , Filogenia , Estabilidad Proteica , Vertebrados/genéticaRESUMEN
Modular genetic systems and networks have complex evolutionary histories shaped by selection acting on single genes as well as on their integrated function within the network. However, uncovering molecular coevolution requires the detection of coevolving sites in sequences. Detailed knowledge of the functions of each gene in the system is also necessary to identify the selective agents driving coevolution. Using recently developed computational tools, we investigated the effect of positive selection on the coevolution of ten major genes in the melanocortin system, responsible for multiple physiological functions and human diseases. Substitutions driven by positive selection at the melanocortin-1-receptor (MC1R) induced more coevolutionary changes on the system than positive selection on other genes in the system. Contrarily, selection on the highly pleiotropic POMC gene, which orchestrates the activation of the different melanocortin receptors, had the lowest coevolutionary influence. MC1R and possibly its main function, melanin pigmentation, seems to have influenced the evolution of the melanocortin system more than functions regulated by MC2-5Rs such as energy homeostasis, glucocorticoid-dependent stress and anti-inflammatory responses. Although replication in other regulatory systems is needed, this suggests that single functional aspects of a genetic network or system can be of higher importance than others in shaping coevolution among the genes that integrate it.
Asunto(s)
Melanocortinas/metabolismo , Receptor de Melanocortina Tipo 1/metabolismo , Animales , Evolución Molecular , Redes Reguladoras de Genes/fisiología , Melanocortinas/genética , Filogenia , Receptor de Melanocortina Tipo 1/genética , Selección Genética/genéticaRESUMEN
BACKGROUND: Available methods to simulate nucleotide or amino acid data typically use Markov models to simulate each position independently. These approaches are not appropriate to assess the performance of combinatorial and probabilistic methods that look for coevolving positions in nucleotide or amino acid sequences. RESULTS: We have developed a web-based platform that gives a user-friendly access to two phylogenetic-based methods implementing the Coev model: the evaluation of coevolving scores and the simulation of coevolving positions. We have also extended the capabilities of the Coev model to allow for the generalization of the alphabet used in the Markov model, which can now analyse both nucleotide and amino acid data sets. The simulation of coevolving positions is novel and builds upon the developments of the Coev model. It allows user to simulate pairs of dependent nucleotide or amino acid positions. CONCLUSIONS: The main focus of our paper is the new simulation method we present for coevolving positions. The implementation of this method is embedded within the web platform Coev-web that is freely accessible at http://coev.vital-it.ch/, and was tested in most modern web browsers.
Asunto(s)
Aminoácidos/metabolismo , Biología Computacional/métodos , Evolución Molecular , Internet , Filogenia , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Teorema de Bayes , HumanosRESUMEN
Models of codon evolution have attracted particular interest because of their unique capabilities to detect selection forces and their high fit when applied to sequence evolution. We described here a novel approach for modeling codon evolution, which is based on Kronecker product of matrices. The 61 × 61 codon substitution rate matrix is created using Kronecker product of three 4 × 4 nucleotide substitution matrices, the equilibrium frequency of codons, and the selection rate parameter. The entities of the nucleotide substitution matrices and selection rate are considered as parameters of the model, which are optimized by maximum likelihood. Our fully mechanistic model allows the instantaneous substitution matrix between codons to be fully estimated with only 19 parameters instead of 3,721, by using the biological interdependence existing between positions within codons. We illustrate the properties of our models using computer simulations and assessed its relevance by comparing the AICc measures of our model and other models of codon evolution on simulations and a large range of empirical data sets. We show that our model fits most biological data better compared with the current codon models. Furthermore, the parameters in our model can be interpreted in a similar way as the exchangeability rates found in empirical codon models.
Asunto(s)
Codón/genética , Modelos Genéticos , Algoritmos , Sustitución de Aminoácidos , Simulación por Computador , Evolución Molecular , Funciones de Verosimilitud , Cadenas de Markov , Tasa de MutaciónRESUMEN
MOTIVATION: The analysis of molecular coevolution provides information on the potential functional and structural implication of positions along DNA sequences, and several methods are available to identify coevolving positions using probabilistic or combinatorial approaches. The specific nucleotide or amino acid profile associated with the coevolution process is, however, not estimated, but only known profiles, such as the Watson-Crick constraint, are usually considered a priori in current measures of coevolution. RESULTS: Here, we propose a new probabilistic model, Coev, to identify coevolving positions and their associated profile in DNA sequences while incorporating the underlying phylogenetic relationships. The process of coevolution is modeled by a 16 × 16 instantaneous rate matrix that includes rates of transition as well as a profile of coevolution. We used simulated, empirical and illustrative data to evaluate our model and to compare it with a model of 'independent' evolution using Akaike Information Criterion. We showed that the Coev model is able to discriminate between coevolving and non-coevolving positions and provides better specificity and specificity than other available approaches. We further demonstrate that the identification of the profile of coevolution can shed new light on the process of dependent substitution during lineage evolution.
Asunto(s)
Evolución Molecular , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Teorema de Bayes , Probabilidad , ARN Ribosómico 16S/genética , Ribulosa-Bifosfato Carboxilasa/genética , Programas InformáticosRESUMEN
BACKGROUND: Searching for similarities in a set of biological data is intrinsically difficult due to possible data points that should not be clustered, or that should group within several clusters. Under these hypotheses, hierarchical agglomerative clustering is not appropriate. Moreover, if the dataset is not known enough, like often is the case, supervised classification is not appropriate either. RESULTS: CLAG (for CLusters AGgregation) is an unsupervised non hierarchical clustering algorithm designed to cluster a large variety of biological data and to provide a clustered matrix and numerical values indicating cluster strength. CLAG clusterizes correlation matrices for residues in protein families, gene-expression and miRNA data related to various cancer types, sets of species described by multidimensional vectors of characters, binary matrices. It does not ask to all data points to cluster and it converges yielding the same result at each run. Its simplicity and speed allows it to run on reasonably large datasets. CONCLUSIONS: CLAG can be used to investigate the cluster structure present in biological datasets and to identify its underlying graph. It showed to be more informative and accurate than several known clustering methods, as hierarchical agglomerative clustering, k-means, fuzzy c-means, model-based clustering, affinity propagation clustering, and not to suffer of the convergence problem proper to this latter.
Asunto(s)
Algoritmos , Expresión Génica , Neoplasias/genética , Neoplasias Encefálicas/genética , Neoplasias de la Mama/genética , Análisis por Conglomerados , Femenino , Humanos , MicroARNs/genética , Análisis de Secuencia por Matrices de OligonucleótidosRESUMEN
Epilepsy is a widespread neurological disease characterized by abnormal neuronal activity resulting in recurrent seizures. There is mounting evidence that a circadian system disruption, involving clock genes and their downstream transcriptional regulators, is associated with epilepsy. In this study, we characterized the hippocampal expression of clock genes and PAR bZIP transcription factors (TFs) in a mouse model of temporal lobe epilepsy induced by intrahippocampal injection of kainic acid (KA). The expression of PAR bZIP TFs was significantly altered following KA injection as well as in other rodent models of acquired epilepsy. Although the PAR bZIP TFs are regulated by proinflammatory cytokines in peripheral tissues, we discovered that the regulation of their expression is inflammation-independent in hippocampal tissue and rather mediated by clock genes and hyperexcitability. Furthermore, we report that hepatic leukemia factor (Hlf), a member of PAR bZIP TFs family, is invariably downregulated in animal models of acquired epilepsy, regulates neuronal activity in vitro and its overexpression in dentate gyrus neurons in vivo leads to altered expression of genes associated with seizures and epilepsy. Overall, our study provides further evidence of PAR bZIP TFs involvement in epileptogenesis and points to Hlf as the key player.
Asunto(s)
Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/metabolismo , Giro Dentado/metabolismo , Epilepsia/metabolismo , Regulación de la Expresión Génica , Animales , Giro Dentado/patología , Modelos Animales de Enfermedad , Epilepsia/inducido químicamente , Ácido Kaínico/efectos adversos , Ácido Kaínico/farmacología , Masculino , RatonesRESUMEN
Small protein fragments, and not just residues, can be used as basic building blocks to reconstruct networks of coevolved amino acids in proteins. Fragments often enter in physical contact one with the other and play a major biological role in the protein. The nature of these interactions might be multiple and spans beyond binding specificity, allosteric regulation and folding constraints. Indeed, coevolving fragments are indicators of important information explaining folding intermediates, peptide assembly, key mutations with known roles in genetic diseases, distinguished subfamily-dependent motifs and differentiated evolutionary pressures on protein regions. Coevolution analysis detects networks of fragments interaction and highlights a high order organization of fragments demonstrating the importance of studying at a deeper level this structure. We demonstrate that it can be applied to protein families that are highly conserved or represented by few sequences, enlarging in this manner, the class of proteins where coevolution analysis can be performed and making large-scale coevolution studies a feasible goal.