Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Nature ; 2023 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-37277473
2.
Biometals ; 36(1): 227-237, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36454509

RESUMEN

Zinc is the second most prevalent metal element present in living organisms, and control of its concentration is pivotal to physiology. The amount of zinc available to the cell cytoplasm is regulated by the activity of members of the SLC39 family, the ZIP proteins. Selectivity of ZIP transporters has been the focus of earlier studies which provided a biochemical and structural basis for the selectivity for zinc over other metals such as copper, iron, and manganese. However, several previous studies have shown how certain ZIP proteins exhibit higher selectivity for metal elements other than zinc. Sequence similarities suggest an evolutionary basis for the elemental selectivity within the ZIP family. Here, by engineering HEK293 cells to overexpress ZIP proteins, we have studied the selectivity of two phylogenetic clades of ZIP proteins, that is ZIP8/ZIP14 (previously known to be iron and manganese transporters) and ZIP5/ZIP10. By incubating ZIP over-expressing cells in presence of several divalent metals, we found that ZIP5 and ZIP10 are high affinity copper transporters with greater selectivity over other elements, revealing a novel substrate signature for the ZIP5/ZIP10 clade.


Asunto(s)
Cobre , Manganeso , Humanos , Cobre/metabolismo , Células HEK293 , Hierro/metabolismo , Manganeso/metabolismo , Proteínas de Transporte de Membrana , Metales/metabolismo , Filogenia , Zinc/metabolismo
3.
J Mol Evol ; 89(6): 357-369, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33934169

RESUMEN

We use large-scale mutagenesis data and computer simulations to quantify the mutational robustness of protein-coding genes by taking into account constraints arising from protein function and the genetic code. Analyses of the distribution of amino acid substitutions from 18 mutagenesis studies revealed an average of 45% of neutral variants; while mutagenesis data of 12 proteins artificially designed under no other constraints but stability, reach an average of 60%. Simulations using a lattice protein model allow us to contrast these estimates to the expected mutational robustness of protein families by generating unbiased samples of foldable sequences, which we find to have 30% of neutral variants. In agreement with mutagenesis data of designed proteins, the model shows that maximally robust protein families might access up to twice the amount of neutral variants observed in the unbiased samples (i.e. 60%). A biophysical model of protein-ligand binding suggests that constraints associated to molecular function have only a moderate impact on robustness of approximately 5 to 10% of neutral variants; and that the direction of this effect depends on the relation between functional performance and thermodynamic stability. Although the genetic code constraints the access of a gene's nucleotide sequence to only 30% of the full distribution of amino acid mutations, it provides an extra 15 to 20% of neutral variants to the estimations above, such that the expected, observed, and maximal robustness of protein-coding genes are approximately 50, 65, and 75%, respectively. We discuss our results in the light of three main hypothesis put forward to explain the existence of mutationally robust genes.


Asunto(s)
Código Genético , Proteínas , Humanos , Modelos Genéticos , Mutagénesis , Mutación , Proteínas/genética , Termodinámica
5.
J Biol Chem ; 292(45): 18518-18529, 2017 11 10.
Artículo en Inglés | MEDLINE | ID: mdl-28939764

RESUMEN

Stringent regulation of tyrosine kinase activity is essential for normal cellular function. In humans, the tyrosine kinase Src is inhibited via phosphorylation of its C-terminal tail by another kinase, C-terminal Src kinase (Csk). Although Src and Csk orthologs are present across holozoan organisms, including animals and protists, the Csk-Src negative regulatory mechanism appears to have evolved gradually. For example, in choanoflagellates, Src and Csk are both active, but the negative regulatory mechanism is reportedly absent. In filastereans, a protist clade closely related to choanoflagellates, Src is active, but Csk is apparently inactive. In this study, we use a combination of bioinformatics, in vitro kinase assays, and yeast-based growth assays to characterize holozoan Src and Csk orthologs. We show that, despite appreciable differences in domain architecture, Csk from Corallochytrium limacisporum, a highly diverged holozoan marine protist, is active and can inhibit Src. However, in comparison with other Csk orthologs, Corallochytrium Csk displays broad substrate specificity and inhibits Src in an activity-independent manner. Furthermore, in contrast to previous studies, we show that Csk from the filasterean Capsaspora owczarzaki is active and that the Csk-Src negative regulatory mechanism is present in Csk and Src proteins from C. owczarzaki and the choanoflagellate Monosiga brevicollis Our results suggest that negative regulation of Src by Csk is more ancient than previously thought and that it might be conserved across all holozoan species.


Asunto(s)
Organismos Acuáticos/enzimología , Coanoflagelados/enzimología , Proteínas Protozoarias/metabolismo , Familia-src Quinasas/antagonistas & inhibidores , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Proteína Tirosina Quinasa CSK , Biología Computacional , Secuencia Conservada , Cinética , Mutación , Filogenia , Dominios y Motivos de Interacción de Proteínas , Proteínas Protozoarias/antagonistas & inhibidores , Proteínas Protozoarias/química , Proteínas Protozoarias/genética , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Especificidad de la Especie , Homología Estructural de Proteína , Especificidad por Sustrato , Técnicas del Sistema de Dos Híbridos , Familia-src Quinasas/química , Familia-src Quinasas/genética , Familia-src Quinasas/metabolismo
6.
Nature ; 474(7349): 92-5, 2011 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-21637259

RESUMEN

Cryptic variation is caused by the robustness of phenotypes to mutations. Cryptic variation has no effect on phenotypes in a given genetic or environmental background, but it can have effects after mutations or environmental change. Because evolutionary adaptation by natural selection requires phenotypic variation, phenotypically revealed cryptic genetic variation may facilitate evolutionary adaptation. This is possible if the cryptic variation happens to be pre-adapted, or "exapted", to a new environment, and is thus advantageous once revealed. However, this facilitating role for cryptic variation has not been proven, partly because most pertinent work focuses on complex phenotypes of whole organisms whose genetic basis is incompletely understood. Here we show that populations of RNA enzymes with accumulated cryptic variation adapt more rapidly to a new substrate than a population without cryptic variation. A detailed analysis of our evolving RNA populations in genotype space shows that cryptic variation allows a population to explore new genotypes that become adaptive only in a new environment. Our observations show that cryptic variation contains new genotypes pre-adapted to a changed environment. Our results highlight the positive role that robustness and epistasis can have in adaptive evolution.


Asunto(s)
Adaptación Fisiológica/genética , Evolución Molecular , Variación Genética , ARN Catalítico/genética , ARN Catalítico/metabolismo , Azoarcus/enzimología , Azoarcus/genética , Mutagénesis , Fenotipo , Selección Genética/genética
7.
PLoS Comput Biol ; 10(12): e1003946, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25473967

RESUMEN

The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet.


Asunto(s)
Secuencia de Aminoácidos , Aminoácidos , Conformación Proteica , Proteínas , Aminoácidos/química , Aminoácidos/genética , Análisis por Conglomerados , Biología Computacional , Genotipo , Modelos Biológicos , Modelos Moleculares , Fenotipo , Pliegue de Proteína , Proteínas/química , Proteínas/genética , Análisis de Secuencia de Proteína
8.
J Mol Evol ; 78(2): 101-8, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24309994

RESUMEN

The distribution of variation in a quantitative trait and its underlying distribution of genotypic diversity can both be shaped by stabilizing and directional selection. Understanding either distribution is important, because it determines a population's response to natural selection. Unfortunately, existing theory makes conflicting predictions about how selection shapes these distributions, and very little pertinent experimental evidence exists. Here we study a simple genetic system, an evolving RNA enzyme (ribozyme) in which a combination of high throughput genotyping and measurement of a biochemical phenotype allow us to address this question. We show that directional selection, compared to stabilizing selection, increases the genotypic diversity of an evolving ribozyme population. In contrast, it leaves the variance in the phenotypic trait unchanged.


Asunto(s)
Variación Genética , Genotipo , Fenotipo , ARN Catalítico/genética , ARN Catalítico/metabolismo , Selección Genética , Azoarcus/genética , Azoarcus/metabolismo , Secuencia de Bases , Evolución Molecular , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , ARN Catalítico/química
9.
Plant Sci ; 339: 111931, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38030036

RESUMEN

Iron is an essential micronutrient for life. During the development of the seed, iron accumulates during embryo maturation. In Arabidopsis thaliana, iron mainly accumulates in the vacuoles of only one cell type, the cell layer that surrounds provasculature in hypocotyl and cotyledons. Iron accumulation pattern in Arabidopsis is an exception in plant phylogeny, most part of the dicot embryos accumulate iron in several cell layers including cortex and, in some cases, even in protodermis. It remains unknown how does iron reach the internal cell layers of the embryo, and in particular, the molecular mechanisms responsible of this process. Here, we use transgenic approaches to modify the iron accumulation pattern in an Arabidopsis model. Using the SDH2-3 embryo-specific promoter, we were able to express VIT1 ectopically in both a wild type background and a mutant vit1 background lacking expression of this vacuolar iron transporter. These manipulations modify the iron distribution pattern in Arabidopsis from one cell layer to several cell layers, including protodermis, cortex cells, and the endodermis. Interestingly, total seed iron content was not modified compared with the wild type, suggesting that iron distribution in embryos is not involved in the control of the total iron amount accumulated in seeds. This experimental model can be used to study the processes involved in iron distribution patterning during embryo maturation and its evolution in dicot plants.


Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/metabolismo , Hierro/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Regiones Promotoras Genéticas/genética , Semillas/metabolismo , Regulación de la Expresión Génica de las Plantas
10.
J Mol Biol ; 436(2): 168383, 2024 01 15.
Artículo en Inglés | MEDLINE | ID: mdl-38070861

RESUMEN

Creatine is an essential metabolite for the storage and rapid supply of energy in muscle and nerve cells. In humans, impaired metabolism, transport, and distribution of creatine throughout tissues can cause varying forms of mental disability, also known as creatine deficiency syndrome (CDS). So far, 80 mutations in the creatine transporter (SLC6A8) have been associated to CDS. To better understand the effect of human genetic variants on the physiology of SLC6A8 and their possible impact on CDS, we studied 30 missense variants including 15 variants of unknown significance, two of which are reported here for the first time. We expressed these variants in HEK293 cells and explored their subcellular localization and transport activity. We also applied computational methods to predict variant effect and estimate site-specific changes in thermodynamic stability. To explore variants that might have a differential effect on the transporter's conformers along the transport cycle, we constructed homology models of the inward facing, and outward facing conformations. In addition, we used mass-spectrometry to study proteins that interact with wild type SLC6A8 and five selected variants in HEK293 cells. In silico models of the protein complexes revealed how two variants impact the interaction interface of SLC6A8 with other proteins and how pathogenic variants lead to an enrichment of ER protein partners. Overall, our integrated analysis disambiguates the pathogenicity of 15 variants of unknown significance revealing diverse mechanisms of pathogenicity, including two previously unreported variants obtained from patients suffering from the creatine deficiency syndrome.


Asunto(s)
Encefalopatías Metabólicas Innatas , Creatina , Discapacidad Intelectual Ligada al Cromosoma X , Proteínas del Tejido Nervioso , Proteínas de Transporte de Neurotransmisores en la Membrana Plasmática , Humanos , Creatina/deficiencia , Células HEK293 , Discapacidad Intelectual Ligada al Cromosoma X/genética , Proteínas del Tejido Nervioso/deficiencia , Proteínas del Tejido Nervioso/genética , Proteínas de Transporte de Neurotransmisores en la Membrana Plasmática/deficiencia , Proteínas de Transporte de Neurotransmisores en la Membrana Plasmática/genética , Encefalopatías Metabólicas Innatas/genética , Análisis Mutacional de ADN/métodos , Mutación Missense , Biología Computacional/métodos
11.
RSC Adv ; 14(19): 13083-13094, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38655474

RESUMEN

The solute carrier transporter family 6 (SLC6) is of key interest for their critical role in the transport of small amino acids or amino acid-like molecules. Their dysfunction is strongly associated with human diseases such as including schizophrenia, depression, and Parkinson's disease. Linking single point mutations to disease may support insights into the structure-function relationship of these transporters. This work aimed to develop a computational model for predicting the potential pathogenic effect of single point mutations in the SLC6 family. Missense mutation data was retrieved from UniProt, LitVar, and ClinVar, covering multiple protein-coding transcripts. As encoding approach, amino acid descriptors were used to calculate the average sequence properties for both original and mutated sequences. In addition to the full-sequence calculation, the sequences were cut into twelve domains. The domains are defined according to the transmembrane domains of the SLC6 transporters to analyse the regions' contributions to the pathogenicity prediction. Subsequently, several classification models, namely Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) with the hyperparameters optimized through grid search were built. For estimation of model performance, repeated stratified k-fold cross-validation was used. The accuracy values of the generated models are in the range of 0.72 to 0.80. Analysis of feature importance indicates that mutations in distinct regions of SLC6 transporters are associated with an increased risk for pathogenicity. When applying the model on an independent validation set, the performance in accuracy dropped to averagely 0.6 with high precision but low sensitivity scores.

12.
Science ; 384(6694): eadk5864, 2024 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-38662832

RESUMEN

Chemical modulation of proteins enables a mechanistic understanding of biology and represents the foundation of most therapeutics. However, despite decades of research, 80% of the human proteome lacks functional ligands. Chemical proteomics has advanced fragment-based ligand discovery toward cellular systems, but throughput limitations have stymied the scalable identification of fragment-protein interactions. We report proteome-wide maps of protein-binding propensity for 407 structurally diverse small-molecule fragments. We verified that identified interactions can be advanced to active chemical probes of E3 ubiquitin ligases, transporters, and kinases. Integrating machine learning binary classifiers further enabled interpretable predictions of fragment behavior in cells. The resulting resource of fragment-protein interactions and predictive models will help to elucidate principles of molecular recognition and expedite ligand discovery efforts for hitherto undrugged proteins.


Asunto(s)
Descubrimiento de Drogas , Aprendizaje Automático , Proteómica , Bibliotecas de Moléculas Pequeñas , Humanos , Ligandos , Unión Proteica , Proteoma/metabolismo , Proteómica/métodos , Bibliotecas de Moléculas Pequeñas/química , Ubiquitina-Proteína Ligasas/metabolismo
13.
ACS Chem Biol ; 18(12): 2464-2473, 2023 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-38098458

RESUMEN

Molecular glue degraders (MGDs) are small molecules that degrade proteins of interest via the ubiquitin-proteasome system. While MGDs were historically discovered serendipitously, approaches for MGD discovery now include cell-viability-based drug screens or data mining of public transcriptomics and drug response datasets. These approaches, however, have target spaces restricted to the essential proteins. Here we develop a high-throughput workflow for MGD discovery that also reaches the nonessential proteome. This workflow begins with the rapid synthesis of a compound library by sulfur(VI) fluoride exchange chemistry coupled to a morphological profiling assay in isogenic cell lines that vary in levels of the E3 ligase CRBN. By comparing the morphological changes induced by compound treatment across the isogenic cell lines, we were able to identify FL2-14 as a CRBN-dependent MGD targeting the nonessential protein GSPT2. We envision that this workflow would contribute to the discovery and characterization of MGDs that target a wider range of proteins.


Asunto(s)
Complejo de la Endopetidasa Proteasomal , Ubiquitina-Proteína Ligasas , Proteolisis , Complejo de la Endopetidasa Proteasomal/metabolismo , Ubiquitina-Proteína Ligasas/metabolismo , Proteínas/metabolismo , Ubiquitina/metabolismo
14.
Biophys J ; 102(8): 1916-25, 2012 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-22768948

RESUMEN

The relationship between the genotype (sequence) and the phenotype (structure) of macromolecules affects their ability to evolve new structures and functions. We here compare the genotype space organization of proteins and RNA molecules to identify differences that may affect this ability. To this end, we computationally study the genotype-phenotype relationship for short RNA and lattice proteins of a reduced monomer alphabet size, to make exhaustive analysis and direct comparison of their genotype spaces feasible. We find that many fewer protein molecules than RNA molecules fold, but they fold into many more structures than RNA. In consequence, protein phenotypes have smaller genotype networks whose member genotypes tend to be more similar than for RNA phenotypes. Neighborhoods in sequence space of a given radius around an RNA molecule contain more novel structures than for protein molecules. We compare this property to evidence from natural RNA and protein molecules, and conclude that RNA genotype space may be more conducive to the evolution of new structure phenotypes.


Asunto(s)
Biología Computacional , Genotipo , Fenotipo , Proteínas/química , Proteínas/metabolismo , ARN/química , ARN/genética , Conformación de Ácido Nucleico , Pliegue de Proteína
15.
iScience ; 25(10): 105096, 2022 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-36164651

RESUMEN

Solute carriers are an operationally defined diverse family of membrane proteins involved in the transport of nutrients, metabolites, xenobiotics, and drugs. Here, we provide an integrative classification of solute carriers by combining evolutionary information with proteome-wide structure models recently made available through the AlphaFold resource. Analyses of orthologous relations among 455 protein-coding genes currently classified as human solute carriers, over the fully sequenced genomes of 2,100 species, suggest no more than approximately 180 independent evolutionary origins. Structural comparative analyses provided further insight revealing a total of 24 structurally distinct transmembrane folds, increasing by approximately 40% the number of previously described SLC structural folds. In addition, a structural comparative analysis identified a new human solute carrier member and revealed details of noncanonical ones. Our analyses uncover new ancestral relations between solute carrier genes, provide insights into the evolution of remote homologs and a platform to test hypotheses of functional deorphanization.

16.
Evol Bioinform Online ; 15: 1176934319870485, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31452598

RESUMEN

In order to preserve structure and function, proteins tend to preferentially conserve amino acids at particular sites along the sequence. Because mutations can affect structure and function, the question arises whether the preference of a protein site for a particular amino acid varies between protein homologs, and to what extent that variation depends on sequence divergence. Answering these questions can help in the development of models of sequence evolution, as well as provide insights on the dependence of the fitness effects of mutations on the genetic background of sequences, a phenomenon known as epistasis. Here, I comment on recent computational work providing a systematic analysis of the extent to which the amino acid preferences of proteins depend on the background mutations of protein homologs.

17.
Genome Biol Evol ; 11(1): 121-135, 2019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30496400

RESUMEN

The propensity of protein sites to be occupied by any of the 20 amino acids is known as site-specific amino acid preferences (SSAP). Under the assumption that SSAP are conserved among homologs, they can be used to parameterize evolutionary models for the reconstruction of accurate phylogenetic trees. However, simulations and experimental studies have not been able to fully assess the relative conservation of SSAP as a function of sequence divergence between protein homologs. Here, we implement a computational procedure to predict the SSAP of proteins based on the effect of changes in thermodynamic stability upon mutation. An advantage of this computational approach is that it allows us to interrogate a large and unbiased sample of homologous proteins, over the entire spectrum of sequence divergence, and under selection for the same molecular trait. We show that computational predictions have reproducibilities that resemble those obtained in experimental replicates, and can largely recapitulate the SSAP observed in a large-scale mutagenesis experiment. Our results support recent experimental reports on the conservation of SSAP of related homologs, with a slowly increasing fraction of up to 15% of different sites at sequence distances lower than 40%. However, even under the sole contribution of thermodynamic stability, our conservative approach identifies up to 30% of significant different sites between divergent homologs. We show that this relation holds for homologs of diverse sizes and structural classes. Analyses of residue contact networks suggest that an important determinant of these differences is the increasing accumulation of structural deviations that results from sequence divergence.


Asunto(s)
Sustitución de Aminoácidos , Modelos Genéticos , Estabilidad Proteica , Homología de Secuencia de Aminoácido
18.
Genes (Basel) ; 10(5)2019 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-31035578

RESUMEN

More than a decade ago, a new mitochondrial Open Reading Frame (mtORF) was discovered in corals of the family Pocilloporidae and has been used since then as an effective barcode for these corals. Recently, mtORF sequencing revealed the existence of two differentiated Stylophora lineages occurring in sympatry along the environmental gradient of the Red Sea (18.5°C to 33.9°C). In the endemic Red Sea lineage RS_LinB, the mtORF and the heat shock protein gene hsp70 uncovered similar phylogeographic patterns strongly correlated with environmental variations. This suggests that the mtORF too might be involved in thermal adaptation. Here, we used computational analyses to explore the features and putative function of this mtORF. In particular, we tested the likelihood that this gene encodes a functional protein and whether it may play a role in adaptation. Analyses of full mitogenomes showed that the mtORF originated in the common ancestor of Madracis and other pocilloporids, and that it encodes a transmembrane protein differing in length and domain architecture among genera. Homology-based annotation and the relative conservation of metal-binding sites revealed traces of an ancient hydrolase catalytic activity. Furthermore, signals of pervasive purifying selection, lack of stop codons in 1830 sequences analyzed, and a codon-usage bias similar to that of other mitochondrial genes indicate that the protein is functional, i.e., not a pseudogene. Other features, such as intrinsically disordered regions, tandem repeats, and signals of positive selection particularly in StylophoraRS_LinB populations, are consistent with a role of the mtORF in adaptive responses to environmental changes.


Asunto(s)
Antozoos/genética , Biología Computacional , ADN Mitocondrial/genética , Mitocondrias/genética , Animales , Ecosistema , Océano Índico , Sistemas de Lectura Abierta/genética , Filogenia , Filogeografía , Conformación Proteica , Secuencias Repetidas en Tándem/genética
19.
BMC Bioinformatics ; 9: 265, 2008 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-18534022

RESUMEN

BACKGROUND: As in many different areas of science and technology, most important problems in bioinformatics rely on the proper development and assessment of binary classifiers. A generalized assessment of the performance of binary classifiers is typically carried out through the analysis of their receiver operating characteristic (ROC) curves. The area under the ROC curve (AUC) constitutes a popular indicator of the performance of a binary classifier. However, the assessment of the statistical significance of the difference between any two classifiers based on this measure is not a straightforward task, since not many freely available tools exist. Most existing software is either not free, difficult to use or not easy to automate when a comparative assessment of the performance of many binary classifiers is intended. This constitutes the typical scenario for the optimization of parameters when developing new classifiers and also for their performance validation through the comparison to previous art. RESULTS: In this work we describe and release new software to assess the statistical significance of the observed difference between the AUCs of any two classifiers for a common task estimated from paired data or unpaired balanced data. The software is able to perform a pairwise comparison of many classifiers in a single run, without requiring any expert or advanced knowledge to use it. The software relies on a non-parametric test for the difference of the AUCs that accounts for the correlation of the ROC curves. The results are displayed graphically and can be easily customized by the user. A human-readable report is generated and the complete data resulting from the analysis are also available for download, which can be used for further analysis with other software. The software is released as a web server that can be used in any client platform and also as a standalone application for the Linux operating system. CONCLUSION: A new software for the statistical comparison of ROC curves is released here as a web server and also as standalone software for the LINUX operating system.


Asunto(s)
Algoritmos , Interpretación Estadística de Datos , Diagnóstico por Computador/métodos , Curva ROC , Programas Informáticos
20.
Proc Biol Sci ; 275(1643): 1595-602, 2008 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-18430649

RESUMEN

Recent laboratory experiments suggest that a molecule's ability to evolve neutrally is important for its ability to generate evolutionary innovations. In contrast to laboratory experiments, life unfolds on time-scales of billions of years. Here, we ask whether a molecule's ability to evolve neutrally-a measure of its robustness-facilitates evolutionary innovation also on these large time-scales. To this end, we use protein designability, the number of sequences that can adopt a given protein structure, as an estimate of the structure's ability to evolve neutrally. Based on two complementary measures of functional diversity-catalytic diversity and molecular functional diversity in gene ontology-we show that more robust proteins have a greater capacity to produce functional innovations. Significant associations among structural designability, folding rate and intrinsic disorder also exist, underlining the complex relationship of the structural factors that affect protein evolution.


Asunto(s)
Evolución Molecular , Estructura Terciaria de Proteína , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Enzimas/química , Enzimas/fisiología , Pliegue de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA