Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Cell ; 158(4): 916-928, 2014 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-25126794

RESUMO

A central problem in biology is to identify gene function. One approach is to infer function in large supergenomic networks of interactions and ancestral relationships among genes; however, their analysis can be computationally prohibitive. We show here that these biological networks are compressible. They can be shrunk dramatically by eliminating redundant evolutionary relationships, and this process is efficient because in these networks the number of compressible elements rises linearly rather than exponentially as in other complex networks. Compression enables global network analysis to computationally harness hundreds of interconnected genomes and to produce functional predictions. As a demonstration, we show that the essential, but functionally uncharacterized Plasmodium falciparum antigen EXP1 is a membrane glutathione S-transferase. EXP1 efficiently degrades cytotoxic hematin, is potently inhibited by artesunate, and is associated with artesunate metabolism and susceptibility in drug-pressured malaria parasites. These data implicate EXP1 in the mode of action of a frontline antimalarial drug.


Assuntos
Antígenos de Protozoários/isolamento & purificação , Compressão de Dados , Genômica/métodos , Plasmodium falciparum/enzimologia , Antígenos de Protozoários/química , Antígenos de Protozoários/genética , Antígenos de Protozoários/metabolismo , Antimaláricos/farmacologia , Artemisininas/farmacologia , Artesunato , Domínio Catalítico , Hemina/metabolismo , Modelos Genéticos , Plasmodium falciparum/genética
2.
BMC Bioinformatics ; 14 Suppl 3: S6, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23514548

RESUMO

BACKGROUND: Annotating protein function with both high accuracy and sensitivity remains a major challenge in structural genomics. One proven computational strategy has been to group a few key functional amino acids into templates and search for these templates in other protein structures, so as to transfer function when a match is found. To this end, we previously developed Evolutionary Trace Annotation (ETA) and showed that diffusing known annotations over a network of template matches on a structural genomic scale improved predictions of function. In order to further increase sensitivity, we now let each protein contribute multiple templates rather than just one, and also let the template size vary. RESULTS: Retrospective benchmarks in 605 Structural Genomics enzymes showed that multiple templates increased sensitivity by up to 14% when combined with single template predictions even as they maintained the accuracy over 91%. Diffusing function globally on networks of single and multiple template matches marginally increased the area under the ROC curve over 0.97, but in a subset of proteins that could not be annotated by ETA, the network approach recovered annotations for the most confident 20-23 of 91 cases with 100% accuracy. CONCLUSIONS: We improve the accuracy and sensitivity of predictions by using multiple templates per protein structure when constructing networks of ETA matches and diffusing annotations.


Assuntos
Conformação Proteica , Proteínas/fisiologia , Algoritmos , Biologia Computacional , Bases de Dados de Proteínas , Enzimas/química , Evolução Molecular , Genômica , Anotação de Sequência Molecular , Proteínas/química , Proteínas/genética
3.
Int J Infect Dis ; 106: 169-170, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33746095

RESUMO

Recently released interim numbers from advanced vaccine candidate clinical trials suggest that a COVID-19 vaccine effectiveness (VE) of >90% is achievable. However, SARS-CoV-2 transmission dynamics are highly heterogeneous and exhibit localized bursts of transmission, which may lead to sharp localized peaks in the number of new cases, often followed by longer periods of low incidence. Here we show that, for interim estimates of VE, these characteristic bursts in SARS-CoV-2 infection may introduce a strong positive bias in VE. Specifically, we generate null models of vaccine effectiveness, i.e., random models with bursts that over longer periods converge to zero VE but that for interim periods frequently produce apparent VE near 100%. As an example, by following the relevant clinical trial protocol, we can reproduce recently reported interim outcomes from an ongoing phase 3 clinical trial of an RNA-based vaccine candidate. Thus, to avoid potential random biases in VE, it is suggested that interim estimates on COVID-19 VE should control for the intrinsic inhomogeneity in both SARS-CoV-2 infection dynamics and reported cases.


Assuntos
Vacinas contra COVID-19 , COVID-19/prevenção & controle , Ensaios Clínicos Fase III como Assunto , SARS-CoV-2/imunologia , Viés , COVID-19/epidemiologia , Humanos , Modelos Estatísticos , Distribuições Estatísticas
4.
Physica A ; 389(16): 3250-3253, 2010 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-20625477

RESUMO

Recurrent international financial crises inflict significant damage to societies and stress the need for mechanisms or strategies to control risk and tamper market uncertainties. Unfortunately, the complex network of market interactions often confounds rational approaches to optimize financial risks. Here we show that investors can overcome this complexity and globally minimize risk in portfolio models for any given expected return, provided the relative margin requirement remains below a critical, empirically measurable value. In practice, for markets with centrally regulated margin requirements, a rational stabilization strategy would be keeping margins small enough. This result follows from ground states of the random field spin glass Ising model that can be calculated exactly through convex optimization when relative spin coupling is limited by the norm of the network's Laplacian matrix. In that regime, this novel approach is robust to noise in empirical data and may be also broadly relevant to complex networks with frustrated interactions that are studied throughout scientific fields.

5.
BMC Bioinformatics ; 9: 17, 2008 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-18190718

RESUMO

BACKGROUND: Structural genomics projects such as the Protein Structure Initiative (PSI) yield many new structures, but often these have no known molecular functions. One approach to recover this information is to use 3D templates - structure-function motifs that consist of a few functionally critical amino acids and may suggest functional similarity when geometrically matched to other structures. Since experimentally determined functional sites are not common enough to define 3D templates on a large scale, this work tests a computational strategy to select relevant residues for 3D templates. RESULTS: Based on evolutionary information and heuristics, an Evolutionary Trace Annotation (ETA) pipeline built templates for 98 enzymes, half taken from the PSI, and sought matches in a non-redundant structure database. On average each template matched 2.7 distinct proteins, of which 2.0 share the first three Enzyme Commission digits as the template's enzyme of origin. In many cases (61%) a single most likely function could be predicted as the annotation with the most matches, and in these cases such a plurality vote identified the correct function with 87% accuracy. ETA was also found to be complementary to sequence homology-based annotations. When matches are required to both geometrically match the 3D template and to be sequence homologs found by BLAST or PSI-BLAST, the annotation accuracy is greater than either method alone, especially in the region of lower sequence identity where homology-based annotations are least reliable. CONCLUSION: These data suggest that knowledge of evolutionarily important residues improves functional annotation among distant enzyme homologs. Since, unlike other 3D template approaches, the ETA method bypasses the need for experimental knowledge of the catalytic mechanism, it should prove a useful, large scale, and general adjunct to combine with other methods to decipher protein function in the structural proteome.


Assuntos
Motivos de Aminoácidos/genética , Enzimas , Evolução Molecular , Inteligência Artificial , Bases de Dados de Proteínas , Enzimas/química , Enzimas/genética , Enzimas/metabolismo , Funções Verossimilhança , Modelos Biológicos , Reconhecimento Automatizado de Padrão , Conformação Proteica , Proteoma , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína , Relação Estrutura-Atividade
6.
Bioinformatics ; 23(23): 3217-24, 2007 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-17977886

RESUMO

MOTIVATION: Predicting protein function is a central problem in bioinformatics, and many approaches use partially or fully automated methods based on various combination of sequence, structure and other information on proteins or genes. Such information establishes relationships between proteins that can be modelled most naturally as edges in graphs. A priori, however, it is often unclear which edges from which graph may contribute most to accurate predictions. For that reason, one established strategy is to integrate all available sources, or graphs as in graph integration, in the hope that the positive signals will add to each other. However, in the problem of functional prediction, noise, i.e. the presence of inaccurate or false edges, can still be large enough that integration alone has little effect on prediction accuracy. In order to reduce noise levels and to improve integration efficiency, we present here a recent method in graph-based learning, graph sharpening, which provides a theoretically firm yet intuitive and practical approach for disconnecting undesirable edges from protein similarity graphs. This approach has several attractive features: it is quick, scalable in the number of proteins, robust with respect to errors and tolerant of very diverse types of protein similarity measures. RESULTS: We tested the classification accuracy in a test set of 599 proteins with remote sequence homology spread over 20 Gene Ontology (GO) functional classes. When compared to integration alone, graph sharpening plus integration of four vastly different molecular similarity measures improved the overall classification by nearly 30% [0.17 average increase in the area under the ROC curve (AUC)]. Moreover, and partially through the increased sparsity of the graphs induced by sharpening, this gain in accuracy came at negligible computational cost: sharpening and integration took on average 4.66 (+/-4.44) CPU seconds. AVAILABILITY: Software and Supplementary data will be available on http://mammoth.bcm.tmc.edu/


Assuntos
Algoritmos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , Proteínas/química , Proteínas/classificação , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Dados de Sequência Molecular , Proteínas/metabolismo , Relação Estrutura-Atividade , Integração de Sistemas
7.
Nucleic Acids Res ; 34(22): e152, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17130161

RESUMO

The characterization of biological function among newly determined protein structures is a central challenge in structural genomics. One class of computational solutions to this problem is based on the similarity of protein structure. Here, we implement a simple yet efficient measure of protein structure similarity, the contact metric. Even though its computation avoids structural alignments and is therefore nearly instantaneous, we find that small values correlate with geometrical root mean square deviations obtained from structural alignments. To test whether the contact metric detects functional similarity, as defined by Gene Ontology (GO) terms, it was compared in large-scale computational experiments to four other measures of structural similarity, including alignment algorithms as well as alignment independent approaches. The contact metric was the fastest method and its sensitivity, at any given specificity level, was a close second only to Fast Alignment and Search Tool--a structural alignment method that is slower by three orders of magnitude. Critically, nearly 40% of correct functional inferences by the contact metric were not identified by any other approach, which shows that the contact metric is complementary and computationally efficient in detecting functional relationships between proteins. A public 'Contact Metric Internet Server' is provided.


Assuntos
Biologia Computacional/métodos , Homologia Estrutural de Proteína , Algoritmos , Bases de Dados de Proteínas , Genes , Internet , Modelos Moleculares , Estrutura Molecular , Dobramento de Proteína , Proteínas/química , Proteínas/genética , Proteínas/fisiologia , Software , Vocabulário Controlado
8.
Int J Parasitol Drugs Drug Resist ; 8(1): 31-35, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29324251

RESUMO

In the human malaria parasite Plasmodium falciparum, membrane glutathione S-transferases (GST) have recently emerged as potential cellular detoxifying units and as drug target candidates with the artemisinin (ART) class of antimalarials inhibiting their activity at single-digit nanomolar potency when activated by iron sources such as cytotoxic hematin. Here we put forward the hypothesis that the membrane GST Plasmodium falciparum exported protein 1 (PfEXP1, PF3D7_1121600) might be directly involved in the mode of action of the unrelated antimalarial 4-aminoquinoline drug chloroquine (CQ). Along this line we report potent biochemical inhibition of membrane glutathione S-transferase activity in recombinant PfEXP1 through CQ at half maximal inhibitory CQ concentrations of 9.02 nM and 19.33 nM when using hematin and the iron deficient 1-chloro-2,4-dinitrobenzene (CDNB) as substrate, respectively. Thus, in contrast to ART, CQ may not require activation through an iron source such as hematin for a potent inhibition of membrane GST activity. Arguably, these data represent the first instance of low nanomolar inhibition of an essential Plasmodium falciparum enzyme through a 4-aminoquinoline and might encourage further investigation of PfEXP1 as a potential CQ target candidate.


Assuntos
Antígenos de Protozoários/efeitos dos fármacos , Antimaláricos/farmacologia , Cloroquina/farmacologia , Plasmodium falciparum/efeitos dos fármacos , Plasmodium falciparum/genética , Antígenos de Protozoários/genética , Sistemas de Liberação de Medicamentos , Resistência a Medicamentos , Glutationa Transferase/efeitos dos fármacos , Humanos , Concentração Inibidora 50 , Malária Falciparum/parasitologia , Plasmodium falciparum/enzimologia
9.
Protein Sci ; 15(6): 1530-6, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16672239

RESUMO

The annotation of protein function has not kept pace with the exponential growth of raw sequence and structure data. An emerging solution to this problem is to identify 3D motifs or templates in protein structures that are necessary and sufficient determinants of function. Here, we demonstrate the recurrent use of evolutionary trace information to construct such 3D templates for enzymes, search for them in other structures, and distinguish true from spurious matches. Serine protease templates built from evolutionarily important residues distinguish between proteases and other proteins nearly as well as the classic Ser-His-Asp catalytic triad. In 53 enzymes spanning 33 distinct functions, an automated pipeline identifies functionally related proteins with an average positive predictive power of 62%, including correct matches to proteins with the same function but with low sequence identity (the average identity for some templates is only 17%). Although these template building, searching, and match classification strategies are not yet optimized, their sequential implementation demonstrates a functional annotation pipeline which does not require experimental information, but only local molecular mimicry among a small number of evolutionarily important residues.


Assuntos
Evolução Molecular , Modelos Biológicos , Proteínas/química , Proteínas/metabolismo , Algoritmos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Enzimas/química , Enzimas/metabolismo
11.
Curr Opin Struct Biol ; 21(2): 180-8, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21353529

RESUMO

Genomic centers discover increasingly many protein sequences and structures, but not necessarily their full biological functions. Thus, currently, less than one percent of proteins have experimentally verified biochemical activities. To fill this gap, function prediction algorithms apply metrics of similarity between proteins on the premise that those sufficiently alike in sequence, or structure, will perform identical functions. Although high sensitivity is elusive, network analyses that integrate these metrics together hold the promise of rapid gains in function prediction specificity.


Assuntos
Simulação por Computador , Proteínas/química , Proteínas/metabolismo , Bases de Dados de Proteínas , Anotação de Sequência Molecular/métodos , Software
12.
PLoS One ; 5(12): e14286, 2010 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-21179190

RESUMO

High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks.


Assuntos
Proteínas/química , Staphylococcus aureus/genética , Algoritmos , Bioquímica/métodos , Clonagem Molecular , Análise por Conglomerados , Estudos de Coortes , Bases de Dados de Proteínas , Evolução Molecular , Genômica , Funções Verossimilhança , Modelos Genéticos , Modelos Estatísticos , Proteômica/métodos , Reprodutibilidade dos Testes , Staphylococcus aureus/metabolismo
13.
PLoS One ; 3(9): e3110, 2008 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-18769673

RESUMO

The transmission of genomic information from coding sequence to protein structure during protein synthesis is subject to stochastic errors. To analyze transmission limits in the presence of spurious errors, Shannon's noisy channel theorem is applied to a communication channel between amino acid sequences and their structures established from a large-scale statistical analysis of protein atomic coordinates. While Shannon's theorem confirms that in close to native conformations information is transmitted with limited error probability, additional random errors in sequence (amino acid substitutions) and in structure (structural defects) trigger a decrease in communication capacity toward a Shannon limit at 0.010 bits per amino acid symbol at which communication breaks down. In several controls, simulated error rates above a critical threshold and models of unfolded structures always produce capacities below this limiting value. Thus an essential biological system can be realistically modeled as a digital communication channel that is (a) sensitive to random errors and (b) restricted by a Shannon error limit. This forms a novel basis for predictions consistent with observed rates of defective ribosomal products during protein synthesis, and with the estimated excess of mutual information in protein contact potentials.


Assuntos
Aminoácidos/química , Mutação , Proteínas/química , Biologia Computacional/métodos , Bases de Dados de Proteínas , Entropia , Humanos , Modelos Estatísticos , Modelos Teóricos , Probabilidade , Conformação Proteica , Desnaturação Proteica , Dobramento de Proteína , Estrutura Terciária de Proteína , Processos Estocásticos
14.
PLoS One ; 3(5): e2136, 2008 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-18461181

RESUMO

Function prediction frequently relies on comparing genes or gene products to search for relevant similarities. Because the number of protein structures with unknown function is mushrooming, however, we asked here whether such comparisons could be improved by focusing narrowly on the key functional features of protein structures, as defined by the Evolutionary Trace (ET). Therefore a series of algorithms was built to (a) extract local motifs (3D templates) from protein structures based on ET ranking of residue importance; (b) to assess their geometric and evolutionary similarity to other structures; and (c) to transfer enzyme annotation whenever a plurality was reached across matches. Whereas a prototype had only been 80% accurate and was not scalable, here a speedy new matching algorithm enabled large-scale searches for reciprocal matches and thus raised annotation specificity to 100% in both positive and negative controls of 49 enzymes and 50 non-enzymes, respectively-in one case even identifying an annotation error-while maintaining sensitivity ( approximately 60%). Critically, this Evolutionary Trace Annotation (ETA) pipeline requires no prior knowledge of functional mechanisms. It could thus be applied in a large-scale retrospective study of 1218 structural genomics enzymes and reached 92% accuracy. Likewise, it was applied to all 2935 unannotated structural genomics proteins and predicted enzymatic functions in 320 cases: 258 on first pass and 62 more on second pass. Controls and initial analyses suggest that these predictions are reliable. Thus the large-scale evolutionary integration of sequence-structure-function data, here through reciprocal identification of local, functionally important structural features, may contribute significantly to de-orphaning the structural proteome.


Assuntos
Evolução Molecular , Proteoma/genética , Algoritmos , Enzimas/metabolismo , Proteínas/genética , Proteínas/metabolismo , Proteoma/química , Moldes Genéticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA