Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
2.
Nat Commun ; 14(1): 6719, 2023 10 23.
Artigo em Inglês | MEDLINE | ID: mdl-37872166

RESUMO

Immune checkpoint inhibitors (CPIs) are a relatively newly licenced cancer treatment, which make a once previously untreatable disease now amenable to a potential cure. Combination regimens of anti-CTLA4 and anti-PD-1 show enhanced efficacy but are prone to off-target immune-mediated tissue injury, particularly at the barrier surfaces. To probe the impact of immune checkpoints on intestinal homoeostasis, mice are challenged with anti-CTLA4 and anti-PD-1 immunotherapy and manipulation of the intestinal microbiota. The immune profile of the colon of these mice with CPI-colitis is analysed using bulk RNA sequencing, single-cell RNA sequencing and flow cytometry. CPI-colitis in mice is dependent on the composition of the intestinal microbiota and by the induction of lymphocytes expressing interferon-γ (IFNγ), cytotoxicity molecules and other pro-inflammatory cytokines/chemokines. This pre-clinical model of CPI-colitis could be attenuated following blockade of the IL23/IFNγ axis. Therapeutic targeting of IFNγ-producing lymphocytes or regulatory networks, may hold the key to reversing CPI-colitis.


Assuntos
Colite , Interferon gama , Animais , Camundongos , Colite/induzido quimicamente , Citocinas , Inibidores de Checkpoint Imunológico , Interferon gama/genética , Linfócitos
4.
Nat Immunol ; 24(9): 1540-1551, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37563310

RESUMO

Circulating proteins have important functions in inflammation and a broad range of diseases. To identify genetic influences on inflammation-related proteins, we conducted a genome-wide protein quantitative trait locus (pQTL) study of 91 plasma proteins measured using the Olink Target platform in 14,824 participants. We identified 180 pQTLs (59 cis, 121 trans). Integration of pQTL data with eQTL and disease genome-wide association studies provided insight into pathogenesis, implicating lymphotoxin-α in multiple sclerosis. Using Mendelian randomization (MR) to assess causality in disease etiology, we identified both shared and distinct effects of specific proteins across immune-mediated diseases, including directionally discordant effects of CD40 on risk of rheumatoid arthritis versus multiple sclerosis and inflammatory bowel disease. MR implicated CXCL5 in the etiology of ulcerative colitis (UC) and we show elevated gut CXCL5 transcript expression in patients with UC. These results identify targets of existing drugs and provide a powerful resource to facilitate future drug target prioritization.


Assuntos
Colite Ulcerativa , Doenças Inflamatórias Intestinais , Esclerose Múltipla , Humanos , Estudo de Associação Genômica Ampla , Doenças Inflamatórias Intestinais/genética , Locos de Características Quantitativas , Colite Ulcerativa/tratamento farmacológico , Colite Ulcerativa/genética , Inflamação/genética , Esclerose Múltipla/genética , Polimorfismo de Nucleotídeo Único
5.
J Clin Invest ; 133(13)2023 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-37219943

RESUMO

Recent transcriptomic-based analysis of diffuse large B cell lymphoma (DLBCL) has highlighted the clinical relevance of LN fibroblast and tumor-infiltrating lymphocyte (TIL) signatures within the tumor microenvironment (TME). However, the immunomodulatory role of fibroblasts in lymphoma remains unclear. Here, by studying human and mouse DLBCL-LNs, we identified the presence of an aberrantly remodeled fibroblastic reticular cell (FRC) network expressing elevated fibroblast-activated protein (FAP). RNA-Seq analyses revealed that exposure to DLBCL reprogrammed key immunoregulatory pathways in FRCs, including a switch from homeostatic to inflammatory chemokine expression and elevated antigen-presentation molecules. Functional assays showed that DLBCL-activated FRCs (DLBCL-FRCs) hindered optimal TIL and chimeric antigen receptor (CAR) T cell migration. Moreover, DLBCL-FRCs inhibited CD8+ TIL cytotoxicity in an antigen-specific manner. Notably, the interrogation of patient LNs with imaging mass cytometry identified distinct environments differing in their CD8+ TIL-FRC composition and spatial organization that associated with survival outcomes. We further demonstrated the potential to target inhibitory FRCs to rejuvenate interacting TILs. Cotreating organotypic cultures with FAP-targeted immunostimulatory drugs and a bispecific antibody (glofitamab) augmented antilymphoma TIL cytotoxicity. Our study reveals an immunosuppressive role of FRCs in DLBCL, with implications for immune evasion, disease pathogenesis, and optimizing immunotherapy for patients.


Assuntos
Linfoma Difuso de Grandes Células B , Linfócitos T , Humanos , Camundongos , Animais , Linfoma Difuso de Grandes Células B/patologia , Fibroblastos/metabolismo , Linfonodos , Microambiente Tumoral
6.
Nat Commun ; 13(1): 5820, 2022 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-36192482

RESUMO

The function of interleukin-22 (IL-22) in intestinal barrier homeostasis remains controversial. Here, we map the transcriptional landscape regulated by IL-22 in human colonic epithelial organoids and evaluate the biological, functional and clinical significance of the IL-22 mediated pathways in ulcerative colitis (UC). We show that IL-22 regulated pro-inflammatory pathways are involved in microbial recognition, cancer and immune cell chemotaxis; most prominently those involving CXCR2+ neutrophils. IL-22-mediated transcriptional regulation of CXC-family neutrophil-active chemokine expression is highly conserved across species, is dependent on STAT3 signaling, and is functionally and pathologically important in the recruitment of CXCR2+ neutrophils into colonic tissue. In UC patients, the magnitude of enrichment of the IL-22 regulated transcripts in colonic biopsies correlates with colonic neutrophil infiltration and is enriched in non-responders to ustekinumab therapy. Our data provide further insights into the biology of IL-22 in human disease and highlight its function in the regulation of pathogenic immune pathways, including neutrophil chemotaxis. The transcriptional networks regulated by IL-22 are functionally and clinically important in UC, impacting patient trajectories and responsiveness to biological intervention.


Assuntos
Colite Ulcerativa , Quimiocinas CXC/metabolismo , Colite Ulcerativa/tratamento farmacológico , Colite Ulcerativa/genética , Humanos , Interleucina-8/metabolismo , Interleucinas , Infiltração de Neutrófilos , Neutrófilos/metabolismo , Receptores de Interleucina-8B/metabolismo , Ustekinumab/farmacologia , Ustekinumab/uso terapêutico , Interleucina 22
7.
Cell Rep ; 40(13): 111439, 2022 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-36170836

RESUMO

Interactions between the epithelium and the immune system are critical in the pathogenesis of inflammatory bowel disease (IBD). In this study, we mapped the transcriptional landscape of human colonic epithelial organoids in response to different cytokines responsible for mediating canonical mucosal immune responses. By profiling the transcriptome of human colonic organoids treated with the canonical cytokines interferon gamma, interleukin-13, -17A, and tumor necrosis factor alpha with next-generation sequencing, we unveil shared and distinct regulation patterns of epithelial function by different cytokines. An integrative analysis of cytokine responses in diseased tissue from patients with IBD (n = 1,009) reveals a molecular classification of mucosal inflammation defined by gradients of cytokine-responsive transcriptional signatures. Our systems biology approach detected signaling bottlenecks in cytokine-responsive networks and highlighted their translational potential as theragnostic targets in intestinal inflammation.


Assuntos
Doenças Inflamatórias Intestinais , Organoides , Colo/patologia , Citocinas , Humanos , Inflamação/patologia , Doenças Inflamatórias Intestinais/patologia , Interferon gama/farmacologia , Interleucina-13 , Mucosa Intestinal/patologia , Organoides/patologia , Fator de Necrose Tumoral alfa
8.
Cell Rep Med ; 2(12): 100473, 2021 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-35028614

RESUMO

Despite its role in cancer surveillance, adoptive immunotherapy using γδ T cells has achieved limited efficacy. To enhance trafficking to bone marrow, circulating Vγ9Vδ2 T cells are expanded in serum-free medium containing TGF-ß1 and IL-2 (γδ[T2] cells) or medium containing IL-2 alone (γδ[2] cells, as the control). Unexpectedly, the yield and viability of γδ[T2] cells are also increased by TGF-ß1, when compared to γδ[2] controls. γδ[T2] cells are less differentiated and yet display increased cytolytic activity, cytokine release, and antitumor activity in several leukemic and solid tumor models. Efficacy is further enhanced by cancer cell sensitization using aminobisphosphonates or Ara-C. A number of contributory effects of TGF-ß are described, including prostaglandin E2 receptor downmodulation, TGF-ß insensitivity, and upregulated integrin activity. Biological relevance is supported by the identification of a favorable γδ[T2] signature in acute myeloid leukemia (AML). Given their enhanced therapeutic activity and compatibility with allogeneic use, γδ[T2] cells warrant evaluation in cancer immunotherapy.


Assuntos
Imunoterapia Adotiva , Leucemia Mieloide Aguda/imunologia , Leucemia Mieloide Aguda/terapia , Receptores de Antígenos de Linfócitos T gama-delta/metabolismo , Fator de Crescimento Transformador beta1/metabolismo , Animais , Células da Medula Óssea/patologia , Linhagem Celular Tumoral , Movimento Celular , Proliferação de Células , Meios de Cultura Livres de Soro/farmacologia , Perfilação da Expressão Gênica , Regulação Leucêmica da Expressão Gênica , Humanos , Imunofenotipagem , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/patologia , Ativação Linfocitária , Camundongos SCID , Prognóstico
9.
Commun Biol ; 3(1): 644, 2020 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-33149188

RESUMO

The tumour microenvironment plays a crucial role in the growth and progression of cancer, and the presence of tumour-associated macrophages (TAMs) is associated with poor prognosis. Recent studies have demonstrated that TAMs display transcriptomic, phenotypic, functional and geographical diversity. Here we show that a sialylated tumour-associated glycoform of the mucin MUC1, MUC1-ST, through the engagement of Siglec-9 can specifically and independently induce the differentiation of monocytes into TAMs with a unique phenotype that to the best of our knowledge has not previously been described. These TAMs can recruit and prolong the lifespan of neutrophils, inhibit the function of T cells, degrade basement membrane allowing for invasion, are inefficient at phagocytosis, and can induce plasma clotting. This macrophage phenotype is enriched in the stroma at the edge of breast cancer nests and their presence is associated with poor prognosis in breast cancer patients.


Assuntos
Macrófagos/fisiologia , Monócitos/fisiologia , Mucina-1/metabolismo , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Diferenciação Celular , Linhagem Celular Tumoral , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Mucina-1/genética
10.
Gut ; 69(3): 578-590, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31792136

RESUMO

OBJECTIVE: The functional role of interleukin-22 (IL22) in chronic inflammation is controversial, and mechanistic insights into how it regulates target tissue are lacking. In this study, we evaluated the functional role of IL22 in chronic colitis and probed mechanisms of IL22-mediated regulation of colonic epithelial cells. DESIGN: To investigate the functional role of IL22 in chronic colitis and how it regulates colonic epithelial cells, we employed a three-dimentional mini-gut epithelial organoid system, in vivo disease models and transcriptomic datasets in human IBD. RESULTS: As well as inducing transcriptional modules implicated in antimicrobial responses, IL22 also coordinated an endoplasmic reticulum (ER) stress response transcriptional programme in colonic epithelial cells. In the colon of patients with active colonic Crohn's disease (CD), there was enrichment of IL22-responsive transcriptional modules and ER stress response modules. Strikingly, in an IL22-dependent model of chronic colitis, targeting IL22 alleviated colonic epithelial ER stress and attenuated colitis. Pharmacological modulation of the ER stress response similarly impacted the severity of colitis. In patients with colonic CD, antibody blockade of IL12p40, which simultaneously blocks IL12 and IL23, the key upstream regulator of IL22 production, alleviated the colonic epithelial ER stress response. CONCLUSIONS: Our data challenge perceptions of IL22 as a predominantly beneficial cytokine in IBD and provide novel insights into the molecular mechanisms of IL22-mediated pathogenicity in chronic colitis. Targeting IL22-regulated pathways and alleviating colonic epithelial ER stress may represent promising therapeutic strategies in patients with colitis. TRIAL REGISTRATION NUMBER: NCT02749630.


Assuntos
Colite/genética , Doença de Crohn/fisiopatologia , Estresse do Retículo Endoplasmático/genética , Células Epiteliais/fisiologia , Interleucinas/farmacologia , Transcrição Gênica , Animais , Antibacterianos/farmacologia , Apoptose/efeitos dos fármacos , Apoptose/genética , Sobrevivência Celular/efeitos dos fármacos , Doença Crônica , Colite/sangue , Colite/tratamento farmacológico , Colite/patologia , Colo/patologia , Doença de Crohn/patologia , Modelos Animais de Doenças , Estresse do Retículo Endoplasmático/efeitos dos fármacos , Fármacos Gastrointestinais/farmacologia , Fármacos Gastrointestinais/uso terapêutico , Humanos , Interleucina-17/farmacologia , Interleucina-23/antagonistas & inibidores , Interleucinas/sangue , Interleucinas/genética , Mucosa Intestinal/patologia , Camundongos , Organoides , Gravidade do Paciente , Fenilbutiratos/farmacologia , Proteínas Recombinantes/farmacologia , Transcrição Gênica/efeitos dos fármacos , Tunicamicina/farmacologia , Resposta a Proteínas não Dobradas , Ustekinumab/farmacologia , Ustekinumab/uso terapêutico , Interleucina 22
11.
PLoS One ; 14(7): e0209958, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31335894

RESUMO

Protein-protein interaction network data provides valuable information that infers direct links between genes and their biological roles. This information brings a fundamental hypothesis for protein function prediction that interacting proteins tend to have similar functions. With the help of recently-developed network embedding feature generation methods and deep maxout neural networks, it is possible to extract functional representations that encode direct links between protein-protein interactions information and protein function. Our novel method, STRING2GO, successfully adopts deep maxout neural networks to learn functional representations simultaneously encoding both protein-protein interactions and functional predictive information. The experimental results show that STRING2GO outperforms other protein-protein interaction network-based prediction methods and one benchmark method adopted in a recent large scale protein function prediction competition.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Proteínas , Humanos , Proteínas/genética , Proteínas/metabolismo
12.
PLoS One ; 13(6): e0198216, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29889900

RESUMO

Machine learning methods for protein function prediction are urgently needed, especially now that a substantial fraction of known sequences remains unannotated despite the extensive use of functional assignments based on sequence similarity. One major bottleneck supervised learning faces in protein function prediction is the structured, multi-label nature of the problem, because biological roles are represented by lists of terms from hierarchically organised controlled vocabularies such as the Gene Ontology. In this work, we build on recent developments in the area of deep learning and investigate the usefulness of multi-task deep neural networks (MTDNN), which consist of upstream shared layers upon which are stacked in parallel as many independent modules (additional hidden layers with their own output units) as the number of output GO terms (the tasks). MTDNN learns individual tasks partially using shared representations and partially from task-specific characteristics. When no close homologues with experimentally validated functions can be identified, MTDNN gives more accurate predictions than baseline methods based on annotation frequencies in public databases or homology transfers. More importantly, the results show that MTDNN binary classification accuracy is higher than alternative machine learning-based methods that do not exploit commonalities and differences among prediction tasks. Interestingly, compared with a single-task predictor, the performance improvement is not linearly correlated with the number of tasks in MTDNN, but medium size models provide more improvement in our case. One of advantages of MTDNN is that given a set of features, there is no requirement for MTDNN to have a bootstrap feature selection procedure as what traditional machine learning algorithms do. Overall, the results indicate that the proposed MTDNN algorithm improves the performance of protein function prediction. On the other hand, there is still large room for deep learning techniques to further enhance prediction ability.


Assuntos
Bases de Dados de Proteínas , Aprendizado de Máquina , Redes Neurais de Computação , Proteínas , Humanos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo
13.
Methods Mol Biol ; 1446: 55-67, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27812935

RESUMO

Surveys of public sequence resources show that experimentally supported functional information is still completely missing for a considerable fraction of known proteins and is clearly incomplete for an even larger portion. Bioinformatics methods have long made use of very diverse data sources alone or in combination to predict protein function, with the understanding that different data types help elucidate complementary biological roles. This chapter focuses on methods accepting amino acid sequences as input and producing GO term assignments directly as outputs; the relevant biological and computational concepts are presented along with the advantages and limitations of individual approaches.


Assuntos
Biologia Computacional/métodos , Ontologia Genética , Anotação de Sequência Molecular/métodos , Animais , Bases de Dados de Proteínas , Humanos , Filogenia , Proteínas/genética , Proteínas/metabolismo
14.
Genome Biol ; 17(1): 184, 2016 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-27604469

RESUMO

BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.


Assuntos
Biologia Computacional , Proteínas/química , Software , Relação Estrutura-Atividade , Algoritmos , Bases de Dados de Proteínas , Ontologia Genética , Humanos , Anotação de Sequência Molecular , Proteínas/genética
15.
Sci Rep ; 6: 31865, 2016 08 26.
Artigo em Inglês | MEDLINE | ID: mdl-27561554

RESUMO

Predicting protein function has been a major goal of bioinformatics for several decades, and it has gained fresh momentum thanks to recent community-wide blind tests aimed at benchmarking available tools on a genomic scale. Sequence-based predictors, especially those performing homology-based transfers, remain the most popular but increasing understanding of their limitations has stimulated the development of complementary approaches, which mostly exploit machine learning. Here we present FFPred 3, which is intended for assigning Gene Ontology terms to human protein chains, when homology with characterized proteins can provide little aid. Predictions are made by scanning the input sequences against an array of Support Vector Machines (SVMs), each examining the relationship between protein function and biophysical attributes describing secondary structure, transmembrane helices, intrinsically disordered regions, signal peptides and other motifs. This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation. The effectiveness of this approach is demonstrated through benchmarking experiments, and its usefulness is illustrated by analysing the potential functional consequences of alternative splicing in human and their relationship to patterns of biological features.


Assuntos
Biologia Computacional/métodos , Ontologia Genética , Humanos
16.
Bioinformatics ; 31(6): 857-63, 2015 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-25391399

RESUMO

MOTIVATION: A sizeable fraction of eukaryotic proteins contain intrinsically disordered regions (IDRs), which act in unfolded states or by undergoing transitions between structured and unstructured conformations. Over time, sequence-based classifiers of IDRs have become fairly accurate and currently a major challenge is linking IDRs to their biological roles from the molecular to the systems level. RESULTS: We describe DISOPRED3, which extends its predecessor with new modules to predict IDRs and protein-binding sites within them. Based on recent CASP evaluation results, DISOPRED3 can be regarded as state of the art in the identification of IDRs, and our self-assessment shows that it significantly improves over DISOPRED2 because its predictions are more specific across the whole board and more sensitive to IDRs longer than 20 amino acids. Predicted IDRs are annotated as protein binding through a novel SVM based classifier, which uses profile data and additional sequence-derived features. Based on benchmarking experiments with full cross-validation, we show that this predictor generates precise assignments of disordered protein binding regions and that it compares well with other publicly available tools.


Assuntos
Biologia Computacional , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/metabolismo , Anotação de Sequência Molecular , Software , Sítios de Ligação , Bases de Dados Factuais , Redes Neurais de Computação , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas
17.
Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25348407

RESUMO

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.


Assuntos
Bases de Dados de Proteínas , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Algoritmos , Genômica , Internet , Modelos Moleculares , Estrutura Terciária de Proteína/genética , Análise de Sequência de Proteína
18.
Proteins ; 82 Suppl 2: 98-111, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23900810

RESUMO

Here we report on the assessment results of the third experiment to evaluate the state of the art in protein model refinement, where participants were invited to improve the accuracy of initial protein models for 27 targets. Using an array of complementary evaluation measures, we find that five groups performed better than the naïve (null) method-a marked improvement over CASP9, although only three were significantly better. The leading groups also demonstrated the ability to consistently improve both backbone and side chain positioning, while other groups reliably enhanced other aspects of protein physicality. The top-ranked group succeeded in improving the backbone conformation in almost 90% of targets, suggesting a strategy that for the first time in CASP refinement is successful in a clear majority of cases. A number of issues remain unsolved: the majority of groups still fail to improve the quality of the starting models; even successful groups are only able to make modest improvements; and no prediction is more similar to the native structure than to the starting model. Successful refinement attempts also often go unrecognized, as suggested by the relatively larger improvements when predictions not submitted as model 1 are also considered.


Assuntos
Biologia Computacional , Modelos Moleculares , Conformação Proteica , Proteínas/química , Algoritmos , Biologia Computacional/métodos , Biologia Computacional/normas , Modelos Estatísticos , Alinhamento de Sequência
19.
PLoS One ; 8(5): e63754, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23717476

RESUMO

To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information - either experimentally validated or computationally inferred by similarity - remains completely missing for approximately 30% of human proteins. FFPred was initially developed to bridge this gap by targeting sequences with distant or no homologues of known function and by exploiting clear patterns of intrinsic disorder associated with particular molecular activities and biological processes. Here, we present an updated and improved version, which builds on larger datasets of protein sequences and annotations, and uses updated component feature predictors as well as revised training procedures. FFPred 2.0 includes support vector regression models for the prediction of 442 Gene Ontology (GO) terms, which largely expand the coverage of the ontology and of the biological process category in particular. The GO term list mainly revolves around macromolecular interactions and their role in regulatory, signalling, developmental and metabolic processes. Benchmarking experiments on newly annotated proteins show that FFPred 2.0 provides more accurate functional assignments than its predecessor and the ProtFun server do; also, its assignments can complement information obtained using BLAST-based transfer of annotations, improving especially prediction in the biological process category. Furthermore, FFPred 2.0 can be used to annotate proteins belonging to several eukaryotic organisms with a limited decrease in prediction quality. We illustrate all these points through the use of both precision-recall plots and of the COGIC scores, which we recently proposed as an alternative numerical evaluation measure of function prediction accuracy.


Assuntos
Sequência de Aminoácidos/genética , Fenômenos Biológicos/genética , Genoma Humano/genética , Anotação de Sequência Molecular/métodos , Proteínas/genética , Homologia de Sequência de Aminoácidos , Biologia Computacional/métodos , Ontologia Genética , Humanos , Proteoma/genética , Software
20.
BMC Bioinformatics ; 14 Suppl 3: S1, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23514099

RESUMO

BACKGROUND: Accurate protein function annotation is a severe bottleneck when utilizing the deluge of high-throughput, next generation sequencing data. Keeping database annotations up-to-date has become a major scientific challenge that requires the development of reliable automatic predictors of protein function. The CAFA experiment provided a unique opportunity to undertake comprehensive 'blind testing' of many diverse approaches for automated function prediction. We report on the methodology we used for this challenge and on the lessons we learnt. METHODS: Our method integrates into a single framework a wide variety of biological information sources, encompassing sequence, gene expression and protein-protein interaction data, as well as annotations in UniProt entries. The methodology transfers functional categories based on the results from complementary homology-based and feature-based analyses. We generated the final molecular function and biological process assignments by combining the initial predictions in a probabilistic manner, which takes into account the Gene Ontology hierarchical structure. RESULTS: We propose a novel scoring function called COmbined Graph-Information Content similarity (COGIC) score for the comparison of predicted functional categories and benchmark data. We demonstrate that our integrative approach provides increased scope and accuracy over both the component methods and the naïve predictors. In line with previous studies, we find that molecular function predictions are more accurate than biological process assignments. CONCLUSIONS: Overall, the results indicate that there is considerable room for improvement in the field. It still remains for the community to invest a great deal of effort to make automated function prediction a useful and routine component in the toolbox of life scientists. As already witnessed in other areas, community-wide blind testing experiments will be pivotal in establishing standards for the evaluation of prediction accuracy, in fostering advancements and new ideas, and ultimately in recording progress.


Assuntos
Proteínas/fisiologia , Biologia Computacional/métodos , Bases de Dados de Proteínas , Evolução Molecular , Expressão Gênica , Anotação de Sequência Molecular , Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/genética , Análise de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA