Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34415019

RESUMO

Over the past few years, meta-analysis has become popular among biomedical researchers for detecting biomarkers across multiple cohort studies with increased predictive power. Combining datasets from different sources increases sample size, thus overcoming the issue related to limited sample size from each individual study and boosting the predictive power. This leads to an increased likelihood of more accurately predicting differentially expressed genes/proteins or significant biomarkers underlying the biological condition of interest. Currently, several meta-analysis methods and tools exist, each having its own strengths and limitations. In this paper, we survey existing meta-analysis methods, and assess the performance of different methods based on results from different datasets as well as assessment from prior knowledge of each method. This provides a reference summary of meta-analysis models and tools, which helps to guide end-users on the choice of appropriate models or tools for given types of datasets and enables developers to consider current advances when planning the development of new meta-analysis models and more practical integrative tools.


Assuntos
Algoritmos , Análise de Dados , Metanálise como Assunto , Software , Árvores de Decisões , Humanos , Fluxo de Trabalho
2.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33129201

RESUMO

Advances in high-throughput sequencing technologies have resulted in an exponential growth of publicly accessible biological datasets. In the 'big data' driven 'post-genomic' context, much work is being done to explore human protein-protein interactions (PPIs) for a systems level based analysis to uncover useful signals and gain more insights to advance current knowledge and answer specific biological and health questions. These PPIs are experimentally or computationally predicted, stored in different online databases and some of PPI resources are updated regularly. As with many biological datasets, such regular updates continuously render older PPI datasets potentially outdated. Moreover, while many of these interactions are shared between these online resources, each resource includes its own identified PPIs and none of these databases exhaustively contains all existing human PPI maps. In this context, it is essential to enable the integration of or combining interaction datasets from different resources, to generate a PPI map with increased coverage and confidence. To allow researchers to produce an integrated human PPI datasets in real-time, we introduce the integrated human protein-protein interaction network generator (IHP-PING) tool. IHP-PING is a flexible python package which generates a human PPI network from freely available online resources. This tool extracts and integrates heterogeneous PPI datasets to generate a unified PPI network, which is stored locally for further applications.


Assuntos
Bases de Dados de Proteínas , Linguagens de Programação , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Humanos
3.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33341897

RESUMO

Current variant calling (VC) approaches have been designed to leverage populations of long-range haplotypes and were benchmarked using populations of European descent, whereas most genetic diversity is found in non-European such as Africa populations. Working with these genetically diverse populations, VC tools may produce false positive and false negative results, which may produce misleading conclusions in prioritization of mutations, clinical relevancy and actionability of genes. The most prominent question is which tool or pipeline has a high rate of sensitivity and precision when analysing African data with either low or high sequence coverage, given the high genetic diversity and heterogeneity of this data. Here, a total of 100 synthetic Whole Genome Sequencing (WGS) samples, mimicking the genetics profile of African and European subjects for different specific coverage levels (high/low), have been generated to assess the performance of nine different VC tools on these contrasting datasets. The performances of these tools were assessed in false positive and false negative call rates by comparing the simulated golden variants to the variants identified by each VC tool. Combining our results on sensitivity and positive predictive value (PPV), VarDict [PPV = 0.999 and Matthews correlation coefficient (MCC) = 0.832] and BCFtools (PPV = 0.999 and MCC = 0.813) perform best when using African population data on high and low coverage data. Overall, current VC tools produce high false positive and false negative rates when analysing African compared with European data. This highlights the need for development of VC approaches with high sensitivity and precision tailored for populations characterized by high genetic variations and low linkage disequilibrium.


Assuntos
População Negra/genética , Bases de Dados de Ácidos Nucleicos , Variação Genética , Genoma Humano , População Branca/genética , Sequenciamento Completo do Genoma , Humanos , Desequilíbrio de Ligação
4.
Brief Bioinform ; 21(1): 144-155, 2020 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-30462157

RESUMO

Advances in human sequencing technologies, coupled with statistical and computational tools, have fostered the development of methods for dating admixture events. These methods have merits and drawbacks in estimating admixture events in multi-way admixed populations. Here, we first provide a comprehensive review and comparison of current methods pertinent to dating admixture events. Second, we assess various admixture dating tools. We do so by performing various simulations. Third, we apply the top two assessed methods to real data of a uniquely admixed population from South Africa. Results reveal that current dating admixture models are not sufficiently equipped to estimate ancient admixtures events and to identify multi-faceted admixture events in complex multi-way admixed populations. We conclude with a discussion of research areas where further work on dating admixture-based methods is needed.

5.
Brief Bioinform ; 21(5): 1663-1675, 2020 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-31711157

RESUMO

Drug-like compounds are most of the time denied approval and use owing to the unexpected clinical side effects and cross-reactivity observed during clinical trials. These unexpected outcomes resulting in significant increase in attrition rate centralizes on the selected drug targets. These targets may be disease candidate proteins or genes, biological pathways, disease-associated microRNAs, disease-related biomarkers, abnormal molecular phenotypes, crucial nodes of biological network or molecular functions. This is generally linked to several factors, including incomplete knowledge on the drug targets and unpredicted pharmacokinetic expressions upon target interaction or off-target effects. A method used to identify targets, especially for polygenic diseases, is essential and constitutes a major bottleneck in drug development with the fundamental stage being the identification and validation of drug targets of interest for further downstream processes. Thus, various computational methods have been developed to complement experimental approaches in drug discovery. Here, we present an overview of various computational methods and tools applied in predicting or validating drug targets and drug-like molecules. We provide an overview on their advantages and compare these methods to identify effective methods which likely lead to optimal results. We also explore major sources of drug failure considering the challenges and opportunities involved. This review might guide researchers on selecting the most efficient approach or technique during the computational drug discovery process.


Assuntos
Biologia Computacional/métodos , Sistemas de Liberação de Medicamentos , Biomarcadores/metabolismo , Simulação por Computador , Descoberta de Drogas , Aprendizado de Máquina , Simulação de Acoplamento Molecular
6.
Brief Bioinform ; 20(2): 690-700, 2019 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-29701762

RESUMO

Over thousands of genetic associations to diseases have been identified by genome-wide association studies (GWASs), which conceptually is a single-marker-based approach. There are potentially many uses of these identified variants, including a better understanding of the pathogenesis of diseases, new leads for studying underlying risk prediction and clinical prediction of treatment. However, because of inadequate power, GWAS might miss disease genes and/or pathways with weak genetic or strong epistatic effects. Driven by the need to extract useful information from GWAS summary statistics, post-GWAS approaches (PGAs) were introduced. Here, we dissect and discuss advances made in pathway/network-based PGAs, with a particular focus on protein-protein interaction networks that leverage GWAS summary statistics by combining effects of multiple loci, subnetworks or pathways to detect genetic signals associated with complex diseases. We conclude with a discussion of research areas where further work on summary statistic-based methods is needed.


Assuntos
Biologia Computacional/métodos , Estudo de Associação Genômica Ampla , Epistasia Genética , Humanos , Mapas de Interação de Proteínas
7.
Brief Bioinform ; 20(5): 1709-1724, 2019 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-30010715

RESUMO

Over the past decade, studies of admixed populations have increasingly gained interest in both medical and population genetics. These studies have so far shed light on the patterns of genetic variation throughout modern human evolution and have improved our understanding of the demographics and adaptive processes of human populations. To date, there exist about 20 methods or tools to deconvolve local ancestry. These methods have merits and drawbacks in estimating local ancestry in multiway admixed populations. In this article, we survey existing ancestry deconvolution methods, with special emphasis on multiway admixture, and compare these methods based on simulation results reported by different studies, computational approaches used, including mathematical and statistical models, and biological challenges related to each method. This should orient users on the choice of an appropriate method or tool for given population admixture characteristics and update researchers on current advances, challenges and opportunities behind existing ancestry deconvolution methods.


Assuntos
Evolução Molecular , Genoma Humano , Modelos Genéticos , Humanos
8.
Malar J ; 20(1): 421, 2021 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-34702263

RESUMO

BACKGROUND: The emergence and spread of malaria drug resistance have resulted in the need to understand disease mechanisms and importantly identify essential targets and potential drug candidates. Malaria infection involves the complex interaction between the host and pathogen, thus, functional interactions between human and Plasmodium falciparum is essential to obtain a holistic view of the genetic architecture of malaria. Several functional interaction studies have extended the understanding of malaria disease and integrating such datasets would provide further insights towards understanding drug resistance and/or genetic resistance/susceptibility, disease pathogenesis, and drug discovery. METHODS: This study curated and analysed data including pathogen and host selective genes, host and pathogen protein sequence data, protein-protein interaction datasets, and drug data from literature and databases to perform human-host and P. falciparum network-based analysis. An integrative computational framework is presented that was developed and found to be reasonably accurate based on various evaluations, applications, and experimental evidence of outputs produced, from data-driven analysis. RESULTS: This approach revealed 8 hub protein targets essential for parasite and human host-directed malaria drug therapy. In a semantic similarity approach, 26 potential repurposable drugs involved in regulating host immune response to inflammatory-driven disorders and/or inhibiting residual malaria infection that can be appropriated for malaria treatment. Further analysis of host-pathogen network shortest paths enabled the prediction of immune-related biological processes and pathways subverted by P. falciparum to increase its within-host survival. CONCLUSIONS: Host-pathogen network analysis reveals potential drug targets and biological processes and pathways subverted by P. falciparum to enhance its within malaria host survival. The results presented have implications for drug discovery and will inform experimental studies.


Assuntos
Descoberta de Drogas , Resistência a Medicamentos/genética , Malária Falciparum/prevenção & controle , Plasmodium falciparum/genética , Mapeamento de Interação de Proteínas , Proteínas de Protozoários/genética , Antimaláricos/uso terapêutico , Simulação por Computador , Humanos , Plasmodium falciparum/efeitos dos fármacos
9.
Brief Bioinform ; 19(6): 1141-1152, 2018 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-28520909

RESUMO

Populations worldwide currently face several public health challenges, including growing prevalence of infections and the emergence of new pathogenic organisms. The cost and risk associated with drug development make the development of new drugs for several diseases, especially orphan or rare diseases, unappealing to the pharmaceutical industry. Proof of drug safety and efficacy is required before market approval, and rigorous testing makes the drug development process slow, expensive and frequently result in failure. This failure is often because of the use of irrelevant targets identified in the early steps of the drug discovery process, suggesting that target identification and validation are cornerstones for the success of drug discovery and development. Here, we present a large-scale data-driven integrative computational framework to extract essential targets and processes from an existing disease-associated data set and enhance target selection by leveraging drug-target-disease association at the systems level. We applied this framework to tuberculosis and Ebola virus diseases combining heterogeneous data from multiple sources, including protein-protein functional interaction, functional annotation and pharmaceutical data sets. Results obtained demonstrate the effectiveness of the pipeline, leading to the extraction of essential drug targets and to the rational use of existing approved drugs. This provides an opportunity to move toward optimal target-based strategies for screening available drugs and for drug discovery. There is potential for this model to bridge the gap in the production of orphan disease therapies, offering a systematic approach to predict new uses for existing drugs, thereby harnessing their full therapeutic potential.


Assuntos
Conjuntos de Dados como Assunto , Antituberculosos/química , Antituberculosos/farmacologia , Antivirais/química , Antivirais/farmacologia , Desenvolvimento de Medicamentos , Ebolavirus/efeitos dos fármacos , Doença pelo Vírus Ebola/genética , Interações Hospedeiro-Patógeno , Humanos , Anotação de Sequência Molecular , Mycobacterium tuberculosis/efeitos dos fármacos , Reprodutibilidade dos Testes , Tuberculose/genética
10.
Brief Bioinform ; 18(5): 886-901, 2017 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27473066

RESUMO

Gene Ontology (GO) semantic similarity tools enable retrieval of semantic similarity scores, which incorporate biological knowledge embedded in the GO structure for comparing or classifying different proteins or list of proteins based on their GO annotations. This facilitates a better understanding of biological phenomena underlying the corresponding experiment and enables the identification of processes pertinent to different biological conditions. Currently, about 14 tools are available, which may play an important role in improving protein analyses at the functional level using different GO semantic similarity measures. Here we survey these tools to provide a comprehensive view of the challenges and advances made in this area to avoid redundant effort in developing features that already exist, or implementing ideas already proven to be obsolete in the context of GO. This helps researchers, tool developers, as well as end users, understand the underlying semantic similarity measures implemented through knowledge of pertinent features of, and issues related to, a particular tool. This should empower users to make appropriate choices for their biological applications and ensure effective knowledge discovery based on GO annotations.


Assuntos
Ontologia Genética , Humanos , Anotação de Sequência Molecular , Semântica , Inquéritos e Questionários
11.
Bioinformatics ; 33(19): 2995-3002, 2017 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-28957497

RESUMO

MOTIVATION: Recent technological advances in high-throughput sequencing and genotyping have facilitated an improved understanding of genomic structure and disease-associated genetic factors. In this context, simulation models can play a critical role in revealing various evolutionary and demographic effects on genomic variation, enabling researchers to assess existing and design novel analytical approaches. Although various simulation frameworks have been suggested, they do not account for natural selection in admixture processes. Most are tailored to a single chromosome or a genomic region, very few capture large-scale genomic data, and most are not accessible for genomic communities. RESULTS: Here we develop a multi-scenario genome-wide medical population genetics simulation framework called 'FractalSIM'. FractalSIM has the capability to accurately mimic and generate genome-wide data under various genetic models on genetic diversity, genomic variation affecting diseases and DNA sequence patterns of admixed and/or homogeneous populations. Moreover, the framework accounts for natural selection in both homogeneous and admixture processes. The outputs of FractalSIM have been assessed using popular tools, and the results demonstrated its capability to accurately mimic real scenarios. They can be used to evaluate the performance of a range of genomic tools from ancestry inference to genome-wide association studies. AVAILABILITY AND IMPLEMENTATION: The FractalSIM package is available at http://www.cbio.uct.ac.za/FractalSIM. CONTACT: emile.chimusa@uct.ac.za. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genética Populacional/métodos , Genômica/métodos , Variação Genética , Genoma , Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Polimorfismo de Nucleotídeo Único , Seleção Genética , Análise de Sequência de DNA , Software
12.
BMC Plant Biol ; 17(1): 218, 2017 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-29169324

RESUMO

BACKGROUND: Advances in forward and reverse genetic techniques have enabled the discovery and identification of several plant defence genes based on quantifiable disease phenotypes in mutant populations. Existing models for testing the effect of gene inactivation or genes causing these phenotypes do not take into account eventual uncertainty of these datasets and potential noise inherent in the biological experiment used, which may mask downstream analysis and limit the use of these datasets. Moreover, elucidating biological mechanisms driving the induced disease resistance and influencing these observable disease phenotypes has never been systematically tackled, eliciting the need for an efficient model to characterize completely the gene target under consideration. RESULTS: We developed a post-gene silencing bioinformatics (post-GSB) protocol which accounts for potential biases related to the disease phenotype datasets in assessing the contribution of the gene target to the plant defence response. The post-GSB protocol uses Gene Ontology semantic similarity and pathway dataset to generate enriched process regulatory network based on the functional degeneracy of the plant proteome to help understand the induced plant defence response. We applied this protocol to investigate the effect of the NPR1 gene silencing to changes in Arabidopsis thaliana plants following Pseudomonas syringae pathovar tomato strain DC3000 infection. Results indicated that the presence of a functionally active NPR1 reduced the plant's susceptibility to the infection, with about 99% of variability in Pseudomonas spore growth between npr1 mutant and wild-type samples. Moreover, the post-GSB protocol has revealed the coordinate action of target-associated genes and pathways through an enriched process regulatory network, summarizing the potential target-based induced disease resistance mechanism. CONCLUSIONS: This protocol can improve the characterization of the gene target and, potentially, elucidate induced defence response by more effectively utilizing available phenotype information and plant proteome functional knowledge.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Biologia Computacional/métodos , Doenças das Plantas/genética , Arabidopsis/microbiologia , Proteínas de Arabidopsis/fisiologia , Conjuntos de Dados como Assunto , Inativação Gênica , Modelos Genéticos , Mutação , Fenótipo , Doenças das Plantas/microbiologia , Pseudomonas syringae/fisiologia
13.
Bioinformatics ; 32(3): 477-9, 2016 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-26476781

RESUMO

SUMMARY: Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. AVAILABILITY AND IMPLEMENTATION: A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). CONTACT: gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Ontologia Genética , Genes , Anotação de Sequência Molecular/métodos , Proteínas/genética , Semântica , Software , Bases de Dados Factuais , Humanos
14.
Bioinformatics ; 32(4): 549-56, 2016 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-26508762

RESUMO

MOTIVATION: Despite numerous successful Genome-wide Association Studies (GWAS), detecting variants that have low disease risk still poses a challenge. GWAS may miss disease genes with weak genetic effects or strong epistatic effects due to the single-marker testing approach commonly used. GWAS may thus generate false negative or inconclusive results, suggesting the need for novel methods to combine effects of single nucleotide polymorphisms within a gene to increase the likelihood of fully characterizing the susceptibility gene. RESULTS: We developed ancGWAS, an algebraic graph-based centrality measure that accounts for linkage disequilibrium in identifying significant disease sub-networks by integrating the association signal from GWAS data sets into the human protein-protein interaction (PPI) network. We validated ancGWAS using an association study result from a breast cancer data set and the simulation of interactive disease loci in the simulation of a complex admixed population, as well as pathway-based GWAS simulation. This new approach holds promise for deconvoluting the interactions between genes underlying the pathogenesis of complex diseases. Results obtained yield a novel central breast cancer sub-network of the human interactome implicated in the proteoglycan syndecan-mediated signaling events pathway which is known to play a major role in mesenchymal tumor cell proliferation, thus providing further insights into breast cancer pathogenesis. AVAILABILITY AND IMPLEMENTATION: The ancGWAS package and documents are available at http://www.cbio.uct.ac.za/~emile/software.html.


Assuntos
Neoplasias da Mama/patologia , Genética Populacional , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único/genética , Mapeamento de Interação de Proteínas/métodos , Transdução de Sinais , Software , Neoplasias da Mama/epidemiologia , Neoplasias da Mama/genética , Feminino , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos , Desequilíbrio de Ligação
15.
BMC Bioinformatics ; 15: 129, 2014 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-24885165

RESUMO

BACKGROUND: Interaction between proteins is one of the most important mechanisms in the execution of cellular functions. The study of these interactions has provided insight into the functioning of an organism's processes. As of October 2013, Homo sapiens had over 170000 Protein-Protein interactions (PPI) registered in the Interologous Interaction Database, which is only one of the many public resources where protein interactions can be accessed. These numbers exemplify the volume of data that research on the topic has generated. Visualization of large data sets is a well known strategy to make sense of information, and protein interaction data is no exception. There are several tools that allow the exploration of this data, providing different methods to visualize protein network interactions. However, there is still no native web tool that allows this data to be explored interactively online. RESULTS: Given the advances that web technologies have made recently it is time to bring these interactive views to the web to provide an easily accessible forum to visualize PPI. We have created a Web-based Protein Interaction Network Visualizer: PINV, an open source, native web application that facilitates the visualization of protein interactions (http://biosual.cbio.uct.ac.za/pinv.html). We developed PINV as a set of components that follow the protocol defined in BioJS and use the D3 library to create the graphic layouts. We demonstrate the use of PINV with multi-organism interaction networks for a predicted target from Mycobacterium tuberculosis, its interacting partners and its orthologs. CONCLUSIONS: The resultant tool provides an attractive view of complex, fully interactive networks with components that allow the querying, filtering and manipulation of the visible subset. Moreover, as a web resource, PINV simplifies sharing and publishing, activities which are vital in today's research collaborative environments. The source code is freely available for download at https://github.com/4ndr01d3/biosual.


Assuntos
Mapas de Interação de Proteínas , Software , Gráficos por Computador , Humanos , Internet , Mapeamento de Interação de Proteínas
16.
BMC Bioinformatics ; 14: 284, 2013 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-24067102

RESUMO

BACKGROUND: The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. RESULTS: We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. CONCLUSIONS: The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.


Assuntos
Biologia Computacional/métodos , Ontologia Genética , Anotação de Sequência Molecular/métodos , Software , Análise por Conglomerados , Bases de Dados Genéticas , Genes/genética , Proteínas/genética , Semântica
17.
Int J Mol Sci ; 13(6): 7283-7302, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22837694

RESUMO

High-throughput biology technologies have yielded complete genome sequences and functional genomics data for several organisms, including crucial microbial pathogens of humans, animals and plants. However, up to 50% of genes within a genome are often labeled "unknown", "uncharacterized" or "hypothetical", limiting our understanding of virulence and pathogenicity of these organisms. Even though biological functions of proteins encoded by these genes are not known, many of them have been predicted to be involved in key processes in these organisms. In particular, for Mycobacterium tuberculosis, some of these "hypothetical" proteins, for example those belonging to the Pro-Glu or Pro-Pro-Glu (PE/PPE) family, have been suspected to play a crucial role in the intracellular lifestyle of this pathogen, and may contribute to its survival in different environments. We have generated a functional interaction network for Mycobacterium tuberculosis proteins and used this to predict functions for many of its hypothetical proteins. Here we performed functional enrichment analysis of these proteins based on their predicted biological functions to identify annotations that are statistically relevant, and analysed and compared network properties of hypothetical proteins to the known proteins. From the statistically significant annotations and network information, we have tried to derive biologically meaningful annotations related to infection and disease. This quantitative analysis provides an overview of the functional contributions of Mycobacterium tuberculosis "hypothetical" proteins to many basic cellular functions, including its adaptability in the host system and its ability to evade the host immune response.


Assuntos
Proteínas de Bactérias/metabolismo , Genoma Bacteriano/fisiologia , Modelos Biológicos , Anotação de Sequência Molecular , Mycobacterium tuberculosis/metabolismo , Proteínas de Bactérias/genética , Mycobacterium tuberculosis/genética
18.
Database (Oxford) ; 20222022 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-35363306

RESUMO

The Sickle Cell Disease (SCD) Ontology (SCDO, https://scdontology.h3abionet.org/) provides a comprehensive knowledge base of SCD management, systems and standardized human and machine-readable resources that unambiguously describe terminology and concepts about SCD for researchers, patients and clinicians. The SCDO was launched in 2016 and is continuously updated in quantity, as well as in quality, to effectively support the curation of SCD research, patient databasing and clinical informatics applications. SCD knowledge from the scientific literature is used to update existing SCDO terms and create new terms where necessary. Here, we report major updates to the SCDO, from December 2019 until April 2021, for promoting interoperability and facilitating SCD data harmonization, sharing and integration across different studies and for retrospective multi-site research collaborations. SCDO developers continue to collaborate with the SCD community, clinicians and researchers to improve specific ontology areas and expand standardized descriptions to conditions influencing SCD phenotypic expressions and clinical manifestations of the sickling process, e.g. thalassemias. Database URL: https://scdontology.h3abionet.org/.


Assuntos
Anemia Falciforme , Anemia Falciforme/genética , Bases de Dados Factuais , Humanos , Bases de Conhecimento , Fenótipo , Estudos Retrospectivos
19.
Front Mol Biosci ; 9: 967205, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36452456

RESUMO

Advances in omics technologies allow for holistic studies into biological systems. These studies rely on integrative data analysis techniques to obtain a comprehensive view of the dynamics of cellular processes, and molecular mechanisms. Network-based integrative approaches have revolutionized multi-omics analysis by providing the framework to represent interactions between multiple different omics-layers in a graph, which may faithfully reflect the molecular wiring in a cell. Here we review network-based multi-omics/multi-modal integrative analytical approaches. We classify these approaches according to the type of omics data supported, the methods and/or algorithms implemented, their node and/or edge weighting components, and their ability to identify key nodes and subnetworks. We show how these approaches can be used to identify biomarkers, disease subtypes, crosstalk, causality, and molecular drivers of physiological and pathological mechanisms. We provide insight into the most appropriate methods and tools for research questions as showcased around the aetiology and treatment of COVID-19 that can be informed by multi-omics data integration. We conclude with an overview of challenges associated with multi-omics network-based analysis, such as reproducibility, heterogeneity, (biological) interpretability of the results, and we highlight some future directions for network-based integration.

20.
Front Genet ; 12: 595702, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33790942

RESUMO

BACKGROUND: Renal dysfunctions are associated with increased morbidity and mortality in sickle cell disease (SCD). Early detection and subsequent management of SCD patients at risk for renal failure and dysfunctions are essential, however, predictors that can identify patients at risk of developing renal dysfunction are not fully understood. METHODS: In this study, we have investigated the association of 31 known kidney dysfunctions-related variants detected in African Americans from multi-ethnic genome wide studies (GWAS) meta-analysis, to kidney-dysfunctions in a group of 413 Cameroonian patients with SCD. Systems level bioinformatics analyses were performed, employing protein-protein interaction networks to further interrogate the putative associations. RESULTS: Up to 61% of these patients had micro-albuminuria, 2.4% proteinuria, 71% glomerular hyperfiltration, and 5.9% had renal failure. Six variants are significantly associated with the two quantifiable phenotypes of kidney dysfunction (eGFR and crude-albuminuria): A1CF-rs10994860 (P = 0.02020), SYPL2-rs12136063 (P = 0.04208), and APOL1 (G1)-rs73885319 (P = 0.04610) are associated with eGFR; and WNT7A-rs6795744 (P = 0.03730), TMEM60-rs6465825 (P = 0.02340), and APOL1 (G2)-rs71785313 (P = 0.03803) observed to be protective against micro-albuminuria. We identified a protein-protein interaction sub-network containing three of these gene variants: APOL1, SYPL2, and WNT7A, connected to the Nuclear factor NF-kappa-B p105 subunit (NFKB1), revealed to be essential and might indirectly influence extreme phenotypes. Interestingly, clinical variables, including body mass index (BMI), systolic blood pressure, vaso-occlusive crisis (VOC), and haemoglobin (Hb), explain better the kidney phenotypic variations in this SCD population. CONCLUSION: This study highlights a strong contribution of haematological indices (Hb level), anthropometric variables (BMI, blood pressure), and clinical events (i.e., vaso-occlusive crisis) to kidney dysfunctions in SCD, rather than known genetic factors. Only 6/31 characterised gene-variants are associated with kidney dysfunction phenotypes in SCD samples from Cameroon. The data reveal and emphasise the urgent need to extend GWAS studies in populations of African ancestries living in Africa, and particularly for kidney dysfunctions in SCD.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA