Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 38
Filter
Add more filters

Publication year range
1.
Brief Bioinform ; 20(2): 609-623, 2019 03 25.
Article in English | MEDLINE | ID: mdl-29684165

ABSTRACT

Large amounts of data emerging from experiments in molecular medicine are leading to the identification of molecular signatures associated with disease subtypes. The contextualization of these patterns is important for obtaining mechanistic insight into the aberrant processes associated with a disease, and this typically involves the integration of multiple heterogeneous types of data. In this review, we discuss knowledge representations that can be useful to explore the biological context of molecular signatures, in particular three main approaches, namely, pathway mapping approaches, molecular network centric approaches and approaches that represent biological statements as knowledge graphs. We discuss the utility of each of these paradigms, illustrate how they can be leveraged with selected practical examples and identify ongoing challenges for this field of research.


Subject(s)
Computational Biology , Molecular Medicine , Humans , Precision Medicine
3.
Bioinformatics ; 33(7): 1096-1098, 2017 04 01.
Article in English | MEDLINE | ID: mdl-27993779

ABSTRACT

Summary: The goal of this work is to offer a computational framework for exploring data from the Recon2 human metabolic reconstruction model. Advanced user access features have been developed using the Neo4j graph database technology and this paper describes key features such as efficient management of the network data, examples of the network querying for addressing particular tasks, and how query results are converted back to the Systems Biology Markup Language (SBML) standard format. The Neo4j-based metabolic framework facilitates exploration of highly connected and comprehensive human metabolic data and identification of metabolic subnetworks of interest. A Java-based parser component has been developed to convert query results (available in the JSON format) into SBML and SIF formats in order to facilitate further results exploration, enhancement or network sharing. Availability and Implementation: The Neo4j-based metabolic framework is freely available from: https://diseaseknowledgebase.etriks.org/metabolic/browser/ . The java code files developed for this work are available from the following url: https://github.com/ibalaur/MetabolicFramework . Contact: ibalaur@eisbm.org. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Metabolic Networks and Pathways , Software , Computer Graphics , Database Management Systems , Databases, Factual , Genome , Humans , Metabolic Networks and Pathways/genetics , Models, Biological
4.
BMC Bioinformatics ; 17(1): 494, 2016 Dec 05.
Article in English | MEDLINE | ID: mdl-27919219

ABSTRACT

BACKGROUND: When modeling in Systems Biology and Systems Medicine, the data is often extensive, complex and heterogeneous. Graphs are a natural way of representing biological networks. Graph databases enable efficient storage and processing of the encoded biological relationships. They furthermore support queries on the structure of biological networks. RESULTS: We present the Java-based framework STON (SBGN TO Neo4j). STON imports and translates metabolic, signalling and gene regulatory pathways represented in the Systems Biology Graphical Notation into a graph-oriented format compatible with the Neo4j graph database. CONCLUSION: STON exploits the power of graph databases to store and query complex biological pathways. This advances the possibility of: i) identifying subnetworks in a given pathway; ii) linking networks across different levels of granularity to address difficulties related to incomplete knowledge representation at single level; and iii) identifying common patterns between pathways in the database.


Subject(s)
Gene Regulatory Networks , Metabolic Networks and Pathways , Signal Transduction , Software , Systems Biology/methods , Databases, Factual , Humans
5.
Plant Physiol ; 167(3): 1158-85, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25596183

ABSTRACT

The hemibiotrophic fungus Zymoseptoria tritici causes Septoria tritici blotch disease of wheat (Triticum aestivum). Pathogen reproduction on wheat occurs without cell penetration, suggesting that dynamic and intimate intercellular communication occurs between fungus and plant throughout the disease cycle. We used deep RNA sequencing and metabolomics to investigate the physiology of plant and pathogen throughout an asexual reproductive cycle of Z. tritici on wheat leaves. Over 3,000 pathogen genes, more than 7,000 wheat genes, and more than 300 metabolites were differentially regulated. Intriguingly, individual fungal chromosomes contributed unequally to the overall gene expression changes. Early transcriptional down-regulation of putative host defense genes was detected in inoculated leaves. There was little evidence for fungal nutrient acquisition from the plant throughout symptomless colonization by Z. tritici, which may instead be utilizing lipid and fatty acid stores for growth. However, the fungus then subsequently manipulated specific plant carbohydrates, including fructan metabolites, during the switch to necrotrophic growth and reproduction. This switch coincided with increased expression of jasmonic acid biosynthesis genes and large-scale activation of other plant defense responses. Fungal genes encoding putative secondary metabolite clusters and secreted effector proteins were identified with distinct infection phase-specific expression patterns, although functional analysis suggested that many have overlapping/redundant functions in virulence. The pathogenic lifestyle of Z. tritici on wheat revealed through this study, involving initial defense suppression by a slow-growing extracellular and nutritionally limited pathogen followed by defense (hyper) activation during reproduction, reveals a subtle modification of the conceptual definition of hemibiotrophic plant infection.


Subject(s)
Ascomycota/metabolism , Chromosomes, Fungal/genetics , Metabolome/genetics , Plant Immunity , Transcriptome/genetics , Triticum/immunology , Triticum/microbiology , Ascomycota/genetics , Ascomycota/growth & development , Disease Progression , Fructans/metabolism , Gene Expression Profiling , Gene Expression Regulation, Fungal , Genes, Fungal , Hexoses/metabolism , Multigene Family , Nitrates/metabolism , Plant Diseases/immunology , Plant Diseases/microbiology , Plant Leaves/microbiology , Reproduction, Asexual , Salicylic Acid/metabolism , Sequence Analysis, RNA , Time Factors
7.
Transl Psychiatry ; 13(1): 108, 2023 04 03.
Article in English | MEDLINE | ID: mdl-37012252

ABSTRACT

Very preterm birth (VPT; ≤32 weeks' gestation) is associated with altered brain development and cognitive and behavioral difficulties across the lifespan. However, heterogeneity in outcomes among individuals born VPT makes it challenging to identify those most vulnerable to neurodevelopmental sequelae. Here, we aimed to stratify VPT children into distinct behavioral subgroups and explore between-subgroup differences in neonatal brain structure and function. 198 VPT children (98 females) previously enrolled in the Evaluation of Preterm Imaging Study (EudraCT 2009-011602-42) underwent Magnetic Resonance Imaging at term-equivalent age and neuropsychological assessments at 4-7 years. Using an integrative clustering approach, we combined neonatal socio-demographic, clinical factors and childhood socio-emotional and executive function outcomes, to identify distinct subgroups of children based on their similarity profiles in a multidimensional space. We characterized resultant subgroups using domain-specific outcomes (temperament, psychopathology, IQ and cognitively stimulating home environment) and explored between-subgroup differences in neonatal brain volumes (voxel-wise Tensor-Based-Morphometry), functional connectivity (voxel-wise degree centrality) and structural connectivity (Tract-Based-Spatial-Statistics). Results showed two- and three-cluster data-driven solutions. The two-cluster solution comprised a 'resilient' subgroup (lower psychopathology and higher IQ, executive function and socio-emotional scores) and an 'at-risk' subgroup (poorer behavioral and cognitive outcomes). No neuroimaging differences between the resilient and at-risk subgroups were found. The three-cluster solution showed an additional third 'intermediate' subgroup, displaying behavioral and cognitive outcomes intermediate between the resilient and at-risk subgroups. The resilient subgroup had the most cognitively stimulating home environment and the at-risk subgroup showed the highest neonatal clinical risk, while the intermediate subgroup showed the lowest clinical, but the highest socio-demographic risk. Compared to the intermediate subgroup, the resilient subgroup displayed larger neonatal insular and orbitofrontal volumes and stronger orbitofrontal functional connectivity, while the at-risk group showed widespread white matter microstructural alterations. These findings suggest that risk stratification following VPT birth is feasible and could be used translationally to guide personalized interventions aimed at promoting children's resilience.


Subject(s)
Infant, Extremely Premature , Premature Birth , Female , Humans , Infant, Newborn , Child , Premature Birth/diagnostic imaging , Premature Birth/pathology , Brain/pathology , Magnetic Resonance Imaging/methods , Gestational Age
8.
J Clin Invest ; 133(13)2023 07 03.
Article in English | MEDLINE | ID: mdl-37219943

ABSTRACT

Recent transcriptomic-based analysis of diffuse large B cell lymphoma (DLBCL) has highlighted the clinical relevance of LN fibroblast and tumor-infiltrating lymphocyte (TIL) signatures within the tumor microenvironment (TME). However, the immunomodulatory role of fibroblasts in lymphoma remains unclear. Here, by studying human and mouse DLBCL-LNs, we identified the presence of an aberrantly remodeled fibroblastic reticular cell (FRC) network expressing elevated fibroblast-activated protein (FAP). RNA-Seq analyses revealed that exposure to DLBCL reprogrammed key immunoregulatory pathways in FRCs, including a switch from homeostatic to inflammatory chemokine expression and elevated antigen-presentation molecules. Functional assays showed that DLBCL-activated FRCs (DLBCL-FRCs) hindered optimal TIL and chimeric antigen receptor (CAR) T cell migration. Moreover, DLBCL-FRCs inhibited CD8+ TIL cytotoxicity in an antigen-specific manner. Notably, the interrogation of patient LNs with imaging mass cytometry identified distinct environments differing in their CD8+ TIL-FRC composition and spatial organization that associated with survival outcomes. We further demonstrated the potential to target inhibitory FRCs to rejuvenate interacting TILs. Cotreating organotypic cultures with FAP-targeted immunostimulatory drugs and a bispecific antibody (glofitamab) augmented antilymphoma TIL cytotoxicity. Our study reveals an immunosuppressive role of FRCs in DLBCL, with implications for immune evasion, disease pathogenesis, and optimizing immunotherapy for patients.


Subject(s)
Lymphoma, Large B-Cell, Diffuse , T-Lymphocytes , Humans , Mice , Animals , Lymphoma, Large B-Cell, Diffuse/pathology , Fibroblasts/metabolism , Lymph Nodes , Tumor Microenvironment
9.
Front Microbiol ; 13: 904451, 2022.
Article in English | MEDLINE | ID: mdl-35774454

ABSTRACT

The cervicovaginal environment in pregnancy is proposed to influence risk of spontaneous preterm birth. The environment is shaped both by the resident microbiota and local inflammation driven by the host response (epithelia, immune cells and mucous). The contributions of the microbiota, metabolome and host defence peptides have been investigated, but less is known about the immune cell populations and how they may respond to the vaginal environment. Here we investigated the maternal immune cell populations at the cervicovaginal interface in early to mid-pregnancy (10-24 weeks of gestation, samples from N = 46 women), we confirmed neutrophils as the predominant cell type and characterised associations between the cervical neutrophil transcriptome and the cervicovaginal metagenome (N = 9 women). In this exploratory study, the neutrophil cell proportion was affected by gestation at sampling but not by birth outcome or ethnicity. Following RNA sequencing (RNA-seq) of a subset of neutrophil enriched cells, principal component analysis of the transcriptome profiles indicated that cells from seven women clustered closely together these women had a less diverse cervicovaginal microbiota than the remaining three women. Expression of genes involved in neutrophil mediated immunity, activation, degranulation, and other immune functions correlated negatively with Gardnerella vaginalis abundance and positively with Lactobacillus iners abundance; microbes previously associated with birth outcome. The finding that neutrophils are the dominant immune cell type in the cervix during pregnancy and that the cervical neutrophil transcriptome of pregnant women may be modified in response to the microbial cervicovaginal environment, or vice versa, establishes the rationale for investigating associations between the innate immune response, cervical shortening and spontaneous preterm birth and the underlying mechanisms.

10.
Cell Rep ; 40(13): 111439, 2022 09 27.
Article in English | MEDLINE | ID: mdl-36170836

ABSTRACT

Interactions between the epithelium and the immune system are critical in the pathogenesis of inflammatory bowel disease (IBD). In this study, we mapped the transcriptional landscape of human colonic epithelial organoids in response to different cytokines responsible for mediating canonical mucosal immune responses. By profiling the transcriptome of human colonic organoids treated with the canonical cytokines interferon gamma, interleukin-13, -17A, and tumor necrosis factor alpha with next-generation sequencing, we unveil shared and distinct regulation patterns of epithelial function by different cytokines. An integrative analysis of cytokine responses in diseased tissue from patients with IBD (n = 1,009) reveals a molecular classification of mucosal inflammation defined by gradients of cytokine-responsive transcriptional signatures. Our systems biology approach detected signaling bottlenecks in cytokine-responsive networks and highlighted their translational potential as theragnostic targets in intestinal inflammation.


Subject(s)
Inflammatory Bowel Diseases , Organoids , Colon/pathology , Cytokines , Humans , Inflammation/pathology , Inflammatory Bowel Diseases/pathology , Interferon-gamma/pharmacology , Interleukin-13 , Intestinal Mucosa/pathology , Organoids/pathology , Tumor Necrosis Factor-alpha
11.
J Intensive Care Soc ; 23(3): 318-324, 2022 Aug.
Article in English | MEDLINE | ID: mdl-36033245

ABSTRACT

Sepsis is a common illness. Immune responses are considered major drivers of sepsis illness and outcomes. However, there are no proven immunomodulator therapies in sepsis. We hypothesised that in-depth characterisation of sepsis-specific immune trajectory may inform immunomodulation in sepsis-related critical illness. We describe the protocol of the IMMERSE study to address this hypothesis. We include critically ill sepsis patients without documented immune comorbidity and age-sex matched cardiac surgical patients as controls. We plan to perform an in-depth biological characterisation of innate and adaptive immune systems, platelet function, humoral components and transcriptional determinants of the immune system responses in sepsis. This will be done at pre-specified time points during their critical illness to generate an illness trajectory. The sample size for each biological assessment is different and is described in detail. In summary, the overall aim of the IMMERSE study is to increase the granularity of longitudinal immunology model of sepsis to inform future immunomodulation trials.

12.
Nat Commun ; 13(1): 5820, 2022 10 03.
Article in English | MEDLINE | ID: mdl-36192482

ABSTRACT

The function of interleukin-22 (IL-22) in intestinal barrier homeostasis remains controversial. Here, we map the transcriptional landscape regulated by IL-22 in human colonic epithelial organoids and evaluate the biological, functional and clinical significance of the IL-22 mediated pathways in ulcerative colitis (UC). We show that IL-22 regulated pro-inflammatory pathways are involved in microbial recognition, cancer and immune cell chemotaxis; most prominently those involving CXCR2+ neutrophils. IL-22-mediated transcriptional regulation of CXC-family neutrophil-active chemokine expression is highly conserved across species, is dependent on STAT3 signaling, and is functionally and pathologically important in the recruitment of CXCR2+ neutrophils into colonic tissue. In UC patients, the magnitude of enrichment of the IL-22 regulated transcripts in colonic biopsies correlates with colonic neutrophil infiltration and is enriched in non-responders to ustekinumab therapy. Our data provide further insights into the biology of IL-22 in human disease and highlight its function in the regulation of pathogenic immune pathways, including neutrophil chemotaxis. The transcriptional networks regulated by IL-22 are functionally and clinically important in UC, impacting patient trajectories and responsiveness to biological intervention.


Subject(s)
Colitis, Ulcerative , Chemokines, CXC/metabolism , Colitis, Ulcerative/drug therapy , Colitis, Ulcerative/genetics , Humans , Interleukin-8/metabolism , Interleukins , Neutrophil Infiltration , Neutrophils/metabolism , Receptors, Interleukin-8B/metabolism , Ustekinumab/pharmacology , Ustekinumab/therapeutic use , Interleukin-22
13.
BMC Bioinformatics ; 12: 203, 2011 May 25.
Article in English | MEDLINE | ID: mdl-21612636

ABSTRACT

BACKGROUND: Combining multiple evidence-types from different information sources has the potential to reveal new relationships in biological systems. The integrated information can be represented as a relationship network, and clustering the network can suggest possible functional modules. The value of such modules for gaining insight into the underlying biological processes depends on their functional coherence. The challenges that we wish to address are to define and quantify the functional coherence of modules in relationship networks, so that they can be used to infer function of as yet unannotated proteins, to discover previously unknown roles of proteins in diseases as well as for better understanding of the regulation and interrelationship between different elements of complex biological systems. RESULTS: We have defined the functional coherence of modules with respect to the Gene Ontology (GO) by considering two complementary aspects: (i) the fragmentation of the GO functional categories into the different modules and (ii) the most representative functions of the modules. We have proposed a set of metrics to evaluate these two aspects and demonstrated their utility in Arabidopsis thaliana. We selected 2355 proteins for which experimentally established protein-protein interaction (PPI) data were available. From these we have constructed five relationship networks, four based on single types of data: PPI, co-expression, co-occurrence of protein names in scientific literature abstracts and sequence similarity and a fifth one combining these four evidence types. The ability of these networks to suggest biologically meaningful grouping of proteins was explored by applying Markov clustering and then by measuring the functional coherence of the clusters. CONCLUSIONS: Relationship networks integrating multiple evidence-types are biologically informative and allow more proteins to be assigned to a putative functional module. Using additional evidence types concentrates the functional annotations in a smaller number of modules without unduly compromising their consistency. These results indicate that integration of more data sources improves the ability to uncover functional association between proteins, both by allowing more proteins to be linked and producing a network where modular structure more closely reflects the hierarchy in the gene ontology.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/genetics , Arabidopsis/metabolism , Metabolomics/methods , Algorithms , Arabidopsis Proteins/genetics , Cluster Analysis , Databases, Genetic , Markov Chains , Metabolic Networks and Pathways
14.
BMC Bioinformatics ; 12: 431, 2011 Nov 03.
Article in English | MEDLINE | ID: mdl-22054122

ABSTRACT

BACKGROUND: In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. RESULTS: In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo) for the Analysis and the Inter-comparison of the products of Gene Ontology (GO) annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. CONCLUSIONS: This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.


Subject(s)
Cattle/genetics , Genomics/methods , Molecular Sequence Annotation , Vocabulary, Controlled , Algorithms , Animals , Chromosome Mapping , Databases, Genetic , Genome , Humans
15.
Brief Bioinform ; 10(6): 676-93, 2009 Nov.
Article in English | MEDLINE | ID: mdl-19933213

ABSTRACT

The development of a systems based approach to problems in plant sciences requires integration of existing information resources. However, the available information is currently often incomplete and dispersed across many sources and the syntactic and semantic heterogeneity of the data is a challenge for integration. In this article, we discuss strategies for data integration and we use a graph based integration method (Ondex) to illustrate some of these challenges with reference to two example problems concerning integration of (i) metabolic pathway and (ii) protein interaction data for Arabidopsis thaliana. We quantify the degree of overlap for three commonly used pathway and protein interaction information sources. For pathways, we find that the AraCyc database contains the widest coverage of enzyme reactions and for protein interactions we find that the IntAct database provides the largest unique contribution to the integrated dataset. For both examples, however, we observe a relatively small amount of data common to all three sources. Analysis and visual exploration of the integrated networks was used to identify a number of practical issues relating to the interpretation of these datasets. We demonstrate the utility of these approaches to the analysis of groups of coexpressed genes from an individual microarray experiment, in the context of pathway information and for the combination of coexpression data with an integrated protein interaction network.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , Chromosome Mapping/methods , Database Management Systems , Databases, Genetic , Genome, Plant/genetics , Information Storage and Retrieval/methods , Protein Interaction Mapping/methods , Systems Integration
16.
Nat Commun ; 12(1): 3406, 2021 06 07.
Article in English | MEDLINE | ID: mdl-34099652

ABSTRACT

Prognostic characteristics inform risk stratification in intensive care unit (ICU) patients with coronavirus disease 2019 (COVID-19). We obtained blood samples (n = 474) from hospitalized COVID-19 patients (n = 123), non-COVID-19 ICU sepsis patients (n = 25) and healthy controls (n = 30). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA was detected in plasma or serum (RNAemia) of COVID-19 ICU patients when neutralizing antibody response was low. RNAemia is associated with higher 28-day ICU mortality (hazard ratio [HR], 1.84 [95% CI, 1.22-2.77] adjusted for age and sex). RNAemia is comparable in performance to the best protein predictors. Mannose binding lectin 2 and pentraxin-3 (PTX3), two activators of the complement pathway of the innate immune system, are positively associated with mortality. Machine learning identified 'Age, RNAemia' and 'Age, PTX3' as the best binary signatures associated with 28-day ICU mortality. In longitudinal comparisons, COVID-19 ICU patients have a distinct proteomic trajectory associated with mortality, with recovery of many liver-derived proteins indicating survival. Finally, proteins of the complement system and galectin-3-binding protein (LGALS3BP) are identified as interaction partners of SARS-CoV-2 spike glycoprotein. LGALS3BP overexpression inhibits spike-pseudoparticle uptake and spike-induced cell-cell fusion in vitro.


Subject(s)
COVID-19/prevention & control , Critical Care/statistics & numerical data , Proteomics/methods , RNA, Viral/genetics , SARS-CoV-2/genetics , Adult , Animals , Antibodies, Neutralizing/immunology , Antigens, Neoplasm/metabolism , Biomarkers, Tumor/metabolism , C-Reactive Protein/metabolism , COVID-19/metabolism , COVID-19/virology , Female , HEK293 Cells , Humans , Kaplan-Meier Estimate , Male , Middle Aged , RNA, Viral/blood , SARS-CoV-2/metabolism , SARS-CoV-2/physiology , Serum Amyloid P-Component/metabolism , Spike Glycoprotein, Coronavirus/immunology , Spike Glycoprotein, Coronavirus/metabolism , Viral Load/immunology
17.
Sci Data ; 6(1): 149, 2019 08 13.
Article in English | MEDLINE | ID: mdl-31409798

ABSTRACT

Biomedical informatics has traditionally adopted a linear view of the informatics process (collect, store and analyse) in translational medicine (TM) studies; focusing primarily on the challenges in data integration and analysis. However, a data management challenge presents itself with the new lifecycle view of data emphasized by the recent calls for data re-use, long term data preservation, and data sharing. There is currently a lack of dedicated infrastructure focused on the 'manageability' of the data lifecycle in TM research between data collection and analysis. Current community efforts towards establishing a culture for open science prompt the creation of a data custodianship environment for management of TM data assets to support data reuse and reproducibility of research results. Here we present the development of a lifecycle-based methodology to create a metadata management framework based on community driven standards for standardisation, consolidation and integration of TM research data. Based on this framework, we also present the development of a new platform (PlatformTM) focused on managing the lifecycle for translational research data assets.


Subject(s)
Information Dissemination , Medical Informatics , Translational Research, Biomedical , Humans , Metadata , User-Computer Interface
18.
Nucleic Acids Res ; 34(5): 1571-80, 2006.
Article in English | MEDLINE | ID: mdl-16547200

ABSTRACT

An important problem in genomics is automatically clustering homologous proteins when only sequence information is available. Most methods for clustering proteins are local, and are based on simply thresholding a measure related to sequence distance. We first show how locality limits the performance of such methods by analysing the distribution of distances between protein sequences. We then present a global method based on spectral clustering and provide theoretical justification of why it will have a remarkable improvement over local methods. We extensively tested our method and compared its performance with other local methods on several subsets of the SCOP (Structural Classification of Proteins) database, a gold standard for protein structure classification. We consistently observed that, the number of clusters that we obtain for a given set of proteins is close to the number of superfamilies in that set; there are fewer singletons; and the method correctly groups most remote homologs. In our experiments, the quality of the clusters as quantified by a measure that combines sensitivity and specificity was consistently better [on average, improvements were 84% over hierarchical clustering, 34% over Connected Component Analysis (CCA) (similar to GeneRAGE) and 72% over another global method, TribeMCL].


Subject(s)
Sequence Analysis, Protein/methods , Sequence Homology, Amino Acid , Algorithms , Cluster Analysis , Proteins/classification
19.
NPJ Syst Biol Appl ; 4: 21, 2018.
Article in English | MEDLINE | ID: mdl-29872544

ABSTRACT

The development of computational approaches in systems biology has reached a state of maturity that allows their transition to systems medicine. Despite this progress, intuitive visualisation and context-dependent knowledge representation still present a major bottleneck. In this paper, we describe the Disease Maps Project, an effort towards a community-driven computationally readable comprehensive representation of disease mechanisms. We outline the key principles and the framework required for the success of this initiative, including use of best practices, standards and protocols. We apply a modular approach to ensure efficient sharing and reuse of resources for projects dedicated to specific diseases. Community-wide use of disease maps will accelerate the conduct of biomedical research and lead to new disease ontologies defined from mechanism-based disease endotypes rather than phenotypes.

20.
BMC Syst Biol ; 12(1): 60, 2018 05 29.
Article in English | MEDLINE | ID: mdl-29843806

ABSTRACT

BACKGROUND: Multilevel data integration is becoming a major area of research in systems biology. Within this area, multi-'omics datasets on complex diseases are becoming more readily available and there is a need to set standards and good practices for integrated analysis of biological, clinical and environmental data. We present a framework to plan and generate single and multi-'omics signatures of disease states. METHODS: The framework is divided into four major steps: dataset subsetting, feature filtering, 'omics-based clustering and biomarker identification. RESULTS: We illustrate the usefulness of this framework by identifying potential patient clusters based on integrated multi-'omics signatures in a publicly available ovarian cystadenocarcinoma dataset. The analysis generated a higher number of stable and clinically relevant clusters than previously reported, and enabled the generation of predictive models of patient outcomes. CONCLUSIONS: This framework will help health researchers plan and perform multi-'omics big data analyses to generate hypotheses and make sense of their rich, diverse and ever growing datasets, to enable implementation of translational P4 medicine.


Subject(s)
Disease/genetics , Systems Biology/methods , Biomarkers/metabolism , Cluster Analysis , False Positive Reactions , Machine Learning , Quality Control
SELECTION OF CITATIONS
SEARCH DETAIL