Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
1.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35649389

ABSTRACT

Rational vaccine design, especially vaccine antigen identification and optimization, is critical to successful and efficient vaccine development against various infectious diseases including coronavirus disease 2019 (COVID-19). In general, computational vaccine design includes three major stages: (i) identification and annotation of experimentally verified gold standard protective antigens through literature mining, (ii) rational vaccine design using reverse vaccinology (RV) and structural vaccinology (SV) and (iii) post-licensure vaccine success and adverse event surveillance and its usage for vaccine design. Protegen is a database of experimentally verified protective antigens, which can be used as gold standard data for rational vaccine design. RV predicts protective antigen targets primarily from genome sequence analysis. SV refines antigens through structural engineering. Recently, RV and SV approaches, with the support of various machine learning methods, have been applied to COVID-19 vaccine design. The analysis of post-licensure vaccine adverse event report data also provides valuable results in terms of vaccine safety and how vaccines should be used or paused. Ontology standardizes and incorporates heterogeneous data and knowledge in a human- and computer-interpretable manner, further supporting machine learning and vaccine design. Future directions on rational vaccine design are discussed.


Subject(s)
COVID-19 , Vaccines , COVID-19/prevention & control , COVID-19 Vaccines , Data Mining , Humans , Machine Learning , Vaccines/chemistry , Vaccines/genetics , Vaccinology/methods
2.
Nucleic Acids Res ; 49(W1): W671-W678, 2021 07 02.
Article in English | MEDLINE | ID: mdl-34009334

ABSTRACT

Vaccination is one of the most significant inventions in medicine. Reverse vaccinology (RV) is a state-of-the-art technique to predict vaccine candidates from pathogen's genome(s). To promote vaccine development, we updated Vaxign2, the first web-based vaccine design program using reverse vaccinology with machine learning. Vaxign2 is a comprehensive web server for rational vaccine design, consisting of predictive and computational workflow components. The predictive part includes the original Vaxign filtering-based method and a new machine learning-based method, Vaxign-ML. The benchmarking results using a validation dataset showed that Vaxign-ML had superior prediction performance compared to other RV tools. Besides the prediction component, Vaxign2 implemented various post-prediction analyses to significantly enhance users' capability to refine the prediction results based on different vaccine design rationales and considerably reduce user time to analyze the Vaxign/Vaxign-ML prediction results. Users provide proteome sequences as input data, select candidates based on Vaxign outputs and Vaxign-ML scores, and perform post-prediction analysis. Vaxign2 also includes precomputed results from approximately 1 million proteins in 398 proteomes of 36 pathogens. As a demonstration, Vaxign2 was used to effectively analyse SARS-CoV-2, the coronavirus causing COVID-19. The comprehensive framework of Vaxign2 can support better and more rational vaccine design. Vaxign2 is publicly accessible at http://www.violinet.org/vaxign2.


Subject(s)
Drug Design , Internet , Machine Learning , Software , Vaccines , Vaccinology/methods , Antigens, Viral/chemistry , Antigens, Viral/immunology , COVID-19/virology , COVID-19 Vaccines/chemistry , COVID-19 Vaccines/immunology , Epitopes/chemistry , Epitopes/immunology , Humans , Proteome , SARS-CoV-2/chemistry , SARS-CoV-2/immunology , SARS-CoV-2/metabolism , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/immunology , Vaccines/chemistry , Vaccines/immunology , Workflow
3.
Bioinformatics ; 36(10): 3185-3191, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32096826

ABSTRACT

MOTIVATION: Reverse vaccinology (RV) is a milestone in rational vaccine design, and machine learning (ML) has been applied to enhance the accuracy of RV prediction. However, ML-based RV still faces challenges in prediction accuracy and program accessibility. RESULTS: This study presents Vaxign-ML, a supervised ML classification to predict bacterial protective antigens (BPAgs). To identify the best ML method with optimized conditions, five ML methods were tested with biological and physiochemical features extracted from well-defined training data. Nested 5-fold cross-validation and leave-one-pathogen-out validation were used to ensure unbiased performance assessment and the capability to predict vaccine candidates against a new emerging pathogen. The best performing model (eXtreme Gradient Boosting) was compared to three publicly available programs (Vaxign, VaxiJen, and Antigenic), one SVM-based method, and one epitope-based method using a high-quality benchmark dataset. Vaxign-ML showed superior performance in predicting BPAgs. Vaxign-ML is hosted in a publicly accessible web server and a standalone version is also available. AVAILABILITY AND IMPLEMENTATION: Vaxign-ML website at http://www.violinet.org/vaxign/vaxign-ml, Docker standalone Vaxign-ML available at https://hub.docker.com/r/e4ong1031/vaxign-ml and source code is available at https://github.com/VIOLINet/Vaxign-ML-docker. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Antigens, Bacterial , Vaccinology , Computational Biology , Machine Learning , Software , Supervised Machine Learning
4.
Nucleic Acids Res ; 47(D1): D693-D700, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30365026

ABSTRACT

Virulence factors (VFs) are molecules that allow microbial pathogens to overcome host defense mechanisms and cause disease in a host. It is critical to study VFs for better understanding microbial pathogenesis and host defense mechanisms. Victors (http://www.phidias.us/victors) is a novel, manually curated, web-based integrative knowledge base and analysis resource for VFs of pathogens that cause infectious diseases in human and animals. Currently, Victors contains 5296 VFs obtained via manual annotation from peer-reviewed publications, with 4648, 179, 105 and 364 VFs originating from 51 bacterial, 54 viral, 13 parasitic and 8 fungal species, respectively. Our data analysis identified many VF-specific patterns. Within the global VF pool, cytoplasmic proteins were more common, while adhesins were less common compared to findings on protective vaccine antigens. Many VFs showed homology with host proteins and the human proteins interacting with VFs represented the hubs of human-pathogen interactions. All Victors data are queriable with a user-friendly web interface. The VFs can also be searched by a customized BLAST sequence similarity searching program. These VFs and their interactions with the host are represented in a machine-readable Ontology of Host-Pathogen Interactions. Victors supports the 'One Health' research as a vital source of VFs in human and animal pathogens.


Subject(s)
Communicable Diseases/microbiology , Genome, Bacterial , Genome, Fungal , Genome, Viral , Knowledge Bases , Software , Virulence Factors/genetics , Animals , Communicable Diseases/veterinary , Communicable Diseases/virology , Databases, Genetic , Genomics/methods , Genomics/standards , Host-Pathogen Interactions , Humans
5.
BMC Bioinformatics ; 20(Suppl 21): 704, 2019 Dec 23.
Article in English | MEDLINE | ID: mdl-31865910

ABSTRACT

BACKGROUND: Different human responses to the same vaccine were frequently observed. For example, independent studies identified overlapping but different transcriptomic gene expression profiles in Yellow Fever vaccine 17D (YF-17D) immunized human subjects. Different experimental and analysis conditions were likely contributed to the observed differences. To investigate this issue, we developed a Vaccine Investigation Ontology (VIO), and applied VIO to classify the different variables and relations among these variables systematically. We then evaluated whether the ontological VIO modeling and VIO-based statistical analysis would contribute to the enhanced vaccine investigation studies and a better understanding of vaccine response mechanisms. RESULTS: Our VIO modeling identified many variables related to data processing and analysis such as normalization method, cut-off criteria, software settings including software version. The datasets from two previous studies on human responses to YF-17D vaccine, reported by Gaucher et al. (2008) and Querec et al. (2009), were re-analyzed. We first applied the same LIMMA statistical method to re-analyze the Gaucher data set and identified a big difference in terms of significantly differentiated gene lists compared to the original study. The different results were likely due to the LIMMA version and software package differences. Our second study re-analyzed both Gaucher and Querec data sets but with the same data processing and analysis pipeline. Significant differences in differential gene lists were also identified. In both studies, we found that Gene Ontology (GO) enrichment results had more overlapping than the gene lists and enriched pathway lists. The visualization of the identified GO hierarchical structures among the enriched GO terms and their associated ancestor terms using GOfox allowed us to find more associations among enriched but often different GO terms, demonstrating the usage of GO hierarchical relations enhance data analysis. CONCLUSIONS: The ontology-based analysis framework supports standardized representation, integration, and analysis of heterogeneous data of host responses to vaccines. Our study also showed that differences in specific variables might explain different results drawn from similar studies.


Subject(s)
Vaccines , Biological Ontologies , Humans , Software
6.
BMC Bioinformatics ; 20(Suppl 7): 199, 2019 May 01.
Article in English | MEDLINE | ID: mdl-31074377

ABSTRACT

BACKGROUND: Drug adverse events (AEs), or called adverse drug events (ADEs), are ranked one of the leading causes of mortality. The Ontology of Adverse Events (OAE) has been widely used for adverse event AE representation, standardization, and analysis. OAE-based ADE-specific ontologies, including ODNAE for drug-associated neuropathy-inducing AEs and OCVDAE for cardiovascular drug AEs, have also been developed and used. However, these ADE-specific ontologies do not consider the effects of other factors (e.g., age and drug-treated disease) on the outcomes of ADEs. With more ontological studies of ADEs, it is also critical to develop a general purpose ontology for representing ADEs for various types of drugs. RESULTS: Our survey of FDA drug package insert documents and other resources for 224 neuropathy-inducing drugs discovered that many drugs (e.g., sirolimus and linezolid) cause different AEs given patients' age or the diseases treated by the drugs. To logically represent the complex relations among drug, drug ingredient and mechanism of action, AE, age, disease, and other related factors, an ontology design pattern was developed and applied to generate a community-driven open-source Ontology of Drug Adverse Events (ODAE). The ODAE development follows the OBO Foundry ontology development principles (e.g., openness and collaboration). Built on a generalizable ODAE design pattern and extending the OAE and NDF-RT ontology, ODAE has represented various AEs associated with the over 200 neuropathy-inducing drugs given different age and disease conditions. ODAE is now deposited in the Ontobee for browsing and queries. As a demonstration of usage, a SPARQL query of the ODAE knowledge base was developed to identify all the drugs having the mechanisms of ion channel interactions, the diseases treated with the drugs, and AEs after the treatment in adult patients. AE-specific drug class effects were also explored using ODAE and SPARQL. CONCLUSION: ODAE provides a general representation of ADEs given different conditions and can be used for querying scientific questions. ODAE is also a robust knowledge base and platform for semantic and logic representation and study of ADEs of more drugs in the future.


Subject(s)
Adverse Drug Reaction Reporting Systems/statistics & numerical data , Drug-Related Side Effects and Adverse Reactions/etiology , Linezolid/adverse effects , Nervous System Diseases/chemically induced , Pharmaceutical Preparations/administration & dosage , Sirolimus/adverse effects , Software , Adult , Age Factors , Anti-Bacterial Agents/adverse effects , Antibiotics, Antineoplastic/adverse effects , Drug-Related Side Effects and Adverse Reactions/pathology , Humans , Pharmaceutical Preparations/analysis
7.
BMC Bioinformatics ; 20(Suppl 5): 180, 2019 Apr 25.
Article in English | MEDLINE | ID: mdl-31272389

ABSTRACT

BACKGROUND: Stem cells and stem cell lines are widely used in biomedical research. The Cell Ontology (CL) and Cell Line Ontology (CLO) are two community-based OBO Foundry ontologies in the domains of in vivo cells and in vitro cell line cells, respectively. RESULTS: To support standardized stem cell investigations, we have developed an Ontology for Stem Cell Investigations (OSCI). OSCI imports stem cell and cell line terms from CL and CLO, and investigation-related terms from existing ontologies. A novel focus of OSCI is its application in representing metadata types associated with various stem cell investigations. We also applied OSCI to systematically categorize experimental variables in an induced pluripotent stem cell line cell study related to bipolar disorder. In addition, we used a semi-automated literature mining approach to identify over 200 stem cell gene markers. The relations between these genes and stem cells are modeled and represented in OSCI. CONCLUSIONS: OSCI standardizes stem cells found in vivo and in vitro and in various stem cell investigation processes and entities. The presented use cases demonstrate the utility of OSCI in iPSC studies and literature mining related to bipolar disorder.


Subject(s)
Biological Ontologies , Biomedical Research/standards , Animals , Humans , Stem Cells
8.
Nucleic Acids Res ; 45(D1): D347-D352, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27733503

ABSTRACT

Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies.


Subject(s)
Biological Ontologies , Databases, Factual , Software , Web Browser
9.
BMC Bioinformatics ; 18(Suppl 17): 557, 2017 12 21.
Article in English | MEDLINE | ID: mdl-29322915

ABSTRACT

BACKGROUND: The Experimental Factor Ontology (EFO) is an application ontology driven by experimental variables including cell lines to organize and describe the diverse experimental variables and data resided in the EMBL-EBI resources. The Cell Line Ontology (CLO) is an OBO community-based ontology that contains information of immortalized cell lines and relevant experimental components. EFO integrates and extends ontologies from the bio-ontology community to drive a number of practical applications. It is desirable that the community shares design patterns and therefore that EFO reuses the cell line representation from the Cell Line Ontology (CLO). There are, however, challenges to be addressed when developing a common ontology design pattern for representing cell lines in both EFO and CLO. RESULTS: In this study, we developed a strategy to compare and map cell line terms between EFO and CLO. We examined Cellosaurus resources for EFO-CLO cross-references. Text labels of cell lines from both ontologies were verified by biological information axiomatized in each source. The study resulted in the identification 873 EFO-CLO aligned and 344 EFO unique immortalized permanent cell lines. All of these cell lines were updated to CLO and the cell line related information was merged. A design pattern that integrates EFO and CLO was also developed. CONCLUSION: Our study compared, aligned, and synchronized the cell line information between CLO and EFO. The final updated CLO will be examined as the candidate ontology to import and replace eligible EFO cell line classes thereby supporting the interoperability in the bio-ontology domain. Our mapping pipeline illustrates the use of ontology in aiding biological data standardization and integration through the biological and semantics content of cell lines.


Subject(s)
Algorithms , Biological Ontologies , Cell Physiological Phenomena , Computational Biology/methods , Databases, Factual , Gene Expression Profiling , Cell Line , Data Mining , Humans , Semantics
10.
BMC Bioinformatics ; 18(Suppl 17): 556, 2017 12 21.
Article in English | MEDLINE | ID: mdl-29322930

ABSTRACT

BACKGROUND: Aiming to understand cellular responses to different perturbations, the NIH Common Fund Library of Integrated Network-based Cellular Signatures (LINCS) program involves many institutes and laboratories working on over a thousand cell lines. The community-based Cell Line Ontology (CLO) is selected as the default ontology for LINCS cell line representation and integration. RESULTS: CLO has consistently represented all 1097 LINCS cell lines and included information extracted from the LINCS Data Portal and ChEMBL. Using MCF 10A cell line cells as an example, we demonstrated how to ontologically model LINCS cellular signatures such as their non-tumorigenic epithelial cell type, three-dimensional growth, latrunculin-A-induced actin depolymerization and apoptosis, and cell line transfection. A CLO subset view of LINCS cell lines, named LINCS-CLOview, was generated to support systematic LINCS cell line analysis and queries. In summary, LINCS cell lines are currently associated with 43 cell types, 131 tissues and organs, and 121 cancer types. The LINCS-CLO view information can be queried using SPARQL scripts. CONCLUSIONS: CLO was used to support ontological representation, integration, and analysis of over a thousand LINCS cell line cells and their cellular responses.


Subject(s)
Breast/metabolism , Computational Biology/methods , Gene Expression Regulation , High-Throughput Screening Assays , Neoplasms/genetics , Apoptosis/drug effects , Breast/cytology , Breast/drug effects , Cell Line , Cells, Cultured , Female , Gene Expression Profiling , Humans , Macrolides/pharmacology , Neoplasms/drug therapy , Neoplasms/pathology , Thiazolidines/pharmacology
11.
Int J Mol Sci ; 18(2)2017 Feb 21.
Article in English | MEDLINE | ID: mdl-28230771

ABSTRACT

As one of the most influential and troublesome human pathogens, Acinetobacter baumannii (A. baumannii) has emerged with many multidrug-resistant strains. After collecting 33 complete A. baumannii genomes and 84 representative antibiotic resistance determinants, we used the Vaxign reverse vaccinology approach to predict classical type vaccine candidates against A. baumannii infections and new type vaccine candidates against antibiotic resistance. Our genome analysis identified 35 outer membrane or extracellular adhesins that are conserved among all 33 genomes, have no human protein homology, and have less than 2 transmembrane helices. These 35 antigens include 11 TonB dependent receptors, 8 porins, 7 efflux pump proteins, and 2 fimbrial proteins (FilF and CAM87009.1). CAM86003.1 was predicted to be an adhesin outer membrane protein absent from 3 antibiotic-sensitive strains and conserved in 21 antibiotic-resistant strains. Feasible anti-resistance vaccine candidates also include one extracellular protein (QnrA), 3 RND type outer membrane efflux pump proteins, and 3 CTX-M type ß-lactamases. Among 39 ß-lactamases, A. baumannii CTX-M-2, -5, and -43 enzymes are predicted as adhesins and better vaccine candidates than other ß-lactamases to induce preventive immunity and enhance antibiotic treatments. This report represents the first reverse vaccinology study to systematically predict vaccine antigen candidates against antibiotic resistance for a microbial pathogen.


Subject(s)
Acinetobacter baumannii/drug effects , Acinetobacter baumannii/immunology , Bacterial Vaccines/immunology , Drug Resistance, Bacterial/immunology , Epitopes/immunology , Acinetobacter Infections/immunology , Acinetobacter Infections/prevention & control , Acinetobacter baumannii/genetics , Amino Acid Sequence , Anti-Bacterial Agents/pharmacology , Computational Biology/methods , Conserved Sequence , Epitopes/chemistry , Epitopes/genetics , Genes, Bacterial , Genome, Bacterial , Genomics/methods , Humans , Microbial Sensitivity Tests
12.
Comput Biol Med ; 171: 108114, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38401450

ABSTRACT

BACKGROUND: Bacteria can have beneficial effects on our health and environment; however, many are responsible for serious infectious diseases, warranting the need for vaccines against such pathogens. Bioinformatic and experimental technologies are crucial for the development of vaccines. The vaccine design pipeline requires identification of bacteria-specific antigens that can be recognized and can induce a response by the immune system upon infection. Immune system recognition is influenced by the location of a protein. Methods have been developed to determine the subcellular localization (SCL) of proteins in prokaryotes and eukaryotes. Bioinformatic tools such as PSORTb can be employed to determine SCL of proteins, which would be tedious to perform experimentally. Unfortunately, PSORTb often predicts many proteins as having an "Unknown" SCL, reducing the number of antigens to evaluate as potential vaccine targets. METHOD: We present a new pipeline called subCellular lOcalization prediction for BacteRiAl Proteins (mtx-COBRA). mtx-COBRA uses Meta's protein language model, Evolutionary Scale Modeling, combined with an Extreme Gradient Boosting machine learning model to identify SCL of bacterial proteins based on amino acid sequence. This pipeline is trained on a curated dataset that combines data from UniProt and the publicly available ePSORTdb dataset. RESULTS: Using benchmarking analyses, nested 5-fold cross-validation, and leave-one-pathogen-out methods, followed by testing on the held-out dataset, we show that our pipeline predicts the SCL of bacterial proteins more accurately than PSORTb. CONCLUSIONS: mtx-COBRA provides an accessible pipeline that can more efficiently classify bacterial proteins with currently "Unknown" SCLs than existing bioinformatic and experimental methods.


Subject(s)
Bacterial Proteins , Vaccines , Bacterial Proteins/chemistry , Software , Bacteria , Amino Acid Sequence , Computational Biology/methods
13.
NAR Cancer ; 6(1): zcad060, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38204924

ABSTRACT

Cancer vaccines have been increasingly studied and developed to prevent or treat various types of cancers. To systematically survey and analyze different reported cancer vaccines, we developed CanVaxKB (https://violinet.org/canvaxkb), the first web-based cancer vaccine knowledgebase that compiles over 670 therapeutic or preventive cancer vaccines that have been experimentally verified to be effective at various stages. Vaccine construction and host response data are also included. These cancer vaccines are developed against various cancer types such as melanoma, hematological cancer, and prostate cancer. CanVaxKB has stored 263 genes or proteins that serve as cancer vaccine antigen genes, which we have collectively termed 'canvaxgens'. Top three mostly used canvaxgens are PMEL, MLANA and CTAG1B, often targeting multiple cancer types. A total of 193 canvaxgens are also reported in cancer-related ONGene, Network of Cancer Genes and/or Sanger Cancer Gene Consensus databases. Enriched functional annotations and clusters of canvaxgens were identified and analyzed. User-friendly web interfaces are searchable for querying and comparing cancer vaccines. CanVaxKB cancer vaccines are also semantically represented by the community-based Vaccine Ontology to support data exchange. Overall, CanVaxKB is a timely and vital cancer vaccine source that facilitates efficient collection and analysis, further helping researchers and physicians to better understand cancer mechanisms.

14.
Sci Transl Med ; 15(710): eadg6050, 2023 08 23.
Article in English | MEDLINE | ID: mdl-37611082

ABSTRACT

The RSVPreF3-AS01 vaccine, containing the respiratory syncytial virus (RSV) prefusion F protein and the AS01 adjuvant, was previously shown to boost neutralization responses against historical RSV strains and to be efficacious in preventing RSV-associated lower respiratory tract diseases in older adults. Although RSV F is highly conserved, variation does exist between strains. Here, we characterized variations in the major viral antigenic sites among contemporary RSV sequences when compared with RSVPreF3 and showed that, in older adults, RSVPreF3-AS01 broadly boosts neutralization responses against currently dominant and antigenically distant RSV strains. RSV-neutralizing responses are thought to play a central role in preventing RSV infection. Therefore, the breadth of RSVPreF3-AS01-elicited neutralization responses may contribute to vaccine efficacy against contemporary RSV strains and those that may emerge in the future.


Subject(s)
Respiratory Syncytial Virus Infections , Vaccines , Humans , Aged , Respiratory Syncytial Viruses , Respiratory Syncytial Virus Infections/prevention & control , Antigens, Viral
15.
Methods Mol Biol ; 2414: 1-16, 2022.
Article in English | MEDLINE | ID: mdl-34784028

ABSTRACT

Reverse vaccinology (RV) is the state-of-the-art vaccine development strategy that starts with predicting vaccine antigens by bioinformatics analysis of the whole genome of a pathogen of interest. Vaxign is the first web-based RV vaccine prediction method based on calculating and filtering different criteria of proteins. Vaxign-ML is a new Vaxign machine learning (ML) method that predicts vaccine antigens based on extreme gradient boosting with the advance of new technologies and cumulation of protective antigen data. Using a benchmark dataset, Vaxign-ML showed superior performance in comparison to existing open-source RV tools. Vaxign-ML is also implemented within the web-based Vaxign platform to support easy and intuitive access. Vaxign-ML is also available as a command-based software package for more advanced and customizable vaccine antigen prediction. Both Vaxign and Vaxign-ML have been applied to predict SARS-CoV-2 (cause of COVID-19) and Brucella vaccine antigens to demonstrate the integrative approach to analyze and select vaccine candidates using the Vaxign platform.


Subject(s)
Machine Learning , Vaccines , Vaccinology , Brucella Vaccine , COVID-19 , COVID-19 Vaccines , Computational Biology , Humans
16.
Front Immunol ; 13: 1066733, 2022.
Article in English | MEDLINE | ID: mdl-36591248

ABSTRACT

COVID-19 often manifests with different outcomes in different patients, highlighting the complexity of the host-pathogen interactions involved in manifestations of the disease at the molecular and cellular levels. In this paper, we propose a set of postulates and a framework for systematically understanding complex molecular host-pathogen interaction networks. Specifically, we first propose four host-pathogen interaction (HPI) postulates as the basis for understanding molecular and cellular host-pathogen interactions and their relations to disease outcomes. These four postulates cover the evolutionary dispositions involved in HPIs, the dynamic nature of HPI outcomes, roles that HPI components may occupy leading to such outcomes, and HPI checkpoints that are critical for specific disease outcomes. Based on these postulates, an HPI Postulate and Ontology (HPIPO) framework is proposed to apply interoperable ontologies to systematically model and represent various granular details and knowledge within the scope of the HPI postulates, in a way that will support AI-ready data standardization, sharing, integration, and analysis. As a demonstration, the HPI postulates and the HPIPO framework were applied to study COVID-19 with the Coronavirus Infectious Disease Ontology (CIDO), leading to a novel approach to rational design of drug/vaccine cocktails aimed at interrupting processes occurring at critical host-coronavirus interaction checkpoints. Furthermore, the host-coronavirus protein-protein interactions (PPIs) relevant to COVID-19 were predicted and evaluated based on prior knowledge of curated PPIs and domain-domain interactions, and how such studies can be further explored with the HPI postulates and the HPIPO framework is discussed.


Subject(s)
COVID-19 , Humans , Host-Pathogen Interactions
17.
J Biomed Semantics ; 13(1): 25, 2022 10 21.
Article in English | MEDLINE | ID: mdl-36271389

ABSTRACT

BACKGROUND: The current COVID-19 pandemic and the previous SARS/MERS outbreaks of 2003 and 2012 have resulted in a series of major global public health crises. We argue that in the interest of developing effective and safe vaccines and drugs and to better understand coronaviruses and associated disease mechenisms it is necessary to integrate the large and exponentially growing body of heterogeneous coronavirus data. Ontologies play an important role in standard-based knowledge and data representation, integration, sharing, and analysis. Accordingly, we initiated the development of the community-based Coronavirus Infectious Disease Ontology (CIDO) in early 2020. RESULTS: As an Open Biomedical Ontology (OBO) library ontology, CIDO is open source and interoperable with other existing OBO ontologies. CIDO is aligned with the Basic Formal Ontology and Viral Infectious Disease Ontology. CIDO has imported terms from over 30 OBO ontologies. For example, CIDO imports all SARS-CoV-2 protein terms from the Protein Ontology, COVID-19-related phenotype terms from the Human Phenotype Ontology, and over 100 COVID-19 terms for vaccines (both authorized and in clinical trial) from the Vaccine Ontology. CIDO systematically represents variants of SARS-CoV-2 viruses and over 300 amino acid substitutions therein, along with over 300 diagnostic kits and methods. CIDO also describes hundreds of host-coronavirus protein-protein interactions (PPIs) and the drugs that target proteins in these PPIs. CIDO has been used to model COVID-19 related phenomena in areas such as epidemiology. The scope of CIDO was evaluated by visual analysis supported by a summarization network method. CIDO has been used in various applications such as term standardization, inference, natural language processing (NLP) and clinical data integration. We have applied the amino acid variant knowledge present in CIDO to analyze differences between SARS-CoV-2 Delta and Omicron variants. CIDO's integrative host-coronavirus PPIs and drug-target knowledge has also been used to support drug repurposing for COVID-19 treatment. CONCLUSION: CIDO represents entities and relations in the domain of coronavirus diseases with a special focus on COVID-19. It supports shared knowledge representation, data and metadata standardization and integration, and has been used in a range of applications.


Subject(s)
COVID-19 , Communicable Diseases , Coronavirus , Vaccines , Humans , SARS-CoV-2 , Pandemics , Amino Acids , COVID-19 Drug Treatment
18.
Vaccines (Basel) ; 9(10)2021 Sep 28.
Article in English | MEDLINE | ID: mdl-34696207

ABSTRACT

Tuberculosis (TB) is the leading cause of death of any single infectious agent, having led to 1.4 million deaths in 2019 alone. Moreover, an estimated one-quarter of the global population is latently infected with Mycobacterium tuberculosis (MTB), presenting a huge pool of potential future disease. Nonetheless, the only currently licensed TB vaccine fails to prevent the activation of latent TB infections (LTBI). These facts together illustrate the desperate need for a more effective TB vaccine strategy that can prevent both primary infection and the activation of LTBI. In this study, we employed a machine learning-based reverse vaccinology approach to predict the likelihood that each protein within the proteome of MTB laboratory reference strain H37Rv would be a protective antigen (PAg). The proteins predicted most likely to be a PAg were assessed for their belonging to a protein family of previously established PAgs, the relevance of their biological processes to MTB virulence and latency, and finally the immunogenic potential that they may provide in terms of the number of promiscuous epitopes within each. This study led to the identification of 16 proteins with the greatest vaccine potential for further in vitro and in vivo studies. It also demonstrates the value of computational methods in vaccine development.

19.
Comput Struct Biotechnol J ; 19: 518-529, 2021.
Article in English | MEDLINE | ID: mdl-33398234

ABSTRACT

The development of effective and safe vaccines is the ultimate way to efficiently stop the ongoing COVID-19 pandemic, which is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Built on the fact that SARS-CoV-2 utilizes the association of its Spike (S) protein with the human angiotensin-converting enzyme 2 (ACE2) receptor to invade host cells, we computationally redesigned the S protein sequence to improve its immunogenicity and antigenicity. Toward this purpose, we extended an evolutionary protein design algorithm, EvoDesign, to create thousands of stable S protein variants that perturb the core protein sequence but keep the surface conformation and B cell epitopes. The T cell epitope content and similarity scores of the perturbed sequences were calculated and evaluated. Out of 22,914 designs with favorable stability energy, 301 candidates contained at least two pre-existing immunity-related epitopes and had promising immunogenic potential. The benchmark tests showed that, although the epitope restraints were not included in the scoring function of EvoDesign, the top S protein design successfully recovered 31 out of the 32 major histocompatibility complex (MHC)-II T cell promiscuous epitopes in the native S protein, where two epitopes were present in all seven human coronaviruses. Moreover, the newly designed S protein introduced nine new MHC-II T cell promiscuous epitopes that do not exist in the wildtype SARS-CoV-2. These results demonstrated a new and effective avenue to enhance a target protein's immunogenicity using rational protein design, which could be applied for new vaccine design against COVID-19 and other pathogens.

20.
Front Microbiol ; 12: 633732, 2021.
Article in English | MEDLINE | ID: mdl-33717026

ABSTRACT

Alterations in the gut microbiome have been associated with various human diseases. Most existing gut microbiome studies stopped at the stage of identifying microbial alterations between diseased or healthy conditions. As inspired by reverse vaccinology (RV), we developed a new strategy called Reverse Microbiomics (RM) that turns this process around: based on the identified microbial alternations, reverse-predicting the molecular mechanisms underlying the disease and microbial alternations. Our RM methodology starts by identifying significantly altered microbiota profiles, performing bioinformatics analysis on the proteomes of the microbiota identified, and finally predicting potential virulence or protective factors relevant to a microbiome-associated disease. As a use case study, this reverse methodology was applied to study the molecular pathogenesis of rheumatoid arthritis (RA), a common autoimmune and inflammatory disease. Those bacteria differentially associated with RA were first identified and annotated from published data and then modeled and classified using the Ontology of Host-Microbiome Interactions (OHMI). Our study identified 14 species increased and 9 species depleted in the gut microbiota of RA patients. Vaxign was used to comparatively analyze 15 genome sequences of the two pairs of species: Gram-negative Prevotella copri (increased) and Prevotella histicola (depleted), as well as Gram-positive Bifidobacterium dentium (increased) and Bifidobacterium bifidum (depleted). In total, 21 auto-antigens were predicted to be related to RA, and five of them were previously reported to be associated with RA with experimental evidence. Furthermore, we identified 94 potential adhesive virulence factors including 24 microbial ABC transporters. While eukaryotic ABC transporters are key RA diagnosis markers and drug targets, we identified, for the first-time, RA-associated microbial ABC transporters and provided a novel hypothesis of RA pathogenesis. Our study showed that RM, by broadening the scope of RV, is a novel and effective strategy to study from bacterial level to molecular level factors and gain further insight into how these factors possibly contribute to the development of microbial alterations under specific diseases.

SELECTION OF CITATIONS
SEARCH DETAIL