Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
2.
Sci Rep ; 12(1): 2427, 2022 02 14.
Article in English | MEDLINE | ID: mdl-35165358

ABSTRACT

Effective and timely antibiotic treatment depends on accurate and rapid in silico antimicrobial-resistant (AMR) predictions. Existing statistical rule-based Mycobacterium tuberculosis (MTB) drug resistance prediction methods using bacterial genomic sequencing data often achieve varying results: high accuracy on some antibiotics but relatively low accuracy on others. Traditional machine learning (ML) approaches have been applied to classify drug resistance for MTB and have shown more stable performance. However, there is no study that uses deep learning architecture like Convolutional Neural Network (CNN) on a large and diverse cohort of MTB samples for AMR prediction. We developed 24 binary classifiers of MTB drug resistance status across eight anti-MTB drugs and three different ML algorithms: logistic regression, random forest and 1D CNN using a training dataset of 10,575 MTB isolates collected from 16 countries across six continents, where an extended pan-genome reference was used for detecting genetic features. Our 1D CNN architecture was designed to integrate both sequential and non-sequential features. In terms of F1-scores, 1D CNN models are our best classifiers that are also more accurate and stable than the state-of-the-art rule-based tool Mykrobe predictor (81.1 to 93.8%, 93.7 to 96.2%, 93.1 to 94.8%, 95.9 to 97.2% and 97.1 to 98.2% for ethambutol, rifampicin, pyrazinamide, isoniazid and ofloxacin respectively). We applied filter-based feature selection to find AMR relevant features. All selected variant features are AMR-related ones in CARD database. 78.8% of them are also in the catalogue of MTB mutations that were recently identified as drug resistance-associated ones by WHO. To facilitate ML model development for AMR prediction, we packaged every step into an automated pipeline and shared the source code at https://github.com/KuangXY3/MTB-AMR-classification-CNN .


Subject(s)
Antitubercular Agents/pharmacology , Antitubercular Agents/therapeutic use , Data Accuracy , Deep Learning , Drug Resistance, Multiple, Bacterial/genetics , Genome, Bacterial/drug effects , Mycobacterium tuberculosis/genetics , Tuberculosis, Multidrug-Resistant/drug therapy , Whole Genome Sequencing/methods , Cohort Studies , Humans , Microbial Sensitivity Tests , Mutation , Mycobacterium tuberculosis/isolation & purification , Phenotype , Phylogeny , Prognosis , Tuberculosis, Multidrug-Resistant/microbiology
3.
Breast Cancer Res ; 22(1): 41, 2020 05 05.
Article in English | MEDLINE | ID: mdl-32370801

ABSTRACT

BACKGROUND: In utero endocrine disruption is linked to increased risk of breast cancer later in life. Despite numerous studies establishing this linkage, the long-term molecular changes that predispose mammary cells to carcinogenic transformation are unknown. Herein, we investigated how endocrine disrupting compounds (EDCs) drive changes within the stroma that can contribute to breast cancer susceptibility. METHODS: We utilized bisphenol A (BPA) as a model of estrogenic endocrine disruption to analyze the long-term consequences in the stroma. Deregulated genes were identified by RNA-seq transcriptional profiling of adult primary fibroblasts, isolated from female mice exposed to in utero BPA. Collagen staining, collagen imaging techniques, and permeability assays were used to characterize changes to the extracellular matrix. Finally, gland stiffness tests were performed on exposed and control mammary glands. RESULTS: We identified significant transcriptional deregulation of adult fibroblasts exposed to in utero BPA. Deregulated genes were associated with cancer pathways and specifically extracellular matrix composition. Multiple collagen genes were more highly expressed in the BPA-exposed fibroblasts resulting in increased collagen deposition in the adult mammary gland. This transcriptional reprogramming of BPA-exposed fibroblasts generates a less permeable extracellular matrix and a stiffer mammary gland. These phenotypes were only observed in adult 12-week-old, but not 4-week-old, mice. Additionally, diethylstilbestrol, known to increase breast cancer risk in humans, also increases gland stiffness similar to BPA, while bisphenol S does not. CONCLUSIONS: As breast stiffness, extracellular matrix density, and collagen deposition have been directly linked to breast cancer risk, these data mechanistically connect EDC exposures to molecular alterations associated with increased disease susceptibility. These alterations develop over time and thus contribute to cancer risk in adulthood.


Subject(s)
Endocrine Disruptors/toxicity , Extracellular Matrix/pathology , Mammary Glands, Animal/pathology , Prenatal Exposure Delayed Effects/pathology , Stromal Cells/pathology , Animals , Benzhydryl Compounds/toxicity , Estrogens, Non-Steroidal/pharmacology , Extracellular Matrix/drug effects , Extracellular Matrix/immunology , Female , Fibroblasts/immunology , Fibroblasts/pathology , Mammary Glands, Animal/drug effects , Mammary Glands, Animal/immunology , Mammary Glands, Animal/metabolism , Mice , Phenols/toxicity , Pregnancy , Prenatal Exposure Delayed Effects/chemically induced , Prenatal Exposure Delayed Effects/metabolism , Stromal Cells/drug effects , Stromal Cells/immunology , Transcriptome
4.
Article in English | MEDLINE | ID: mdl-26827237

ABSTRACT

Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction's mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein-protein interactions or protein-DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1,040,000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43,000 RNA-mediated interactions, and ∼12,000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org.


Subject(s)
Computational Biology/methods , DNA/chemistry , Macromolecular Substances/chemistry , Protein Interaction Mapping , RNA/chemistry , Computer Simulation , Databases, Protein , Genetic Variation , Humans , Internet , Polymorphism, Single Nucleotide , Protein Structure, Tertiary , Proteins/chemistry , Software , Viruses
5.
Article in English | MEDLINE | ID: mdl-26357218

ABSTRACT

Accurate alignment of protein-protein binding sites can aid in protein docking studies and constructing templates for predicting structure of protein complexes, along with in-depth understanding of evolutionary and functional relationships. However, over the past three decades, structural alignment algorithms have focused predominantly on global alignments with little effort on the alignment of local interfaces. In this paper, we introduce the PBSalign (Protein-protein Binding Site alignment) method, which integrates techniques in graph theory, 3D localized shape analysis, geometric scoring, and utilization of physicochemical and geometrical properties. Computational results demonstrate that PBSalign is capable of identifying similar homologous and analogous binding sites accurately and performing alignments with better geometric match measures than existing protein-protein interface comparison tools. The proportion of better alignment quality generated by PBSalign is 46, 56, and 70 percent more than iAlign as judged by the average match index (MI), similarity index (SI), and structural alignment score (SAS), respectively. PBSalign provides the life science community an efficient and accurate solution to binding-site alignment while striking the balance between topological details and computational complexity.


Subject(s)
Binding Sites , Computational Biology/methods , Models, Molecular , Proteins/chemistry , Sequence Alignment/methods , Databases, Protein , Structural Homology, Protein
6.
Nat Commun ; 5: 3650, 2014 Apr 11.
Article in English | MEDLINE | ID: mdl-24722188

ABSTRACT

Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases.


Subject(s)
Autistic Disorder/metabolism , Alternative Splicing/genetics , Alternative Splicing/physiology , Autistic Disorder/genetics , Genetic Predisposition to Disease/genetics , Humans , Molecular Sequence Data , Protein Interaction Maps/genetics , Protein Interaction Maps/physiology , Protein Isoforms/genetics , Protein Isoforms/metabolism , Risk Factors
7.
Nucleic Acids Res ; 40(Web Server issue): W428-34, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22689645

ABSTRACT

PBSword is a web server designed for efficient and accurate comparisons and searches of geometrically similar protein-protein binding sites from a large-scale database. The basic idea of PBSword is that each protein binding site is first represented by a high-dimensional vector of 'visual words', which characterizes both the global and local shape features of the binding site. It then uses a scalable indexing technique to search for those binding sites whose visual words representations are similar to that of the query binding site. Our system is able to return ranked results of binding sites in short time from a database of 194 322 domain-domain binding sites. PBSword supports query by protein ID and by new structures uploaded by users. PBSword is a useful tool to investigate functional connections among proteins based on the local structures of binding site and has potential applications to protein-protein docking and drug discovery. The system is hosted at http://pbs.rnet.missouri.edu.


Subject(s)
Protein Interaction Domains and Motifs , Protein Interaction Mapping , Software , Binding Sites , Databases, Protein , Internet , Proteins/chemistry , User-Computer Interface
8.
Nucleic Acids Res ; 40(Database issue): D501-6, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22135305

ABSTRACT

With the growing number of experimentally resolved structures of macromolecular complexes, it becomes clear that the interactions that involve protein structures are mediated not only by the protein domains, but also by various non-structured regions, such as interdomain linkers, or terminal sequences. Here, we present DOMMINO (http://dommino.org), a comprehensive database of macromolecular interactions that includes the interactions between protein domains, interdomain linkers, N- and C-terminal regions and protein peptides. The database complements SCOP domain annotations with domain predictions by SUPERFAMILY and is automatically updated every week. The database interface is designed to provide the user with a three-stage pipeline to study macromolecular interactions: (i) a flexible search that can include a PDB ID, type of interaction, SCOP family of interacting proteins, organism name, interaction keyword and a minimal threshold on the number of contact pairs; (ii) visualization of subunit interaction network, where the user can investigate the types of interactions within a macromolecular assembly; and (iii) visualization of an interface structure between any pair of the interacting subunits, where the user can highlight several different types of residues within the interfaces as well as study the structure of the corresponding binary complex of subunits.


Subject(s)
Databases, Protein , Protein Interaction Domains and Motifs , Protein Interaction Mapping , Molecular Sequence Annotation , Peptides/chemistry , Proteins/chemistry , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...