Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 64
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(11): e2219523120, 2023 03 14.
Artigo em Inglês | MEDLINE | ID: mdl-36893269

RESUMO

The continuous evolution of SARS-CoV-2 variants complicates efforts to combat the ongoing pandemic, underscoring the need for a dynamic platform for the rapid development of pan-viral variant therapeutics. Oligonucleotide therapeutics are enhancing the treatment of numerous diseases with unprecedented potency, duration of effect, and safety. Through the systematic screening of hundreds of oligonucleotide sequences, we identified fully chemically stabilized siRNAs and ASOs that target regions of the SARS-CoV-2 genome conserved in all variants of concern, including delta and omicron. We successively evaluated candidates in cellular reporter assays, followed by viral inhibition in cell culture, with eventual testing of leads for in vivo antiviral activity in the lung. Previous attempts to deliver therapeutic oligonucleotides to the lung have met with only modest success. Here, we report the development of a platform for identifying and generating potent, chemically modified multimeric siRNAs bioavailable in the lung after local intranasal and intratracheal delivery. The optimized divalent siRNAs showed robust antiviral activity in human cells and mouse models of SARS-CoV-2 infection and represent a new paradigm for antiviral therapeutic development for current and future pandemics.


Assuntos
COVID-19 , Humanos , Animais , Camundongos , RNA Interferente Pequeno/genética , COVID-19/terapia , SARS-CoV-2/genética , Antivirais/farmacologia , Antivirais/uso terapêutico , Oligonucleotídeos , Pulmão
2.
Behav Brain Sci ; 46: e249, 2023 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-37779279

RESUMO

One novel example and/or perspective in support of "Why the learning account fails" is the impressive ability of humans to recognize and memorize facial features and accurately and reliably connect those to related identities. Furthermore, neuroimaging analysis presents an example in support of the crucial role of standardization in the lack of adoption of ideography.


Assuntos
Reconhecimento Facial , Humanos , Reconhecimento Psicológico , Aprendizagem , Neuroimagem , Expressão Facial
3.
RNA ; 26(10): 1303-1319, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32532794

RESUMO

Single-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine-grained discovery of cellular subtypes and specific cell states. Analysis of scRNA-seq data routinely involves machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering novel information from scRNA-seq data. However, current methods are not well suited to deal with the substantial amount of noise that is created by the experiments or the variation that occurs due to differences in the cells of the same type. To address this, we developed a new hybrid approach, deep unsupervised single-cell clustering (DUSC), which integrates feature generation based on a deep learning architecture by using a new technique to estimate the number of latent features, with a model-based clustering algorithm, to find a compact and informative representation of the single-cell transcriptomic data generating robust clusters. We also include a technique to estimate an efficient number of latent features in the deep learning model. Our method outperforms both classical and state-of-the-art feature learning and clustering methods, approaching the accuracy of supervised learning. We applied DUSC to a single-cell transcriptomics data set obtained from a triple-negative breast cancer tumor to identify potential cancer subclones accentuated by copy-number variation and investigate the role of clonal heterogeneity. Our method is freely available to the community and will hopefully facilitate our understanding of the cellular atlas of living organisms as well as provide the means to improve patient diagnostics and treatment.


Assuntos
Perfilação da Expressão Gênica/métodos , RNA-Seq/métodos , Análise de Célula Única/métodos , Algoritmos , Animais , Análise por Conglomerados , Biologia Computacional , Humanos , Aprendizado de Máquina , Análise de Sequência de RNA/métodos , Transcriptoma/genética
4.
New Phytol ; 229(1): 563-574, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32569394

RESUMO

Cyst nematodes induce a multicellular feeding site within roots called a syncytium. It remains unknown how root cells are primed for incorporation into the developing syncytium. Furthermore, it is unclear how CLAVATA3/EMBRYO SURROUNDING REGION (CLE) peptide effectors secreted into the cytoplasm of the initial feeding cell could have an effect on plant cells so distant from where the nematode is feeding as the syncytium expands. Here we describe a novel translocation signal within nematode CLE effectors that is recognized by plant cell secretory machinery to redirect these peptides from the cytoplasm to the apoplast of plant cells. We show that the translocation signal is functionally conserved across CLE effectors identified in nematode species spanning three genera and multiple plant species, operative across plant cell types, and can traffic other unrelated small peptides from the cytoplasm to the apoplast of host cells via a previously unknown post-translational mechanism of endoplasmic reticulum (ER) translocation. Our results uncover a mechanism of effector trafficking that is unprecedented in any plant pathogen to date, andthey illustrate how phytonematodes can deliver effector proteins into host cells and then hijack plant cellular processes for their export back out of the cell to function as external signaling molecules to distant cells.


Assuntos
Nematoides , Tylenchoidea , Animais , Retículo Endoplasmático , Proteínas de Helminto/genética , Interações Hospedeiro-Parasita , Peptídeos , Doenças das Plantas , Raízes de Plantas
5.
RNA ; 24(9): 1119-1132, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29941426

RESUMO

RNA sequencing (RNA-seq) is becoming a prevalent approach to quantify gene expression and is expected to gain better insights into a number of biological and biomedical questions compared to DNA microarrays. Most importantly, RNA-seq allows us to quantify expression at the gene or transcript levels. However, leveraging the RNA-seq data requires development of new data mining and analytics methods. Supervised learning methods are commonly used approaches for biological data analysis that have recently gained attention for their applications to RNA-seq data. Here, we assess the utility of supervised learning methods trained on RNA-seq data for a diverse range of biological classification tasks. We hypothesize that the transcript-level expression data are more informative for biological classification tasks than the gene-level expression data. Our large-scale assessment utilizes multiple data sets, organisms, lab groups, and RNA-seq analysis pipelines. Overall, we performed and assessed 61 biological classification problems that leverage three independent RNA-seq data sets and include over 2000 samples that come from multiple organisms, lab groups, and RNA-seq analyses. These 61 problems include predictions of the tissue type, sex, or age of the sample, healthy or cancerous phenotypes, and pathological tumor stages for the samples from the cancerous tissue. For each problem, the performance of three normalization techniques and six machine learning classifiers was explored. We find that for every single classification problem, the transcript-based classifiers outperform or are comparable with gene expression-based methods. The top-performing techniques reached a near perfect classification accuracy, demonstrating the utility of supervised learning for RNA-seq based data analysis.


Assuntos
Processamento Alternativo , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Aprendizado de Máquina Supervisionado , Animais , Mineração de Dados , Humanos , Especificidade de Órgãos , RNA/genética
6.
Bioinformatics ; 35(24): 5374-5378, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31350874

RESUMO

MOTIVATION: The complexity of protein-protein interactions (PPIs) is further compounded by the fact that an average protein consists of two or more domains, structurally and evolutionary independent subunits. Experimental studies have demonstrated that an interaction between a pair of proteins is not carried out by all domains constituting each protein, but rather by a select subset. However, determining which domains from each protein mediate the corresponding PPI is a challenging task. RESULTS: Here, we present domain interaction statistical potential (DISPOT), a simple knowledge-based statistical potential that estimates the propensity of an interaction between a pair of protein domains, given their structural classification of protein (SCOP) family annotations. The statistical potential is derived based on the analysis of >352 000 structurally resolved PPIs obtained from DOMMINO, a comprehensive database of structurally resolved macromolecular interactions. AVAILABILITY AND IMPLEMENTATION: DISPOT is implemented in Python 2.7 and packaged as an open-source tool. DISPOT is implemented in two modes, basic and auto-extraction. The source code for both modes is available on GitHub: https://github.com/korkinlab/dispot and standalone docker images on DockerHub: https://hub.docker.com/r/korkinlab/dispot. The web server is freely available at http://dispot.korkinlab.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Substâncias Macromoleculares , Anotação de Sequência Molecular , Domínios Proteicos , Proteínas
7.
BMC Genomics ; 20(1): 119, 2019 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-30732586

RESUMO

BACKGROUND: Heterodera glycines, commonly referred to as the soybean cyst nematode (SCN), is an obligatory and sedentary plant parasite that causes over a billion-dollar yield loss to soybean production annually. Although there are genetic determinants that render soybean plants resistant to certain nematode genotypes, resistant soybean cultivars are increasingly ineffective because their multi-year usage has selected for virulent H. glycines populations. The parasitic success of H. glycines relies on the comprehensive re-engineering of an infection site into a syncytium, as well as the long-term suppression of host defense to ensure syncytial viability. At the forefront of these complex molecular interactions are effectors, the proteins secreted by H. glycines into host root tissues. The mechanisms of effector acquisition, diversification, and selection need to be understood before effective control strategies can be developed, but the lack of an annotated genome has been a major roadblock. RESULTS: Here, we use PacBio long-read technology to assemble a H. glycines genome of 738 contigs into 123 Mb with annotations for 29,769 genes. The genome contains significant numbers of repeats (34%), tandem duplicates (18.7 Mb), and horizontal gene transfer events (151 genes). A large number of putative effectors (431 genes) were identified in the genome, many of which were found in transposons. CONCLUSIONS: This advance provides a glimpse into the host and parasite interplay by revealing a diversity of mechanisms that give rise to virulence genes in the soybean cyst nematode, including: tandem duplications containing over a fifth of the total gene count, virulence genes hitchhiking in transposons, and 107 horizontal gene transfers not reported in other plant parasitic nematodes thus far. Through extensive characterization of the H. glycines genome, we provide new insights into H. glycines biology and shed light onto the mystery underlying complex host-parasite interactions. This genome sequence is an important prerequisite to enable work towards generating new resistance or control measures against H. glycines.


Assuntos
Evolução Molecular , Duplicação Gênica , Genômica , Glycine max/parasitologia , Tylenchoidea/genética , Tylenchoidea/fisiologia , Animais , Genótipo , Interações Hospedeiro-Parasita , Anotação de Sequência Molecular , Doenças das Plantas/parasitologia , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA
10.
Plant Physiol ; 175(3): 1370-1380, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28912378

RESUMO

Rhg4 is a major genetic locus that contributes to soybean cyst nematode (SCN) resistance in the Peking-type resistance of soybean (Glycine max), which also requires the rhg1 gene. By map-based cloning and functional genomic approaches, we previously showed that the Rhg4 gene encodes a predicted cytosolic serine hydroxymethyltransferase (GmSHMT08); however, the novel gain of function of GmSHMT08 in SCN resistance remains to be characterized. Using a forward genetic screen, we identified an allelic series of GmSHMT08 mutants that shed new light on the mechanistic aspects of GmSHMT08-mediated resistance. The new mutants provide compelling genetic evidence that Peking-type rhg1 resistance in cv Forrest is fully dependent on the GmSHMT08 gene and demonstrates that this resistance is mechanistically different from the PI 88788-type of resistance that only requires rhg1 We also demonstrated that rhg1-a from cv Forrest, although required, does not exert selection pressure on the nematode to shift from HG type 7, which further validates the bigenic nature of this resistance. Mapping of the identified mutations onto the SHMT structural model uncovered key residues for structural stability, ligand binding, enzyme activity, and protein interactions, suggesting that GmSHMT08 has additional functions aside from its main enzymatic role in SCN resistance. Lastly, we demonstrate the functionality of the GmSHMT08 SCN resistance gene in a transgenic soybean plant.


Assuntos
Resistência à Doença , Glicina Hidroximetiltransferase/genética , Glycine max/enzimologia , Glycine max/parasitologia , Mutagênese/genética , Doenças das Plantas/imunologia , Doenças das Plantas/parasitologia , Tylenchoidea/fisiologia , Animais , Teste de Complementação Genética , Testes Genéticos , Glicina Hidroximetiltransferase/química , Modelos Moleculares , Mutação/genética , Plantas Geneticamente Modificadas , Glycine max/imunologia , Tylenchoidea/patogenicidade , Virulência
11.
Nature ; 492(7428): 256-60, 2012 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-23235880

RESUMO

Soybean (Glycine max (L.) Merr.) is an important crop that provides a sustainable source of protein and oil worldwide. Soybean cyst nematode (Heterodera glycines Ichinohe) is a microscopic roundworm that feeds on the roots of soybean and is a major constraint to soybean production. This nematode causes more than US$1 billion in yield losses annually in the United States alone, making it the most economically important pathogen on soybean. Although planting of resistant cultivars forms the core management strategy for this pathogen, nothing is known about the nature of resistance. Moreover, the increase in virulent populations of this parasite on most known resistance sources necessitates the development of novel approaches for control. Here we report the map-based cloning of a gene at the Rhg4 (for resistance to Heterodera glycines 4) locus, a major quantitative trait locus contributing to resistance to this pathogen. Mutation analysis, gene silencing and transgenic complementation confirm that the gene confers resistance. The gene encodes a serine hydroxymethyltransferase, an enzyme that is ubiquitous in nature and structurally conserved across kingdoms. The enzyme is responsible for interconversion of serine and glycine and is essential for cellular one-carbon metabolism. Alleles of Rhg4 conferring resistance or susceptibility differ by two genetic polymorphisms that alter a key regulatory property of the enzyme. Our discovery reveals an unprecedented plant resistance mechanism against a pathogen. The mechanistic knowledge of the resistance gene can be readily exploited to improve nematode resistance of soybean, an increasingly important global crop.


Assuntos
Glycine max/genética , Glycine max/parasitologia , Interações Hospedeiro-Parasita , Nematoides/fisiologia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Sequência de Aminoácidos , Animais , Análise Mutacional de DNA , Ordem dos Genes , Inativação Gênica , Teste de Complementação Genética , Glicina Hidroximetiltransferase/genética , Glicina Hidroximetiltransferase/metabolismo , Haplótipos , Modelos Moleculares , Dados de Sequência Molecular , Proteínas de Plantas/química , Polimorfismo Genético/genética , Estrutura Terciária de Proteína , Locos de Características Quantitativas/genética , Glycine max/enzimologia
12.
Bioinformatics ; 32(17): i685-i692, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587690

RESUMO

MOTIVATION: Due to their high genomic variability, RNA viruses and retroviruses present a unique opportunity for detailed study of molecular evolution. Lentiviruses, with HIV being a notable example, are one of the best studied viral groups: hundreds of thousands of sequences are available together with experimentally resolved three-dimensional structures for most viral proteins. In this work, we use these data to study specific patterns of evolution of the viral proteins, and their relationship to protein interactions and immunogenicity. RESULTS: We propose a method for identification of two types of surface residues clusters with abnormal conservation: extremely conserved and extremely variable clusters. We identify them on the surface of proteins from HIV and other animal immunodeficiency viruses. Both types of clusters are overrepresented on the interaction interfaces of viral proteins with other proteins, nucleic acids or low molecular-weight ligands, both in the viral particle and between the virus and its host. In the immunodeficiency viruses, the interaction interfaces are not more conserved than the corresponding proteins on an average, and we show that extremely conserved clusters coincide with protein-protein interaction hotspots, predicted as the residues with the largest energetic contribution to the interaction. Extremely variable clusters have been identified here for the first time. In the HIV-1 envelope protein gp120, they overlap with known antigenic sites. These antigenic sites also contain many residues from extremely conserved clusters, hence representing a unique interacting interface enriched both in extremely conserved and in extremely variable clusters of residues. This observation may have important implication for antiretroviral vaccine development. AVAILABILITY AND IMPLEMENTATION: A Python package is available at https://bioinf.mpi-inf.mpg.de/publications/viral-ppi-pred/ CONTACT: voitenko@mpi-inf.mpg.de or kalinina@mpi-inf.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aminoácidos , Sequência Conservada , Evolução Molecular , Imunogenética , Proteínas Virais , Sequência de Aminoácidos , Animais , HIV-1 , Humanos , Vírus
13.
J Bacteriol ; 198(7): 1149-59, 2016 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-26833409

RESUMO

UNLABELLED: The dimorphic alphaproteobacterium Prosthecomicrobium hirschii has both short-stalked and long-stalked morphotypes. Notably, these morphologies do not arise from transitions in a cell cycle. Instead, the maternal cell morphology is typically reproduced in daughter cells, which results in microcolonies of a single cell type. In this work, we further characterized the short-stalked cells and found that these cells have a Caulobacter-like life cycle in which cell division leads to the generation of two morphologically distinct daughter cells. Using a microfluidic device and total internal reflection fluorescence (TIRF) microscopy, we observed that motile short-stalked cells attach to a surface by means of a polar adhesin. Cells attached at their poles elongate and ultimately release motile daughter cells. Robust biofilm growth occurs in the microfluidic device, enabling the collection of synchronous motile cells and downstream analysis of cell growth and attachment. Analysis of a draft P. hirschii genome sequence indicates the presence of CtrA-dependent cell cycle regulation. This characterization of P. hirschii will enable future studies on the mechanisms underlying complex morphologies and polymorphic cell cycles. IMPORTANCE: Bacterial cell shape plays a critical role in regulating important behaviors, such as attachment to surfaces, motility, predation, and cellular differentiation; however, most studies on these behaviors focus on bacteria with relatively simple morphologies, such as rods and spheres. Notably, complex morphologies abound throughout the bacteria, with striking examples, such as P. hirschii, found within the stalked Alphaproteobacteria. P. hirschii is an outstanding candidate for studies of complex morphology generation and polymorphic cell cycles. Here, the cell cycle and genome of P. hirschii are characterized. This work sets the stage for future studies of the impact of complex cell shapes on bacterial behaviors.


Assuntos
Alphaproteobacteria/citologia , Alphaproteobacteria/fisiologia , Ciclo Celular/fisiologia , Técnicas Bacteriológicas , Biofilmes/crescimento & desenvolvimento
14.
Methods ; 79-80: 18-31, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25944472

RESUMO

Tremendous advances in Next Generation Sequencing (NGS) and high-throughput omics methods have brought us one step closer towards mechanistic understanding of the complex disease at the molecular level. In this review, we discuss four basic regulatory mechanisms implicated in complex genetic diseases, such as cancer, neurological disorders, heart disease, diabetes, and many others. The mechanisms, including genetic variations, copy-number variations, posttranscriptional variations, and epigenetic variations, can be detected using a variety of NGS methods. We propose that malfunctions detected in these mechanisms are not necessarily independent, since these malfunctions are often found associated with the same disease and targeting the same gene, group of genes, or functional pathway. As an example, we discuss possible rewiring effects of the cancer-associated genetic, structural, and posttranscriptional variations on the protein-protein interaction (PPI) network centered around P53 protein. The review highlights multi-layered complexity of common genetic disorders and suggests that integration of NGS and omics data is a critical step in developing new computational methods capable of deciphering this complexity.


Assuntos
Predisposição Genética para Doença , Variação Genética , Análise de Sequência de DNA/métodos , Biologia Computacional , Variações do Número de Cópias de DNA , Epigenômica/métodos , Estudo de Associação Genômica Ampla , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Anotação de Sequência Molecular , Alinhamento de Sequência
15.
PLoS Comput Biol ; 10(5): e1003592, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24784581

RESUMO

Single nucleotide polymorphisms (SNPs) are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs) have been found near or inside the protein-protein interaction (PPI) interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor). Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1) a 2-class problem (strengthening/weakening PPI mutations), (2) another 2-class problem (mutations that disrupt/preserve a PPI), and (3) a 3-class classification (detrimental/neutral/beneficial mutation effects). In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the rewiring of large-scale protein-protein interaction networks, and can be useful for functional annotation of disease-associated SNPs. SNIP-IN tool is freely accessible as a web-server at http://korkinlab.org/snpintool/.


Assuntos
Inteligência Artificial , Neoplasias da Mama/genética , Diabetes Mellitus/genética , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Mapeamento de Interação de Proteínas/métodos , Proteoma/genética , Algoritmos , Estudos de Associação Genética , Humanos , Reconhecimento Automatizado de Padrão/métodos
16.
Proc Natl Acad Sci U S A ; 109(19): E1183-91, 2012 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-22496592

RESUMO

Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.


Assuntos
Sequência Conservada/genética , Evolução Molecular , Genoma de Planta/genética , Genoma/genética , Sequência de Aminoácidos , Animais , Arabidopsis/genética , Sequência de Bases , Núcleo Celular/genética , Mapeamento Cromossômico , Cromossomos de Mamíferos/genética , Cromossomos de Plantas/genética , Redes Reguladoras de Genes , Genoma Mitocondrial/genética , Humanos , Camundongos , Modelos Genéticos , Dados de Sequência Molecular , Ratos , Especificidade da Espécie , Sintenia
17.
J Biomed Inform ; 52: 394-405, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25150201

RESUMO

OBJECTIVES: We developed Resource Description Framework (RDF)-induced InfluGrams (RIIG) - an informatics formalism to uncover complex relationships among biomarker proteins and biological pathways using the biomedical knowledge bases. We demonstrate an application of RIIG in morphoproteomics, a theranostic technique aimed at comprehensive analysis of protein circuitries to design effective therapeutic strategies in personalized medicine setting. METHODS: RIIG uses an RDF "mashup" knowledge base that integrates publicly available pathway and protein data with ontologies. To mine for RDF-induced Influence Links, RIIG introduces notions of RDF relevancy and RDF collider, which mimic conditional independence and "explaining away" mechanism in probabilistic systems. Using these notions and constraint-based structure learning algorithms, the formalism generates the morphoproteomic diagrams, which we call InfluGrams, for further analysis by experts. RESULTS: RIIG was able to recover up to 90% of predefined influence links in a simulated environment using synthetic data and outperformed a naïve Monte Carlo sampling of random links. In clinical cases of Acute Lymphoblastic Leukemia (ALL) and Mesenchymal Chondrosarcoma, a significant level of concordance between the RIIG-generated and expert-built morphoproteomic diagrams was observed. In a clinical case of Squamous Cell Carcinoma, RIIG allowed selection of alternative therapeutic targets, the validity of which was supported by a systematic literature review. We have also illustrated an ability of RIIG to discover novel influence links in the general case of the ALL. CONCLUSIONS: Applications of the RIIG formalism demonstrated its potential to uncover patient-specific complex relationships among biological entities to find effective drug targets in a personalized medicine setting. We conclude that RIIG provides an effective means not only to streamline morphoproteomic studies, but also to bridge curated biomedical knowledge and causal reasoning with the clinical data in general.


Assuntos
Bases de Conhecimento , Medicina de Precisão/métodos , Proteômica/métodos , Algoritmos , Biomarcadores/análise , Humanos , Mapas de Interação de Proteínas/fisiologia , Transdução de Sinais/fisiologia
18.
Nucleic Acids Res ; 40(Web Server issue): W428-34, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22689645

RESUMO

PBSword is a web server designed for efficient and accurate comparisons and searches of geometrically similar protein-protein binding sites from a large-scale database. The basic idea of PBSword is that each protein binding site is first represented by a high-dimensional vector of 'visual words', which characterizes both the global and local shape features of the binding site. It then uses a scalable indexing technique to search for those binding sites whose visual words representations are similar to that of the query binding site. Our system is able to return ranked results of binding sites in short time from a database of 194 322 domain-domain binding sites. PBSword supports query by protein ID and by new structures uploaded by users. PBSword is a useful tool to investigate functional connections among proteins based on the local structures of binding site and has potential applications to protein-protein docking and drug discovery. The system is hosted at http://pbs.rnet.missouri.edu.


Assuntos
Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Software , Sítios de Ligação , Bases de Dados de Proteínas , Internet , Proteínas/química , Interface Usuário-Computador
19.
Nucleic Acids Res ; 40(Database issue): D501-6, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22135305

RESUMO

With the growing number of experimentally resolved structures of macromolecular complexes, it becomes clear that the interactions that involve protein structures are mediated not only by the protein domains, but also by various non-structured regions, such as interdomain linkers, or terminal sequences. Here, we present DOMMINO (http://dommino.org), a comprehensive database of macromolecular interactions that includes the interactions between protein domains, interdomain linkers, N- and C-terminal regions and protein peptides. The database complements SCOP domain annotations with domain predictions by SUPERFAMILY and is automatically updated every week. The database interface is designed to provide the user with a three-stage pipeline to study macromolecular interactions: (i) a flexible search that can include a PDB ID, type of interaction, SCOP family of interacting proteins, organism name, interaction keyword and a minimal threshold on the number of contact pairs; (ii) visualization of subunit interaction network, where the user can investigate the types of interactions within a macromolecular assembly; and (iii) visualization of an interface structure between any pair of the interacting subunits, where the user can highlight several different types of residues within the interfaces as well as study the structure of the corresponding binary complex of subunits.


Assuntos
Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Anotação de Sequência Molecular , Peptídeos/química , Proteínas/química , Interface Usuário-Computador
20.
Res Sq ; 2024 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-38464300

RESUMO

The prediction of RNA secondary structures is essential for understanding its underlying principles and applications in diverse fields, including molecular diagnostics and RNA-based therapeutic strategies. However, the complexity of the search space presents a challenge. This work proposes a Graph Convolutional Network (GCNfold) for predicting the RNA secondary structure. GCNfold considers an RNA sequence as graph-structured data and predicts posterior base-pairing probabilities given the prior base-pairing probabilities, calculated using McCaskill's partition function. The performance of GCNfold surpasses that of the state-of-the-art folding algorithms, as we have incorporated minimum free energy information into the richly parameterized network, enhancing its robustness in predicting non-homologous RNA secondary structures. A Symmetric Argmax Post-processing algorithm ensures that GCNfold formulates valid structures. To validate our algorithm, we applied it to the SARS-CoV-2 E gene and determined the secondary structure of the E-gene across the Betacoronavirus subgenera.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA