Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 218
Filtrar
1.
Nucleic Acids Res ; 52(D1): D98-D106, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37953349

RESUMO

Long noncoding RNAs (lncRNAs) have emerged as crucial regulators across diverse biological processes and diseases. While high-throughput sequencing has enabled lncRNA discovery, functional characterization remains limited. The EVLncRNAs database is the first and exclusive repository for all experimentally validated functional lncRNAs from various species. After previous releases in 2018 and 2021, this update marks a major expansion through exhaustive manual curation of nearly 25 000 publications from 15 May 2020, to 15 May 2023. It incorporates substantial growth across all categories: a 154% increase in functional lncRNAs, 160% in associated diseases, 186% in lncRNA-disease associations, 235% in interactions, 138% in structures, 234% in circular RNAs, 235% in resistant lncRNAs and 4724% in exosomal lncRNAs. More importantly, it incorporated additional information include functional classifications, detailed interaction pathways, homologous lncRNAs, lncRNA locations, COVID-19, phase-separation and organoid-related lncRNAs. The web interface was substantially improved for browsing, visualization, and searching. ChatGPT was tested for information extraction and functional overview with its limitation noted. EVLncRNAs 3.0 represents the most extensive curated resource of experimentally validated functional lncRNAs and will serve as an indispensable platform for unravelling emerging lncRNA functions. The updated database is freely available at https://www.sdklab-biophysics-dzu.net/EVLncRNAs3/.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA Longo não Codificante , Gerenciamento de Dados , Armazenamento e Recuperação da Informação , RNA Longo não Codificante/genética
2.
Nucleic Acids Res ; 52(1): e3, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-37941140

RESUMO

Compared with proteins, DNA and RNA are more difficult languages to interpret because four-letter coded DNA/RNA sequences have less information content than 20-letter coded protein sequences. While BERT (Bidirectional Encoder Representations from Transformers)-like language models have been developed for RNA, they are ineffective at capturing the evolutionary information from homologous sequences because unlike proteins, RNA sequences are less conserved. Here, we have developed an unsupervised multiple sequence alignment-based RNA language model (RNA-MSM) by utilizing homologous sequences from an automatic pipeline, RNAcmap, as it can provide significantly more homologous sequences than manually annotated Rfam. We demonstrate that the resulting unsupervised, two-dimensional attention maps and one-dimensional embeddings from RNA-MSM contain structural information. In fact, they can be directly mapped with high accuracy to 2D base pairing probabilities and 1D solvent accessibilities, respectively. Further fine-tuning led to significantly improved performance on these two downstream tasks compared with existing state-of-the-art techniques including SPOT-RNA2 and RNAsnap2. By comparison, RNA-FM, a BERT-based RNA language model, performs worse than one-hot encoding with its embedding in base pair and solvent-accessible surface area prediction. We anticipate that the pre-trained RNA-MSM model can be fine-tuned on many other tasks related to RNA structure and function.


Assuntos
Aprendizado de Máquina , RNA , Alinhamento de Sequência , DNA/química , Proteínas , RNA/química , Solventes
3.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37204193

RESUMO

Determining intrinsically disordered regions of proteins is essential for elucidating protein biological functions and the mechanisms of their associated diseases. As the gap between the number of experimentally determined protein structures and the number of protein sequences continues to grow exponentially, there is a need for developing an accurate and computationally efficient disorder predictor. However, current single-sequence-based methods are of low accuracy, while evolutionary profile-based methods are computationally intensive. Here, we proposed a fast and accurate protein disorder predictor LMDisorder that employed embedding generated by unsupervised pretrained language models as features. We showed that LMDisorder performs best in all single-sequence-based methods and is comparable or better than another language-model-based technique in four independent test sets, respectively. Furthermore, LMDisorder showed equivalent or even better performance than the state-of-the-art profile-based technique SPOT-Disorder2. In addition, the high computation efficiency of LMDisorder enabled proteome-scale analysis of human, showing that proteins with high predicted disorder content were associated with specific biological functions. The datasets, the source codes, and the trained model are available at https://github.com/biomed-AI/LMDisorder.


Assuntos
Proteoma , Software , Humanos , Sequência de Aminoácidos , Evolução Biológica
4.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36573492

RESUMO

Long non-coding RNAs (lncRNAs) played essential roles in nearly every biological process and disease. Many algorithms were developed to distinguish lncRNAs from mRNAs in transcriptomic data and facilitated discoveries of more than 600 000 of lncRNAs. However, only a tiny fraction (<1%) of lncRNA transcripts (~4000) were further validated by low-throughput experiments (EVlncRNAs). Given the cost and labor-intensive nature of experimental validations, it is necessary to develop computational tools to prioritize those potentially functional lncRNAs because many lncRNAs from high-throughput sequencing (HTlncRNAs) could be resulted from transcriptional noises. Here, we employed deep learning algorithms to separate EVlncRNAs from HTlncRNAs and mRNAs. For overcoming the challenge of small datasets, we employed a three-layer deep-learning neural network (DNN) with a K-mer feature as the input and a small convolutional neural network (CNN) with one-hot encoding as the input. Three separate models were trained for human (h), mouse (m) and plant (p), respectively. The final concatenated models (EVlncRNA-Dpred (h), EVlncRNA-Dpred (m) and EVlncRNA-Dpred (p)) provided substantial improvement over a previous model based on support-vector-machines (EVlncRNA-pred). For example, EVlncRNA-Dpred (h) achieved 0.896 for the area under receiver-operating characteristic curve, compared with 0.582 given by sequence-based EVlncRNA-pred model. The models developed here should be useful for screening lncRNA transcripts for experimental validations. EVlncRNA-Dpred is available as a web server at https://www.sdklab-biophysics-dzu.net/EVlncRNA-Dpred/index.html, and the data and source code can be freely available along with the web server.


Assuntos
Aprendizado Profundo , RNA Longo não Codificante , Humanos , Animais , Camundongos , RNA Longo não Codificante/genética , Biologia Computacional/métodos , Software , Algoritmos , RNA Mensageiro/genética
5.
Gastroenterology ; 165(3): 629-646, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37247644

RESUMO

BACKGROUND & AIMS: Hyperactivation of ribosome biogenesis leads to hepatocyte transformation and plays pivotal roles in hepatocellular carcinoma (HCC) development. We aimed to identify critical ribosome biogenesis proteins that are overexpressed and crucial in HCC progression. METHODS: HEAT repeat containing 1 (HEATR1) expression and clinical correlations were analyzed using The Cancer Genome Atlas and Gene Expression Omnibus databases and further evaluated by immunohistochemical analysis of an HCC tissue microarray. Gene expression was knocked down by small interfering RNA. HEATR1-knockdown cells were subjected to viability, cell cycle, and apoptosis assays and used to establish subcutaneous and orthotopic tumor models. Chromatin immunoprecipitation and quantitative polymerase chain reaction were performed to detect the association of candidate proteins with specific DNA sequences. Endogenous coimmunoprecipitation combined with mass spectrometry was used to identify protein interactions. We performed immunoblot and immunofluorescence assays to detect and localize proteins in cells. The nucleolus ultrastructure was detected by transmission electron microscopy. Click-iT (Thermo Fisher Scientific) RNA imaging and puromycin incorporation assays were used to measure nascent ribosomal RNA and protein synthesis, respectively. Proteasome activity, 20S proteasome foci formation, and protein stability were evaluated in HEATR1-knockdown HCC cells. RESULTS: HEATR1 was the most up-regulated gene in a set of ribosome biogenesis mediators in HCC samples. High expression of HEATR1 was associated with poor survival and malignant clinicopathologic features in patients with HCC and contributed to HCC growth in vitro and in vivo. HEATR1 expression was regulated by the transcription factor specificity protein 1, which can be activated by insulin-like growth factor 1-mammalian target of rapamycin complex 1 signaling in HCC cells. HEATR1 localized predominantly in the nucleolus, bound to ribosomal DNA, and was associated with RNA polymerase I transcription/processing factors. Knockdown of HEATR1 disrupted ribosomal RNA biogenesis and impaired nascent protein synthesis, leading to reduced cytoplasmic proteasome activity and inhibitory-κB/nuclear factor-κB signaling. Moreover, HEATR1 knockdown induced nucleolar stress with increased nuclear proteasome activity and inactivation of the nucleophosmin 1-MYC axis. CONCLUSIONS: Our study revealed that HEATR1 is up-regulated by insulin-like growth factor 1-mammalian target of rapamycin complex 1-specificity protein 1 signaling in HCC and functions as a crucial regulator of ribosome biogenesis and proteome homeostasis to promote HCC development.


Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/patologia , Linhagem Celular Tumoral , Proliferação de Células/genética , Regulação Neoplásica da Expressão Gênica , Homeostase , Temperatura Alta , Fator de Crescimento Insulin-Like I/genética , Neoplasias Hepáticas/patologia , Alvo Mecanístico do Complexo 1 de Rapamicina/metabolismo , Complexo de Endopeptidases do Proteassoma/genética , Proteoma/metabolismo , Ribossomos/metabolismo , Ribossomos/patologia , RNA Ribossômico/genética , RNA Ribossômico/metabolismo
6.
Biochem Biophys Res Commun ; 724: 150224, 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-38851139

RESUMO

Despite intensive search over the past decades, only a few small-molecule DNA fluorescent dyes were found with large Stokes shifts. These molecules, however, are often too toxic for widespread usage. Here, we designed DNA-specific fluorescent dyes rooted in benzimidazole architectures with a hitherto unexplored molecular framework based on thiazole-benzimidazole scaffolding. We further incorporated a pyrazole ring with an extended sidechain to prevent cell penetration. These novel benzimidazole derivatives were predicted by quantum calculations and subsequently validated to have large Stokes shifts ranging from 135 to 143 nm, with their emission colors changed from capri blue for the Hoechst reference compound to iguana green. These readily-synthesized compounds, which displayed improved DNA staining intensity and detection limits along with a complete loss of capability for cellular membrane permeation and negligible mutagenic effects as designed, offer a safer alternative to the existing high-performance small-molecule DNA fluorescent dyes.


Assuntos
Benzimidazóis , DNA , Corantes Fluorescentes , Corantes Fluorescentes/química , Corantes Fluorescentes/síntese química , DNA/química , Benzimidazóis/química , Humanos , Desenho de Fármacos , Mutagênicos/química , Mutagênicos/toxicidade , Dano ao DNA
7.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35348613

RESUMO

Characterizing RNA structures and functions have mostly been focused on 2D, secondary and 3D, tertiary structures. Recent advances in experimental and computational techniques for probing or predicting RNA solvent accessibility make this 1D representation of tertiary structures an increasingly attractive feature to explore. Here, we provide a survey of these recent developments, which indicate the emergence of solvent accessibility as a simple 1D property, adding to secondary and tertiary structures for investigating complex structure-function relations of RNAs.


Assuntos
RNA , Conformação de Ácido Nucleico , RNA/química , Solventes/química
8.
PLoS Comput Biol ; 19(12): e1011330, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38060617

RESUMO

Recent advances in deep learning have significantly improved the ability to infer protein sequences directly from protein structures for the fix-backbone design. The methods have evolved from the early use of multi-layer perceptrons to convolutional neural networks, transformers, and graph neural networks (GNN). However, the conventional approach of constructing K-nearest-neighbors (KNN) graph for GNN has limited the utilization of edge information, which plays a critical role in network performance. Here we introduced SPIN-CGNN based on protein contact maps for nearest neighbors. Together with auxiliary edge updates and selective kernels, we found that SPIN-CGNN provided a comparable performance in refolding ability by AlphaFold2 to the current state-of-the-art techniques but a significant improvement over them in term of sequence recovery, perplexity, deviation from amino-acid compositions of native sequences, conservation of hydrophobic positions, and low complexity regions, according to the test by unseen structures, "hallucinated" structures and diffusion models. Results suggest that low complexity regions in the sequences designed by deep learning, for generated structures in particular, remain to be improved, when compared to the native sequences.


Assuntos
Aminoácidos , Redes Neurais de Computação , Sequência de Aminoácidos , Análise por Conglomerados , Difusão
9.
Proteins ; 91(12): 1771-1778, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37638558

RESUMO

We describe the modeling method for RNA tertiary structures employed by team AIchemy_RNA2 in the 15th Critical Assessment of Structure Prediction (CASP15). The method consists of the following steps. Firstly, secondary structure information was derived from various manually-verified sources. With this information, the full length RNA was fragmented into structural modules. The structures of each module were predicted and then assembled into the full structure. To reduce the searching conformational space, an RNA structure was organized into an optimal base folding tree. And to further improve the sampling efficiency, the energy surface was smoothed at high temperatures during the Monte Carlo sampling to make it easier to move across the energy barrier. The statistical potential energy function BRiQ was employed during Monte Carlo energy optimization.


Assuntos
Algoritmos , RNA , RNA/química , Conformação Proteica , Método de Monte Carlo
10.
Bioinformatics ; 38(16): 3900-3910, 2022 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-35751593

RESUMO

MOTIVATION: Recently, AlphaFold2 achieved high experimental accuracy for the majority of proteins in Critical Assessment of Structure Prediction (CASP 14). This raises the hope that one day, we may achieve the same feat for RNA structure prediction for those structured RNAs, which is as fundamentally and practically important similar to protein structure prediction. One major factor in the recent advancement of protein structure prediction is the highly accurate prediction of distance-based contact maps of proteins. RESULTS: Here, we showed that by integrated deep learning with physics-inferred secondary structures, co-evolutionary information and multiple sequence-alignment sampling, we can achieve RNA contact-map prediction at a level of accuracy similar to that in protein contact-map prediction. More importantly, highly accurate prediction for top L long-range contacts can be assured for those RNAs with a high effective number of homologous sequences (Neff > 50). The initial use of the predicted contact map as distance-based restraints confirmed its usefulness in 3D structure prediction. AVAILABILITY AND IMPLEMENTATION: SPOT-RNA-2D is available as a web server at https://sparks-lab.org/server/spot-rna-2d/ and as a standalone program at https://github.com/jaswindersingh2/SPOT-RNA-2D. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Aprendizado Profundo , Redes Neurais de Computação , RNA , Proteínas/química , Física
11.
Bioinformatics ; 38(7): 1888-1894, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35104320

RESUMO

MOTIVATION: Accurate prediction of protein contact-map is essential for accurate protein structure and function prediction. As a result, many methods have been developed for protein contact map prediction. However, most methods rely on protein-sequence-evolutionary information, which may not exist for many proteins due to lack of naturally occurring homologous sequences. Moreover, generating evolutionary profiles is computationally intensive. Here, we developed a contact-map predictor utilizing the output of a pre-trained language model ESM-1b as an input along with a large training set and an ensemble of residual neural networks. RESULTS: We showed that the proposed method makes a significant improvement over a single-sequence-based predictor SSCpred with 15% improvement in the F1-score for the independent CASP14-FM test set. It also outperforms evolutionary-profile-based methods trRosetta and SPOT-Contact with 48.7% and 48.5% respective improvement in the F1-score on the proteins without homologs (Neff = 1) in the independent SPOT-2018 set. The new method provides a much faster and reasonably accurate alternative to evolution-based methods, useful for large-scale prediction. AVAILABILITY AND IMPLEMENTATION: Stand-alone-version of SPOT-Contact-LM is available at https://github.com/jas-preet/SPOT-Contact-Single. Direct prediction can also be made at https://sparks-lab.org/server/spot-contact-single. The datasets used in this research can also be downloaded from the GitHub. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Idioma , Biologia Computacional/métodos , Proteínas/química , Redes Neurais de Computação , Sequência de Aminoácidos
12.
Acta Pharmacol Sin ; 44(7): 1487-1499, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36759643

RESUMO

Ebola virus (EBOV) causes hemorrhagic fever in humans with high morbidity and fatality. Although over 45 years have passed since the first EBOV outbreak, small molecule drugs are not yet available. Ebola viral protein VP30 is a unique RNA synthesis cofactor, and the VP30/NP interaction plays a critical role in initiating the transcription and propagation of EBOV. Here, we designed a high-throughput screening technique based on a competitive binding assay to bind VP30 between an NP-derived peptide and a chemical compound. By screening a library of 8004 compounds, we obtained two lead compounds, Embelin and Kobe2602. The binding of these compounds to the VP30-NP interface was validated by dose-dependent competitive binding assay, surface plasmon resonance, and thermal shift assay. Moreover, the compounds were confirmed to inhibit the transcription and replication of the Ebola genome by a minigenome assay. Similar results were obtained for their two respective analogs (8-gingerol and Kobe0065). Interestingly, these two structurally different molecules exhibit synergistic binding to the VP30/NP interface. The antiviral efficacy (EC50) increased from 1 µM by Kobe0065 alone to 351 nM when Kobe0065 and Embelin were combined in a 4:1 ratio. The synergistic anti-EBOV effect provides a strong incentive for further developing these lead compounds in future studies.


Assuntos
Ebolavirus , Doença pelo Vírus Ebola , Humanos , Ebolavirus/genética , Ebolavirus/metabolismo , Doença pelo Vírus Ebola/tratamento farmacológico , Nucleoproteínas/genética , Nucleoproteínas/metabolismo , RNA Viral/genética , RNA Viral/metabolismo , Fatores de Transcrição/metabolismo , Transcrição Gênica , Replicação Viral
13.
Mol Ther ; 30(10): 3284-3299, 2022 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-35765243

RESUMO

Existing evidence indicates that gut fungal dysbiosis might play a key role in the pathogenesis of colorectal cancer (CRC). We sought to explore whether reversing the fungal dysbiosis by terbinafine, an approved antifungal drug, might inhibit the development of CRC. A population-based study from Sweden identified a total of 185 patients who received terbinafine after their CRC diagnosis and found that they had a decreased risk of death (hazard ratio = 0.50) and metastasis (hazard ratio = 0.44) compared with patients without terbinafine administration. In multiple mouse models of CRC, administration of terbinafine decreased the fungal load, the fungus-induced myeloid-derived suppressor cell (MDSC) expansion, and the tumor burden. Fecal microbiota transplantation from mice without terbinafine treatment reversed MDSC infiltration and partially restored tumor proliferation. Mechanistically, terbinafine directly impaired tumor cell proliferation by reducing the ratio of nicotinamide adenine dinucleotide phosphate (NADP+) to reduced form of nicotinamide adenine dinucleotide phosphate (NADPH), suppressing the activity of glucose-6-phosphate dehydrogenase (G6PD), resulting in nucleotide synthesis disruption, deoxyribonucleotide (dNTP) starvation, and cell-cycle arrest. Collectively, terbinafine can inhibit CRC by reversing fungal dysbiosis, suppressing tumor cell proliferation, inhibiting fungus-induced MDSC infiltration, and restoring antitumor immune response.


Assuntos
Neoplasias Colorretais , Terbinafina , Animais , Antifúngicos , Neoplasias Colorretais/tratamento farmacológico , Neoplasias Colorretais/patologia , Desoxirribonucleotídeos , Disbiose , Glucosefosfato Desidrogenase , Camundongos , NADP , Terbinafina/farmacologia
14.
Nucleic Acids Res ; 49(D1): D298-D308, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33119734

RESUMO

We present DescribePROT, the database of predicted amino acid-level descriptors of structure and function of proteins. DescribePROT delivers a comprehensive collection of 13 complementary descriptors predicted using 10 popular and accurate algorithms for 83 complete proteomes that cover key model organisms. The current version includes 7.8 billion predictions for close to 600 million amino acids in 1.4 million proteins. The descriptors encompass sequence conservation, position specific scoring matrix, secondary structure, solvent accessibility, intrinsic disorder, disordered linkers, signal peptides, MoRFs and interactions with proteins, DNA and RNAs. Users can search DescribePROT by the amino acid sequence and the UniProt accession number and entry name. The pre-computed results are made available instantaneously. The predictions can be accesses via an interactive graphical interface that allows simultaneous analysis of multiple descriptors and can be also downloaded in structured formats at the protein, proteome and whole database scale. The putative annotations included by DescriPROT are useful for a broad range of studies, including: investigations of protein function, applied projects focusing on therapeutics and diseases, and in the development of predictors for other protein sequence descriptors. Future releases will expand the coverage of DescribePROT. DescribePROT can be accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.


Assuntos
Aminoácidos/química , Bases de Dados de Proteínas , Genoma , Proteínas/genética , Proteoma/genética , Software , Sequência de Aminoácidos , Aminoácidos/metabolismo , Animais , Archaea/genética , Archaea/metabolismo , Bactérias/genética , Bactérias/metabolismo , Sítios de Ligação , Sequência Conservada , Fungos/genética , Fungos/metabolismo , Humanos , Internet , Plantas/genética , Plantas/metabolismo , Células Procarióticas/metabolismo , Ligação Proteica , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/classificação , Proteínas/metabolismo , Proteoma/química , Proteoma/metabolismo , Análise de Sequência de Proteína , Vírus/genética , Vírus/metabolismo
15.
Nucleic Acids Res ; 49(D1): D86-D91, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33221906

RESUMO

Long non-coding RNAs (lncRNAs) play important functional roles in many diverse biological processes. However, not all expressed lncRNAs are functional. Thus, it is necessary to manually collect all experimentally validated functional lncRNAs (EVlncRNA) with their sequences, structures, and functions annotated in a central database. The first release of such a database (EVLncRNAs) was made using the literature prior to 1 May 2016. Since then (till 15 May 2020), 19 245 articles related to lncRNAs have been published. In EVLncRNAs 2.0, these articles were manually examined for a major expansion of the data collected. Specifically, the number of annotated EVlncRNAs, associated diseases, lncRNA-disease associations, and interaction records were increased by 260%, 320%, 484% and 537%, respectively. Moreover, the database has added several new categories: 8 lncRNA structures, 33 exosomal lncRNAs, 188 circular RNAs, and 1079 drug-resistant, chemoresistant, and stress-resistant lncRNAs. All records have checked against known retraction and fake articles. This release also comes with a highly interactive visual interaction network that facilitates users to track the underlying relations among lncRNAs, miRNAs, proteins, genes and other functional elements. Furthermore, it provides links to four new bioinformatics tools with improved data browsing and searching functionality. EVLncRNAs 2.0 is freely available at https://www.sdklab-biophysics-dzu.net/EVLncRNAs2/.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos/organização & administração , RNA Circular/genética , RNA Longo não Codificante/genética , Software , Animais , Bibliometria , Resistencia a Medicamentos Antineoplásicos/genética , Exossomos/química , Exossomos/genética , Humanos , Internet , Plantas/genética , RNA Circular/classificação , RNA Circular/metabolismo , RNA Longo não Codificante/classificação , RNA Longo não Codificante/metabolismo , Estresse Fisiológico
16.
Antimicrob Agents Chemother ; 66(1): e0154221, 2022 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-34633841

RESUMO

Neisseria gonorrhoeae is an increasing public health threat due to its rapidly rising incidence and antibiotic resistance. There are an estimated 106 million cases per year worldwide, there is no vaccine available to prevent infection, and N. gonorrhoeae strains that are resistant to all antibiotics routinely used to treat the infection have emerged. In many strains, antibiotic resistance is mediated by overexpression of the MtrCDE efflux pump, which enables the bacteria to transport toxic antibiotics out of the cell. Genetic mutations that inactivate MtrCDE have previously been shown to render resistant strains susceptible to certain antibiotics. Here, we show that peptides rationally designed to target and disrupt the activity of each of the three protein components of MtrCDE were able to increase the susceptibility of N. gonorrhoeae strains to antibiotics in a dose-dependent manner and with no toxicity to human cells. Cotreatment of bacteria with subinhibitory concentrations of the peptide led to 2- to 64-fold increases in susceptibility to erythromycin, azithromycin, ciprofloxacin, and/or ceftriaxone in N. gonorrhoeae strains FA1090, WHO K, WHO P, and WHO X. The cotreatment experiments with peptides P-MtrC1 and P-MtrE1 resulted in increased susceptibilities of WHO P and WHO X to azithromycin, ciprofloxacin, and ceftriaxone that were of the same magnitude seen in MtrCDE mutants. P-MtrE1 was able to change the azithromycin resistance profile of WHO P from resistant to susceptible. Data presented here demonstrate that these peptides may be developed for use as a dual treatment with existing antibiotics to treat multidrug-resistant gonococcal infections.


Assuntos
Gonorreia , Neisseria gonorrhoeae , Antibacterianos/metabolismo , Antibacterianos/farmacologia , Azitromicina/farmacologia , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Farmacorresistência Bacteriana/genética , Gonorreia/tratamento farmacológico , Gonorreia/microbiologia , Humanos , Testes de Sensibilidade Microbiana , Neisseria gonorrhoeae/genética , Neisseria gonorrhoeae/metabolismo , Peptídeos/metabolismo , Peptídeos/farmacologia , Proteínas Repressoras/genética
17.
Bioinformatics ; 38(1): 86-93, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34406339

RESUMO

MOTIVATION: Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS: The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION: The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.


Assuntos
Aminoácidos , Proteínas , Proteínas/química , Solventes , Conformação Proteica
18.
Bioinformatics ; 38(1): 125-132, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34498061

RESUMO

MOTIVATION: Protein-protein interactions (PPI) play crucial roles in many biological processes, and identifying PPI sites is an important step for mechanistic understanding of diseases and design of novel drugs. Since experimental approaches for PPI site identification are expensive and time-consuming, many computational methods have been developed as screening tools. However, these methods are mostly based on neighbored features in sequence, and thus limited to capture spatial information. RESULTS: We propose a deep graph-based framework deep Graph convolutional network for Protein-Protein-Interacting Site prediction (GraphPPIS) for PPI site prediction, where the PPI site prediction problem was converted into a graph node classification task and solved by deep learning using the initial residual and identity mapping techniques. We showed that a deeper architecture (up to eight layers) allows significant performance improvement over other sequence-based and structure-based methods by more than 12.5% and 10.5% on AUPRC and MCC, respectively. Further analyses indicated that the predicted interacting sites by GraphPPIS are more spatially clustered and closer to the native ones even when false-positive predictions are made. The results highlight the importance of capturing spatially neighboring residues for interacting site prediction. AVAILABILITY AND IMPLEMENTATION: The datasets, the pre-computed features, and the source codes along with the pre-trained models of GraphPPIS are available at https://github.com/biomed-AI/GraphPPIS. The GraphPPIS web server is freely available at https://biomed.nscc-gz.cn/apps/GraphPPIS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Proteínas
19.
Bioinformatics ; 37(20): 3494-3500, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-34021744

RESUMO

MOTIVATION: The accuracy of RNA secondary and tertiary structure prediction can be significantly improved by using structural restraints derived from evolutionary coupling or direct coupling analysis. Currently, these coupling analyses relied on manually curated multiple sequence alignments collected in the Rfam database, which contains 3016 families. By comparison, millions of non-coding RNA sequences are known. Here, we established RNAcmap, a fully automatic pipeline that enables evolutionary coupling analysis for any RNA sequences. The homology search was based on the covariance model built by INFERNAL according to two secondary structure predictors: a folding-based algorithm RNAfold and the latest deep-learning method SPOT-RNA. RESULTS: We showed that the performance of RNAcmap is less dependent on the specific evolutionary coupling tool but is more dependent on the accuracy of secondary structure predictor with the best performance given by RNAcmap (SPOT-RNA). The performance of RNAcmap (SPOT-RNA) is comparable to that based on Rfam-supplied alignment and consistent for those sequences that are not in Rfam collections. Further improvement can be made with a simple meta predictor RNAcmap (SPOT-RNA/RNAfold) depending on which secondary structure predictor can find more homologous sequences. Reliable base-pairing information generated from RNAcmap, for RNAs with high effective homologous sequences, in particular, will be useful for aiding RNA structure prediction. AVAILABILITY AND IMPLEMENTATION: RNAcmap is available as a web server at https://sparks-lab.org/server/rnacmap/ and as a standalone application along with the datasets at https://github.com/sparks-lab-org/RNAcmap_standalone. A platform independent and fully configured docker image of RNAcmap is also provided at https://hub.docker.com/r/jaswindersingh2/rnacmap. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

20.
Bioinformatics ; 37(20): 3464-3472, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-33983382

RESUMO

MOTIVATION: Knowing protein secondary and other one-dimensional structural properties are essential for accurate protein structure and function prediction. As a result, many methods have been developed for predicting these one-dimensional structural properties. However, most methods relied on evolutionary information that may not exist for many proteins due to a lack of sequence homologs. Moreover, it is computationally intensive for obtaining evolutionary information as the library of protein sequences continues to expand exponentially. Here, we developed a new single-sequence method called SPOT-1D-Single based on a large training dataset of 39 120 proteins deposited prior to 2016 and an ensemble of hybrid long-short-term-memory bidirectional neural network and convolutional neural network. RESULTS: We showed that SPOT-1D-Single consistently improves over SPIDER3-Single and ProteinUnet for secondary structure, solvent accessibility, contact number and backbone angles prediction for all seven independent test sets (TEST2018, SPOT-2016, SPOT-2016-HQ, SPOT-2018, SPOT-2018-HQ, CASP12 and CASP13 free-modeling targets). For example, the predicted three-state secondary structure's accuracy ranges from 72.12% to 74.28% by SPOT-1D-Single, compared to 69.1-72.6% by SPIDER3-Single and 70.6-73% by ProteinUnet. SPOT-1D-Single also predicts SS3 and SS8 with 6.24% and 6.98% better accuracy than SPOT-1D on SPOT-2018 proteins with no homologs (Neff = 1), respectively. The new method's improvement over existing techniques is due to a larger training set combined with ensembled learning. AVAILABILITY AND IMPLEMENTATION: Standalone-version of SPOT-1D-Single is available at https://github.com/jas-preet/SPOT-1D-Single. Direct prediction can also be made at https://sparks-lab.org/server/spot-1d-single. The datasets used in this research can also be downloaded from GitHub. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa