Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-37162909

RESUMO

Human genome sequencing studies have identified numerous loci associated with complex diseases. However, translating human genetic and genomic findings to disease pathobiology and therapeutic discovery remains a major challenge at multiscale interactome network levels. Here, we present a deep-learning-based ensemble framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that accurately predicts protein binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms, generating comprehensive structurally-informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods. We further systematically validated PIONEER predictions experimentally through generating 2,395 mutations and testing their impact on 6,754 mutation-interaction pairs, confirming the high quality and validity of PIONEER predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces after mapping mutations from ~60,000 germline exomes and ~36,000 somatic genomes. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from pan-cancer analysis of ~11,000 tumor whole-exomes across 33 cancer types. We show that PIONEER-predicted oncoPPIs are significantly associated with patient survival and drug responses from both cancer cell lines and patient-derived xenograft mouse models. We identify a landscape of PPI-perturbing tumor alleles upon ubiquitination by E3 ligases, and we experimentally validate the tumorigenic KEAP1-NRF2 interface mutation p.Thr80Lys in non-small cell lung cancer. We show that PIONEER-predicted PPI-perturbing alleles alter protein abundance and correlates with drug responses and patient survival in colon and uterine cancers as demonstrated by proteogenomic data from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.

2.
IEEE J Biomed Health Inform ; 27(9): 4421-4432, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37310830

RESUMO

Breast ultrasound (BUS) image segmentation is a critical procedure in the diagnosis and quantitative analysis of breast cancer. Most existing methods for BUS image segmentation do not effectively utilize the prior information extracted from the images. In addition, breast tumors have very blurred boundaries, various sizes and irregular shapes, and the images have a lot of noise. Thus, tumor segmentation remains a challenge. In this article, we propose a BUS image segmentation method using a boundary-guided and region-aware network with global scale-adaptive (BGRA-GSA). Specifically, we first design a global scale-adaptive module (GSAM) to extract features of tumors of different sizes from multiple perspectives. GSAM encodes the features at the top of the network in both channel and spatial dimensions, which can effectively extract multi-scale context and provide global prior information. Moreover, we develop a boundary-guided module (BGM) for fully mining boundary information. BGM guides the decoder to learn the boundary context by explicitly enhancing the extracted boundary features. Simultaneously, we design a region-aware module (RAM) for realizing the cross-fusion of diverse layers of breast tumor diversity features, which can facilitate the network to improve the learning ability of contextual features of tumor regions. These modules enable our BGRA-GSA to capture and integrate rich global multi-scale context, multi-level fine-grained details, and semantic information to facilitate accurate breast tumor segmentation. Finally, the experimental results on three publicly available datasets show that our model achieves highly effective segmentation of breast tumors even with blurred boundaries, various sizes and shapes, and low contrast.


Assuntos
Neoplasias da Mama , Ultrassonografia Mamária , Humanos , Feminino , Ultrassonografia , Semântica , Processamento de Imagem Assistida por Computador
3.
Eur J Med Chem ; 257: 115500, 2023 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-37262996

RESUMO

Small molecules have been providing medical breakthroughs for human diseases for more than a century. Recently, identifying small molecule inhibitors that target microRNAs (miRNAs) has gained importance, despite the challenges posed by labour-intensive screening experiments and the significant efforts required for medicinal chemistry optimization. Numerous experimentally-verified cases have demonstrated the potential of miRNA-targeted small molecule inhibitors for disease treatment. This new approach is grounded in their posttranscriptional regulation of the expression of disease-associated genes. Reversing dysregulated gene expression using this mechanism may help control dysfunctional pathways. Furthermore, the ongoing improvement of algorithms has allowed for the integration of computational strategies built on top of laboratory-based data, facilitating a more precise and rational design and discovery of lead compounds. To complement the use of extensive pharmacogenomics data in prioritising potential drugs, our previous work introduced a computational approach based on only molecular sequences. Moreover, various computational tools for predicting molecular interactions in biological networks using similarity-based inference techniques have been accumulated in established studies. However, there are a limited number of comprehensive reviews covering both computational and experimental drug discovery processes. In this review, we outline a cohesive overview of both biological and computational applications in miRNA-targeted drug discovery, along with their disease implications and clinical significance. Finally, utilizing drug-target interaction (DTIs) data from DrugBank, we showcase the effectiveness of deep learning for obtaining the physicochemical characterization of DTIs.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Regulação da Expressão Gênica , Algoritmos , Estrutura Molecular , Descoberta de Drogas
4.
Curr Opin Struct Biol ; 73: 102329, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35139457

RESUMO

Bolstered by recent methodological and hardware advances, deep learning has increasingly been applied to biological problems and structural proteomics. Such approaches have achieved remarkable improvements over traditional machine learning methods in tasks ranging from protein contact map prediction to protein folding, prediction of protein-protein interaction interfaces, and characterization of protein-drug binding pockets. In particular, emergence of ab initio protein structure prediction methods including AlphaFold2 has revolutionized protein structural modeling. From a protein function perspective, numerous deep learning methods have facilitated deconvolution of the exact amino acid residues and protein surface regions responsible for binding other proteins or small molecule drugs. In this review, we provide a comprehensive overview of recent deep learning methods applied in structural proteomics.


Assuntos
Aprendizado Profundo , Proteoma , Biologia Computacional/métodos , Conformação Proteica , Dobramento de Proteína
5.
Curr Opin Struct Biol ; 72: 219-225, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34959033

RESUMO

Protein-protein interfaces have been attracting great attention owing to their critical roles in protein-protein interactions and the fact that human disease-related mutations are generally enriched in them. Recently, substantial research progress has been made in this field, which has significantly promoted the understanding and treatment of various human diseases. For example, many studies have discovered the properties of disease-related mutations. Besides, as more large-scale experimental data become available, various computational approaches have been proposed to advance our understanding of disease mutations from the data. Here, we overview recent advances in characteristics of disease-related mutations at protein-protein interfaces, mutation effects on protein interactions, and investigation of mutations on specific diseases.


Assuntos
Proteínas , Humanos , Mutação , Proteínas/genética , Proteínas/metabolismo
6.
Proteomics ; 21(23-24): e2100145, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34647422

RESUMO

Deciphering the interaction networks and structural dynamics of proteins is pivotal to better understanding their biological functions. Cross-linking mass spectrometry (XL-MS) is a powerful and increasingly popular technology that provides information about protein-protein interactions and their structural constraints for individual proteins and multiprotein complexes on a proteome-scale. In this review, we first assess the coverage and depth of the XL-MS technique by utilizing publicly available datasets. We then delve into the progress in XL-MS experimental and computational methodologies and examine different quality-control strategies reported in the literature. Finally, we discuss the progress in XL-MS applications along with the scope for future improvements.


Assuntos
Proteoma , Proteômica , Reagentes de Ligações Cruzadas , Espectrometria de Massas , Complexos Multiproteicos
7.
BMC Bioinformatics ; 21(1): 377, 2020 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-32883200

RESUMO

BACKGROUND: A large number of experimental studies show that the mutation and regulation of long non-coding RNAs (lncRNAs) are associated with various human diseases. Accurate prediction of lncRNA-disease associations can provide a new perspective for the diagnosis and treatment of diseases. The main function of many lncRNAs is still unclear and using traditional experiments to detect lncRNA-disease associations is time-consuming. RESULTS: In this paper, we develop a novel and effective method for the prediction of lncRNA-disease associations using network feature similarity and gradient boosting (LDNFSGB). In LDNFSGB, we first construct a comprehensive feature vector to effectively extract the global and local information of lncRNAs and diseases through considering the disease semantic similarity (DISSS), the lncRNA function similarity (LNCFS), the lncRNA Gaussian interaction profile kernel similarity (LNCGS), the disease Gaussian interaction profile kernel similarity (DISGS), and the lncRNA-disease interaction (LNCDIS). Particularly, two methods are used to calculate the DISSS (LNCFS) for considering the local and global information of disease semantics (lncRNA functions) respectively. An autoencoder is then used to reduce the dimensionality of the feature vector to obtain the optimal feature parameter from the original feature set. Furthermore, we employ the gradient boosting algorithm to obtain the lncRNA-disease association prediction. CONCLUSIONS: In this study, hold-out, leave-one-out cross-validation, and ten-fold cross-validation methods are implemented on three publicly available datasets to evaluate the performance of LDNFSGB. Extensive experiments show that LDNFSGB dramatically outperforms other state-of-the-art methods. The case studies on six diseases, including cancers and non-cancers, further demonstrate the effectiveness of our method in real-world applications.


Assuntos
Algoritmos , Neoplasias/patologia , RNA Longo não Codificante/metabolismo , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Área Sob a Curva , Insuficiência Cardíaca/genética , Insuficiência Cardíaca/patologia , Humanos , Neoplasias/genética , RNA Longo não Codificante/genética , Curva ROC
8.
PLoS One ; 14(1): e0211805, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30703165

RESUMO

[This corrects the article DOI: 10.1371/journal.pone.0043126.].

9.
Proteins ; 85(12): 2162-2169, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28833538

RESUMO

Helix-helix interactions are crucial in the structure assembly, stability and function of helix-rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix-helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix-helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F-measure of ∼60% for predicting helix-helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred.


Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Proteínas de Membrana/química , Sequência de Aminoácidos , Sítios de Ligação , Ligação Proteica , Conformação Proteica em alfa-Hélice , Domínios e Motivos de Interação entre Proteínas
10.
Bioinformatics ; 33(17): 2675-2683, 2017 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-28472263

RESUMO

MOTIVATION: Residue-residue contacts are of great value for protein structure prediction, since contact information, especially from those long-range residue pairs, can significantly reduce the complexity of conformational sampling for protein structure prediction in practice. Despite progresses in the past decade on protein targets with abundant homologous sequences, accurate contact prediction for proteins with limited sequence information is still far from satisfaction. Methodologies for these hard targets still need further improvement. RESULTS: We presented a computational program DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-range residue contacts from primary sequences. When compared with previous prediction approaches, our framework employed an effective scheme to identify optimal and important features for contact prediction, and was only trained with coevolutionary information derived from a limited number of homologous sequences to ensure robustness and usefulness for hard targets. Independent tests showed that 59.33%/49.97%, 64.39%/54.01% and 70.00%/59.81% of the top L/5, top L/10 and top 5 predictions were correct for CASP10/CASP11 proteins, respectively. In general, our algorithm ranked as one of the best methods for CASP targets. AVAILABILITY AND IMPLEMENTATION: All source data and codes are available at http://166.111.152.91/Downloads.html . CONTACT: hgong@tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Modelos Moleculares , Conformação Proteica , Software , Bases de Dados de Proteínas
11.
Proteins ; 83(6): 1068-77, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25846271

RESUMO

Rapid and correct identification of RNA-binding residues based on the protein primary sequences is of great importance. In most prevalent machine-learning-based identification methods; however, either some features are inefficiently represented, or the redundancy between features is not effectively removed. Both problems may weaken the performance of a classifier system and raise its computational complexity. Here, we addressed the above problems and developed a better classifier (RBRIdent) to identify the RNA-binding residues. In an independent benchmark test, RBRIdent achieved an accuracy of 76.79%, Matthews correlation coefficient of 0.3819 and F-measure of 75.58%, remarkably outperforming all prevalent methods. These results suggest the necessity of proper feature description and the essential role of feature selection in this project. All source data and codes are freely available at http://166.111.152.91/RBRIdent.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Análise de Sequência de Proteína/métodos , Software , Sítios de Ligação , Bases de Dados de Proteínas , Aprendizado de Máquina , Modelos Moleculares
12.
IEEE Trans Nanobioscience ; 13(4): 374-83, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24919203

RESUMO

The core promoters play significant and extensive roles for the initiation and regulation of DNA transcription. The identification of core promoters is one of the most challenging problems yet. Due to the diverse nature of core promoters, the results obtained through existing computational approaches are not satisfactory. None of them considered the potential influence on performance of predictive approach resulted by the interference between neighboring TSSs in TSS clusters. In this paper, we sufficiently considered this main factor and proposed an approach to locate potential TSS clusters according to the correlation of regional profiles of DNA and TSS clusters. On this basis, we further presented a novel computational approach (ProMT) for promoter prediction using Markov chain model and predictive TSS clusters based on structural properties of DNA. Extensive experiments demonstrated that ProMT can significantly improve the predictive performance. Therefore, considering interference between neighboring TSSs is essential for a wider range of promoter prediction.


Assuntos
Algoritmos , DNA/genética , Cadeias de Markov , Modelos Estatísticos , Regiões Promotoras Genéticas/genética , Análise de Sequência de DNA/métodos , Software , Sequência de Bases , Simulação por Computador , Humanos , Modelos Genéticos , Dados de Sequência Molecular , Família Multigênica/genética
13.
PLoS One ; 7(8): e43126, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22905214

RESUMO

BACKGROUND: Horizontal gene transfer (HGT) is one of the major mechanisms contributing to microbial genome diversification. A number of computational methods for finding horizontally transferred genes have been proposed in the past decades; however none of them has provided a reliable detector yet. In existing parametric approaches, only one single compositional property can participate in the detection process, or the results obtained through each single property are just simply combined. It's known that different properties may mean different information, so the single property can't sufficiently contain the information encoded by gene sequences. In addition, the class imbalance problem in the datasets, which also results in great errors for the gene detection, hasn't been considered by the published methods. Here we developed an effective classifier system (Hgtident) that used support vector machine (SVM) by combining unusual properties effectively for HGT detection. RESULTS: Our approach Hgtident includes the introduction of more representative datasets, optimization of SVM model, feature selection, handling of imbalance problem in the datasets and extensive performance evaluation via systematic cross-validation methods. Through feature selection, we found that JS-DN and JS-CB have higher discriminating power for HGT detection, while GC1-GC3 and k-mer (k = 1, 2, …, 7) make the least contribution. Extensive experiments indicated the new classifier could reduce Mean error dramatically, and also improve Recall by a certain level. For the testing genomes, compared with the existing popular multiple-threshold approach, on average, our Recall and Mean error was respectively improved by 2.81% and reduced by 26.32%, which means that numerous false positives were identified correctly. CONCLUSIONS: Hgtident introduced here is an effective approach for better detecting HGT. Combining multiple features of HGT is also essential for a wider range of HGT events detection.


Assuntos
Transferência Genética Horizontal , Biologia Computacional/métodos , DNA Bacteriano/genética , Bases de Dados Genéticas , Reações Falso-Negativas , Reações Falso-Positivas , Genoma Bacteriano , Genômica/métodos , Modelos Genéticos , Modelos Estatísticos , Filogenia , Reprodutibilidade dos Testes , Software , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA