Pesquisa | Portal Regional da BVS

1.

Multi-layer sequential network analysis improves protein 3D structural classification.

Newaz, Khalique; Piland, Jacob; Clark, Patricia L; Emrich, Scott J; Li, Jun; Milenkovic, Tijana.

Proteins ; 90(9): 1721-1731, 2022 09.

Artigo em Inglês | MEDLINE | ID: mdl-35441395

RESUMO

Protein structural classification (PSC) is a supervised problem of assigning proteins into pre-defined structural (e.g., CATH or SCOPe) classes based on the proteins' sequence or 3D structural features. We recently proposed PSC approaches that model protein 3D structures as protein structure networks (PSNs) and analyze PSN-based protein features, which performed better than or comparable to state-of-the-art sequence or other 3D structure-based PSC approaches. However, existing PSN-based PSC approaches model the whole 3D structure of a protein as a static (i.e., single-layer) PSN. Because folding of a protein is a dynamic process, where some parts (i.e., sub-structures) of a protein fold before others, modeling the 3D structure of a protein as a PSN that captures the sub-structures might further help improve the existing PSC performance. Here, we propose to model 3D structures of proteins as multi-layer sequential PSNs that approximate 3D sub-structures of proteins, with the hypothesis that this will improve upon the current state-of-the-art PSC approaches that are based on single-layer PSNs (and thus upon the existing state-of-the-art sequence and other 3D structural approaches). Indeed, we confirm this on 72 datasets spanning ~44 000 CATH and SCOPe protein domains.

Assuntos

Proteínas , Sequência de Aminoácidos , Proteínas/química , Alinhamento de Sequência

2.

Inference of a Dynamic Aging-related Biological Subnetwork via Network Propagation.

Newaz, Khalique; Milenkovic, Tijana.

IEEE/ACM Trans Comput Biol Bioinform ; 19(2): 974-988, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-32897864

RESUMO

Gene expression (GE)data capture valuable condition-specific information ("condition" can mean a biological process, disease stage, age, patient, etc.)However, GE analyses ignore physical interactions between gene products, i.e., proteins. Because proteins function by interacting with each other, and because biological networks (BNs)capture these interactions, BN analyses are promising. However, current BN data fail to capture condition-specific information. Recently, GE and BN data have been integrated using network propagation (NP)to infer condition-specific BNs. However, existing NP-based studies result in a static condition-specific subnetwork, even though cellular processes are dynamic. A dynamic process of our interest is human aging. We use prominent existing NP methods in a new task of inferring a dynamic rather than static condition-specific (aging-related)subnetwork. Then, we study evolution of network structure with age - we identify proteins whose network positions significantly change with age and predict them as new aging-related candidates. We validate the predictions via e.g., functional enrichment analyses and literature search. Dynamic network inference via NP yields higher prediction quality than the only existing method for inferring a dynamic aging-related BN, which does not use NP. Our data and code are available at https://nd.edu/~cone/dynetinf.

Assuntos

Envelhecimento , Proteínas , Envelhecimento/genética , Humanos , Proteínas/genética

3.

Towards future directions in data-integrative supervised prediction of human aging-related genes.

Li, Qi; Newaz, Khalique; Milenkovic, Tijana.

Bioinform Adv ; 2(1): vbac081, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36699345

RESUMO

Motivation: Identification of human genes involved in the aging process is critical due to the incidence of many diseases with age. A state-of-the-art approach for this purpose infers a weighted dynamic aging-specific subnetwork by mapping gene expression (GE) levels at different ages onto the protein-protein interaction network (PPIN). Then, it analyzes this subnetwork in a supervised manner by training a predictive model to learn how network topologies of known aging- versus non-aging-related genes change across ages. Finally, it uses the trained model to predict novel aging-related gene candidates. However, the best current subnetwork resulting from this approach still yields suboptimal prediction accuracy. This could be because it was inferred using outdated GE and PPIN data. Here, we evaluate whether analyzing a weighted dynamic aging-specific subnetwork inferred from newer GE and PPIN data improves prediction accuracy upon analyzing the best current subnetwork inferred from outdated data. Results: Unexpectedly, we find that not to be the case. To understand this, we perform aging-related pathway and Gene Ontology term enrichment analyses. We find that the suboptimal prediction accuracy, regardless of which GE or PPIN data is used, may be caused by the current knowledge about which genes are aging-related being incomplete, or by the current methods for inferring or analyzing an aging-specific subnetwork being unable to capture all of the aging-related knowledge. These findings can potentially guide future directions towards improving supervised prediction of aging-related genes via -omics data integration. Availability and implementation: All data and code are available at zenodo, DOI: 10.5281/zenodo.6995045. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

4.

Improved supervised prediction of aging-related genes via weighted dynamic network analysis.

Li, Qi; Newaz, Khalique; Milenkovic, Tijana.

BMC Bioinformatics ; 22(1): 520, 2021 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-34696741

RESUMO

BACKGROUND: This study focuses on the task of supervised prediction of aging-related genes from -omics data. Unlike gene expression methods for this task that capture aging-specific information but ignore interactions between genes (i.e., their protein products), or protein-protein interaction (PPI) network methods for this task that account for PPIs but the PPIs are context-unspecific, we recently integrated the two data types into an aging-specific PPI subnetwork, which yielded more accurate aging-related gene predictions. However, a dynamic aging-specific subnetwork did not improve prediction performance compared to a static aging-specific subnetwork, despite the aging process being dynamic. This could be because the dynamic subnetwork was inferred using a naive Induced subgraph approach. Instead, we recently inferred a dynamic aging-specific subnetwork using a methodologically more advanced notion of network propagation (NP), which improved upon Induced dynamic aging-specific subnetwork in a different task, that of unsupervised analyses of the aging process. RESULTS: Here, we evaluate whether our existing NP-based dynamic subnetwork will improve upon the dynamic as well as static subnetwork constructed by the Induced approach in the considered task of supervised prediction of aging-related genes. The existing NP-based subnetwork is unweighted, i.e., it gives equal importance to each of the aging-specific PPIs. Because accounting for aging-specific edge weights might be important, we additionally propose a weighted NP-based dynamic aging-specific subnetwork. We demonstrate that a predictive machine learning model trained and tested on the weighted subnetwork yields higher accuracy when predicting aging-related genes than predictive models run on the existing unweighted dynamic or static subnetworks, regardless of whether the existing subnetworks were inferred using NP or the Induced approach. CONCLUSIONS: Our proposed weighted dynamic aging-specific subnetwork and its corresponding predictive model could guide with higher confidence than the existing data and models the discovery of novel aging-related gene candidates for future wet lab validation.

Assuntos

Mapas de Interação de Proteínas , Proteínas , Expressão Gênica

5.

Author Correction: GRAFENE: Graphlet-based alignment-free network approach integrates 3D structural and sequence (residue order) data to improve protein structural comparison.

Faisal, Fazle E; Newaz, Khalique; Chaney, Julie L; Li, Jun; Emrich, Scott J; Clark, Patricia L; Milenkovic, Tijana.

Sci Rep ; 10(1): 13455, 2020 Aug 10.

Artigo em Inglês | MEDLINE | ID: mdl-32778675

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

6.

Network-based protein structural classification.

Newaz, Khalique; Ghalehnovi, Mahboobeh; Rahnama, Arash; Antsaklis, Panos J; Milenkovic, Tijana.

R Soc Open Sci ; 7(6): 191461, 2020 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-32742675

RESUMO

Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922.

7.

Network analysis of synonymous codon usage.

Newaz, Khalique; Wright, Gabriel; Piland, Jacob; Li, Jun; Clark, Patricia L; Emrich, Scott J; Milenkovic, Tijana.

Bioinformatics ; 36(19): 4876-4884, 2020 12 08.

Artigo em Inglês | MEDLINE | ID: mdl-32609328

RESUMO

MOTIVATION: Most amino acids are encoded by multiple synonymous codons, some of which are used more rarely than others. Analyses of positions of such rare codons in protein sequences revealed that rare codons can impact co-translational protein folding and that positions of some rare codons are evolutionarily conserved. Analyses of their positions in protein 3-dimensional structures, which are richer in biochemical information than sequences alone, might further explain the role of rare codons in protein folding. RESULTS: We model protein structures as networks and use network centrality to measure the structural position of an amino acid. We first validate that amino acids buried within the structural core are network-central, and those on the surface are not. Then, we study potential differences between network centralities and thus structural positions of amino acids encoded by conserved rare, non-conserved rare and commonly used codons. We find that in 84% of proteins, the three codon categories occupy significantly different structural positions. We examine protein groups showing different codon centrality trends, i.e. different relationships between structural positions of the three codon categories. We see several cases of all proteins from our data with some structural or functional property being in the same group. Also, we see a case of all proteins in some group having the same property. Our work shows that codon usage is linked to the final protein structure and thus possibly to co-translational protein folding. AVAILABILITY AND IMPLEMENTATION: https://nd.edu/â¼cone/CodonUsage/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Uso do Códon , Dobramento de Proteína , Sequência de Aminoácidos , Códon/genética , Proteínas/genética

8.

GRAFENE: Graphlet-based alignment-free network approach integrates 3D structural and sequence (residue order) data to improve protein structural comparison.

Faisal, Fazle E; Newaz, Khalique; Chaney, Julie L; Li, Jun; Emrich, Scott J; Clark, Patricia L; Milenkovic, Tijana.

Sci Rep ; 7(1): 14890, 2017 11 02.

Artigo em Inglês | MEDLINE | ID: mdl-29097661

RESUMO

Initial protein structural comparisons were sequence-based. Since amino acids that are distant in the sequence can be close in the 3-dimensional (3D) structure, 3D contact approaches can complement sequence approaches. Traditional 3D contact approaches study 3D structures directly and are alignment-based. Instead, 3D structures can be modeled as protein structure networks (PSNs). Then, network approaches can compare proteins by comparing their PSNs. These can be alignment-based or alignment-free. We focus on the latter. Existing network alignment-free approaches have drawbacks: 1) They rely on naive measures of network topology. 2) They are not robust to PSN size. They cannot integrate 3) multiple PSN measures or 4) PSN data with sequence data, although this could improve comparison because the different data types capture complementary aspects of the protein structure. We address this by: 1) exploiting well-established graphlet measures via a new network alignment-free approach, 2) introducing normalized graphlet measures to remove the bias of PSN size, 3) allowing for integrating multiple PSN measures, and 4) using ordered graphlets to combine the complementary PSN data and sequence (specifically, residue order) data. We compare synthetic networks and real-world PSNs more accurately and faster than existing network (alignment-free and alignment-based), 3D contact, or sequence approaches.

Assuntos

Proteínas/química , Software , Algoritmos , Aminoácidos/química , Gráficos por Computador , Bases de Dados de Proteínas , Modelos Biológicos , Conformação Proteica

9.

Identification of Major Signaling Pathways in Prion Disease Progression Using Network Analysis.

Newaz, Khalique; Sriram, K; Bera, Debajyoti.

PLoS One ; 10(12): e0144389, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26646948

RESUMO

Prion diseases are transmissible neurodegenerative diseases that arise due to conformational change of normal, cellular prion protein (PrPC) to protease-resistant isofrom (rPrPSc). Deposition of misfolded PrpSc proteins leads to an alteration of many signaling pathways that includes immunological and apoptotic pathways. As a result, this culminates in the dysfunction and death of neuronal cells. Earlier works on transcriptomic studies have revealed some affected pathways, but it is not clear which is (are) the prime network pathway(s) that change during the disease progression and how these pathways are involved in crosstalks with each other from the time of incubation to clinical death. We perform network analysis on large-scale transcriptomic data of differentially expressed genes obtained from whole brain in six different mouse strain-prion strain combination models to determine the pathways involved in prion diseases, and to understand the role of crosstalks in disease propagation. We employ a notion of differential network centrality measures on protein interaction networks to identify the potential biological pathways involved. We also propose a crosstalk ranking method based on dynamic protein interaction networks to identify the core network elements involved in crosstalk with different pathways. We identify 148 DEGs (differentially expressed genes) potentially related to the prion disease progression. Functional association of the identified genes implicates a strong involvement of immunological pathways. We extract a bow-tie structure that is potentially dysregulated in prion disease. We also propose an ODE model for the bow-tie network. Predictions related to diseased condition suggests the downregulation of the core signaling elements (PI3Ks and AKTs) of the bow-tie network. In this work, we show using transcriptomic data that the neuronal dysfunction in prion disease is strongly related to the immunological pathways. We conclude that these immunological pathways occupy influential positions in the PFNs (protein functional networks) that are related to prion disease. Importantly, this functional network involvement is prevalent in all the five different mouse strain-prion strain combinations that we studied. We also conclude that the dysregulation of the core elements of the bow-tie structure, which belongs to PI3K-Akt signaling pathway, leads to dysregulation of the downstream components corresponding to other biological pathways.

Assuntos

Doenças Priônicas/patologia , Transdução de Sinais , Progressão da Doença , Humanos , Doenças Priônicas/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA