Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 111(5): 990-995, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38636510

RESUMO

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.


Assuntos
Frequência do Gene , Genótipo , Polimorfismo de Nucleotídeo Único , Software , Humanos , Estudos de Coortes , Desequilíbrio de Ligação , Estudo de Associação Genômica Ampla/métodos , Genoma Humano , Controle de Qualidade , Aprendizado de Máquina , Sequenciamento Completo do Genoma/normas , Sequenciamento Completo do Genoma/métodos
2.
Am J Hum Genet ; 109(11): 1986-1997, 2022 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-36198314

RESUMO

Whole-genome sequencing (WGS) is the gold standard for fully characterizing genetic variation but is still prohibitively expensive for large samples. To reduce costs, many studies sequence only a subset of individuals or genomic regions, and genotype imputation is used to infer genotypes for the remaining individuals or regions without sequencing data. However, not all variants can be well imputed, and the current state-of-the-art imputation quality metric, denoted as standard Rsq, is poorly calibrated for lower-frequency variants. Here, we propose MagicalRsq, a machine-learning-based method that integrates variant-level imputation and population genetics statistics, to provide a better calibrated imputation quality metric. Leveraging WGS data from the Cystic Fibrosis Genome Project (CFGP), and whole-exome sequence data from UK BioBank (UKB), we performed comprehensive experiments to evaluate the performance of MagicalRsq compared to standard Rsq for partially sequenced studies. We found that MagicalRsq aligns better with true R2 than standard Rsq in almost every situation evaluated, for both European and African ancestry samples. For example, when applying models trained from 1,992 CFGP sequenced samples to an independent 3,103 samples with no sequencing but TOPMed imputation from array genotypes, MagicalRsq, compared to standard Rsq, achieved net gains of 1.4 million rare, 117k low-frequency, and 18k common variants, where net gains were gained numbers of correctly distinguished variants by MagicalRsq over standard Rsq. MagicalRsq can serve as an improved post-imputation quality metric and will benefit downstream analysis by better distinguishing well-imputed variants from those poorly imputed. MagicalRsq is freely available on GitHub.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único/genética , Calibragem , Genótipo , Aprendizado de Máquina
3.
Am J Hum Genet ; 108(5): 942-950, 2021 05 06.
Artigo em Inglês | MEDLINE | ID: mdl-33891857

RESUMO

Cerebral cavernous malformations (CCMs) are vascular disorders that affect up to 0.5% of the total population. About 20% of CCMs are inherited because of familial mutations in CCM genes, including CCM1/KRIT1, CCM2/MGC4607, and CCM3/PDCD10, whereas the etiology of a majority of simplex CCM-affected individuals remains unclear. Here, we report somatic mutations of MAP3K3, PIK3CA, MAP2K7, and CCM genes in CCM lesions. In particular, somatic hotspot mutations of PIK3CA are found in 11 of 38 individuals with CCMs, and a MAP3K3 somatic mutation (c.1323C>G [p.Ile441Met]) is detected in 37.0% (34 of 92) of the simplex CCM-affected individuals. Strikingly, the MAP3K3 c.1323C>G mutation presents in 95.7% (22 of 23) of the popcorn-like lesions but only 2.5% (1 of 40) of the subacute-bleeding or multifocal lesions that are predominantly attributed to mutations in the CCM1/2/3 signaling complex. Leveraging mini-bulk sequencing, we demonstrate the enrichment of MAP3K3 c.1323C>G mutation in CCM endothelium. Mechanistically, beyond the activation of CCM1/2/3-inhibited ERK5 signaling, MEKK3 p.Ile441Met (MAP3K3 encodes MEKK3) also activates ERK1/2, JNK, and p38 pathways because of mutation-induced MEKK3 kinase activity enhancement. Collectively, we identified several somatic activating mutations in CCM endothelium, and the MAP3K3 c.1323C>G mutation defines a primary CCM subtype with distinct characteristics in signaling activation and magnetic resonance imaging appearance.


Assuntos
Hemangioma Cavernoso do Sistema Nervoso Central/genética , MAP Quinase Quinase Quinase 3/genética , Mutação , Sequência de Aminoácidos , Classe I de Fosfatidilinositol 3-Quinases/genética , Células Endoteliais/metabolismo , Mutação em Linhagem Germinativa , Hemangioma Cavernoso do Sistema Nervoso Central/patologia , Células Endoteliais da Veia Umbilical Humana , Humanos , MAP Quinase Quinase Quinase 3/metabolismo , Sistema de Sinalização das MAP Quinases , Modelos Moleculares
4.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34882196

RESUMO

Multiple statistical methods for aggregate association testing have been developed for whole-genome sequencing (WGS) data. Many aggregate variants in a given genomic window and ignore existing knowledge to define test regions, resulting in many identified regions not clearly linked to genes, and thus, limiting biological understanding. Functional information from new technologies (such as Hi-C and its derivatives), which can help link enhancers to their effector genes, can be leveraged to predefine variant sets for aggregate testing in WGS data. Here, we propose the eSCAN (scan the enhancers) method for genome-wide assessment of enhancer regions in sequencing studies, combining the advantages of dynamic window selection in SCANG (SCAN the Genome), a previously developed method, with the advantages of incorporating putative regulatory regions from annotation. eSCAN, by searching in putative enhancers, increases statistical power and aids mechanistic interpretation, as demonstrated by extensive simulation studies. We also apply eSCAN for blood cell traits using NHLBI Trans-Omics for Precision Medicine WGS data. Results from real data analysis show that eSCAN is able to capture more significant signals, and these signals are of shorter length (indicating higher resolution fine-mapping capability) and drive association of larger regions detected by other methods.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Estudo de Associação Genômica Ampla/métodos , Genômica , Sequências Reguladoras de Ácido Nucleico , Sequenciamento Completo do Genoma/métodos
5.
J Biomed Sci ; 31(1): 51, 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38741091

RESUMO

BACKGROUND: The fusiform aneurysm is a nonsaccular dilatation affecting the entire vessel wall over a short distance. Although PDGFRB somatic variants have been identified in fusiform intracranial aneurysms, the molecular and cellular mechanisms driving fusiform intracranial aneurysms due to PDGFRB somatic variants remain poorly understood. METHODS: In this study, single-cell sequencing and immunofluorescence were employed to investigate the phenotypic changes in smooth muscle cells within fusiform intracranial aneurysms. Whole-exome sequencing revealed the presence of PDGFRB gene mutations in fusiform intracranial aneurysms. Subsequent immunoprecipitation experiments further explored the functional alterations of these mutated PDGFRB proteins. For the common c.1684 mutation site of PDGFRß, we established mutant smooth muscle cell lines and zebrafish models. These models allowed us to simulate the effects of PDGFRB mutations. We explored the major downstream cellular pathways affected by PDGFRBY562D mutations and evaluated the potential therapeutic effects of Ruxolitinib. RESULTS: Single-cell sequencing of two fusiform intracranial aneurysms sample revealed downregulated smooth muscle cell markers and overexpression of inflammation-related markers in vascular smooth muscle cells, which was validated by immunofluorescence staining, indicating smooth muscle cell phenotype modulation is involved in fusiform aneurysm. Whole-exome sequencing was performed on seven intracranial aneurysms (six fusiform and one saccular) and PDGFRB somatic mutations were detected in four fusiform aneurysms. Laser microdissection and Sanger sequencing results indicated that the PDGFRB mutations were present in smooth muscle layer. For the c.1684 (chr5: 149505131) site mutation reported many times, further cell experiments showed that PDGFRBY562D mutations promoted inflammatory-related vascular smooth muscle cell phenotype and JAK-STAT pathway played a crucial role in the process. Notably, transfection of PDGFRBY562D in zebrafish embryos resulted in cerebral vascular anomalies. Ruxolitinib, the JAK inhibitor, could reversed the smooth muscle cells phenotype modulation in vitro and inhibit the vascular anomalies in zebrafish induced by PDGFRB mutation. CONCLUSION: Our findings suggested that PDGFRB somatic variants played a role in regulating smooth muscle cells phenotype modulation in fusiform aneurysms and offered a potential therapeutic option for fusiform aneurysms.


Assuntos
Aneurisma Intracraniano , Miócitos de Músculo Liso , Fenótipo , Receptor beta de Fator de Crescimento Derivado de Plaquetas , Aneurisma Intracraniano/genética , Aneurisma Intracraniano/metabolismo , Humanos , Receptor beta de Fator de Crescimento Derivado de Plaquetas/genética , Receptor beta de Fator de Crescimento Derivado de Plaquetas/metabolismo , Miócitos de Músculo Liso/metabolismo , Peixe-Zebra/genética , Animais , Masculino , Mutação , Feminino , Adulto , Pessoa de Meia-Idade
6.
Zhongguo Zhong Yao Za Zhi ; 49(2): 325-333, 2024 Jan.
Artigo em Zh | MEDLINE | ID: mdl-38403308

RESUMO

Neutrophil extracellular traps(NETs) are fibrous networks formed by neutrophils after a procedure called NETosis, with the function of capturing and killing pathogens. NETs are widely involved in the pathological processes of major diseases such as immune system diseases, respiratory diseases, metabolic diseases, cancers, and reperfusion injury. Therefore, regulating NETs has become one of the important ways to prevent and treat the above diseases. As an excellent traditional culture in China, traditional Chinese medicine has made outstanding contributions to the treatment of diseases. In recent years, studies have discovered that a variety of active components in traditional Chinese medicines, Chinese medicine compound prescriptions, and single traditional Chinese medicines can alleviate the symptoms by regulating NETs in the pathological process of major diseases. This article reviews the research progress in the regulation of NETs by the active components of traditional Chinese medicines, Chinese medicine compound prescriptions, and single traditional Chinese medicines in the last five years, aiming to serve as a reference for related research.


Assuntos
Armadilhas Extracelulares , Armadilhas Extracelulares/metabolismo , Medicina Tradicional Chinesa , Neutrófilos , China
7.
Angiogenesis ; 26(2): 295-312, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36719480

RESUMO

Cerebral cavernous malformations (CCMs) refer to a common vascular abnormality that affects up to 0.5% of the population. A somatic gain-of-function mutation in MAP3K3 (p.I441M) was recently reported in sporadic CCMs, frequently accompanied by somatic activating PIK3CA mutations in diseased endothelium. However, the molecular mechanisms of these driver genes remain elusive. In this study, we performed whole-exome sequencing and droplet digital polymerase chain reaction to analyze CCM lesions and the matched blood from sporadic patients. 44 of 94 cases harbored mutations in KRIT1/CCM2 or MAP3K3, of which 75% were accompanied by PIK3CA mutations (P = 0.006). AAV-BR1-mediated brain endothelial-specific MAP3K3I441M overexpression induced CCM-like lesions throughout the brain and spinal cord in adolescent mice. Interestingly, over half of lesions disappeared at adulthood. Single-cell RNA sequencing found significant enrichment of the apoptosis pathway in a subset of brain endothelial cells in MAP3K3I441M mice compared to controls. We then demonstrated that MAP3K3I441M overexpression activated p38 signaling that is associated with the apoptosis of endothelial cells in vitro and in vivo. In contrast, the mice simultaneously overexpressing PIK3CA and MAP3K3 mutations had an increased number of CCM-like lesions and maintained these lesions for a longer time compared to those with only MAP3K3I441M. Further in vitro and in vivo experiments showed that activating PI3K signaling increased proliferation and alleviated apoptosis of endothelial cells. By using AAV-BR1, we found that MAP3K3I441M mutation can provoke CCM-like lesions in mice and the activation of PI3K signaling significantly enhances and maintains these lesions, providing a preclinical model for the further mechanistic and therapeutic study of CCMs.


Assuntos
Classe I de Fosfatidilinositol 3-Quinases , Hemangioma Cavernoso do Sistema Nervoso Central , MAP Quinase Quinase Quinase 3 , Animais , Camundongos , Células Endoteliais/metabolismo , Endotélio/metabolismo , Hemangioma Cavernoso do Sistema Nervoso Central/genética , Hemangioma Cavernoso do Sistema Nervoso Central/patologia , Mutação/genética , Fosfatidilinositol 3-Quinases/genética , Fosfatidilinositol 3-Quinases/metabolismo , Proteínas Proto-Oncogênicas/genética , MAP Quinase Quinase Quinase 3/genética , MAP Quinase Quinase Quinase 3/metabolismo , Classe I de Fosfatidilinositol 3-Quinases/genética , Classe I de Fosfatidilinositol 3-Quinases/metabolismo
8.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33822890

RESUMO

Recent pharmacogenomic studies that generate sequencing data coupled with pharmacological characteristics for patient-derived cancer cell lines led to large amounts of multi-omics data for precision cancer medicine. Among various obstacles hindering clinical translation, lacking effective methods for multimodal and multisource data integration is becoming a bottleneck. Here we proposed DeepDRK, a machine learning framework for deciphering drug response through kernel-based data integration. To transfer information among different drugs and cancer types, we trained deep neural networks on more than 20 000 pan-cancer cell line-anticancer drug pairs. These pairs were characterized by kernel-based similarity matrices integrating multisource and multi-omics data including genomics, transcriptomics, epigenomics, chemical properties of compounds and known drug-target interactions. Applied to benchmark cancer cell line datasets, our model surpassed previous approaches with higher accuracy and better robustness. Then we applied our model on newly established patient-derived cancer cell lines and achieved satisfactory performance with AUC of 0.84 and AUPRC of 0.77. Moreover, DeepDRK was used to predict clinical response of cancer patients. Notably, the prediction of DeepDRK correlated well with clinical outcome of patients and revealed multiple drug repurposing candidates. In sum, DeepDRK provided a computational method to predict drug response of cancer cells from integrating pharmacogenomic datasets, offering an alternative way to prioritize repurposing drugs in precision cancer treatment. The DeepDRK is freely available via https://github.com/wangyc82/DeepDRK.


Assuntos
Antineoplásicos/uso terapêutico , Biologia Computacional/métodos , Aprendizado Profundo , Reposicionamento de Medicamentos/métodos , Neoplasias/tratamento farmacológico , Software , Antineoplásicos/química , Linhagem Celular Tumoral , Conjuntos de Dados como Assunto , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/patologia , Farmacogenética/métodos , Medicina de Precisão/métodos , Prognóstico , Transcriptoma
9.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33839756

RESUMO

Batch effect correction is an essential step in the integrative analysis of multiple single-cell RNA-sequencing (scRNA-seq) data. One state-of-the-art strategy for batch effect correction is via unsupervised or supervised detection of mutual nearest neighbors (MNNs). However, both types of methods only detect MNNs across batches of uncorrected data, where the large batch effects may affect the MNN search. To address this issue, we presented a batch effect correction approach via iterative supervised MNN (iSMNN) refinement across data after correction. Our benchmarking on both simulation and real datasets showed the advantages of the iterative refinement of MNNs on the performance of correction. Compared to popular alternative methods, our iSMNN is able to better mix the cells of the same cell type across batches. In addition, iSMNN can also facilitate the identification of differentially expressed genes (DEGs) that are relevant to the biological function of certain cell types. These results indicated that iSMNN will be a valuable method for integrating multiple scRNA-seq datasets that can facilitate biological and medical studies at single-cell level.


Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Animais , Benchmarking/métodos , Células Cultivadas , Humanos , Camundongos , Reprodutibilidade dos Testes
11.
Exp Aging Res ; 48(4): 387-399, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34969355

RESUMO

OBJECTIVES: The objective of this study was to understand how sleep duration could affect depression among elderly in China. METHOD: A total of 7103 individuals aged 60 and older were selected from China Health and Retirement Longitudinal Study. A generalized linear mixed-effects model was used to estimate the relationship between sleep duration and depression, and we performed stratified analyses by age: young-old elderly, old-old elderly and oldest-old elderly. RESULTS: Short sleep duration significantly incresased CES-D10 depression scores. In addition, the participants with middle sleep duration had higher CES-D10 scores compared to the participants with long sleep duration among young-old elderly, and we found that middle sleep duration was not significantly different from CES-D10 scores after adjustment for demographics, frequencies of activities and Chronic diseases. CONCLUSIONS: These findings suggested that there was a complex association between depression and sleep duration among elderly in China. Different from previous research results on the middle or normal sleep time of the elderly, the middle sleep duration maybe not optimal sleep duration in this study. Investigation of sleep extension to prevent depression may be warranted among the elderly.


Assuntos
Envelhecimento , Depressão , Idoso , Idoso de 80 Anos ou mais , China/epidemiologia , Depressão/epidemiologia , Humanos , Estudos Longitudinais , Pessoa de Meia-Idade , Sono
12.
BMC Bioinformatics ; 22(1): 171, 2021 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-33789579

RESUMO

BACKGROUND: Protein post-translational modification (PTM) is a key issue to investigate the mechanism of protein's function. With the rapid development of proteomics technology, a large amount of protein sequence data has been generated, which highlights the importance of the in-depth study and analysis of PTMs in proteins. METHOD: We proposed a new multi-classification machine learning pipeline MultiLyGAN to identity seven types of lysine modified sites. Using eight different sequential and five structural construction methods, 1497 valid features were remained after the filtering by Pearson correlation coefficient. To solve the data imbalance problem, Conditional Generative Adversarial Network (CGAN) and Conditional Wasserstein Generative Adversarial Network (CWGAN), two influential deep generative methods were leveraged and compared to generate new samples for the types with fewer samples. Finally, random forest algorithm was utilized to predict seven categories. RESULTS: In the tenfold cross-validation, accuracy (Acc) and Matthews correlation coefficient (MCC) were 0.8589 and 0.8376, respectively. In the independent test, Acc and MCC were 0.8549 and 0.8330, respectively. The results indicated that CWGAN better solved the existing data imbalance and stabilized the training error. Alternatively, an accumulated feature importance analysis reported that CKSAAP, PWM and structural features were the three most important feature-encoding schemes. MultiLyGAN can be found at https://github.com/Lab-Xu/MultiLyGAN . CONCLUSIONS: The CWGAN greatly improved the predictive performance in all experiments. Features derived from CKSAAP, PWM and structure schemes are the most informative and had the greatest contribution to the prediction of PTM.


Assuntos
Lisina , Processamento de Proteína Pós-Traducional , Proteínas , Algoritmos , Lisina/metabolismo , Aprendizado de Máquina , Proteínas/genética
13.
Pharmacol Res ; 151: 104552, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31747557

RESUMO

In recent years, although the concept and means of modern treatment of chronic heart failure(CHF) are continually improving, the readmission rate and mortality rate are still high. At present, there is evidence that there is a link between gut microbiota and heart failure, so the intervention of gut microbiota and its metabolites is expected to become a potential new therapeutic target in heart failure. Traditional Chinese medicine(TCM) has apparent advantages in stabilizing the disease, improving heart function, and improving the quality of life. It can exert its effect by operating in the gut microbiota and is an ideal intestinal micro-ecological regulator. Therefore, this article will mainly discuss the advantages of traditional Chinese medicine in treating CHF, the relationship between traditional Chinese medicine and gut microbiota, the relationship between CHF and gut microbiota, and the ways of regulating gut microbiota by traditional Chinese medicine to prevent and treat CHF. It will specify the target and mechanism of traditional Chinese medicine treating heart failure by acting gut microbiota and provide ideas for the treatment of heart failure.


Assuntos
Cardiotônicos/uso terapêutico , Medicamentos de Ervas Chinesas/uso terapêutico , Microbioma Gastrointestinal/efeitos dos fármacos , Insuficiência Cardíaca/tratamento farmacológico , Animais , Cardiotônicos/farmacologia , Doença Crônica , Medicamentos de Ervas Chinesas/farmacologia , Insuficiência Cardíaca/prevenção & controle , Humanos , Medicina Tradicional Chinesa
14.
BMC Bioinformatics ; 20(1): 49, 2019 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-30674277

RESUMO

BACKGROUND: Lysine acetylation in protein is one of the most important post-translational modifications (PTMs). It plays an important role in essential biological processes and is related to various diseases. To obtain a comprehensive understanding of regulatory mechanism of lysine acetylation, the key is to identify lysine acetylation sites. Previously, several shallow machine learning algorithms had been applied to predict lysine modification sites in proteins. However, shallow machine learning has some disadvantages. For instance, it is not as effective as deep learning for processing big data. RESULTS: In this work, a novel predictor named DeepAcet was developed to predict acetylation sites. Six encoding schemes were adopted, including a one-hot, BLOSUM62 matrix, a composition of K-space amino acid pairs, information gain, physicochemical properties, and a position specific scoring matrix to represent the modified residues. A multilayer perceptron (MLP) was utilized to construct a model to predict lysine acetylation sites in proteins with many different features. We also integrated all features and implemented the feature selection method to select a feature set that contained 2199 features. As a result, the best prediction achieved 84.95% accuracy, 83.45% specificity, 86.44% sensitivity, 0.8540 AUC, and 0.6993 MCC in a 10-fold cross-validation. For an independent test set, the prediction achieved 84.87% accuracy, 83.46% specificity, 86.28% sensitivity, 0.8407 AUC, and 0.6977 MCC. CONCLUSION: The predictive performance of our DeepAcet is better than that of other existing methods. DeepAcet can be freely downloaded from https://github.com/Sunmile/DeepAcet .


Assuntos
Aprendizado Profundo/normas , Lisina/química , Processamento de Proteína Pós-Traducional/genética , Acetilação
15.
BMC Bioinformatics ; 20(1): 86, 2019 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-30777029

RESUMO

BACKGROUND: Protein ubiquitination occurs when the ubiquitin protein binds to a target protein residue of lysine (K), and it is an important regulator of many cellular functions, such as signal transduction, cell division, and immune reactions, in eukaryotes. Experimental and clinical studies have shown that ubiquitination plays a key role in several human diseases, and recent advances in proteomic technology have spurred interest in identifying ubiquitination sites. However, most current computing tools for predicting target sites are based on small-scale data and shallow machine learning algorithms. RESULTS: As more experimentally validated ubiquitination sites emerge, we need to design a predictor that can identify lysine ubiquitination sites in large-scale proteome data. In this work, we propose a deep learning predictor, DeepUbi, based on convolutional neural networks. Four different features are adopted from the sequences and physicochemical properties. In a 10-fold cross validation, DeepUbi obtains an AUC (area under the Receiver Operating Characteristic curve) of 0.9, and the accuracy, sensitivity and specificity exceeded 85%. The more comprehensive indicator, MCC, reaches 0.78. We also develop a software package that can be freely downloaded from https://github.com/Sunmile/DeepUbi . CONCLUSION: Our results show that DeepUbi has excellent performance in predicting ubiquitination based on large data.


Assuntos
Aprendizado Profundo , Proteômica/métodos , Proteínas Ubiquitinadas/química , Ubiquitinação , Humanos , Lisina/metabolismo , Redes Neurais de Computação , Proteoma/metabolismo , Software
16.
Curr Genomics ; 20(5): 362-370, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32476993

RESUMO

BACKGROUND: Lysine lipoylation which is a rare and highly conserved post-translational modification of proteins has been considered as one of the most important processes in the biological field. To obtain a comprehensive understanding of regulatory mechanism of lysine lipoylation, the key is to identify lysine lipoylated sites. The experimental methods are expensive and laborious. Due to the high cost and complexity of experimental methods, it is urgent to develop computational ways to predict lipoylation sites. METHODOLOGY: In this work, a predictor named LipoSVM is developed to accurately predict lipoylation sites. To overcome the problem of an unbalanced sample, synthetic minority over-sampling technique (SMOTE) is utilized to balance negative and positive samples. Furthermore, different ratios of positive and negative samples are chosen as training sets. RESULTS: By comparing five different encoding schemes and five classification algorithms, LipoSVM is constructed finally by using a training set with positive and negative sample ratio of 1:1, combining with position-specific scoring matrix and support vector machine. The best performance achieves an accuracy of 99.98% and AUC 0.9996 in 10-fold cross-validation. The AUC of independent test set reaches 0.9997, which demonstrates the robustness of LipoSVM. The analysis between lysine lipoylation and non-lipoylation fragments shows significant statistical differences. CONCLUSION: A good predictor for lysine lipoylation is built based on position-specific scoring matrix and support vector machine. Meanwhile, an online webserver LipoSVM can be freely downloaded from https://github.com/stars20180811/LipoSVM.

17.
Curr Genomics ; 20(8): 581-591, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32581646

RESUMO

BACKGROUND: With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations. METHODOLOGY: In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms. RESULTS: IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries. CONCLUSION: We built a software named IMPMD which can be freely downloaded from https://github.com/Sunmile/IMPMD.

18.
Stroke Vasc Neurol ; 8(6): 453-462, 2023 12 29.
Artigo em Inglês | MEDLINE | ID: mdl-37072338

RESUMO

OBJECTIVE: Extra-axial cavernous hemangiomas (ECHs) are sporadic and rare intracranial occupational lesions that usually occur within the cavernous sinus. The aetiology of ECHs remains unknown. METHODS: Whole-exome sequencing was performed on ECH lesions from 12 patients (discovery cohort) and droplet digital polymerase-chain-reaction (ddPCR) was used to confirm the identified mutation in 46 additional cases (validation cohort). Laser capture microdissection (LCM) was carried out to capture and characterise subgroups of tissue cells. Mechanistic and functional investigations were carried out in human umbilical vein endothelial cells and a newly established mouse model. RESULTS: We detected somatic GJA4 mutation (c.121G>T, p.G41C) in 5/12 patients with ECH in the discovery cohort and confirmed the finding in the validation cohort (16/46). LCM followed by ddPCR revealed that the mutation was enriched in lesional endothelium. In vitro experiments in endothelial cells demonstrated that the GJA4 mutation activated SGK-1 signalling that in turn upregulated key genes involved in cell hyperproliferation and the loss of arterial specification. Compared with wild-type littermates, mice overexpressing the GJA4 mutation developed ECH-like pathological morphological characteristics (dilated venous lumen and elevated vascular density) in the retinal superficial vascular plexus at the postnatal 3 weeks, which were reversed by an SGK1 inhibitor, EMD638683. CONCLUSIONS: We identified a somatic GJA4 mutation that presents in over one-third of ECH lesions and proposed that ECHs are vascular malformations due to GJA4-induced activation of the SGK1 signalling pathway in brain endothelial cells.


Assuntos
Hemangioma Cavernoso do Sistema Nervoso Central , Hemangioma Cavernoso , Humanos , Animais , Camundongos , Células Endoteliais/metabolismo , Hemangioma Cavernoso do Sistema Nervoso Central/diagnóstico por imagem , Hemangioma Cavernoso do Sistema Nervoso Central/genética , Hemangioma Cavernoso do Sistema Nervoso Central/metabolismo , Hemangioma Cavernoso/metabolismo , Hemangioma Cavernoso/patologia , Mutação , Transdução de Sinais
19.
Genome Med ; 15(1): 16, 2023 03 13.
Artigo em Inglês | MEDLINE | ID: mdl-36915208

RESUMO

BACKGROUND: Although temozolomide (TMZ) has been used as a standard adjuvant chemotherapeutic agent for primary glioblastoma (GBM), treating isocitrate dehydrogenase wild-type (IDH-wt) cases remains challenging due to intrinsic and acquired drug resistance. Therefore, elucidation of the molecular mechanisms of TMZ resistance is critical for its precision application. METHODS: We stratified 69 primary IDH-wt GBM patients into TMZ-resistant (n = 29) and sensitive (n = 40) groups, using TMZ screening of the corresponding patient-derived glioma stem-like cells (GSCs). Genomic and transcriptomic features were then examined to identify TMZ-associated molecular alterations. Subsequently, we developed a machine learning (ML) model to predict TMZ response from combined signatures. Moreover, TMZ response in multisector samples (52 tumor sectors from 18 cases) was evaluated to validate findings and investigate the impact of intra-tumoral heterogeneity on TMZ efficacy. RESULTS: In vitro TMZ sensitivity of patient-derived GSCs classified patients into groups with different survival outcomes (P = 1.12e-4 for progression-free survival (PFS) and 3.63e-4 for overall survival (OS)). Moreover, we found that elevated gene expression of EGR4, PAPPA, LRRC3, and ANXA3 was associated to intrinsic TMZ resistance. In addition, other features such as 5-aminolevulinic acid negative, mesenchymal/proneural expression subtypes, and hypermutation phenomena were prone to promote TMZ resistance. In contrast, concurrent copy-number-alteration in PTEN, EGFR, and CDKN2A/B was more frequent in TMZ-sensitive samples (Fisher's exact P = 0.0102), subsequently consolidated by multi-sector sequencing analyses. Integrating all features, we trained a ML tool to segregate TMZ-resistant and sensitive groups. Notably, our method segregated IDH-wt GBM patients from The Cancer Genome Atlas (TCGA) into two groups with divergent survival outcomes (P = 4.58e-4 for PFS and 3.66e-4 for OS). Furthermore, we showed a highly heterogeneous TMZ-response pattern within each GBM patient using in vitro TMZ screening and genomic characterization of multisector GSCs. Lastly, the prediction model that evaluates the TMZ efficacy for primary IDH-wt GBMs was developed into a webserver for public usage ( http://www.wang-lab-hkust.com:3838/TMZEP ). CONCLUSIONS: We identified molecular characteristics associated to TMZ sensitivity, and illustrate the potential clinical value of a ML model trained from pharmacogenomic profiling of patient-derived GSC against IDH-wt GBMs.


Assuntos
Neoplasias Encefálicas , Glioblastoma , Glioma , Humanos , Glioblastoma/tratamento farmacológico , Glioblastoma/genética , Glioblastoma/metabolismo , Farmacogenética , Neoplasias Encefálicas/tratamento farmacológico , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Temozolomida/farmacologia , Temozolomida/uso terapêutico , Glioma/genética , Resistencia a Medicamentos Antineoplásicos/genética , Fatores de Transcrição de Resposta de Crescimento Precoce
20.
Sci Transl Med ; 15(716): eadh4181, 2023 10 04.
Artigo em Inglês | MEDLINE | ID: mdl-37792958

RESUMO

Clonal evolution drives cancer progression and therapeutic resistance. Recent studies have revealed divergent longitudinal trajectories in gliomas, but early molecular features steering posttreatment cancer evolution remain unclear. Here, we collected sequencing and clinical data of initial-recurrent tumor pairs from 544 adult diffuse gliomas and performed multivariate analysis to identify early molecular predictors of tumor evolution in three diffuse glioma subtypes. We found that CDKN2A deletion at initial diagnosis preceded tumor necrosis and microvascular proliferation that occur at later stages of IDH-mutant glioma. Ki67 expression at diagnosis was positively correlated with acquiring hypermutation at recurrence in the IDH-wild-type glioma. In all glioma subtypes, MYC gain or MYC-target activation at diagnosis was associated with treatment-induced hypermutation at recurrence. To predict glioma evolution, we constructed CELLO2 (Cancer EvoLution for LOngitudinal data version 2), a machine learning model integrating features at diagnosis to forecast hypermutation and progression after treatment. CELLO2 successfully stratified patients into subgroups with distinct prognoses and identified a high-risk patient group featured by MYC gain with worse post-progression survival, from the low-grade IDH-mutant-noncodel subtype. We then performed chronic temozolomide-induction experiments in glioma cell lines and isogenic patient-derived gliomaspheres and demonstrated that MYC drives temozolomide resistance by promoting hypermutation. Mechanistically, we demonstrated that, by binding to open chromatin and transcriptionally active genomic regions, c-MYC increases the vulnerability of key mismatch repair genes to treatment-induced mutagenesis, thus triggering hypermutation. This study reveals early predictors of cancer evolution under therapy and provides a resource for precision oncology targeting cancer dynamics in diffuse gliomas.


Assuntos
Neoplasias Encefálicas , Glioma , Adulto , Humanos , Neoplasias Encefálicas/terapia , Temozolomida/farmacologia , Temozolomida/uso terapêutico , Mutação/genética , Medicina de Precisão , Recidiva Local de Neoplasia/tratamento farmacológico , Glioma/tratamento farmacológico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA