Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
NAR Genom Bioinform ; 4(3): lqac053, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35899080

RESUMO

Despite the tremendous increase in omics data generated by modern sequencing technologies, their analysis can be tricky and often requires substantial expertise in bioinformatics. To address this concern, we have developed a user-friendly pipeline to analyze (cancer) genomic data that takes in raw sequencing data (FASTQ format) as input and outputs insightful statistics. Our iCOMIC toolkit pipeline featuring many independent workflows is embedded in the popular Snakemake workflow management system. It can analyze whole-genome and transcriptome data and is characterized by a user-friendly GUI that offers several advantages, including minimal execution steps and eliminating the need for complex command-line arguments. Notably, we have integrated algorithms developed in-house to predict pathogenicity among cancer-causing mutations and differentiate between tumor suppressor genes and oncogenes from somatic mutation data. We benchmarked our tool against Genome In A Bottle benchmark dataset (NA12878) and got the highest F1 score of 0.971 and 0.988 for indels and SNPs, respectively, using the BWA MEM-GATK HC DNA-Seq pipeline. Similarly, we achieved a correlation coefficient of r = 0.85 using the HISAT2-StringTie-ballgown and STAR-StringTie-ballgown RNA-Seq pipelines on the human monocyte dataset (SRP082682). Overall, our tool enables easy analyses of omics datasets, significantly ameliorating complex data analysis pipelines.

2.
Front Genet ; 13: 854190, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35620468

RESUMO

The progression of tumorigenesis starts with a few mutational and structural driver events in the cell. Various cohort-based computational tools exist to identify driver genes but require multiple samples to identify less frequently mutated driver genes. Many studies use different methods to identify driver mutations/genes from mutations that have no impact on tumor progression; however, a small fraction of patients show no mutational events in any known driver genes. Current unsupervised methods map somatic and expression data onto a network to identify personalized driver genes based on changes in expression. Our method is the first machine learning model to classify genes as tumor suppressor gene (TSG), oncogene (OG), or neutral, thus assigning the functional impact of the gene in the patient. In this study, we develop a multi-omic approach, PIVOT (Personalized Identification of driVer OGs and TSGs), to train on experimentally or computationally validated mutational and structural driver events. Given the lack of any gold standards for the identification of personalized driver genes, we label the data using four strategies and, based on classification metrics, show gene-based labeling strategies perform best. We build different models using SNV, RNA, and multi-omic features to be used based on the data available. Our models trained on multi-omic data improved predictions compared with mutation and expression data, achieving an accuracy ≥ 0.99 for BRCA, LUAD, and COAD datasets. We show network and expression-based features contribute the most to PIVOT. Our predictions on BRCA, COAD, and LUAD cancer types reveal commonly altered genes such as TP53 and PIK3CA, which are predicted drivers for multiple cancer types. Along with known driver genes, our models also identify new driver genes such as PRKCA, SOX9, and PSMD4. Our multi-omic model labels both CNV and mutations with a more considerable contribution by CNV alterations. While predicting labels for genes mutated in multiple samples, we also label rare driver events occurring in as few as one sample. We also identify genes with dual roles within the same cancer type. Overall, PIVOT labels personalized driver genes as TSGs and OGs and also identified rare driver genes.

3.
Sci Rep ; 12(1): 5, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34997044

RESUMO

An emergent area of cancer genomics is the identification of driver genes. Driver genes confer a selective growth advantage to the cell. While several driver genes have been discovered, many remain undiscovered, especially those mutated at a low frequency across samples. This study defines new features and builds a pan-cancer model, cTaG, to identify new driver genes. The features capture the functional impact of the mutations as well as their recurrence across samples, which helps build a model unbiased to genes with low frequency. The model classifies genes into the functional categories of driver genes, tumour suppressor genes (TSGs) and oncogenes (OGs), having distinct mutation type profiles. We overcome overfitting and show that certain mutation types, such as nonsense mutations, are more important for classification. Further, cTaG was employed to identify tissue-specific driver genes. Some known cancer driver genes predicted by cTaG as TSGs with high probability are ARID1A, TP53, and RB1. In addition to these known genes, potential driver genes predicted are CD36, ZNF750 and ARHGAP35 as TSGs and TAB3 as an oncogene. Overall, our approach surmounts the issue of low recall and bias towards genes with high mutation rates and predicts potential new driver genes for further experimental screening. cTaG is available at https://github.com/RamanLab/cTaG .


Assuntos
Neoplasias/genética , Proteínas Oncogênicas/genética , Genes Supressores de Tumor , Genômica , Humanos , Mutação , Oncogenes , Proteínas Supressoras de Tumor/genética
4.
Metabolites ; 10(6)2020 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-32545768

RESUMO

The metabolome of an organism depends on environmental factors and intracellular regulation and provides information about the physiological conditions. Metabolomics helps to understand disease progression in clinical settings or estimate metabolite overproduction for metabolic engineering. The most popular analytical metabolomics platform is mass spectrometry (MS). However, MS metabolome data analysis is complicated, since metabolites interact nonlinearly, and the data structures themselves are complex. Machine learning methods have become immensely popular for statistical analysis due to the inherent nonlinear data representation and the ability to process large and heterogeneous data rapidly. In this review, we address recent developments in using machine learning for processing MS spectra and show how machine learning generates new biological insights. In particular, supervised machine learning has great potential in metabolomics research because of the ability to supply quantitative predictions. We review here commonly used tools, such as random forest, support vector machines, artificial neural networks, and genetic algorithms. During processing steps, the supervised machine learning methods help peak picking, normalization, and missing data imputation. For knowledge-driven analysis, machine learning contributes to biomarker detection, classification and regression, biochemical pathway identification, and carbon flux determination. Of important relevance is the combination of different omics data to identify the contributions of the various regulatory levels. Our overview of the recent publications also highlights that data quality determines analysis quality, but also adds to the challenge of choosing the right model for the data. Machine learning methods applied to MS-based metabolomics ease data analysis and can support clinical decisions, guide metabolic engineering, and stimulate fundamental biological discoveries.

5.
J Biol Chem ; 295(27): 9192-9210, 2020 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-32424041

RESUMO

Intracellular pathogens commonly manipulate the host lysosomal system for their survival. However, whether this pathogen-induced alteration affects the organization and functioning of the lysosomal system itself is not known. Here, using in vitro and in vivo infections and quantitative image analysis, we show that the lysosomal content and activity are globally elevated in Mycobacterium tuberculosis (Mtb)-infected macrophages. We observed that this enhanced lysosomal state is sustained over time and defines an adaptive homeostasis in the infected macrophage. Lysosomal alterations are caused by mycobacterial surface components, notably the cell wall-associated lipid sulfolipid-1 (SL-1), which functions through the mTOR complex 1 (mTORC1)-transcription factor EB (TFEB) axis in the host cells. An Mtb mutant lacking SL-1, MtbΔpks2, shows attenuated lysosomal rewiring compared with the WT Mtb in both in vitro and in vivo infections. Exposing macrophages to purified SL-1 enhanced the trafficking of phagocytic cargo to lysosomes. Correspondingly, MtbΔpks2 exhibited a further reduction in lysosomal delivery compared with the WT. Reduced trafficking of this mutant Mtb strain to lysosomes correlated with enhanced intracellular bacterial survival. Our results reveal that global alteration of the host lysosomal system is a defining feature of Mtb-infected macrophages and suggest that this altered lysosomal state protects host cell integrity and contributes to the containment of the pathogen.


Assuntos
Metabolismo dos Lipídeos/fisiologia , Lisossomos/metabolismo , Mycobacterium tuberculosis/metabolismo , Movimento Celular , Parede Celular , Interações Hospedeiro-Patógeno/fisiologia , Humanos , Lipídeos/fisiologia , Lisossomos/fisiologia , Macrófagos/metabolismo , Macrófagos/microbiologia , Mycobacterium tuberculosis/fisiologia , Transporte Proteico , Células THP-1 , Tuberculose/microbiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA