Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
BMC Bioinformatics ; 25(1): 130, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38532317

RESUMO

BACKGROUND: Recent improvements in sequencing technologies enabled detailed profiling of genomic features. These technologies mostly rely on short reads which are merged and compared to reference genome for variant identification. These operations should be done with computers due to the size and complexity of the data. The need for analysis software resulted in many programs for mapping, variant calling and annotation steps. Currently, most programs are either expensive enterprise software with proprietary code which makes access and verification very difficult or open-access programs that are mostly based on command-line operations without user interfaces and extensive documentation. Moreover, a high level of disagreement is observed among popular mapping and variant calling algorithms in multiple studies, which makes relying on a single algorithm unreliable. User-friendly open-source software tools that offer comparative analysis are an important need considering the growth of sequencing technologies. RESULTS: Here, we propose Comparative Sequencing Analysis Platform (COSAP), an open-source platform that provides popular sequencing algorithms for SNV, indel, structural variant calling, copy number variation, microsatellite instability and fusion analysis and their annotations. COSAP is packed with a fully functional user-friendly web interface and a backend server which allows full independent deployment for both individual and institutional scales. COSAP is developed as a workflow management system and designed to enhance cooperation among scientists with different backgrounds. It is publicly available at https://cosap.bio and https://github.com/MBaysanLab/cosap/ . The source code of the frontend and backend services can be found at https://github.com/MBaysanLab/cosap-webapi/ and https://github.com/MBaysanLab/cosap_frontend/ respectively. All services are packed as Docker containers as well. Pipelines that combine algorithms can be customized and new algorithms can be added with minimal coding through modular structure. CONCLUSIONS: COSAP simplifies and speeds up the process of DNA sequencing analyses providing commonly used algorithms for SNV, indel, structural variant calling, copy number variation, microsatellite instability and fusion analysis as well as their annotations. COSAP is packed with a fully functional user-friendly web interface and a backend server which allows full independent deployment for both individual and institutional scales. Standardized implementations of popular algorithms in a modular platform make comparisons much easier to assess the impact of alternative pipelines which is crucial in establishing reproducibility of sequencing analyses.


Assuntos
Variações do Número de Cópias de DNA , Instabilidade de Microssatélites , Humanos , Reprodutibilidade dos Testes , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software
2.
BMC Bioinformatics ; 25(1): 290, 2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39227760

RESUMO

BACKGROUND: Advancements over the past decade in DNA sequencing technology and computing power have created the potential to revolutionize medicine. There has been a marked increase in genetic data available, allowing for the advancement of areas such as personalized medicine. A crucial type of data in this context is genetic variant data which is stored in variant call format (VCF) files. However, the rapid growth in genomics has presented challenges in analyzing and comparing VCF files. RESULTS: In response to the limitations of existing tools, this paper introduces a novel web application that provides a user-friendly solution for VCF file analyses and comparisons. The software tool enables researchers and clinicians to perform high-level analysis with ease and enhances productivity. The application's interface allows users to conveniently upload, analyze, and visualize their VCF files using simple drag-and-drop and point-and-click operations. Essential visualizations such as Venn diagrams, clustergrams, and precision-recall plots are provided to users. A key feature of the application is its support for metadata-based file grouping, accomplished through flexible data matrix uploads, streamlining organization and analysis of user-defined categories. Additionally, the application facilitates standardized benchmarking of VCF files by integrating user-provided ground truth regions and variant lists. CONCLUSIONS: By providing a user-friendly interface and supporting essential visualizations, this software enhances the accessibility of VCF file analysis and assists researchers and clinicians in their scientific inquiries.


Assuntos
Software , Genômica/métodos , Interface Usuário-Computador , Humanos , Variação Genética
3.
BMC Bioinformatics ; 25(1): 124, 2024 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-38519906

RESUMO

BACKGROUND: Next-generation sequencing (NGS) technologies offer fast and inexpensive identification of DNA sequences. Somatic sequencing is among the primary applications of NGS, where acquired (non-inherited) variants are based on comparing diseased and healthy tissues from the same individual. Somatic mutations in genetic diseases such as cancer are tightly associated with genomic instability. Genomic instability increases heterogenity, complicating sequencing efforts further, a task already challenged by the presence of short reads and repetitions in human DNA. This leads to low concordance among studies and limits reproducibility. This limitation is a significant problem since identified mutations in somatic sequencing are major biomarkers for diagnosis and the primary input of targeted therapies. Benchmarking studies were conducted to assess the error rates and increase reproducibility. Unfortunately, the number of somatic benchmarking sets is very limited due to difficulties in validating true somatic variants. Moreover, most NGS benchmarking studies are based on relatively simpler germline (inherited) sequencing. Recently, a comprehensive somatic sequencing benchmarking set was published by Sequencing Quality Control Phase 2 (SEQC2). We chose this dataset for our experiments because it is a well-validated, cancer-focused dataset that includes many tumor/normal biological replicates. Our study has two primary goals. First goal is to determine how replicate-based consensus approaches can improve the accuracy of somatic variant detection systems. Second goal is to develop highly predictive machine learning (ML) models by employing replicate-based consensus variants as labels during the training phase. RESULTS: Ensemble approaches that combine alternative algorithms are relatively common; here, as an alternative, we study the performance enhancement potential of biological replicates. We first developed replicate-based consensus approaches that utilize the biological replicates available in this study to improve variant calling performance. Subsequently, we trained ML models using these biological replicates and achieved performance comparable to optimal ML models, those trained using high-confidence variants identified in advance. CONCLUSIONS: Our replicate-based consensus approach can be used to improve variant calling performance and develop efficient ML models. Given the relative ease of obtaining biological replicates, this strategy allows for the development of efficient ML models tailored to specific datasets or scenarios.


Assuntos
Algoritmos , Neoplasias , Humanos , Reprodutibilidade dos Testes , Sequenciamento do Exoma , Neoplasias/genética , Instabilidade Genômica , Sequenciamento de Nucleotídeos em Larga Escala
5.
Int J Cancer ; 141(10): 2002-2013, 2017 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-28710771

RESUMO

Intratumoral heterogeneity at the genetic, epigenetic, transcriptomic, and morphologic levels is a commonly observed phenomenon in many aggressive cancer types. Clonal evolution during tumor formation and in response to therapeutic intervention can be predicted utilizing reverse engineering approaches on detailed genomic snapshots of heterogeneous patient tumor samples. In this study, we developed an extensive dataset for a GBM case via the generation of polyclonal and monoclonal glioma stem cell lines from initial diagnosis, and from multiple sections of distant tumor locations of the deceased patient's brain following tumor recurrence. Our analyses revealed the tissue-wide expansion of a new clone in the recurrent tumor and chromosome 7 gain and chromosome 10 loss as repeated genomic events in primary and recurrent disease. Moreover, chromosome 7 gain and chromosome 10 loss produced similar alterations in mRNA expression profiles in primary and recurrent tumors despite possessing other highly heterogeneous and divergent genomic alterations between the tumors. We identified ETV1 and CDK6 as putative candidate genes, and NFKB (complex), IL1B, IL6, Akt and VEGF as potential signaling regulators, as potentially central downstream effectors of chr7 gain and chr10 loss. Finally, the differences caused by the transcriptomic shift following gain of chromosome 7 and loss of chromosome 10 were consistent with those generally seen in GBM samples compared to normal brain in large-scale patient-tumor data sets.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias Encefálicas/genética , Cromossomos Humanos Par 10/genética , Cromossomos Humanos Par 7/genética , Glioma/genética , Recidiva Local de Neoplasia/genética , Células-Tronco Neoplásicas/metabolismo , Animais , Neoplasias Encefálicas/patologia , Linhagem Celular Tumoral , Aberrações Cromossômicas , Perfilação da Expressão Gênica , Genômica/métodos , Glioma/patologia , Xenoenxertos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Recidiva Local de Neoplasia/patologia , Células-Tronco Neoplásicas/patologia , Prognóstico
6.
Med Biol Eng Comput ; 61(1): 243-258, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36357628

RESUMO

This study explores the machine learning-based assessment of predisposition to colorectal cancer based on single nucleotide polymorphisms (SNP). Such a computational approach may be used as a risk indicator and an auxiliary diagnosis method that complements the traditional methods such as biopsy and CT scan. Moreover, it may be used to develop a low-cost screening test for the early detection of colorectal cancers to improve public health. We employ several supervised classification algorithms. Besides, we apply data imputation to fill in the missing genotype values. The employed dataset includes SNPs observed in particular colorectal cancer-associated genomic loci that are located within DNA regions of 11 selected genes obtained from 115 individuals. We make the following observations: (i) random forest-based classifier using one-hot encoding and K-nearest neighbor (KNN)-based imputation performs the best among the studied classifiers with an F1 score of 89% and area under the curve (AUC) score of 0.96. (ii) One-hot encoding together with K-nearest neighbor-based data imputation increases the F1 scores by around 26% in comparison to the baseline approach which does not employ them. (iii) The proposed model outperforms a commonly employed state-of-the-art approach, ColonFlag, under all evaluated settings by up to 24% in terms of the AUC score. Based on the high accuracy of the constructed predictive models, the studied 11 genes may be considered a gene panel candidate for colon cancer risk screening.


Assuntos
Algoritmos , Neoplasias do Colo , Humanos , Genótipo , Fenótipo , Aprendizado de Máquina Supervisionado
7.
Turk J Biol ; 45(2): 114-126, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33907494

RESUMO

The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.

8.
Gene ; 588(1): 38-46, 2016 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-27125224

RESUMO

Multiple sclerosis (MS) is an imflammatory disease of central nervous system caused by genetic and environmental factors that remain largely unknown. Autophagy is the process of degradation and recycling of damaged cytoplasmic organelles, macromolecular aggregates, and long-lived proteins. Malfunction of autophagy contributes to the pathogenesis of neurological diseases, and autophagy genes may modulate the T cell survival. We aimed to examine the expression levels of autophagy-related genes. The blood samples of 95 unrelated patients (aged 17-65years, 37 male, 58 female) diagnosed as MS and 95 healthy controls were used to extract the RNA samples. After conversion to single stranded cDNA using polyT priming: the targeted genes were pre-amplified, and 96×78 (samples×primers) qRT-PCR reactions were performed for each primer pair on each sample on a 96.96 array of Fluidigm BioMark™. Compared to age- and sex-matched controls, gene expression levels of ATG16L2, ATG9A, BCL2, FAS, GAA, HGS, PIK3R1, RAB24, RGS19, ULK1, FOXO1, HTT were significantly altered (false discovery rate<0.05). Thus, altered expression levels of several autophagy related genes may affect protein levels, which in turn would influence the activity of autophagy, or most probably, those genes might be acting independent of autophagy and contributing to MS pathogenesis as risk factors. The indeterminate genetic causes leading to alterations in gene expressions require further analysis.


Assuntos
Autofagia , Esclerose Múltipla/genética , Transcriptoma , Adolescente , Adulto , Idoso , Autofagossomos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Esclerose Múltipla/imunologia , Esclerose Múltipla/patologia , Adulto Jovem
9.
Cancer Inform ; 13(Suppl 3): 91, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25573794

RESUMO

[This corrects the article on p. 33 in vol. 13, PMID: 25368508.].

10.
Cancer Inform ; 13(Suppl 3): 33-44, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25368508

RESUMO

Glioblastoma multiforme (GBM) is the most common malignant brain tumor. GBM samples are classified into subtypes based on their transcriptomic and epigenetic profiles. Despite numerous studies to better characterize GBM biology, a comprehensive study to identify GBM subtype- specific master regulators, gene regulatory networks, and pathways is missing. Here, we used FastMEDUSA to compute master regulators and gene regulatory networks for each GBM subtype. We also ran Gene Set Enrichment Analysis and Ingenuity Pathway Analysis on GBM expression dataset from The Cancer Genome Atlas Project to compute GBM- and GBM subtype-specific pathways. Our analysis was able to recover some of the known master regulators and pathways in GBM as well as some putative novel regulators and pathways, which will aide in our understanding of the unique biology of GBM subtypes.

11.
PLoS One ; 9(4): e94045, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24728236

RESUMO

In vitro and in vivo models are widely used in cancer research. Characterizing the similarities and differences between a patient's tumor and corresponding in vitro and in vivo models is important for understanding the potential clinical relevance of experimental data generated with these models. Towards this aim, we analyzed the genomic aberrations, DNA methylation and transcriptome profiles of five parental tumors and their matched in vitro isolated glioma stem cell (GSC) lines and xenografts generated from these same GSCs using high-resolution platforms. We observed that the methylation and transcriptome profiles of in vitro GSCs were significantly different from their corresponding xenografts, which were actually more similar to their original parental tumors. This points to the potentially critical role of the brain microenvironment in influencing methylation and transcriptional patterns of GSCs. Consistent with this possibility, ex vivo cultured GSCs isolated from xenografts showed a tendency to return to their initial in vitro states even after a short time in culture, supporting a rapid dynamic adaptation to the in vitro microenvironment. These results show that methylation and transcriptome profiles are highly dependent on the microenvironment and growth in orthotopic sites partially reverse the changes caused by in vitro culturing.


Assuntos
Glioma/genética , Células-Tronco Neoplásicas/metabolismo , Animais , Metilação de DNA/genética , Metilação de DNA/fisiologia , Feminino , Humanos , Técnicas In Vitro , Camundongos , Camundongos SCID , Polimorfismo de Nucleotídeo Único/genética , Análise de Componente Principal , Estudos Prospectivos , Células Tumorais Cultivadas
12.
PLoS One ; 8(4): e62982, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23658659

RESUMO

Age is a powerful predictor of survival in glioblastoma multiforme (GBM) yet the biological basis for the difference in clinical outcome is mostly unknown. Discovering genes and pathways that would explain age-specific survival difference could generate opportunities for novel therapeutics for GBM. Here we have integrated gene expression, exon expression, microRNA expression, copy number alteration, SNP, whole exome sequence, and DNA methylation data sets of a cohort of GBM patients in The Cancer Genome Atlas (TCGA) project to discover age-specific signatures at the transcriptional, genetic, and epigenetic levels and validated our findings on the REMBRANDT data set. We found major age-specific signatures at all levels including age-specific hypermethylation in polycomb group protein target genes and the upregulation of angiogenesis-related genes in older GBMs. These age-specific differences in GBM, which are independent of molecular subtypes, may in part explain the preferential effects of anti-angiogenic agents in older GBM and pave the way to a better understanding of the unique biology and clinical behavior of older versus younger GBMs.


Assuntos
Envelhecimento/genética , Neoplasias Encefálicas/genética , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Glioblastoma/genética , Adulto , Fatores Etários , Idoso , Envelhecimento/patologia , Inibidores da Angiogênese/uso terapêutico , Neoplasias Encefálicas/irrigação sanguínea , Neoplasias Encefálicas/tratamento farmacológico , Neoplasias Encefálicas/mortalidade , Variações do Número de Cópias de DNA , Metilação de DNA , Éxons , Feminino , Glioblastoma/irrigação sanguínea , Glioblastoma/tratamento farmacológico , Glioblastoma/mortalidade , Humanos , Masculino , MicroRNAs , Pessoa de Meia-Idade , Neovascularização Patológica , Proteínas do Grupo Polycomb/genética , Proteínas do Grupo Polycomb/metabolismo , Polimorfismo de Nucleotídeo Único , Análise de Sobrevida
13.
PLoS One ; 7(11): e47839, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23139755

RESUMO

Glioblastoma Multiforme (GBM) is a tumor with high mortality and no known cure. The dramatic molecular and clinical heterogeneity seen in this tumor has led to attempts to define genetically similar subgroups of GBM with the hope of developing tumor specific therapies targeted to the unique biology within each of these subgroups. Recently, a subset of relatively favorable prognosis GBMs has been identified. These glioma CpG island methylator phenotype, or G-CIMP tumors, have distinct genomic copy number aberrations, DNA methylation patterns, and (mRNA) expression profiles compared to other GBMs. While the standard method for identifying G-CIMP tumors is based on genome-wide DNA methylation data, such data is often not available compared to the more widely available gene expression data. In this study, we have developed and evaluated a method to predict the G-CIMP status of GBM samples based solely on gene expression data.


Assuntos
Neoplasias Encefálicas/genética , Ilhas de CpG/genética , Metilação de DNA/genética , Bases de Dados Genéticas , Regulação Neoplásica da Expressão Gênica , Glioblastoma/genética , Análise por Conglomerados , Humanos , Estimativa de Kaplan-Meier , Modelos Genéticos , Análise de Componente Principal , RNA Mensageiro/genética , Reprodutibilidade dos Testes
14.
PLoS One ; 7(12): e51407, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23236496

RESUMO

Histone methylation regulates normal stem cell fate decisions through a coordinated interplay between histone methyltransferases and demethylases at lineage specific genes. Malignant transformation is associated with aberrant accumulation of repressive histone modifications, such as polycomb mediated histone 3 lysine 27 (H3K27me3) resulting in a histone methylation mediated block to differentiation. The relevance, however, of histone demethylases in cancer remains less clear. We report that JMJD3, a H3K27me3 demethylase, is induced during differentiation of glioblastoma stem cells (GSCs), where it promotes a differentiation-like phenotype via chromatin dependent (INK4A/ARF locus activation) and chromatin independent (nuclear p53 protein stabilization) mechanisms. Our findings indicate that deregulation of JMJD3 may contribute to gliomagenesis via inhibition of the p53 pathway resulting in a block to terminal differentiation.


Assuntos
Diferenciação Celular/fisiologia , Transformação Celular Neoplásica/metabolismo , Glioblastoma/fisiopatologia , Histona Desmetilases com o Domínio Jumonji/metabolismo , Células-Tronco Neoplásicas/fisiologia , Proteína Supressora de Tumor p53/metabolismo , Animais , Western Blotting , Primers do DNA/genética , Histonas/metabolismo , Humanos , Imuno-Histoquímica , Imunoprecipitação , Luciferases , Espectrometria de Massas , Camundongos , Camundongos SCID , Estabilidade Proteica , Reação em Cadeia da Polimerase em Tempo Real
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA