Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 80
Filtrar
1.
Sci Rep ; 14(1): 11794, 2024 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-38782963

RESUMO

We present the Manatee variational autoencoder model to predict transcription factor (TF) perturbation-induced transcriptomes. We demonstrate that the Manatee in silico perturbation analysis recapitulates target transcriptomic phenotypes in diverse cellular lineage transitions. We further propose the Manatee in silico screening analysis for prioritizing TF combinations targeting desired transcriptomic phenotypes.


Assuntos
Fatores de Transcrição , Transcriptoma , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Humanos , Perfilação da Expressão Gênica , Simulação por Computador , Biologia Computacional/métodos , Algoritmos
3.
Sci Rep ; 12(1): 1329, 2022 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-35079083

RESUMO

The SARS-CoV-2 pandemic has challenged humankind's ability to quickly determine the cascade of health effects caused by a novel infection. Even with the unprecedented speed at which vaccines were developed and introduced into society, identifying therapeutic interventions and drug targets for patients infected with the virus remains important as new strains of the virus evolve, or future coronaviruses may emerge that are resistant to current vaccines. The application of transcriptomic RNA sequencing of infected samples may shed new light on the pathways involved in viral mechanisms and host responses. We describe the application of the previously developed "dual RNA-seq" approach to investigate, for the first time, the co-regulation between the human and SARS-CoV-2 transcriptomes. Together with differential expression analysis, we describe the tissue specificity of SARS-CoV-2 expression, an inferred lipopolysaccharide response, and co-regulation of CXCL's, SPRR's, S100's with SARS-CoV-2 expression. Lipopolysaccharide response pathways in particular offer promise for future therapeutic research and the prospect of subgrouping patients based on chemokine expression that may help explain the vastly different reactions patients have to infection. Taken together these findings highlight unappreciated SARS-CoV-2 expression signatures and emphasize new considerations and mechanisms for SARS-CoV-2 therapeutic intervention.


Assuntos
COVID-19 , Regulação Viral da Expressão Gênica , RNA-Seq , SARS-CoV-2 , Transcriptoma , Células A549 , COVID-19/genética , COVID-19/metabolismo , Humanos , SARS-CoV-2/genética , SARS-CoV-2/metabolismo
4.
Cell Syst ; 12(8): 827-838.e5, 2021 08 18.
Artigo em Inglês | MEDLINE | ID: mdl-34146471

RESUMO

The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Neoplasias , Humanos , Neoplasias/genética , Isoformas de Proteínas/genética , RNA/genética , RNA-Seq , Análise de Sequência de RNA
5.
PLoS Comput Biol ; 17(4): e1008878, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33861732

RESUMO

Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene "signatures"-patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality.


Assuntos
Genômica , Aprendizado de Máquina , Neoplasias/classificação , Neoplasias/genética , Linhagem Celular Tumoral , Técnicas de Silenciamento de Genes , Humanos , Fenótipo , RNA Interferente Pequeno/genética , Análise de Sobrevida
6.
iScience ; 24(1): 102017, 2021 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-33490923

RESUMO

Biological states are controlled by orchestrated transcriptional factors (TFs) within gene regulatory networks. Here we show TFs responsible for the dynamic changes of biological states can be prioritized with temporal PageRank. We further show such TF prioritization can be extended by integrating gene regulatory networks reverse engineered from multi-omics profiles, e.g. gene expression, chromatin accessibility, and chromosome conformation assays, using multiplex PageRank.

7.
Prostate Cancer Prostatic Dis ; 24(1): 81-87, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-32286548

RESUMO

BACKGROUND: Metastatic disease burden out of proportion to serum PSA has been used as a marker of aggressive phenotype prostate cancer but is not well defined as a distinct subgroup. We sought to prospectively characterize the molecular features and clinical outcomes of Low PSA Secretors. METHODS: Eligible metastatic castration resistant prostate cancer (mCRPC) patients without prior small cell histology underwent metastatic tumor biopsy with molecular characterization. Low PSA secretion was defined as serum PSA < 2, 5, or 10 ng/mL plus >5 metastases with radiographic progression at study entry. Clinical and molecular features were compared between low PSA vs. normal secretors in a post-hoc fashion. RESULTS: 183 patients were enrolled, including 15 (8%) identified as Low PSA Secretors using optimal PSA cut point of 5 ng/mL. Biopsies from Low PSA Secretors demonstrated higher t-SCNC and RB1 loss and lower AR transcriptional signature scores compared with normal secretors. Genomic loss of RB1 and/or TP53 was more common in Low PSA Secretors (80% vs. 41%). Overall survival (OS) was shorter in Low PSA Secretors (median OS = 26.7 vs. 46.0 months, hazard ratio = 2.465 (95% CI: 0.982-6.183). Progression-free survival (PFS) on post-biopsy treatment with AR-targeted therapy was shorter than with chemotherapy (median PFS 6.2 vs. 4.1 months). CONCLUSIONS: Low PSA secretion in relation to metastatic tumor burden may be a readily available clinical selection tool for de-differentiated mCRPC with molecular features consistent with t-SCNC. Prospective validation is warranted.


Assuntos
Adenocarcinoma/sangue , Estadiamento de Neoplasias , Neoplasias de Próstata Resistentes à Castração/sangue , Proteínas de Ligação a Retinoblastoma/genética , Proteína Supressora de Tumor p53/genética , Ubiquitina-Proteína Ligases/genética , Adenocarcinoma/genética , Adenocarcinoma/secundário , Idoso , Idoso de 80 Anos ou mais , Biomarcadores Tumorais/sangue , Biópsia , DNA de Neoplasias/genética , Intervalo Livre de Doença , Feminino , Seguimentos , Genômica , Humanos , Masculino , Pessoa de Meia-Idade , Metástase Neoplásica , Antígeno Prostático Específico/sangue , Neoplasias de Próstata Resistentes à Castração/genética , Neoplasias de Próstata Resistentes à Castração/patologia , Proteínas de Ligação a Retinoblastoma/metabolismo , Estudos Retrospectivos , Proteína Supressora de Tumor p53/metabolismo , Ubiquitina-Proteína Ligases/metabolismo
8.
Dev Cell ; 56(3): 292-309.e9, 2021 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-33321106

RESUMO

Haploinsufficiency of transcriptional regulators causes human congenital heart disease (CHD); however, the underlying CHD gene regulatory network (GRN) imbalances are unknown. Here, we define transcriptional consequences of reduced dosage of the CHD transcription factor, TBX5, in individual cells during cardiomyocyte differentiation from human induced pluripotent stem cells (iPSCs). We discovered highly sensitive dysregulation of TBX5-dependent pathways-including lineage decisions and genes associated with heart development, cardiomyocyte function, and CHD genetics-in discrete subpopulations of cardiomyocytes. Spatial transcriptomic mapping revealed chamber-restricted expression for many TBX5-sensitive transcripts. GRN analysis indicated that cardiac network stability, including vulnerable CHD-linked nodes, is sensitive to TBX5 dosage. A GRN-predicted genetic interaction between Tbx5 and Mef2c, manifesting as ventricular septation defects, was validated in mice. These results demonstrate exquisite and diverse sensitivity to TBX5 dosage in heterogeneous subsets of iPSC-derived cardiomyocytes and predicts candidate GRNs for human CHDs, with implications for quantitative transcriptional regulation in disease.


Assuntos
Redes Reguladoras de Genes , Haploinsuficiência/genética , Cardiopatias Congênitas/genética , Modelos Biológicos , Proteínas com Domínio T/genética , Animais , Padronização Corporal/genética , Diferenciação Celular , Dosagem de Genes , Ventrículos do Coração/patologia , Humanos , Fatores de Transcrição MEF2/metabolismo , Camundongos , Mutação/genética , Miócitos Cardíacos/metabolismo , Transcrição Gênica
9.
Urol Oncol ; 38(12): 931.e9-931.e16, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32624423

RESUMO

OBJECTIVES: The net oncogenic effect of ß2-adrenergic receptor ADRB2, whose downstream elements induce neuroendocrine differentiation and whose expression is regulated by EZH2, is unclear. ADRB2 expression and associated clinical outcomes in metastatic castration-resistant prostate cancer (mCRPC) are unknown. METHODS AND MATERIALS: This was a retrospective analysis of a multi-center, prospectively enrolled cohort of mCRPC patients. Metastatic biopsies were obtained at progression, and specimens underwent laser capture microdissection and RNA-seq. ADRB2 expression was stratified by histology and clustering based on unsupervised hierarchical transcriptome analysis and correlated with EZH2 expression; an external dataset was used for validation. The association between ADRB2 expression and overall survival (OS) was assessed by log-rank test and a multivariable Cox proportional hazard model. RESULTS: One hundred and twenty-seven patients with progressive mCRPC had sufficient metastatic tumor for RNA-seq. ADRB2 expression was lowest in the small cell-enriched transcriptional cluster (P < 0.01) and correlated inversely with EZH2 expression (r = -0.28, P < 0.01). These findings were validated in an external cohort enriched for neuroendocrine differentiation. Patients with tumors harboring low ADRB2 expression (lowest quartile) had a shorter median OS than those with higher (9.5 vs. 20.5 months, P = 0.02). In multivariable analysis, low ADRB2 expression was associated with a trend toward shorter OS (HR for death = 1.54, 95%CI 0.98-2.44). Conversely, higher expression of upstream transcriptional regulator EZH2 was associated with shortened OS (HR for death = 3.01, 95%CI 1.12-8.09). CONCLUSIONS: Low ADRB2 expression is associated with neuroendocrine differentiation and is associated with shortened survival. EZH2 is a potential therapeutic target for preventing neuroendocrine transdifferentiation and improving outcomes in mCRPC. Further studies of agents targeting ß-adrenergic signaling are warranted.


Assuntos
Carcinoma Neuroendócrino/genética , Carcinoma de Células Pequenas/genética , Regulação Neoplásica da Expressão Gênica , Neoplasias de Próstata Resistentes à Castração/genética , Idoso , Idoso de 80 Anos ou mais , Carcinoma Neuroendócrino/mortalidade , Carcinoma de Células Pequenas/mortalidade , Regulação para Baixo , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias de Próstata Resistentes à Castração/mortalidade , Receptores Adrenérgicos beta 2 , Estudos Retrospectivos , Taxa de Sobrevida
10.
Clin Cancer Res ; 26(17): 4616-4624, 2020 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-32727885

RESUMO

PURPOSE: The purpose of this study was to measure genomic changes that emerge with enzalutamide treatment using analyses of whole-genome sequencing and RNA sequencing. EXPERIMENTAL DESIGN: One hundred and one tumors from men with metastatic castration-resistant prostate cancer (mCRPC) who had not been treated with enzalutamide (n = 64) or who had enzalutamide-resistant mCRPC (n = 37) underwent whole genome sequencing. Ninety-nine of these tumors also underwent RNA sequencing. We analyzed the genomes and transcriptomes of these mCRPC tumors. RESULTS: Copy number loss was more common than gain in enzalutamide-resistant tumors. Specially, we identified 124 protein-coding genes that were more commonly lost in enzalutamide-resistant samples. These 124 genes included eight putative tumor suppressors located at nine distinct genomic regions. We demonstrated that focal deletion of the 17q22 locus that includes RNF43 and SRSF1 was not present in any patient with enzalutamide-naïve mCRPC but was present in 16% (6/37) of patients with enzalutamide-resistant mCRPC. 17q22 loss was associated with lower RNF43 and SRSF1 expression and poor overall survival from time of biopsy [median overall survival of 19.3 months in 17q22 intact vs. 8.9 months in 17q22 loss, HR, 3.44 95% confidence interval (CI), 1.338-8.867, log-rank P = 0.006]. Finally, 17q22 loss was linked with activation of several targetable factors, including CDK1/2, Akt, and PLK1, demonstrating the potential therapeutic relevance of 17q22 loss in mCRPC. CONCLUSIONS: Copy number loss is common in enzalutamide-resistant tumors. Focal deletion of chromosome 17q22 defines a previously unappreciated molecular subset of enzalutamide-resistant mCRPC associated with poor clinical outcome.


Assuntos
Benzamidas/farmacologia , Biomarcadores Tumorais/genética , Cromossomos Humanos Par 17/genética , Resistencia a Medicamentos Antineoplásicos/genética , Nitrilas/farmacologia , Feniltioidantoína/farmacologia , Neoplasias de Próstata Resistentes à Castração/genética , Benzamidas/uso terapêutico , Biópsia , Variações do Número de Cópias de DNA , Intervalo Livre de Doença , Humanos , Masculino , Nitrilas/uso terapêutico , Feniltioidantoína/uso terapêutico , Próstata/patologia , Neoplasias de Próstata Resistentes à Castração/tratamento farmacológico , Neoplasias de Próstata Resistentes à Castração/mortalidade , Neoplasias de Próstata Resistentes à Castração/patologia , RNA-Seq , Análise de Sobrevida
11.
Proc Natl Acad Sci U S A ; 117(22): 12315-12323, 2020 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-32424106

RESUMO

The androgen receptor (AR) antagonist enzalutamide is one of the principal treatments for men with castration-resistant prostate cancer (CRPC). However, not all patients respond, and resistance mechanisms are largely unknown. We hypothesized that genomic and transcriptional features from metastatic CRPC biopsies prior to treatment would be predictive of de novo treatment resistance. To this end, we conducted a phase II trial of enzalutamide treatment (160 mg/d) in 36 men with metastatic CRPC. Thirty-four patients were evaluable for the primary end point of a prostate-specific antigen (PSA)50 response (PSA decline ≥50% at 12 wk vs. baseline). Nine patients were classified as nonresponders (PSA decline <50%), and 25 patients were classified as responders (PSA decline ≥50%). Failure to achieve a PSA50 was associated with shorter progression-free survival, time on treatment, and overall survival, demonstrating PSA50's utility. Targeted DNA-sequencing was performed on 26 of 36 biopsies, and RNA-sequencing was performed on 25 of 36 biopsies that contained sufficient material. Using computational methods, we measured AR transcriptional function and performed gene set enrichment analysis (GSEA) to identify pathways whose activity state correlated with de novo resistance. TP53 gene alterations were more common in nonresponders, although this did not reach statistical significance (P = 0.055). AR gene alterations and AR expression were similar between groups. Importantly, however, transcriptional measurements demonstrated that specific gene sets-including those linked to low AR transcriptional activity and a stemness program-were activated in nonresponders. Our results suggest that patients whose tumors harbor this program should be considered for clinical trials testing rational agents to overcome de novo enzalutamide resistance.


Assuntos
Antineoplásicos/administração & dosagem , Resistencia a Medicamentos Antineoplásicos , Feniltioidantoína/análogos & derivados , Neoplasias de Próstata Resistentes à Castração/genética , Receptores Androgênicos/administração & dosagem , Receptores Androgênicos/genética , Idoso , Idoso de 80 Anos ou mais , Benzamidas , Perfilação da Expressão Gênica , Humanos , Masculino , Pessoa de Meia-Idade , Nitrilas , Feniltioidantoína/administração & dosagem , Antígeno Prostático Específico/metabolismo , Neoplasias de Próstata Resistentes à Castração/tratamento farmacológico , Neoplasias de Próstata Resistentes à Castração/metabolismo , Receptores Androgênicos/metabolismo
12.
JCO Clin Cancer Inform ; 4: 147-159, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32097025

RESUMO

PURPOSE: The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis. METHODS: We introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations. RESULTS: The construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query-based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross-data set analysis to show the utility of the system. CONCLUSION: The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available.


Assuntos
Antineoplásicos/uso terapêutico , Biomarcadores Tumorais/genética , Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Informática Médica , Neoplasias/diagnóstico , Neoplasias/tratamento farmacológico , Gráficos por Computador , Bases de Dados Factuais , Redes Reguladoras de Genes , Humanos , Neoplasias/genética , Transdução de Sinais
13.
Nat Commun ; 11(1): 729, 2020 02 05.
Artigo em Inglês | MEDLINE | ID: mdl-32024854

RESUMO

The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.


Assuntos
Regulação Neoplásica da Expressão Gênica , Mutação , Neoplasias/genética , Splicing de RNA , Montagem e Desmontagem da Cromatina , Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Humano , Humanos , Redes e Vias Metabólicas/genética , Neoplasias/metabolismo , Regiões Promotoras Genéticas
14.
Nat Biotechnol ; 38(1): 97-107, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31919445

RESUMO

Tumor DNA sequencing data can be interpreted by computational methods that analyze genomic heterogeneity to infer evolutionary dynamics. A growing number of studies have used these approaches to link cancer evolution with clinical progression and response to therapy. Although the inference of tumor phylogenies is rapidly becoming standard practice in cancer genome analyses, standards for evaluating them are lacking. To address this need, we systematically assess methods for reconstructing tumor subclonality. First, we elucidate the main algorithmic problems in subclonal reconstruction and develop quantitative metrics for evaluating them. Then we simulate realistic tumor genomes that harbor all known clonal and subclonal mutation types and processes. Finally, we benchmark 580 tumor reconstructions, varying tumor read depth, tumor type and somatic variant detection. Our analysis provides a baseline for the establishment of gold-standard methods to analyze tumor heterogeneity.


Assuntos
Algoritmos , Neoplasias/patologia , Células Clonais , Simulação por Computador , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Genoma , Humanos , Mutação/genética , Neoplasias/genética , Polimorfismo de Nucleotídeo Único/genética , Padrões de Referência
15.
Pac Symp Biocomput ; 25: 343-354, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31797609

RESUMO

Cancer genome projects have produced multidimensional datasets on thousands of samples. Yet, depending on the tumor type, 5-50% of samples have no known driving event. We introduce a semi-supervised method called Learning UnRealized Events (LURE) that uses a progressive label learning framework and minimum spanning analysis to predict cancer drivers based on their altered samples sharing a gene expression signature with the samples of a known event. We demonstrate the utility of the method on the TCGA Pan-Cancer Atlas dataset for which it produced a high-confidence result relating 59 new connections to 18 known mutation events including alterations in the same gene, family, and pathway. We give examples of predicted drivers involved in TP53, telomere maintenance, and MAPK/RTK signaling pathways. LURE identifies connections between genes with no known prior relationship, some of which may offer clues for targeting specific forms of cancer. Code and Supplemental Material are available on the LURE website: https://sysbiowiki.soe.ucsc.edu/lure.


Assuntos
Biologia Computacional , Neoplasias , Humanos , Mutação , Neoplasias/genética
16.
Nat Commun ; 10(1): 4899, 2019 10 25.
Artigo em Inglês | MEDLINE | ID: mdl-31653878

RESUMO

The maintenance and transition of cellular states are controlled by biological processes. Here we present a gene set-based transformation of single cell RNA-Seq data into biological process activities that provides a robust description of cellular states. Moreover, as these activities represent species-independent descriptors, they facilitate the alignment of single cell states across different organisms.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Animais , Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento/genética , Humanos , Leucócitos Mononucleares/metabolismo , Camundongos , Células-Tronco Embrionárias Murinas/metabolismo , Razão Sinal-Ruído , Análise de Célula Única/métodos , Biologia de Sistemas , Peixe-Zebra/genética
17.
Genome Biol ; 20(1): 195, 2019 09 10.
Artigo em Inglês | MEDLINE | ID: mdl-31506093

RESUMO

Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.


Assuntos
Algoritmos , Benchmarking , Disseminação de Informação , Modelos Biológicos , Reprodutibilidade dos Testes
18.
Pac Symp Biocomput ; 24: 136-147, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30864317

RESUMO

Cancer is a complex collection of diseases that are to some degree unique to each patient. Precision oncology aims to identify the best drug treatment regime using molecular data on tumor samples. While omics-level data is becoming more widely available for tumor specimens, the datasets upon which computational learning methods can be trained vary in coverage from sample to sample and from data type to data type. Methods that can 'connect the dots' to leverage more of the information provided by these studies could offer major advantages for maximizing predictive potential. We introduce a multi-view machinelearning strategy called PLATYPUS that builds 'views' from multiple data sources that are all used as features for predicting patient outcomes. We show that a learning strategy that finds agreement across the views on unlabeled data increases the performance of the learning methods over any single view. We illustrate the power of the approach by deriving signatures for drug sensitivity in a large cancer cell line database. Code and additional information are available from the PLATYPUS website https://sysbiowiki.soe.ucsc.edu/platypus.


Assuntos
Resistencia a Medicamentos Antineoplásicos , Aprendizado de Máquina , Neoplasias/tratamento farmacológico , Antineoplásicos/uso terapêutico , Linhagem Celular Tumoral , Biologia Computacional/métodos , Bases de Dados Factuais , Resistencia a Medicamentos Antineoplásicos/genética , Humanos , Armazenamento e Recuperação da Informação , Aprendizado de Máquina/estatística & dados numéricos , Neoplasias/genética , Modelagem Computacional Específica para o Paciente , Variantes Farmacogenômicos , Medicina de Precisão , Software , Aprendizado de Máquina Supervisionado/estatística & dados numéricos
20.
Genome Biol ; 19(1): 188, 2018 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-30400818

RESUMO

BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .


Assuntos
Benchmarking , Simulação por Computador , Crowdsourcing , Variação Genética , Genoma Humano , Genômica/métodos , Neoplasias/genética , Algoritmos , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...