Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Immunity ; 48(4): 812-830.e14, 2018 04 17.
Artículo en Inglés | MEDLINE | ID: mdl-29628290

RESUMEN

We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes-wound healing, IFN-γ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-ß dominant-characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tumors and the resulting data are intended to serve as a resource for future targeted studies to further advance the field.


Asunto(s)
Genómica/métodos , Neoplasias , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Niño , Femenino , Humanos , Interferón gamma/genética , Interferón gamma/inmunología , Macrófagos/inmunología , Masculino , Persona de Mediana Edad , Neoplasias/clasificación , Neoplasias/genética , Neoplasias/inmunología , Pronóstico , Balance Th1 - Th2/fisiología , Factor de Crecimiento Transformador beta/genética , Factor de Crecimiento Transformador beta/inmunología , Cicatrización de Heridas/genética , Cicatrización de Heridas/inmunología , Adulto Joven
2.
Artículo en Inglés | MEDLINE | ID: mdl-38466528

RESUMEN

We identified a progenitor cell population highly enriched in samples from invasive and chemo-resistant carcinomas, characterized by a well-defined multigene signature including APOD, DCN, and LUM. This cell population has previously been labeled as consisting of inflammatory cancer-associated fibroblasts (iCAFs). The same signature characterizes naturally occurring fibro-adipogenic progenitors (FAPs) as well as stromal cells abundant in normal adipose tissue. Our analysis of human gene expression databases provides evidence that adipose stromal cells (ASCs) are recruited by tumors and undergo differentiation into CAFs during cancer progression to invasive and chemotherapy-resistant stages.

3.
Bioinformatics ; 40(5)2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38662553

RESUMEN

SUMMARY: Existing clustering methods for characterizing cell populations from single-cell RNA sequencing are constrained by several limitations stemming from the fact that clusters often cannot be homogeneous, particularly for transitioning populations. On the other hand, dominant cell populations within samples can be identified independently by their strong gene co-expression signatures using methods unrelated to partitioning. Here, we introduce a clustering method, CASCC (co-expression-assisted single-cell clustering), designed to improve biological accuracy using gene co-expression features identified using an unsupervised adaptive attractor algorithm. CASCC outperformed other methods as evidenced by multiple evaluation metrics, and our results suggest that CASCC can improve the analysis of single-cell transcriptomics, enabling potential new discoveries related to underlying biological mechanisms. AVAILABILITY AND IMPLEMENTATION: The CASCC R package is publicly available at https://github.com/LingyiC/CASCC and https://zenodo.org/doi/10.5281/zenodo.10648327.


Asunto(s)
Algoritmos , RNA-Seq , Análisis de la Célula Individual , Programas Informáticos , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , RNA-Seq/métodos , Humanos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de Expresión Génica de una Sola Célula
5.
PLoS Comput Biol ; 17(7): e1009228, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34283835

RESUMEN

During the last ten years, many research results have been referring to a particular type of cancer-associated fibroblasts associated with poor prognosis, invasiveness, metastasis and resistance to therapy in multiple cancer types, characterized by a gene expression signature with prominent presence of genes COL11A1, THBS2 and INHBA. Identifying the underlying biological mechanisms responsible for their creation may facilitate the discovery of targets for potential pan-cancer therapeutics. Using a novel computational approach for single-cell gene expression data analysis identifying the dominant cell populations in a sequence of samples from patients at various stages, we conclude that these fibroblasts are produced by a pan-cancer cellular transition originating from a particular type of adipose-derived stromal cells naturally present in the stromal vascular fraction of normal adipose tissue, having a characteristic gene expression signature. Focusing on a rich pancreatic cancer dataset, we provide a detailed description of the continuous modification of the gene expression profiles of cells as they transition from APOD-expressing adipose-derived stromal cells to COL11A1-expressing cancer-associated fibroblasts, identifying the key genes that participate in this transition. These results also provide an explanation to the well-known fact that the adipose microenvironment contributes to cancer progression.


Asunto(s)
Biomarcadores de Tumor/genética , Fibroblastos Asociados al Cáncer/metabolismo , Colágeno Tipo XI/genética , Invasividad Neoplásica/genética , Tejido Adiposo/metabolismo , Tejido Adiposo/patología , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Fibroblastos Asociados al Cáncer/patología , Carcinoma Ductal Pancreático/genética , Carcinoma Ductal Pancreático/patología , Biología Computacional , Bases de Datos Factuales , Bases de Datos Genéticas , Progresión de la Enfermedad , Femenino , Regulación Neoplásica de la Expresión Génica , Neoplasias de Cabeza y Cuello/genética , Neoplasias de Cabeza y Cuello/patología , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , Células Madre Mesenquimatosas/metabolismo , Células Madre Mesenquimatosas/patología , Invasividad Neoplásica/patología , Invasividad Neoplásica/prevención & control , Neoplasias Ováricas/genética , Neoplasias Ováricas/patología , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/patología , Análisis de la Célula Individual , Células del Estroma/metabolismo , Células del Estroma/patología , Transcriptoma , Microambiente Tumoral/genética
6.
Bioinformatics ; 36(11): 3588-3589, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32108864

RESUMEN

SUMMARY: We developed 2DImpute, an imputation method for correcting false zeros (known as dropouts) in single-cell RNA-sequencing (scRNA-seq) data. It features preventing excessive correction by predicting the false zeros and imputing their values by making use of the interrelationships between both genes and cells in the expression matrix. We showed that 2DImpute outperforms several leading imputation methods by applying it on datasets from various scRNA-seq protocols. AVAILABILITY AND IMPLEMENTATION: The R package of 2DImpute is freely available at GitHub (https://github.com/zky0708/2DImpute). CONTACT: d.anastassiou@columbia.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
RNA-Seq , Programas Informáticos , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Secuenciación del Exoma
7.
PLoS Comput Biol ; 9(2): e1002920, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23468608

RESUMEN

Mining gene expression profiles has proven valuable for identifying signatures serving as surrogates of cancer phenotypes. However, the similarities of such signatures across different cancer types have not been strong enough to conclude that they represent a universal biological mechanism shared among multiple cancer types. Here we present a computational method for generating signatures using an iterative process that converges to one of several precise attractors defining signatures representing biomolecular events, such as cell transdifferentiation or the presence of an amplicon. By analyzing rich gene expression datasets from different cancer types, we identified several such biomolecular events, some of which are universally present in all tested cancer types in nearly identical form. Although the method is unsupervised, we show that it often leads to attractors with strong phenotypic associations. We present several such multi-cancer attractors, focusing on three that are prominent and sharply defined in all cases: a mesenchymal transition attractor strongly associated with tumor stage, a mitotic chromosomal instability attractor strongly associated with tumor grade, and a lymphocyte-specific attractor.


Asunto(s)
Biología Computacional/métodos , Modelos Biológicos , Neoplasias/genética , Algoritmos , Minería de Datos , Bases de Datos Genéticas , Transición Epitelial-Mesenquimal , Perfilación de la Expresión Génica/métodos , Genoma/genética , Humanos , Estimación de Kaplan-Meier , Cinetocoros , Mitosis/genética , Neoplasias/metabolismo , Neoplasias/patología , Oncogenes , Fenotipo , Pronóstico
8.
Cancer Res ; 84(5): 648-649, 2024 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-38437636

RESUMEN

Cancer aggressiveness has been linked with obesity, and studies have shown that adipose tissue can enhance cancer progression. In this issue of Cancer Research, Hosni and colleagues discover a paracrine mechanism mediated by adipocyte precursor cells through which urothelial carcinomas become resistant to erdafitinib, a recently approved therapy inhibiting fibroblast growth factor receptors (FGFR). They identified neuregulin 1 (NRG1) secreted by adipocyte precursor cells as an activator of HER3 signaling that enables resistance. The NRG1-mediated FGFR inhibitor resistance was amenable to intervention with pertuzumab, an antibody blocking the NRG1/HER3 axis. To investigate the nature of the resistance-associated NRG1-expressing cells in human patients, the authors analyzed published single-cell RNA sequencing data and observed that such cells appear in a cluster assigned as inflammatory cancer-associated fibroblasts (iCAF). Notably, the gene signature corresponding to these CAFs is highly similar to that shared by adipose stromal cells (ASC) in fat tissue and fibro-adipogenic progenitors (FAP) in skeletal muscle of cancer-free individuals. Because fibroblasts with the ASC/FAP signature are enriched in various carcinomas, it is possible that the paracrine signaling conferred by NRG1 is a pan-cancer mechanism of FGFR inhibitor resistance and tumor aggressiveness. See related article by Hosni et al., p. 725.


Asunto(s)
Fibroblastos Asociados al Cáncer , Carcinoma de Células Transicionales , Humanos , Adipocitos , Tejido Adiposo , Células del Estroma
9.
Nat Biotechnol ; 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38862616

RESUMEN

Subclonal reconstruction algorithms use bulk DNA sequencing data to quantify parameters of tumor evolution, allowing an assessment of how cancers initiate, progress and respond to selective pressures. We launched the ICGC-TCGA (International Cancer Genome Consortium-The Cancer Genome Atlas) DREAM Somatic Mutation Calling Tumor Heterogeneity and Evolution Challenge to benchmark existing subclonal reconstruction algorithms. This 7-year community effort used cloud computing to benchmark 31 subclonal reconstruction algorithms on 51 simulated tumors. Algorithms were scored on seven independent tasks, leading to 12,061 total runs. Algorithm choice influenced performance substantially more than tumor features but purity-adjusted read depth, copy-number state and read mappability were associated with the performance of most algorithms on most tasks. No single algorithm was a top performer for all seven tasks and existing ensemble strategies were unable to outperform the best individual methods, highlighting a key research need. All containerized methods, evaluation code and datasets are available to support further assessment of the determinants of subclonal reconstruction accuracy and development of improved methods to understand tumor evolution.

10.
BMC Bioinformatics ; 14: 270, 2013 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-24010487

RESUMEN

BACKGROUND: DNA pooling constitutes a cost effective alternative in genome wide association studies. In DNA pooling, equimolar amounts of DNA from different individuals are mixed into one sample and the frequency of each allele in each position is observed in a single genotype experiment. The identification of haplotype frequencies from pooled data in addition to single locus analysis is of separate interest within these studies as haplotypes could increase statistical power and provide additional insight. RESULTS: We developed a method for maximum-parsimony haplotype frequency estimation from pooled DNA data based on the sparse representation of the DNA pools in a dictionary of haplotypes. Extensions to scenarios where data is noisy or even missing are also presented. The resulting method is first applied to simulated data based on the haplotypes and their associated frequencies of the AGT gene. We further evaluate our methodology on datasets consisting of SNPs from the first 7Mb of the HapMap CEU population. Noise and missing data were further introduced in the datasets in order to test the extensions of the proposed method. Both HIPPO and HAPLOPOOL were also applied to these datasets to compare performances. CONCLUSIONS: We evaluate our methodology on scenarios where pooling is more efficient relative to individual genotyping; that is, in datasets that contain pools with a small number of individuals. We show that in such scenarios our methodology outperforms state-of-the-art methods such as HIPPO and HAPLOPOOL.


Asunto(s)
ADN/química , Frecuencia de los Genes/genética , Genómica/métodos , Haplotipos/genética , Algoritmos , ADN/genética , Bases de Datos Genéticas , Proyecto Mapa de Haplotipos , Humanos , Polimorfismo de Nucleótido Simple/genética
11.
Ann Hum Genet ; 76(4): 312-25, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22607042

RESUMEN

Many large genome-wide association studies include nuclear families with more than one child (trio families), allowing for analysis of differences between siblings (sib pair analysis). Statistical power can be increased when haplotypes are used instead of genotypes. Currently, haplotype inference in families with more than one child can be performed either using the familial information or statistical information derived from the population samples but not both. Building on our recently proposed tree-based deterministic framework (TDS) for trio families, we augment its applicability to general nuclear families. We impose a minimum recombinant approach locally and independently on each multiple children family, while resorting to the population-derived information to solve the remaining ambiguities. Thus our framework incorporates all available information (familial and population) in a given study. We demonstrate that using all the constraints in our approach we can have gains in the accuracy as opposed to breaking the multiple children families to separate trios and resorting to a trio inference algorithm or phasing each family in isolation. We believe that our proposed framework could be the method of choice for haplotype inference in studies that include nuclear families with multiple children. Our software (tds2.0) is downloadable from www.ee.columbia.edu/∼anastas/tds.


Asunto(s)
Haplotipos , Modelos Genéticos , Núcleo Familiar , Algoritmos , Humanos , Método de Montecarlo , Linaje , Hermanos
12.
BMC Genet ; 13: 94, 2012 Oct 30.
Artículo en Inglés | MEDLINE | ID: mdl-23110720

RESUMEN

BACKGROUND: Typically, the first phase of a genome wide association study (GWAS) includes genotyping across hundreds of individuals and validation of the most significant SNPs. Allelotyping of pooled genomic DNA is a common approach to reduce the overall cost of the study. Knowledge of haplotype structure can provide additional information to single locus analyses. Several methods have been proposed for estimating haplotype frequencies in a population from pooled DNA data. RESULTS: We introduce a technique for haplotype frequency estimation in a population from pooled DNA samples focusing on datasets containing a small number of individuals per pool (2 or 3 individuals) and a large number of markers. We compare our method with the publicly available state-of-the-art algorithms HIPPO and HAPLOPOOL on datasets of varying number of pools and marker sizes. We demonstrate that our algorithm provides improvements in terms of accuracy and computational time over competing methods for large number of markers while demonstrating comparable performance for smaller marker sizes. Our method is implemented in the "Tree-Based Deterministic Sampling Pool" (TDSPool) package which is available for download at http://www.ee.columbia.edu/~anastas/tdspool. CONCLUSIONS: Using a tree-based determinstic sampling technique we present an algorithm for haplotype frequency estimation from pooled data. Our method demonstrates superior performance in datasets with large number of markers and could be the method of choice for haplotype frequency estimation in such datasets.


Asunto(s)
Algoritmos , Frecuencia de los Genes , Haplotipos , ADN , Bases de Datos Genéticas , Pool de Genes , Marcadores Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Modelos Genéticos
13.
Hum Genet ; 129(2): 161-76, 2011 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21076979

RESUMEN

The human leukocyte antigen (HLA) class II genes HLA-DRB1, -DQA1 and -DQB1 are the strongest genetic factors for type 1 diabetes (T1D). Additional loci in the major histocompatibility complex (MHC) are difficult to identify due to the region's high gene density and complex linkage disequilibrium (LD). To facilitate the association analysis, two novel algorithms were implemented in this study: one for phasing the multi-allelic HLA genotypes in trio families, and one for partitioning the HLA strata in conditional testing. Screening and replication were performed on two large and independent datasets: the Wellcome Trust Case-Control Consortium (WTCCC) dataset of 2,000 cases and 1,504 controls, and the T1D Genetics Consortium (T1DGC) dataset of 2,300 nuclear families. After imputation, the two datasets have 1,941 common SNPs in the MHC, of which 22 were successfully tested and replicated based on the statistical testing stratifying on the detailed DRB1 and DQB1 genotypes. Further conditional tests using the combined dataset confirmed eight novel SNP associations around 31.3 Mb on chromosome 6 (rs3094663, p = 1.66 × 10(-11) and rs2523619, p = 2.77 × 10(-10) conditional on the DR/DQ genotypes). A subsequent LD analysis established TCF19, POU5F1, CCHCR1 and PSORS1C1 as potential causal genes for the observed association.


Asunto(s)
Diabetes Mellitus Tipo 1/genética , Polimorfismo de Nucleótido Simple , Factores de Transcripción/genética , Estudios de Casos y Controles , Femenino , Humanos , Péptidos y Proteínas de Señalización Intracelular/genética , Masculino , Factor 3 de Transcripción de Unión a Octámeros/genética , Proteínas/genética
14.
BMC Cancer ; 11: 529, 2011 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-22208948

RESUMEN

BACKGROUND: The biological mechanisms underlying cancer cell motility and invasiveness remain unclear, although it has been hypothesized that they involve some type of epithelial-mesenchymal transition (EMT). METHODS: We used xenograft models of human cancer cells in immunocompromised mice, profiling the harvested tumors separately with species-specific probes and computationally analyzing the results. RESULTS: Here we show that human cancer cells express in vivo a precise multi-cancer invasion-associated gene expression signature that prominently includes many EMT markers, among them the transcription factor Slug, fibronectin, and α-SMA. We found that human, but not mouse, cells express the signature and Slug is the only upregulated EMT-inducing transcription factor. The signature is also present in samples from many publicly available cancer gene expression datasets, suggesting that it is produced by the cancer cells themselves in multiple cancer types, including nonepithelial cancers such as neuroblastoma. Furthermore, we found that the presence of the signature in human xenografted cells was associated with a downregulation of adipocyte markers in the mouse tissue adjacent to the invasive tumor, suggesting that the signature is triggered by contextual microenvironmental interactions when the cancer cells encounter adipocytes, as previously reported. CONCLUSIONS: The known, precise and consistent gene composition of this cancer mesenchymal transition signature, particularly when combined with simultaneous analysis of the adjacent microenvironment, provides unique opportunities for shedding light on the underlying mechanisms of cancer invasiveness as well as identifying potential diagnostic markers and targets for metastasis-inhibiting therapeutics.


Asunto(s)
Transición Epitelial-Mesenquimal/genética , Neoplasias/metabolismo , Factores de Transcripción/metabolismo , Animales , Línea Celular Tumoral , Colágeno Tipo XI/metabolismo , Perfilación de la Expresión Génica , Humanos , Ratones , Análisis por Micromatrices , Invasividad Neoplásica/genética , Neoplasias/genética , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Factores de Transcripción de la Familia Snail , Especificidad de la Especie
15.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2271-2280, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-32070995

RESUMEN

Bulk samples of the same patient are heterogeneous in nature, comprising of different subpopulations (subclones) of cancer cells. Cells in a tumor subclone are characterized by unique mutational genotype profile. Resolving tumor heterogeneity by estimating the genotypes, cellular proportions and the number of subclones present in the tumor can help in understanding cancer progression and treatment. We present a novel method, ChaClone2, to efficiently deconvolve the observed variant allele fractions (VAFs), with consideration for possible effects from copy number aberrations at the mutation loci. Our method describes a state-space formulation of the feature allocation model, deconvolving the observed VAFs from samples of the same patient into three matrices: subclonal total and variant copy numbers for mutated genes, and proportions of subclones in each sample. We describe an efficient sequential Monte Carlo (SMC) algorithm to estimate these matrices. Extensive simulation shows that the ChaClone2 yields better accuracy when compared with other state-of-the-art methods for addressing similar problem and it offers scalability to large datasets. Also, ChaClone2 features that the model parameter estimates can be refined whenever new mutation data of freshly sequenced genomic locations are available. MATLAB code and datasets are available to download at: https://github.com/moyanre/method2.


Asunto(s)
Biología Computacional/métodos , Variaciones en el Número de Copia de ADN/genética , Mutación/genética , Neoplasias/genética , Algoritmos , Teorema de Bayes , Heterogeneidad Genética , Humanos , Método de Montecarlo , Procesos Estocásticos
16.
Bioinformatics ; 25(11): 1445-6, 2009 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-19297347

RESUMEN

SUMMARY: We present a visualization tool applied on genome-wide association data, revealing disease-associated haplotypes, epistatically interacting loci, as well as providing visual signatures of multivariate correlations of genetic markers with respect to a phenotype. AVAILABILITY: Freely available on the web at: (http://www.ee.columbia.edu/~anastas/sdplots).


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Gráficos por Computador/normas , Estudio de Asociación del Genoma Completo , Haplotipos , Programas Informáticos , Interfaz Usuario-Computador
17.
BMC Genet ; 11: 78, 2010 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-20727218

RESUMEN

BACKGROUND: In genome-wide association studies, thousands of individuals are genotyped in hundreds of thousands of single nucleotide polymorphisms (SNPs). Statistical power can be increased when haplotypes, rather than three-valued genotypes, are used in analysis, so the problem of haplotype phase inference (phasing) is particularly relevant. Several phasing algorithms have been developed for data from unrelated individuals, based on different models, some of which have been extended to father-mother-child "trio" data. RESULTS: We introduce a technique for phasing trio datasets using a tree-based deterministic sampling scheme. We have compared our method with publicly available algorithms PHASE v2.1, BEAGLE v3.0.2 and 2SNP v1.7 on datasets of varying number of markers and trios. We have found that the computational complexity of PHASE makes it prohibitive for routine use; on the other hand 2SNP, though the fastest method for small datasets, was significantly inaccurate. We have shown that our method outperforms BEAGLE in terms of speed and accuracy for small to intermediate dataset sizes in terms of number of trios for all marker sizes examined. Our method is implemented in the "Tree-Based Deterministic Sampling" (TDS) package, available for download at http://www.ee.columbia.edu/~anastas/tds CONCLUSIONS: Using a Tree-Based Deterministic sampling technique, we present an intuitive and conceptually simple phasing algorithm for trio data. The trade off between speed and accuracy achieved by our algorithm makes it a strong candidate for routine use on trio datasets.


Asunto(s)
Algoritmos , Estudio de Asociación del Genoma Completo/métodos , Haplotipos , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple
18.
Sci Rep ; 10(1): 17199, 2020 10 14.
Artículo en Inglés | MEDLINE | ID: mdl-33057153

RESUMEN

Analysis of large gene expression datasets from biopsies of cancer patients can identify co-expression signatures representing particular biomolecular events in cancer. Some of these signatures involve genomically co-localized genes resulting from the presence of copy number alterations (CNAs), for which analysis of the expression of the underlying genes provides valuable information about their combined role as oncogenes or tumor suppressor genes. Here we focus on the discovery and interpretation of such signatures that are present in multiple cancer types due to driver amplifications and deletions in particular regions of the genome after doing a comprehensive analysis combining both gene expression and CNA data from The Cancer Genome Atlas.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Neoplasias/genética , Oncogenes/genética , Análisis de Datos , Dosificación de Gen/genética , Expresión Génica/genética , Genómica/métodos , Humanos
19.
Nat Biotechnol ; 38(1): 97-107, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31919445

RESUMEN

Tumor DNA sequencing data can be interpreted by computational methods that analyze genomic heterogeneity to infer evolutionary dynamics. A growing number of studies have used these approaches to link cancer evolution with clinical progression and response to therapy. Although the inference of tumor phylogenies is rapidly becoming standard practice in cancer genome analyses, standards for evaluating them are lacking. To address this need, we systematically assess methods for reconstructing tumor subclonality. First, we elucidate the main algorithmic problems in subclonal reconstruction and develop quantitative metrics for evaluating them. Then we simulate realistic tumor genomes that harbor all known clonal and subclonal mutation types and processes. Finally, we benchmark 580 tumor reconstructions, varying tumor read depth, tumor type and somatic variant detection. Our analysis provides a baseline for the establishment of gold-standard methods to analyze tumor heterogeneity.


Asunto(s)
Algoritmos , Neoplasias/patología , Células Clonales , Simulación por Computador , Variaciones en el Número de Copia de ADN/genética , Dosificación de Gen , Genoma , Humanos , Mutación/genética , Neoplasias/genética , Polimorfismo de Nucleótido Simple/genética , Estándares de Referencia
20.
Bioinformatics ; 24(1): 46-55, 2008 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-18024972

RESUMEN

MOTIVATION: Conserved motifs often represent biological significance, providing insight on biological aspects such as gene transcription regulation, biomolecular secondary structure, presence of non-coding RNAs and evolution history. With the increasing number of sequenced genomic data, faster and more accurate tools are needed to automate the process of motif discovery. RESULTS: We propose a deterministic sequential Monte Carlo (DSMC) motif discovery technique based on the position weight matrix (PWM) model to locate conserved motifs in a given set of nucleotide sequences, and extend our model to search for instances of the motif with insertions/deletions. We show that the proposed method can be used to align the motif where there are insertions and deletions found in different instances of the motif, which cannot be satisfactorily done using other multiple alignment and motif discovery algorithms. AVAILABILITY: MATLAB code is available at http://www.ee.columbia.edu/~kcliang


Asunto(s)
Algoritmos , ADN/genética , Reconocimiento de Normas Patrones Automatizadas/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Simulación por Computador , Modelos Genéticos , Modelos Estadísticos , Datos de Secuencia Molecular , Método de Montecarlo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA