Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Bioinformatics ; 37(11): 1535-1543, 2021 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-30768159

RESUMEN

MOTIVATION: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. However, extracting features from single cell genomic data in order to infer their evolutionary trajectory remains computationally challenging due to the extremely noisy and sparse nature of the data. RESULTS: Here we describe 'Dhaka', a variational autoencoder method which transforms single cell genomic data to a reduced dimension feature space that is more efficient in differentiating between (hidden) tumor subpopulations. Our method is general and can be applied to several different types of genomic data including copy number variation from scDNA-Seq and gene expression from scRNA-Seq experiments. We tested the method on synthetic and six single cell cancer datasets where the number of cells ranges from 250 to 6000 for each sample. Analysis of the resulting feature space revealed subpopulations of cells and their marker genes. The features are also able to infer the lineage and/or differentiation trajectory between cells greatly improving upon prior methods suggested for feature extraction and dimensionality reduction of such data. AVAILABILITY AND IMPLEMENTATION: All the datasets used in the paper are publicly available and developed software package and supporting info is available on Github https://github.com/MicrosoftGenomics/Dhaka. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Proc Natl Acad Sci U S A ; 114(36): E7554-E7563, 2017 09 05.
Artículo en Inglés | MEDLINE | ID: mdl-28784769

RESUMEN

Translating the genetic and epigenetic heterogeneity underlying human cancers into therapeutic strategies is an ongoing challenge. Large-scale sequencing efforts have uncovered a spectrum of mutations in many hematologic malignancies, including acute myeloid leukemia (AML), suggesting that combinations of agents will be required to treat these diseases effectively. Combinatorial approaches will also be critical for combating the emergence of genetically heterogeneous subclones, rescue signals in the microenvironment, and tumor-intrinsic feedback pathways that all contribute to disease relapse. To identify novel and effective drug combinations, we performed ex vivo sensitivity profiling of 122 primary patient samples from a variety of hematologic malignancies against a panel of 48 drug combinations. The combinations were designed as drug pairs that target nonoverlapping biological pathways and comprise drugs from different classes, preferably with Food and Drug Administration approval. A combination ratio (CR) was derived for each drug pair, and CRs were evaluated with respect to diagnostic categories as well as against genetic, cytogenetic, and cellular phenotypes of specimens from the two largest disease categories: AML and chronic lymphocytic leukemia (CLL). Nearly all tested combinations involving a BCL2 inhibitor showed additional benefit in patients with myeloid malignancies, whereas select combinations involving PI3K, CSF1R, or bromodomain inhibitors showed preferential benefit in lymphoid malignancies. Expanded analyses of patients with AML and CLL revealed specific patterns of ex vivo drug combination efficacy that were associated with select genetic, cytogenetic, and phenotypic disease subsets, warranting further evaluation. These findings highlight the heuristic value of an integrated functional genomic approach to the identification of novel treatment strategies for hematologic malignancies.


Asunto(s)
Antineoplásicos/uso terapéutico , Neoplasias Hematológicas/tratamiento farmacológico , Leucemia Linfocítica Crónica de Células B/tratamiento farmacológico , Leucemia Mieloide Aguda/tratamiento farmacológico , Combinación de Medicamentos , Neoplasias Hematológicas/metabolismo , Humanos , Leucemia Linfocítica Crónica de Células B/metabolismo , Leucemia Mieloide Aguda/metabolismo , Mutación/efectos de los fármacos , Fosfatidilinositol 3-Quinasas/metabolismo , Proteínas Proto-Oncogénicas c-bcl-2/metabolismo , Receptores de Factor Estimulante de Colonias de Granulocitos y Macrófagos/metabolismo
3.
Cell Genom ; 1(2): None, 2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-34820659

RESUMEN

Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.

4.
medRxiv ; 2020 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-32793919

RESUMEN

T cells are involved in the early identification and clearance of viral infections and also support the development of antibodies by B cells. This central role for T cells makes them a desirable target for assessing the immune response to SARS-CoV-2 infection. Here, we combined two high-throughput immune profiling methods to create a quantitative picture of the T-cell response to SARS-CoV-2. First, at the individual level, we deeply characterized 3 acutely infected and 58 recovered COVID-19 subjects by experimentally mapping their CD8 T-cell response through antigen stimulation to 545 Human Leukocyte Antigen (HLA) class I presented viral peptides (class II data in a forthcoming study). Then, at the population level, we performed T-cell repertoire sequencing on 1,815 samples (from 1,521 COVID-19 subjects) as well as 3,500 controls to identify shared "public" T-cell receptors (TCRs) associated with SARS-CoV-2 infection from both CD8 and CD4 T cells. Collectively, our data reveal that CD8 T-cell responses are often driven by a few immunodominant, HLA-restricted epitopes. As expected, the T-cell response to SARS-CoV-2 peaks about one to two weeks after infection and is detectable for at least several months after recovery. As an application of these data, we trained a classifier to diagnose SARS-CoV-2 infection based solely on TCR sequencing from blood samples, and observed, at 99.8% specificity, high early sensitivity soon after diagnosis (Day 3-7 = 85.1% [95% CI = 79.9-89.7]; Day 8-14 = 94.8% [90.7-98.4]) as well as lasting sensitivity after recovery (Day 29+/convalescent = 95.4% [92.1-98.3]). These results demonstrate an approach to reliably assess the adaptive immune response both soon after viral antigenic exposure (before antibodies are typically detectable) as well as at later time points. This blood-based molecular approach to characterizing the cellular immune response has applications in clinical diagnostics as well as in vaccine development and monitoring.

5.
Gigascience ; 8(12)2019 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-31782791

RESUMEN

BACKGROUND: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10-13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. RESULTS: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2-6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence ∼3.8-4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. CONCLUSIONS: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.


Asunto(s)
Mapeo Contig/métodos , Glucosiltransferasas/genética , Fenilanina Amoníaco-Liasa/genética , Saccharum/crecimiento & desarrollo , Biomasa , Productos Agrícolas/genética , Productos Agrícolas/crecimiento & desarrollo , Variación Genética , Tamaño del Genoma , Genoma de Planta , Familia de Multigenes , Proteínas de Plantas/genética , Poliploidía , Regiones Promotoras Genéticas , Saccharum/genética
6.
Sci Data ; 5: 180039, 2018 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-29537396

RESUMEN

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use. However, under current practices, the data is fragmented into many distinct datasets, and researchers must go through a separate application process for each dataset. This is time-consuming both for the researchers and the data stewards, and it reduces the velocity of research and new discoveries that could improve human health. We propose to simplify this process, by introducing a standard Library Card that identifies and authenticates researchers across all participating datasets. Each researcher would only need to apply once to establish their bona fides as a qualified researcher, and could then use the Library Card to access a wide range of datasets that use a compatible data access policy and authentication protocol.

7.
Eur J Hum Genet ; 26(12): 1721-1731, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-30069064

RESUMEN

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model-"registered access"-to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.


Asunto(s)
Acceso a la Información , Genética Médica/normas , Genómica/normas , Difusión de la Información , Genética Médica/ética , Genética Médica/legislación & jurisprudencia , Genómica/ética , Genómica/legislación & jurisprudencia , Humanos , Concesión de Licencias , Guías de Práctica Clínica como Asunto
8.
Gigascience ; 8(12): 1–18, 2019.
Artículo en Inglés | SES-SP, SES SP - Instituto Butantan, SES-SP | ID: but-ib17279

RESUMEN

Background: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10–13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. Results: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2–6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence ~3.8–4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. Conclusions: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.

9.
Gigascience, v. 8, n. 12, p. 1-18, nov. 2019
Artículo en Inglés | SES-SP, SES SP - Instituto Butantan, SES-SP | ID: bud-2873

RESUMEN

Background: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10–13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. Results: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2–6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence ~3.8–4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. Conclusions: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA