Pesquisa | Portal de Pesquisa da BVS

Dhaka: variational autoencoder for unmasking tumor heterogeneity from single cell genomic data.

Rashid, Sabrina; Shah, Sohrab; Bar-Joseph, Ziv; Pandya, Ravi.

Bioinformatics ; 37(11): 1535-1543, 2021 Jul 12.

Artigo em Inglês | MEDLINE | ID: mdl-30768159

RESUMO

MOTIVATION: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. However, extracting features from single cell genomic data in order to infer their evolutionary trajectory remains computationally challenging due to the extremely noisy and sparse nature of the data. RESULTS: Here we describe 'Dhaka', a variational autoencoder method which transforms single cell genomic data to a reduced dimension feature space that is more efficient in differentiating between (hidden) tumor subpopulations. Our method is general and can be applied to several different types of genomic data including copy number variation from scDNA-Seq and gene expression from scRNA-Seq experiments. We tested the method on synthetic and six single cell cancer datasets where the number of cells ranges from 250 to 6000 for each sample. Analysis of the resulting feature space revealed subpopulations of cells and their marker genes. The features are also able to infer the lineage and/or differentiation trajectory between cells greatly improving upon prior methods suggested for feature extraction and dimensionality reduction of such data. AVAILABILITY AND IMPLEMENTATION: All the datasets used in the paper are publicly available and developed software package and supporting info is available on Github https://github.com/MicrosoftGenomics/Dhaka. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Molecularly targeted drug combinations demonstrate selective effectiveness for myeloid- and lymphoid-derived hematologic malignancies.

Kurtz, Stephen E; Eide, Christopher A; Kaempf, Andy; Khanna, Vishesh; Savage, Samantha L; Rofelty, Angela; English, Isabel; Ho, Hibery; Pandya, Ravi; Bolosky, William J; Poon, Hoifung; Deininger, Michael W; Collins, Robert; Swords, Ronan T; Watts, Justin; Pollyea, Daniel A; Medeiros, Bruno C; Traer, Elie; Tognon, Cristina E; Mori, Motomi; Druker, Brian J; Tyner, Jeffrey W.

Proc Natl Acad Sci U S A ; 114(36): E7554-E7563, 2017 09 05.

Artigo em Inglês | MEDLINE | ID: mdl-28784769

RESUMO

Translating the genetic and epigenetic heterogeneity underlying human cancers into therapeutic strategies is an ongoing challenge. Large-scale sequencing efforts have uncovered a spectrum of mutations in many hematologic malignancies, including acute myeloid leukemia (AML), suggesting that combinations of agents will be required to treat these diseases effectively. Combinatorial approaches will also be critical for combating the emergence of genetically heterogeneous subclones, rescue signals in the microenvironment, and tumor-intrinsic feedback pathways that all contribute to disease relapse. To identify novel and effective drug combinations, we performed ex vivo sensitivity profiling of 122 primary patient samples from a variety of hematologic malignancies against a panel of 48 drug combinations. The combinations were designed as drug pairs that target nonoverlapping biological pathways and comprise drugs from different classes, preferably with Food and Drug Administration approval. A combination ratio (CR) was derived for each drug pair, and CRs were evaluated with respect to diagnostic categories as well as against genetic, cytogenetic, and cellular phenotypes of specimens from the two largest disease categories: AML and chronic lymphocytic leukemia (CLL). Nearly all tested combinations involving a BCL2 inhibitor showed additional benefit in patients with myeloid malignancies, whereas select combinations involving PI3K, CSF1R, or bromodomain inhibitors showed preferential benefit in lymphoid malignancies. Expanded analyses of patients with AML and CLL revealed specific patterns of ex vivo drug combination efficacy that were associated with select genetic, cytogenetic, and phenotypic disease subsets, warranting further evaluation. These findings highlight the heuristic value of an integrated functional genomic approach to the identification of novel treatment strategies for hematologic malignancies.

Assuntos

Antineoplásicos/uso terapêutico , Neoplasias Hematológicas/tratamento farmacológico , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Leucemia Mieloide Aguda/tratamento farmacológico , Combinação de Medicamentos , Neoplasias Hematológicas/metabolismo , Humanos , Leucemia Linfocítica Crônica de Células B/metabolismo , Leucemia Mieloide Aguda/metabolismo , Mutação/efeitos dos fármacos , Fosfatidilinositol 3-Quinases/metabolismo , Proteínas Proto-Oncogênicas c-bcl-2/metabolismo , Receptores de Fator Estimulador das Colônias de Granulócitos e Macrófagos/metabolismo

The Data Use Ontology to streamline responsible access to human biomedical datasets.

Lawson, Jonathan; Cabili, Moran N; Kerry, Giselle; Boughtwood, Tiffany; Thorogood, Adrian; Alper, Pinar; Bowers, Sarion R; Boyles, Rebecca R; Brookes, Anthony J; Brush, Matthew; Burdett, Tony; Clissold, Hayley; Donnelly, Stacey; Dyke, Stephanie O M; Freeberg, Mallory A; Haendel, Melissa A; Hata, Chihiro; Holub, Petr; Jeanson, Francis; Jene, Aina; Kawashima, Minae; Kawashima, Shuichi; Konopko, Melissa; Kyomugisha, Irene; Li, Haoyuan; Linden, Mikael; Rodriguez, Laura Lyman; Morita, Mizuki; Mulder, Nicola; Muller, Jean; Nagaie, Satoshi; Nasir, Jamal; Ogishima, Soichi; Ota Wang, Vivian; Paglione, Laura D; Pandya, Ravi N; Parkinson, Helen; Philippakis, Anthony A; Prasser, Fabian; Rambla, Jordi; Reinold, Kathy; Rushton, Gregory A; Saltzman, Andrea; Saunders, Gary; Sofia, Heidi J; Spalding, John D; Swertz, Morris A; Tulchinsky, Ilia; van Enckevort, Esther J; Varma, Susheel.

Cell Genom ; 1(2): None, 2021 Nov 10.

Artigo em Inglês | MEDLINE | ID: mdl-34820659

RESUMO

Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.

Magnitude and Dynamics of the T-Cell Response to SARS-CoV-2 Infection at Both Individual and Population Levels.

Snyder, Thomas M; Gittelman, Rachel M; Klinger, Mark; May, Damon H; Osborne, Edward J; Taniguchi, Ruth; Zahid, H Jabran; Kaplan, Ian M; Dines, Jennifer N; Noakes, Matthew T; Pandya, Ravi; Chen, Xiaoyu; Elasady, Summer; Svejnoha, Emily; Ebert, Peter; Pesesky, Mitchell W; De Almeida, Patricia; O'Donnell, Hope; DeGottardi, Quinn; Keitany, Gladys; Lu, Jennifer; Vong, Allen; Elyanow, Rebecca; Fields, Paul; Greissl, Julia; Baldo, Lance; Semprini, Simona; Cerchione, Claudio; Nicolini, Fabio; Mazza, Massimiliano; Delmonte, Ottavia M; Dobbs, Kerry; Laguna-Goya, Rocio; Carreño-Tarragona, Gonzalo; Barrio, Santiago; Imberti, Luisa; Sottini, Alessandra; Quiros-Roldan, Eugenia; Rossi, Camillo; Biondi, Andrea; Bettini, Laura Rachele; D'Angio, Mariella; Bonfanti, Paolo; Tompkins, Miranda F; Alba, Camille; Dalgard, Clifton; Sambri, Vittorio; Martinelli, Giovanni; Goldman, Jason D; Heath, James R.

medRxiv ; 2020 Sep 17.

Artigo em Inglês | MEDLINE | ID: mdl-32793919

RESUMO

T cells are involved in the early identification and clearance of viral infections and also support the development of antibodies by B cells. This central role for T cells makes them a desirable target for assessing the immune response to SARS-CoV-2 infection. Here, we combined two high-throughput immune profiling methods to create a quantitative picture of the T-cell response to SARS-CoV-2. First, at the individual level, we deeply characterized 3 acutely infected and 58 recovered COVID-19 subjects by experimentally mapping their CD8 T-cell response through antigen stimulation to 545 Human Leukocyte Antigen (HLA) class I presented viral peptides (class II data in a forthcoming study). Then, at the population level, we performed T-cell repertoire sequencing on 1,815 samples (from 1,521 COVID-19 subjects) as well as 3,500 controls to identify shared "public" T-cell receptors (TCRs) associated with SARS-CoV-2 infection from both CD8 and CD4 T cells. Collectively, our data reveal that CD8 T-cell responses are often driven by a few immunodominant, HLA-restricted epitopes. As expected, the T-cell response to SARS-CoV-2 peaks about one to two weeks after infection and is detectable for at least several months after recovery. As an application of these data, we trained a classifier to diagnose SARS-CoV-2 infection based solely on TCR sequencing from blood samples, and observed, at 99.8% specificity, high early sensitivity soon after diagnosis (Day 3-7 = 85.1% [95% CI = 79.9-89.7]; Day 8-14 = 94.8% [90.7-98.4]) as well as lasting sensitivity after recovery (Day 29+/convalescent = 95.4% [92.1-98.3]). These results demonstrate an approach to reliably assess the adaptive immune response both soon after viral antigenic exposure (before antibodies are typically detectable) as well as at later time points. This blood-based molecular approach to characterizing the cellular immune response has applications in clinical diagnostics as well as in vaccine development and monitoring.

Assembly of the 373k gene space of the polyploid sugarcane genome reveals reservoirs of functional diversity in the world's leading biomass crop.

Souza, Glaucia Mendes; Van Sluys, Marie-Anne; Lembke, Carolina Gimiliani; Lee, Hayan; Margarido, Gabriel Rodrigues Alves; Hotta, Carlos Takeshi; Gaiarsa, Jonas Weissmann; Diniz, Augusto Lima; Oliveira, Mauro de Medeiros; Ferreira, Sávio de Siqueira; Nishiyama, Milton Yutaka; Ten-Caten, Felipe; Ragagnin, Geovani Tolfo; Andrade, Pablo de Morais; de Souza, Robson Francisco; Nicastro, Gianlucca Gonçalves; Pandya, Ravi; Kim, Changsoo; Guo, Hui; Durham, Alan Mitchell; Carneiro, Monalisa Sampaio; Zhang, Jisen; Zhang, Xingtan; Zhang, Qing; Ming, Ray; Schatz, Michael C; Davidson, Bob; Paterson, Andrew H; Heckerman, David.

Gigascience ; 8(12)2019 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-31782791

RESUMO

BACKGROUND: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10-13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. RESULTS: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2-6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence â¼3.8-4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. CONCLUSIONS: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.

Assuntos

Mapeamento de Sequências Contíguas/métodos , Glucosiltransferases/genética , Fenilalanina Amônia-Liase/genética , Saccharum/crescimento & desenvolvimento , Biomassa , Produtos Agrícolas/genética , Produtos Agrícolas/crescimento & desenvolvimento , Variação Genética , Tamanho do Genoma , Genoma de Planta , Família Multigênica , Proteínas de Plantas/genética , Poliploidia , Regiões Promotoras Genéticas , Saccharum/genética

Simplifying research access to genomics and health data with Library Cards.

Cabili, Moran N; Carey, Knox; Dyke, Stephanie O M; Brookes, Anthony J; Fiume, Marc; Jeanson, Francis; Kerry, Giselle; Lash, Alex; Sofia, Heidi; Spalding, Dylan; Tasse, Anne-Marie; Varma, Susheel; Pandya, Ravi.

Sci Data ; 5: 180039, 2018 03 14.

Artigo em Inglês | MEDLINE | ID: mdl-29537396

RESUMO

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use. However, under current practices, the data is fragmented into many distinct datasets, and researchers must go through a separate application process for each dataset. This is time-consuming both for the researchers and the data stewards, and it reduces the velocity of research and new discoveries that could improve human health. We propose to simplify this process, by introducing a standard Library Card that identifies and authenticates researchers across all participating datasets. Each researcher would only need to apply once to establish their bona fides as a qualified researcher, and could then use the Library Card to access a wide range of datasets that use a compatible data access policy and authentication protocol.

Registered access: authorizing data access.

Dyke, Stephanie O M; Linden, Mikael; Lappalainen, Ilkka; De Argila, Jordi Rambla; Carey, Knox; Lloyd, David; Spalding, J Dylan; Cabili, Moran N; Kerry, Giselle; Foreman, Julia; Cutts, Tim; Shabani, Mahsa; Rodriguez, Laura L; Haeussler, Maximilian; Walsh, Brian; Jiang, Xiaoqian; Wang, Shuang; Perrett, Daniel; Boughtwood, Tiffany; Matern, Andreas; Brookes, Anthony J; Cupak, Miro; Fiume, Marc; Pandya, Ravi; Tulchinsky, Ilia; Scollen, Serena; Törnroos, Juha; Das, Samir; Evans, Alan C; Malin, Bradley A; Beck, Stephan; Brenner, Steven E; Nyrönen, Tommi; Blomberg, Niklas; Firth, Helen V; Hurles, Matthew; Philippakis, Anthony A; Rätsch, Gunnar; Brudno, Michael; Boycott, Kym M; Rehm, Heidi L; Baudis, Michael; Sherry, Stephen T; Kato, Kazuto; Knoppers, Bartha M; Baker, Dixie; Flicek, Paul.

Eur J Hum Genet ; 26(12): 1721-1731, 2018 12.

Artigo em Inglês | MEDLINE | ID: mdl-30069064

RESUMO

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model-"registered access"-to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.

Assuntos

Acesso à Informação , Genética Médica/normas , Genômica/normas , Disseminação de Informação , Genética Médica/ética , Genética Médica/legislação & jurisprudência , Genômica/ética , Genômica/legislação & jurisprudência , Humanos , Licenciamento , Guias de Prática Clínica como Assunto

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA