Búsqueda | Portal Regional de la BVS

Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis.

Deflaux, Nicole; Selvaraj, Margaret Sunitha; Condon, Henry Robert; Mayo, Kelsey; Haidermota, Sara; Basford, Melissa A; Lunt, Chris; Philippakis, Anthony A; Roden, Dan M; Denny, Joshua C; Musick, Anjene; Collins, Rory; Allen, Naomi; Effingham, Mark; Glazer, David; Natarajan, Pradeep; Bick, Alexander G.

Nat Commun ; 14(1): 5419, 2023 09 05.

Artículo en Inglés | MEDLINE | ID: mdl-37669985

RESUMEN

Recently, large scale genomic projects such as All of Us and the UK Biobank have introduced a new research paradigm where data are stored centrally in cloud-based Trusted Research Environments (TREs). To characterize the advantages and drawbacks of different TRE attributes in facilitating cross-cohort analysis, we conduct a Genome-Wide Association Study of standard lipid measures using two approaches: meta-analysis and pooled analysis. Comparison of full summary data from both approaches with an external study shows strong correlation of known loci with lipid levels (R2 ~ 83-97%). Importantly, 90 variants meet the significance threshold only in the meta-analysis and 64 variants are significant only in pooled analysis, with approximately 20% of variants in each of those groups being most prevalent in non-European, non-Asian ancestry individuals. These findings have important implications, as technical and policy choices lead to cross-cohort analyses generating similar, but not identical results, particularly for non-European ancestral populations.

Asunto(s)

Estudio de Asociación del Genoma Completo , Salud Poblacional , Humanos , Genómica , Políticas , Lípidos

Genomic architecture of autism from comprehensive whole-genome sequence annotation.

Trost, Brett; Thiruvahindrapuram, Bhooma; Chan, Ada J S; Engchuan, Worrawat; Higginbotham, Edward J; Howe, Jennifer L; Loureiro, Livia O; Reuter, Miriam S; Roshandel, Delnaz; Whitney, Joe; Zarrei, Mehdi; Bookman, Matthew; Somerville, Cherith; Shaath, Rulan; Abdi, Mona; Aliyev, Elbay; Patel, Rohan V; Nalpathamkalam, Thomas; Pellecchia, Giovanna; Hamdan, Omar; Kaur, Gaganjot; Wang, Zhuozhi; MacDonald, Jeffrey R; Wei, John; Sung, Wilson W L; Lamoureux, Sylvia; Hoang, Ny; Selvanayagam, Thanuja; Deflaux, Nicole; Geng, Melissa; Ghaffari, Siavash; Bates, John; Young, Edwin J; Ding, Qiliang; Shum, Carole; D'Abate, Lia; Bradley, Clarrisa A; Rutherford, Annabel; Aguda, Vernie; Apresto, Beverly; Chen, Nan; Desai, Sachin; Du, Xiaoyan; Fong, Matthew L Y; Pullenayegum, Sanjeev; Samler, Kozue; Wang, Ting; Ho, Karen; Paton, Tara; Pereira, Sergio L.

Cell ; 185(23): 4409-4427.e18, 2022 11 10.

Artículo en Inglés | MEDLINE | ID: mdl-36368308

RESUMEN

Fully understanding autism spectrum disorder (ASD) genetics requires whole-genome sequencing (WGS). We present the latest release of the Autism Speaks MSSNG resource, which includes WGS data from 5,100 individuals with ASD and 6,212 non-ASD parents and siblings (total n = 11,312). Examining a wide variety of genetic variants in MSSNG and the Simons Simplex Collection (SSC; n = 9,205), we identified ASD-associated rare variants in 718/5,100 individuals with ASD from MSSNG (14.1%) and 350/2,419 from SSC (14.5%). Considering genomic architecture, 52% were nuclear sequence-level variants, 46% were nuclear structural variants (including copy-number variants, inversions, large insertions, uniparental isodisomies, and tandem repeat expansions), and 2% were mitochondrial variants. Our study provides a guidebook for exploring genotype-phenotype correlations in families who carry ASD-associated rare variants and serves as an entry point to the expanded studies required to dissect the etiology in the â¼85% of the ASD population that remain idiopathic.

Asunto(s)

Trastorno del Espectro Autista , Trastorno Autístico , Humanos , Trastorno del Espectro Autista/genética , Predisposición Genética a la Enfermedad , Variaciones en el Número de Copia de ADN/genética , Genómica

The All of Us Research Program: Data quality, utility, and diversity.

Ramirez, Andrea H; Sulieman, Lina; Schlueter, David J; Halvorson, Alese; Qian, Jun; Ratsimbazafy, Francis; Loperena, Roxana; Mayo, Kelsey; Basford, Melissa; Deflaux, Nicole; Muthuraman, Karthik N; Natarajan, Karthik; Kho, Abel; Xu, Hua; Wilkins, Consuelo; Anton-Culver, Hoda; Boerwinkle, Eric; Cicek, Mine; Clark, Cheryl R; Cohn, Elizabeth; Ohno-Machado, Lucila; Schully, Sheri D; Ahmedani, Brian K; Argos, Maria; Cronin, Robert M; O'Donnell, Christopher; Fouad, Mona; Goldstein, David B; Greenland, Philip; Hebbring, Scott J; Karlson, Elizabeth W; Khatri, Parinda; Korf, Bruce; Smoller, Jordan W; Sodeke, Stephen; Wilbanks, John; Hentges, Justin; Mockrin, Stephen; Lunt, Christopher; Devaney, Stephanie A; Gebo, Kelly; Denny, Joshua C; Carroll, Robert J; Glazer, David; Harris, Paul A; Hripcsak, George; Philippakis, Anthony; Roden, Dan M.

Patterns (N Y) ; 3(8): 100570, 2022 Aug 12.

Artículo en Inglés | MEDLINE | ID: mdl-36033590

RESUMEN

The All of Us Research Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races. Replication findings include medication usage pattern differences by race in depression and type 2 diabetes, validation of known cancer associations with smoking, and calculation of cardiovascular risk scores by reported race effects. The cloud-based Researcher Workbench represents an important advance in enabling secure access for a broad range of researchers to this large resource and analytical tools.

The ISB Cancer Genomics Cloud: A Flexible Cloud-Based Platform for Cancer Genomics Research.

Reynolds, Sheila M; Miller, Michael; Lee, Phyliss; Leinonen, Kalle; Paquette, Suzanne M; Rodebaugh, Zack; Hahn, Abigail; Gibbs, David L; Slagel, Joseph; Longabaugh, William J; Dhankani, Varsha; Reyes, Madelyn; Pihl, Todd; Backus, Mark; Bookman, Matthew; Deflaux, Nicole; Bingham, Jonathan; Pot, David; Shmulevich, Ilya.

Cancer Res ; 77(21): e7-e10, 2017 11 01.

Artículo en Inglés | MEDLINE | ID: mdl-29092928

RESUMEN

The ISB Cancer Genomics Cloud (ISB-CGC) is one of three pilot projects funded by the National Cancer Institute to explore new approaches to computing on large cancer datasets in a cloud environment. With a focus on Data as a Service, the ISB-CGC offers multiple avenues for accessing and analyzing The Cancer Genome Atlas, TARGET, and other important references such as GENCODE and COSMIC using the Google Cloud Platform. The open approach allows researchers to choose approaches best suited to the task at hand: from analyzing terabytes of data using complex workflows to developing new analysis methods in common languages such as Python, R, and SQL; to using an interactive web application to create synthetic patient cohorts and to explore the wealth of available genomic data. Links to resources and documentation can be found at www.isb-cgc.org Cancer Res; 77(21); e7-10. ©2017 AACR.

Asunto(s)

Nube Computacional , Biología Computacional , Genómica , Neoplasias/genética , Conjuntos de Datos como Asunto , Genoma Humano , Humanos , Internet , National Cancer Institute (U.S.) , Investigación/tendencias , Programas Informáticos , Estados Unidos

Cloud-based interactive analytics for terabytes of genomic variants data.

Pan, Cuiping; McInnes, Gregory; Deflaux, Nicole; Snyder, Michael; Bingham, Jonathan; Datta, Somalee; Tsao, Philip S.

Bioinformatics ; 33(23): 3709-3715, 2017 Dec 01.

Artículo en Inglés | MEDLINE | ID: mdl-28961771

RESUMEN

MOTIVATION: Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired. RESULTS: We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information. AVAILABILITY AND IMPLEMENTATION: Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs. CONTACT: cuiping@stanford.edu or ptsao@stanford.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Variación Genética , Genómica/métodos , Compresión de Datos , Bases de Datos de Ácidos Nucleicos , Frecuencia de los Genes , Genoma Humano , Genotipo , Humanos , Programas Informáticos , Navegador Web

Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder.

C Yuen, Ryan K; Merico, Daniele; Bookman, Matt; L Howe, Jennifer; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D'Abate, Lia; Chan, Ada J S; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson W L; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili.

Nat Neurosci ; 20(4): 602-611, 2017 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-28263302

RESUMEN

We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible on a cloud platform and through a controlled-access internet portal. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertions and deletions or copy number variations per ASD subject. We identified 18 new candidate ASD-risk genes and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (P = 6 × 10-4). In 294 of 2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried copy number variations and/or chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD.

Asunto(s)

Trastorno del Espectro Autista/genética , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Aberraciones Cromosómicas , Variaciones en el Número de Copia de ADN , Humanos , Mutagénesis Insercional/genética , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Eliminación de Secuencia/genética

Analysis of protein-coding genetic variation in 60,706 humans.

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M.

Nature ; 536(7616): 285-91, 2016 08 18.

Artículo en Inglés | MEDLINE | ID: mdl-27535533

RESUMEN

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

Asunto(s)

Exoma/genética , Variación Genética/genética , Análisis Mutacional de ADN , Conjuntos de Datos como Asunto , Humanos , Fenotipo , Proteoma/genética , Enfermedades Raras/genética , Tamaño de la Muestra

Systematic analysis of challenge-driven improvements in molecular prognostic models for breast cancer.

Margolin, Adam A; Bilal, Erhan; Huang, Erich; Norman, Thea C; Ottestad, Lars; Mecham, Brigham H; Sauerwine, Ben; Kellen, Michael R; Mangravite, Lara M; Furia, Matthew D; Vollan, Hans Kristian Moen; Rueda, Oscar M; Guinney, Justin; Deflaux, Nicole A; Hoff, Bruce; Schildwachter, Xavier; Russnes, Hege G; Park, Daehoon; Vang, Veronica O; Pirtle, Tyler; Youseff, Lamia; Citro, Craig; Curtis, Christina; Kristensen, Vessela N; Hellerstein, Joseph; Friend, Stephen H; Stolovitzky, Gustavo; Aparicio, Samuel; Caldas, Carlos; Børresen-Dale, Anne-Lise.

Sci Transl Med ; 5(181): 181re1, 2013 Apr 17.

Artículo en Inglés | MEDLINE | ID: mdl-23596205

RESUMEN

Although molecular prognostics in breast cancer are among the most successful examples of translating genomic analysis to clinical applications, optimal approaches to breast cancer clinical risk prediction remain controversial. The Sage Bionetworks-DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced research study for breast cancer prognostic modeling using genome-scale data. The BCC provided a community of data analysts with a common platform for data access and blinded evaluation of model accuracy in predicting breast cancer survival on the basis of gene expression data, copy number data, and clinical covariates. This approach offered the opportunity to assess whether a crowdsourced community Challenge would generate models of breast cancer prognosis commensurate with or exceeding current best-in-class approaches. The BCC comprised multiple rounds of blinded evaluations on held-out portions of data on 1981 patients, resulting in more than 1400 models submitted as open source code. Participants then retrained their models on the full data set of 1981 samples and submitted up to five models for validation in a newly generated data set of 184 breast cancer patients. Analysis of the BCC results suggests that the best-performing modeling strategy outperformed previously reported methods in blinded evaluations; model performance was consistent across several independent evaluations; and aggregating community-developed models achieved performance on par with the best-performing individual models.

Asunto(s)

Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/genética , Modelos Biológicos , Bases de Datos Genéticas , Femenino , Humanos , Persona de Mediana Edad , Pronóstico , Análisis de Supervivencia , Factores de Tiempo

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA