Pesquisa | Biblioteca Virtual em Saúde

1.

Dynamic landscape of immune cell-specific gene regulation in immune-mediated diseases.

Ota, Mineto; Nagafuchi, Yasuo; Hatano, Hiroaki; Ishigaki, Kazuyoshi; Terao, Chikashi; Takeshima, Yusuke; Yanaoka, Haruyuki; Kobayashi, Satomi; Okubo, Mai; Shirai, Harumi; Sugimori, Yusuke; Maeda, Junko; Nakano, Masahiro; Yamada, Saeko; Yoshida, Ryochi; Tsuchiya, Haruka; Tsuchida, Yumi; Akizuki, Shuji; Yoshifuji, Hajime; Ohmura, Koichiro; Mimori, Tsuneyo; Yoshida, Ken; Kurosaka, Daitaro; Okada, Masato; Setoguchi, Keigo; Kaneko, Hiroshi; Ban, Nobuhiro; Yabuki, Nami; Matsuki, Kosuke; Mutoh, Hironori; Oyama, Sohei; Okazaki, Makoto; Tsunoda, Hiroyuki; Iwasaki, Yukiko; Sumitomo, Shuji; Shoda, Hirofumi; Kochi, Yuta; Okada, Yukinori; Yamamoto, Kazuhiko; Okamura, Tomohisa; Fujio, Keishi.

Cell ; 184(11): 3006-3021.e17, 2021 05 27.

Artigo em Inglês | MEDLINE | ID: mdl-33930287

RESUMO

Genetic studies have revealed many variant loci that are associated with immune-mediated diseases. To elucidate the disease pathogenesis, it is essential to understand the function of these variants, especially under disease-associated conditions. Here, we performed a large-scale immune cell gene-expression analysis, together with whole-genome sequence analysis. Our dataset consists of 28 distinct immune cell subsets from 337 patients diagnosed with 10 categories of immune-mediated diseases and 79 healthy volunteers. Our dataset captured distinctive gene-expression profiles across immune cell types and diseases. Expression quantitative trait loci (eQTL) analysis revealed dynamic variations of eQTL effects in the context of immunological conditions, as well as cell types. These cell-type-specific and context-dependent eQTLs showed significant enrichment in immune disease-associated genetic variants, and they implicated the disease-relevant cell types, genes, and environment. This atlas deepens our understanding of the immunogenetic functions of disease-associated variants under in vivo disease conditions.

Assuntos

Regulação da Expressão Gênica/genética , Expressão Gênica/imunologia , Doenças do Sistema Imunitário/genética , Adulto , Feminino , Expressão Gênica/genética , Regulação da Expressão Gênica/imunologia , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Sistema Imunitário/citologia , Sistema Imunitário/metabolismo , Doenças do Sistema Imunitário/metabolismo , Doenças do Sistema Imunitário/fisiopatologia , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Locos de Características Quantitativas/imunologia , Transcriptoma/genética , Sequenciamento Completo do Genoma/métodos

2.

Transmission, infectivity, and neutralization of a spike L452R SARS-CoV-2 variant.

Deng, Xianding; Garcia-Knight, Miguel A; Khalid, Mir M; Servellita, Venice; Wang, Candace; Morris, Mary Kate; Sotomayor-González, Alicia; Glasner, Dustin R; Reyes, Kevin R; Gliwa, Amelia S; Reddy, Nikitha P; Sanchez San Martin, Claudia; Federman, Scot; Cheng, Jing; Balcerek, Joanna; Taylor, Jordan; Streithorst, Jessica A; Miller, Steve; Sreekumar, Bharath; Chen, Pei-Yi; Schulze-Gahmen, Ursula; Taha, Taha Y; Hayashi, Jennifer M; Simoneau, Camille R; Kumar, G Renuka; McMahon, Sarah; Lidsky, Peter V; Xiao, Yinghong; Hemarajata, Peera; Green, Nicole M; Espinosa, Alex; Kath, Chantha; Haw, Monica; Bell, John; Hacker, Jill K; Hanson, Carl; Wadford, Debra A; Anaya, Carlos; Ferguson, Donna; Frankino, Phillip A; Shivram, Haridha; Lareau, Liana F; Wyman, Stacia K; Ott, Melanie; Andino, Raul; Chiu, Charles Y.

Cell ; 184(13): 3426-3437.e8, 2021 06 24.

Artigo em Inglês | MEDLINE | ID: mdl-33991487

RESUMO

We identified an emerging severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variant by viral whole-genome sequencing of 2,172 nasal/nasopharyngeal swab samples from 44 counties in California, a state in the western United States. Named B.1.427/B.1.429 to denote its two lineages, the variant emerged in May 2020 and increased from 0% to >50% of sequenced cases from September 2020 to January 2021, showing 18.6%-24% increased transmissibility relative to wild-type circulating strains. The variant carries three mutations in the spike protein, including an L452R substitution. We found 2-fold increased B.1.427/B.1.429 viral shedding in vivo and increased L452R pseudovirus infection of cell cultures and lung organoids, albeit decreased relative to pseudoviruses carrying the N501Y mutation common to variants B.1.1.7, B.1.351, and P.1. Antibody neutralization assays revealed 4.0- to 6.7-fold and 2.0-fold decreases in neutralizing titers from convalescent patients and vaccine recipients, respectively. The increased prevalence of a more transmissible variant in California exhibiting decreased antibody neutralization warrants further investigation.

Assuntos

Anticorpos Neutralizantes/imunologia , COVID-19/imunologia , COVID-19/transmissão , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus/imunologia , Anticorpos Monoclonais/imunologia , Anticorpos Antivirais/imunologia , Humanos , Mutação/genética , Sequenciamento Completo do Genoma/métodos

3.

Generation and transmission of interlineage recombinants in the SARS-CoV-2 pandemic.

Jackson, Ben; Boni, Maciej F; Bull, Matthew J; Colleran, Amy; Colquhoun, Rachel M; Darby, Alistair C; Haldenby, Sam; Hill, Verity; Lucaci, Anita; McCrone, John T; Nicholls, Samuel M; O'Toole, Áine; Pacchiarini, Nicole; Poplawski, Radoslaw; Scher, Emily; Todd, Flora; Webster, Hermione J; Whitehead, Mark; Wierzbicki, Claudia; Loman, Nicholas J; Connor, Thomas R; Robertson, David L; Pybus, Oliver G; Rambaut, Andrew.

Cell ; 184(20): 5179-5188.e8, 2021 09 30.

Artigo em Inglês | MEDLINE | ID: mdl-34499854

RESUMO

We present evidence for multiple independent origins of recombinant SARS-CoV-2 viruses sampled from late 2020 and early 2021 in the United Kingdom. Their genomes carry single-nucleotide polymorphisms and deletions that are characteristic of the B.1.1.7 variant of concern but lack the full complement of lineage-defining mutations. Instead, the remainder of their genomes share contiguous genetic variation with non-B.1.1.7 viruses circulating in the same geographic area at the same time as the recombinants. In four instances, there was evidence for onward transmission of a recombinant-origin virus, including one transmission cluster of 45 sequenced cases over the course of 2 months. The inferred genomic locations of recombination breakpoints suggest that every community-transmitted recombinant virus inherited its spike region from a B.1.1.7 parental virus, consistent with a transmission advantage for B.1.1.7's set of mutations.

Assuntos

COVID-19/epidemiologia , COVID-19/transmissão , Pandemias , Recombinação Genética , SARS-CoV-2/genética , Sequência de Bases/genética , COVID-19/virologia , Biologia Computacional/métodos , Frequência do Gene , Genoma Viral , Genótipo , Humanos , Mutação , Filogenia , Polimorfismo de Nucleotídeo Único , Reino Unido/epidemiologia , Sequenciamento Completo do Genoma/métodos

4.

Distinct Classes of Complex Structural Variation Uncovered across Thousands of Cancer Genome Graphs.

Hadi, Kevin; Yao, Xiaotong; Behr, Julie M; Deshpande, Aditya; Xanthopoulakis, Charalampos; Tian, Huasong; Kudman, Sarah; Rosiene, Joel; Darmofal, Madison; DeRose, Joseph; Mortensen, Rick; Adney, Emily M; Shaiber, Alon; Gajic, Zoran; Sigouros, Michael; Eng, Kenneth; Wala, Jeremiah A; Wrzeszczynski, Kazimierz O; Arora, Kanika; Shah, Minita; Emde, Anne-Katrin; Felice, Vanessa; Frank, Mayu O; Darnell, Robert B; Ghandi, Mahmoud; Huang, Franklin; Dewhurst, Sally; Maciejowski, John; de Lange, Titia; Setton, Jeremy; Riaz, Nadeem; Reis-Filho, Jorge S; Powell, Simon; Knowles, David A; Reznik, Ed; Mishra, Bud; Beroukhim, Rameen; Zody, Michael C; Robine, Nicolas; Oman, Kenji M; Sanchez, Carissa A; Kuhner, Mary K; Smith, Lucian P; Galipeau, Patricia C; Paulson, Thomas G; Reid, Brian J; Li, Xiaohong; Wilkes, David; Sboner, Andrea; Mosquera, Juan Miguel.

Cell ; 183(1): 197-210.e32, 2020 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-33007263

RESUMO

Cancer genomes often harbor hundreds of somatic DNA rearrangement junctions, many of which cannot be easily classified into simple (e.g., deletion) or complex (e.g., chromothripsis) structural variant classes. Applying a novel genome graph computational paradigm to analyze the topology of junction copy number (JCN) across 2,778 tumor whole-genome sequences, we uncovered three novel complex rearrangement phenomena: pyrgo, rigma, and tyfonas. Pyrgo are "towers" of low-JCN duplications associated with early-replicating regions, superenhancers, and breast or ovarian cancers. Rigma comprise "chasms" of low-JCN deletions enriched in late-replicating fragile sites and gastrointestinal carcinomas. Tyfonas are "typhoons" of high-JCN junctions and fold-back inversions associated with expressed protein-coding fusions, breakend hypermutation, and acral, but not cutaneous, melanomas. Clustering of tumors according to genome graph-derived features identified subgroups associated with DNA repair defects and poor prognosis.

Assuntos

Variação Estrutural do Genoma/genética , Genômica/métodos , Neoplasias/genética , Inversão Cromossômica/genética , Cromotripsia , Variações do Número de Cópias de DNA/genética , Rearranjo Gênico/genética , Genoma Humano/genética , Humanos , Mutação/genética , Sequenciamento Completo do Genoma/métodos

5.

A Compendium of Mutational Signatures of Environmental Agents.

Kucab, Jill E; Zou, Xueqing; Morganella, Sandro; Joel, Madeleine; Nanda, A Scott; Nagy, Eszter; Gomez, Celine; Degasperi, Andrea; Harris, Rebecca; Jackson, Stephen P; Arlt, Volker M; Phillips, David H; Nik-Zainal, Serena.

Cell ; 177(4): 821-836.e16, 2019 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-30982602

RESUMO

Whole-genome-sequencing (WGS) of human tumors has revealed distinct mutation patterns that hint at the causative origins of cancer. We examined mutational signatures in 324 WGS human-induced pluripotent stem cells exposed to 79 known or suspected environmental carcinogens. Forty-one yielded characteristic substitution mutational signatures. Some were similar to signatures found in human tumors. Additionally, six agents produced double-substitution signatures and eight produced indel signatures. Investigating mutation asymmetries across genome topography revealed fully functional mismatch and transcription-coupled repair pathways. DNA damage induced by environmental mutagens can be resolved by disparate repair and/or replicative pathways, resulting in an assortment of signature outcomes even for a single agent. This compendium of experimentally induced mutational signatures permits further exploration of roles of environmental agents in cancer etiology and underscores how human stem cell DNA is directly vulnerable to environmental agents. VIDEO ABSTRACT.

Assuntos

Carcinógenos Ambientais/classificação , Neoplasias/genética , Carcinógenos Ambientais/efeitos adversos , Dano ao DNA/genética , Análise Mutacional de DNA/métodos , Reparo do DNA/genética , Replicação do DNA , Perfil Genético , Genoma Humano/genética , Humanos , Mutação INDEL/genética , Mutagênese , Mutação/genética , Células-Tronco Pluripotentes/metabolismo , Sequenciamento Completo do Genoma/métodos

6.

Genomic Analysis in the Age of Human Genome Sequencing.

Lappalainen, Tuuli; Scott, Alexandra J; Brandt, Margot; Hall, Ira M.

Cell ; 177(1): 70-84, 2019 03 21.

Artigo em Inglês | MEDLINE | ID: mdl-30901550

RESUMO

Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats. Here, we review the current state of technologies for genetic variant discovery, genotyping, and functional interpretation and discuss the prospects for future advances. We focus on germline variants discovered by whole-genome sequencing, genome-wide functional genomic approaches for predicting and measuring variant functional effects, and implications for studies of common and rare human disease.

Assuntos

Variação Genética/genética , Genoma Humano/genética , Análise de Sequência de DNA/tendências , Bancos de Espécimes Biológicos , Mapeamento Cromossômico/métodos , Predisposição Genética para Doença/genética , Testes Genéticos/tendências , Estudo de Associação Genômica Ampla , Genômica/métodos , Genômica/tendências , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Projeto Genoma Humano , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Sequenciamento Completo do Genoma/métodos , Sequenciamento Completo do Genoma/tendências

7.

Genomic Hallmarks and Structural Variation in Metastatic Prostate Cancer.

Quigley, David A; Dang, Ha X; Zhao, Shuang G; Lloyd, Paul; Aggarwal, Rahul; Alumkal, Joshi J; Foye, Adam; Kothari, Vishal; Perry, Marc D; Bailey, Adina M; Playdle, Denise; Barnard, Travis J; Zhang, Li; Zhang, Jin; Youngren, Jack F; Cieslik, Marcin P; Parolia, Abhijit; Beer, Tomasz M; Thomas, George; Chi, Kim N; Gleave, Martin; Lack, Nathan A; Zoubeidi, Amina; Reiter, Robert E; Rettig, Matthew B; Witte, Owen; Ryan, Charles J; Fong, Lawrence; Kim, Won; Friedlander, Terence; Chou, Jonathan; Li, Haolong; Das, Rajdeep; Li, Hui; Moussavi-Baygi, Ruhollah; Goodarzi, Hani; Gilbert, Luke A; Lara, Primo N; Evans, Christopher P; Goldstein, Theodore C; Stuart, Joshua M; Tomlins, Scott A; Spratt, Daniel E; Cheetham, R Keira; Cheng, Donavan T; Farh, Kyle; Gehring, Julian S; Hakenberg, Jörg; Liao, Arnold; Febbo, Philip G.

Cell ; 174(3): 758-769.e9, 2018 07 26.

Artigo em Inglês | MEDLINE | ID: mdl-30033370

RESUMO

While mutations affecting protein-coding regions have been examined across many cancers, structural variants at the genome-wide level are still poorly defined. Through integrative deep whole-genome and -transcriptome analysis of 101 castration-resistant prostate cancer metastases (109X tumor/38X normal coverage), we identified structural variants altering critical regulators of tumorigenesis and progression not detectable by exome approaches. Notably, we observed amplification of an intergenic enhancer region 624 kb upstream of the androgen receptor (AR) in 81% of patients, correlating with increased AR expression. Tandem duplication hotspots also occur near MYC, in lncRNAs associated with post-translational MYC regulation. Classes of structural variations were linked to distinct DNA repair deficiencies, suggesting their etiology, including associations of CDK12 mutation with tandem duplications, TP53 inactivation with inverted rearrangements and chromothripsis, and BRCA2 inactivation with deletions. Together, these observations provide a comprehensive view of how structural variations affect critical regulators in metastatic prostate cancer.

Assuntos

Variação Estrutural do Genoma/genética , Neoplasias da Próstata/genética , Idoso , Idoso de 80 Anos ou mais , Proteína BRCA2/metabolismo , Quinases Ciclina-Dependentes/metabolismo , Variações do Número de Cópias de DNA , Exoma , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Metástase Neoplásica/genética , Proteínas Proto-Oncogênicas c-myc/genética , Proteínas Proto-Oncogênicas c-myc/metabolismo , Receptores Androgênicos/genética , Receptores Androgênicos/metabolismo , Sequências de Repetição em Tandem/genética , Proteína Supressora de Tumor p53/metabolismo , Sequenciamento Completo do Genoma/métodos

8.

The genetics of obesity: from discovery to biology.

Loos, Ruth J F; Yeo, Giles S H.

Nat Rev Genet ; 23(2): 120-133, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-34556834

RESUMO

The prevalence of obesity has tripled over the past four decades, imposing an enormous burden on people's health. Polygenic (or common) obesity and rare, severe, early-onset monogenic obesity are often polarized as distinct diseases. However, gene discovery studies for both forms of obesity show that they have shared genetic and biological underpinnings, pointing to a key role for the brain in the control of body weight. Genome-wide association studies (GWAS) with increasing sample sizes and advances in sequencing technology are the main drivers behind a recent flurry of new discoveries. However, it is the post-GWAS, cross-disciplinary collaborations, which combine new omics technologies and analytical approaches, that have started to facilitate translation of genetic loci into meaningful biology and new avenues for treatment.

Assuntos

Predisposição Genética para Doença/genética , Variação Genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Obesidade/genética , Sequenciamento Completo do Genoma/métodos , Animais , Ingestão de Alimentos/genética , Interação Gene-Ambiente , Humanos , Herança Multifatorial/genética , Sobrepeso/genética

9.

A Highly Scalable Method for Joint Whole-Genome Sequencing and Gene-Expression Profiling of Single Cells.

Zachariadis, Vasilios; Cheng, Huaitao; Andrews, Nathanael; Enge, Martin.

Mol Cell ; 80(3): 541-553.e5, 2020 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-33068522

RESUMO

To address how genetic variation alters gene expression in complex cell mixtures, we developed direct nuclear tagmentation and RNA sequencing (DNTR-seq), which enables whole-genome and mRNA sequencing jointly in single cells. DNTR-seq readily identified minor subclones within leukemia patients. In a large-scale DNA damage screen, DNTR-seq was used to detect regions under purifying selection and identified genes where mRNA abundance was resistant to copy-number alteration, suggesting strong genetic compensation. mRNA sequencing (mRNA-seq) quality equals RNA-only methods, and the low positional bias of genomic libraries allowed detection of sub-megabase aberrations at ultra-low coverage. Each cell library is individually addressable and can be re-sequenced at increased depth, allowing multi-tiered study designs. Additionally, the direct tagmentation protocol enables coverage-independent estimation of ploidy, which can be used to identify cell singlets. Thus, DNTR-seq directly links each cell's state to its corresponding genome at scale, enabling routine analysis of heterogeneous tumors and other complex tissues.

Assuntos

Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Sequência de Bases/genética , Linhagem Celular Tumoral , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , RNA/genética , RNA Mensageiro/genética , Análise de Sequência de DNA/métodos

10.

MagicalRsq-X: A cross-cohort transferable genotype imputation quality metric.

Sun, Quan; Yang, Yingxi; Rosen, Jonathan D; Chen, Jiawen; Li, Xihao; Guan, Wyliena; Jiang, Min-Zhi; Wen, Jia; Pace, Rhonda G; Blackman, Scott M; Bamshad, Michael J; Gibson, Ronald L; Cutting, Garry R; O'Neal, Wanda K; Knowles, Michael R; Kooperberg, Charles; Reiner, Alexander P; Raffield, Laura M; Carson, April P; Rich, Stephen S; Rotter, Jerome I; Loos, Ruth J F; Kenny, Eimear; Jaeger, Byron C; Min, Yuan-I; Fuchsberger, Christian; Li, Yun.

Am J Hum Genet ; 111(5): 990-995, 2024 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-38636510

RESUMO

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.

Assuntos

Frequência do Gene , Genótipo , Polimorfismo de Nucleotídeo Único , Software , Humanos , Estudos de Coortes , Desequilíbrio de Ligação , Estudo de Associação Genômica Ampla/métodos , Genoma Humano , Controle de Qualidade , Aprendizado de Máquina , Sequenciamento Completo do Genoma/normas , Sequenciamento Completo do Genoma/métodos

11.

Large-scale genomic analysis of the domestic dog informs biological discovery.

Buckley, Reuben M; Ostrander, Elaine A.

Genome Res ; 34(6): 811-821, 2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-38955465

RESUMO

Recent advances in genomics, coupled with a unique population structure and remarkable levels of variation, have propelled the domestic dog to new levels as a system for understanding fundamental principles in mammalian biology. Central to this advance are more than 350 recognized breeds, each a closed population that has undergone selection for unique features. Genetic variation in the domestic dog is particularly well characterized compared with other domestic mammals, with almost 3000 high-coverage genomes publicly available. Importantly, as the number of sequenced genomes increases, new avenues for analysis are becoming available. Herein, we discuss recent discoveries in canine genomics regarding behavior, morphology, and disease susceptibility. We explore the limitations of current data sets for variant interpretation, tradeoffs between sequencing strategies, and the burgeoning role of long-read genomes for capturing structural variants. In addition, we consider how large-scale collections of whole-genome sequence data drive rare variant discovery and assess the geographic distribution of canine diversity, which identifies Asia as a major source of missing variation. Finally, we review recent comparative genomic analyses that will facilitate annotation of the noncoding genome in dogs.

Assuntos

Genoma , Genômica , Cães/genética , Animais , Genômica/métodos , Variação Genética , Sequenciamento Completo do Genoma/métodos

12.

Accelerated somatic mutation calling for whole-genome and whole-exome sequencing data from heterogenous tumor samples.

Ji, Shuangxi; Zhu, Tong; Sethia, Ankit; Wang, Wenyi.

Genome Res ; 34(4): 633-641, 2024 05 15.

Artigo em Inglês | MEDLINE | ID: mdl-38589250

RESUMO

Accurate detection of somatic mutations in DNA sequencing data is a fundamental prerequisite for cancer research. Previous analytical challenges were overcome by consensus mutation calling from four to five popular callers. This, however, increases the already nontrivial computing time from individual callers. Here, we launch MuSE 2, powered by multistep parallelization and efficient memory allocation, to resolve the computing time bottleneck. MuSE 2 speeds up 50 times more than MuSE 1 and eight to 80 times more than other popular callers. Our benchmark study suggests combining MuSE 2 and the recently accelerated Strelka2 achieves high efficiency and accuracy in analyzing large cancer genomic data sets.

Assuntos

Sequenciamento do Exoma , Mutação , Neoplasias , Sequenciamento Completo do Genoma , Humanos , Neoplasias/genética , Sequenciamento do Exoma/métodos , Sequenciamento Completo do Genoma/métodos , Software , Genoma Humano , Genômica/métodos , Algoritmos , Análise Mutacional de DNA/métodos

13.

Comparative genomics of Cryptosporidium parvum reveals the emergence of an outbreak-associated population in Europe and its spread to the United States.

Bellinzona, Greta; Nardi, Tiago; Castelli, Michele; Batisti Biffignandi, Gherard; Adjou, Karim; Betson, Martha; Blanchard, Yannick; Bujila, Ioana; Chalmers, Rachel; Davidson, Rebecca; D'Avino, Nicoletta; Enbom, Tuulia; Gomes, Jacinto; Karadjian, Gregory; Klotz, Christian; Östlund, Emma; Plutzer, Judith; Rimhanen-Finne, Ruska; Robinson, Guy; Sannella, Anna Rosa; Sroka, Jacek; Stensvold, Christen Rune; Troell, Karin; Vatta, Paolo; Zalewska, Barbora; Bandi, Claudio; Sassera, Davide; Cacciò, Simone M.

Genome Res ; 34(6): 877-887, 2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-38977307

RESUMO

The zoonotic parasite Cryptosporidium parvum is a global cause of gastrointestinal disease in humans and ruminants. Sequence analysis of the highly polymorphic gp60 gene enabled the classification of C. parvum isolates into multiple groups (e.g., IIa, IIc, Id) and a large number of subtypes. In Europe, subtype IIaA15G2R1 is largely predominant and has been associated with many water- and food-borne outbreaks. In this study, we generated new whole-genome sequence (WGS) data from 123 human- and ruminant-derived isolates collected in 13 European countries and included other available WGS data from Europe, Egypt, China, and the United States (n = 72) in the largest comparative genomics study to date. We applied rigorous filters to exclude mixed infections and analyzed a data set from 141 isolates from the zoonotic groups IIa (n = 119) and IId (n = 22). Based on 28,047 high-quality, biallelic genomic SNPs, we identified three distinct and strongly supported populations: Isolates from China (IId) and Egypt (IIa and IId) formed population 1; a minority of European isolates (IIa and IId) formed population 2; and the majority of European (IIa, including all IIaA15G2R1 isolates) and all isolates from the United States (IIa) clustered in population 3. Based on analyses of the population structure, population genetics, and recombination, we show that population 3 has recently emerged and expanded throughout Europe to then, possibly from the United Kingdom, reach the United States, where it also expanded. The reason(s) for the successful spread of population 3 remain elusive, although genes under selective pressure uniquely in this population were identified.

Assuntos

Criptosporidiose , Cryptosporidium parvum , Surtos de Doenças , Cryptosporidium parvum/genética , Estados Unidos/epidemiologia , Europa (Continente)/epidemiologia , Humanos , Criptosporidiose/parasitologia , Criptosporidiose/epidemiologia , Animais , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Filogenia , Sequenciamento Completo do Genoma/métodos , Genoma de Protozoário , China/epidemiologia , Egito/epidemiologia

14.

Improving population scale statistical phasing with whole-genome sequencing data.

Wertenbroek, Rick; Hofmeister, Robin J; Xenarios, Ioannis; Thoma, Yann; Delaneau, Olivier.

PLoS Genet ; 20(7): e1011092, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38959269

RESUMO

Haplotype estimation, or phasing, has gained significant traction in large-scale projects due to its valuable contributions to population genetics, variant analysis, and the creation of reference panels for imputation and phasing of new samples. To scale with the growing number of samples, haplotype estimation methods designed for population scale rely on highly optimized statistical models to phase genotype data, and usually ignore read-level information. Statistical methods excel in resolving common variants, however, they still struggle at rare variants due to the lack of statistical information. In this study we introduce SAPPHIRE, a new method that leverages whole-genome sequencing data to enhance the precision of haplotype calls produced by statistical phasing. SAPPHIRE achieves this by refining haplotype estimates through the realignment of sequencing reads, particularly targeting low-confidence phase calls. Our findings demonstrate that SAPPHIRE significantly enhances the accuracy of haplotypes obtained from state of the art methods and also provides the subset of phase calls that are validated by sequencing reads. Finally, we show that our method scales to large data sets by its successful application to the extensive 3.6 Petabytes of sequencing data of the last UK Biobank 200,031 sample release.

Assuntos

Genética Populacional , Haplótipos , Sequenciamento Completo do Genoma , Sequenciamento Completo do Genoma/métodos , Humanos , Genética Populacional/métodos , Genoma Humano , Polimorfismo de Nucleotídeo Único/genética , Estudo de Associação Genômica Ampla/métodos , Algoritmos

15.

A machine learning enhanced EMS mutagenesis probability map for efficient identification of causal mutations in Caenorhabditis elegans.

Guo, Zhengyang; Wang, Shimin; Wang, Yang; Wang, Zi; Ou, Guangshuo.

PLoS Genet ; 20(8): e1011377, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39186782

RESUMO

Chemical mutagenesis-driven forward genetic screens are pivotal in unveiling gene functions, yet identifying causal mutations behind phenotypes remains laborious, hindering their high-throughput application. Here, we reveal a non-uniform mutation rate caused by Ethyl Methane Sulfonate (EMS) mutagenesis in the C. elegans genome, indicating that mutation frequency is influenced by proximate sequence context and chromatin status. Leveraging these factors, we developed a machine learning enhanced pipeline to create a comprehensive EMS mutagenesis probability map for the C. elegans genome. This map operates on the principle that causative mutations are enriched in genetic screens targeting specific phenotypes among random mutations. Applying this map to Whole Genome Sequencing (WGS) data of genetic suppressors that rescue a C. elegans ciliary kinesin mutant, we successfully pinpointed causal mutations without generating recombinant inbred lines. This method can be adapted in other species, offering a scalable approach for identifying causal genes and revitalizing the effectiveness of forward genetic screens.

Assuntos

Caenorhabditis elegans , Metanossulfonato de Etila , Aprendizado de Máquina , Mutagênese , Mutação , Caenorhabditis elegans/genética , Animais , Fenótipo , Sequenciamento Completo do Genoma/métodos , Cinesinas/genética , Taxa de Mutação , Proteínas de Caenorhabditis elegans/genética , Mapeamento Cromossômico/métodos

16.

Large-scale genome sequencing of giant pandas improves the understanding of population structure and future conservation initiatives.

Lan, Tianming; Yang, Shangchen; Li, Haimeng; Zhang, Yi; Li, Rengui; Sahu, Sunil Kumar; Deng, Wenwen; Liu, Boyang; Shi, Minhui; Wang, Shiqing; Du, Hanyu; Huang, Xiaoyu; Lu, Haorong; Liu, Shanlin; Deng, Tao; Chen, Jin; Wang, Qing; Han, Lei; Zhou, Yajie; Li, Qiye; Li, Desheng; Kristiansen, Karsten; Wan, Qiu-Hong; Liu, Huan; Fang, Sheng-Guo.

Proc Natl Acad Sci U S A ; 121(36): e2406343121, 2024 Sep 03.

Artigo em Inglês | MEDLINE | ID: mdl-39186654

RESUMO

The extinction risk of the giant panda has been demoted from "endangered" to "vulnerable" on the International Union for Conservation of Nature Red List, but its habitat is more fragmented than ever before, resulting in 33 isolated giant panda populations according to the fourth national survey released by the Chinese government. Further comprehensive investigations of the genetic background and in-depth assessments of the conservation status of wild populations are still necessary and urgently needed. Here, we sequenced the genomes of 612 giant pandas with an average depth of ~26× and generated a high-resolution map of genomic variation with more than 20 million variants covering wild individuals from six mountain ranges and captive representatives in China. We identified distinct genetic clusters within the Minshan population by performing a fine-grained genetic structure. The estimation of inbreeding and genetic load associated with historical population dynamics suggested that future conservation efforts should pay special attention to the Qinling and Liangshan populations. Releasing captive individuals with a genetic background similar to the recipient population appears to be an advantageous genetic rescue strategy for recovering the wild giant panda populations, as this approach introduces fewer deleterious mutations into the wild population than mating with differentiated lineages. These findings emphasize the superiority of large-scale population genomics to provide precise guidelines for future conservation of the giant panda.

Assuntos

Conservação dos Recursos Naturais , Genoma , Ursidae , Ursidae/genética , Animais , Conservação dos Recursos Naturais/métodos , Genoma/genética , China , Espécies em Perigo de Extinção , Variação Genética , Genética Populacional/métodos , Dinâmica Populacional , Sequenciamento Completo do Genoma/métodos

17.

Whole genome sequencing based analysis of inflammation biomarkers in the Trans-Omics for Precision Medicine (TOPMed) consortium.

Jiang, Min-Zhi; Gaynor, Sheila M; Li, Xihao; Van Buren, Eric; Stilp, Adrienne; Buth, Erin; Wang, Fei Fei; Manansala, Regina; Gogarten, Stephanie M; Li, Zilin; Polfus, Linda M; Salimi, Shabnam; Bis, Joshua C; Pankratz, Nathan; Yanek, Lisa R; Durda, Peter; Tracy, Russell P; Rich, Stephen S; Rotter, Jerome I; Mitchell, Braxton D; Lewis, Joshua P; Psaty, Bruce M; Pratte, Katherine A; Silverman, Edwin K; Kaplan, Robert C; Avery, Christy; North, Kari E; Mathias, Rasika A; Faraday, Nauder; Lin, Honghuang; Wang, Biqi; Carson, April P; Norwood, Arnita F; Gibbs, Richard A; Kooperberg, Charles; Lundin, Jessica; Peters, Ulrike; Dupuis, Josée; Hou, Lifang; Fornage, Myriam; Benjamin, Emelia J; Reiner, Alexander P; Bowler, Russell P; Lin, Xihong; Auer, Paul L; Raffield, Laura M.

Hum Mol Genet ; 33(16): 1429-1441, 2024 Aug 06.

Artigo em Inglês | MEDLINE | ID: mdl-38747556

RESUMO

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38 465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program (with varying sample size by trait, where the minimum sample size was n = 737 for MMP-1). We identified 22 distinct single-variant associations across 6 traits-E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin-that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.

Assuntos

Biomarcadores , Estudo de Associação Genômica Ampla , Inflamação , Medicina de Precisão , Sequenciamento Completo do Genoma , Humanos , Medicina de Precisão/métodos , Inflamação/genética , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Predisposição Genética para Doença , Feminino , Interleucina-6/genética

18.

Return of Results in Genomic Research Using Large-Scale or Whole Genome Sequencing: Toward a New Normal.

Wolf, Susan M; Green, Robert C.

Annu Rev Genomics Hum Genet ; 24: 393-414, 2023 08 25.

Artigo em Inglês | MEDLINE | ID: mdl-36913714

RESUMO

Genome sequencing is increasingly used in research and integrated into clinical care. In the research domain, large-scale analyses, including whole genome sequencing with variant interpretation and curation, virtually guarantee identification of variants that are pathogenic or likely pathogenic and actionable. Multiple guidelines recommend that findings associated with actionable conditions be offered to research participants in order to demonstrate respect for autonomy, reciprocity, and participant interests in health and privacy. Some recommendations go further and support offering a wider range of findings, including those that are not immediately actionable. In addition, entities covered by the US Health Insurance Portability and Accountability Act (HIPAA) may be required to provide a participant's raw genomic data on request. Despite these widely endorsed guidelines and requirements, the implementation of return of genomic results and data by researchers remains uneven. This article analyzes the ethical and legal foundations for researcher duties to offer adult participants their interpreted results and raw data as the new normal in genomic research.

Assuntos

Genômica , Sequenciamento Completo do Genoma , Genômica/métodos , Sequenciamento Completo do Genoma/métodos , Humanos , United States Food and Drug Administration , Estados Unidos , Armazenamento e Recuperação da Informação , Health Insurance Portability and Accountability Act

19.

Rare variants in long non-coding RNAs are associated with blood lipid levels in the TOPMed whole-genome sequencing study.

Wang, Yuxuan; Selvaraj, Margaret Sunitha; Li, Xihao; Li, Zilin; Holdcraft, Jacob A; Arnett, Donna K; Bis, Joshua C; Blangero, John; Boerwinkle, Eric; Bowden, Donald W; Cade, Brian E; Carlson, Jenna C; Carson, April P; Chen, Yii-Der Ida; Curran, Joanne E; de Vries, Paul S; Dutcher, Susan K; Ellinor, Patrick T; Floyd, James S; Fornage, Myriam; Freedman, Barry I; Gabriel, Stacey; Germer, Soren; Gibbs, Richard A; Guo, Xiuqing; He, Jiang; Heard-Costa, Nancy; Hildalgo, Bertha; Hou, Lifang; Irvin, Marguerite R; Joehanes, Roby; Kaplan, Robert C; Kardia, Sharon Lr; Kelly, Tanika N; Kim, Ryan; Kooperberg, Charles; Kral, Brian G; Levy, Daniel; Li, Changwei; Liu, Chunyu; Lloyd-Jone, Don; Loos, Ruth Jf; Mahaney, Michael C; Martin, Lisa W; Mathias, Rasika A; Minster, Ryan L; Mitchell, Braxton D; Montasser, May E; Morrison, Alanna C; Murabito, Joanne M.

Am J Hum Genet ; 110(10): 1704-1717, 2023 10 05.

Artigo em Inglês | MEDLINE | ID: mdl-37802043

RESUMO

Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions in lipid metabolism. Large-scale whole-genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess more associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with measurement of blood lipids and lipoproteins (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare-variant aggregate association tests using the STAAR (variant-set test for association using annotation information) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare-coding variants in nearby protein-coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500-kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variation and rare protein-coding variation at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNAs.

Assuntos

RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Estudo de Associação Genômica Ampla , Medicina de Precisão , Sequenciamento Completo do Genoma/métodos , Lipídeos/genética , Polimorfismo de Nucleotídeo Único/genética

20.

Evaluating the efficacy of MEANGS for mitochondrial genome assembly of cartilaginous and ray-finned fish species.

Xu, Sheng-Yong; Cai, Shan-Shan; Han, Zhi-Qiang.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38349058

RESUMO

The assembly of complete and circularized mitochondrial genomes (mitogenomes) is essential for population genetics, phylogenetics and evolution studies. Recently, Song et al. developed a seed-free tool called MEANGS for de novo mitochondrial assembly from whole genome sequencing (WGS) data in animals, achieving highly accurate and intact assemblies. However, the suitability of this tool for marine fish remains unexplored. Additionally, we have concerns regarding the overlap sequences in their original results, which may impact downstream analyses. In this Letter to the Editor, the effectiveness of MEANGS in assembling mitogenomes of cartilaginous and ray-finned fish species was assessed. Moreover, we also discussed the appropriate utilization of MEANGS in mitogenome assembly, including the implementation of the data-cut function and circular detection module. Our observations indicated that with the utilization of these modules, MEANGS efficiently assembled complete and circularized mitogenomes, even when handling large WGS datasets. Therefore, we strongly recommend users employ the data-cut function and circular detection module when using MEANGS, as the former significantly reduces runtime and the latter aids in the removal of overlapped sequences for improved circularization. Furthermore, our findings suggested that approximately 2× coverage of clean WGS data was sufficient for MEANGS to assemble mitogenomes in marine fish species. Moreover, due to its seed-free nature, MEANGS can be deemed one of the most efficient software tools for assembling mitogenomes from animal WGS data, particularly in studies with limited species or genetic background information.

Assuntos

Genoma Mitocondrial , Animais , Sequenciamento Completo do Genoma/métodos , Software , Filogenia

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA