Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.095
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 184(11): 3006-3021.e17, 2021 05 27.
Artigo em Inglês | MEDLINE | ID: mdl-33930287

RESUMO

Genetic studies have revealed many variant loci that are associated with immune-mediated diseases. To elucidate the disease pathogenesis, it is essential to understand the function of these variants, especially under disease-associated conditions. Here, we performed a large-scale immune cell gene-expression analysis, together with whole-genome sequence analysis. Our dataset consists of 28 distinct immune cell subsets from 337 patients diagnosed with 10 categories of immune-mediated diseases and 79 healthy volunteers. Our dataset captured distinctive gene-expression profiles across immune cell types and diseases. Expression quantitative trait loci (eQTL) analysis revealed dynamic variations of eQTL effects in the context of immunological conditions, as well as cell types. These cell-type-specific and context-dependent eQTLs showed significant enrichment in immune disease-associated genetic variants, and they implicated the disease-relevant cell types, genes, and environment. This atlas deepens our understanding of the immunogenetic functions of disease-associated variants under in vivo disease conditions.


Assuntos
Regulação da Expressão Gênica/genética , Expressão Gênica/imunologia , Doenças do Sistema Imunitário/genética , Adulto , Feminino , Expressão Gênica/genética , Regulação da Expressão Gênica/imunologia , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Sistema Imunitário/citologia , Sistema Imunitário/metabolismo , Doenças do Sistema Imunitário/metabolismo , Doenças do Sistema Imunitário/fisiopatologia , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Locos de Características Quantitativas/imunologia , Transcriptoma/genética , Sequenciamento Completo do Genoma/métodos
2.
Cell ; 184(13): 3426-3437.e8, 2021 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-33991487

RESUMO

We identified an emerging severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variant by viral whole-genome sequencing of 2,172 nasal/nasopharyngeal swab samples from 44 counties in California, a state in the western United States. Named B.1.427/B.1.429 to denote its two lineages, the variant emerged in May 2020 and increased from 0% to >50% of sequenced cases from September 2020 to January 2021, showing 18.6%-24% increased transmissibility relative to wild-type circulating strains. The variant carries three mutations in the spike protein, including an L452R substitution. We found 2-fold increased B.1.427/B.1.429 viral shedding in vivo and increased L452R pseudovirus infection of cell cultures and lung organoids, albeit decreased relative to pseudoviruses carrying the N501Y mutation common to variants B.1.1.7, B.1.351, and P.1. Antibody neutralization assays revealed 4.0- to 6.7-fold and 2.0-fold decreases in neutralizing titers from convalescent patients and vaccine recipients, respectively. The increased prevalence of a more transmissible variant in California exhibiting decreased antibody neutralization warrants further investigation.


Assuntos
Anticorpos Neutralizantes/imunologia , COVID-19/imunologia , COVID-19/transmissão , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus/imunologia , Anticorpos Monoclonais/imunologia , Anticorpos Antivirais/imunologia , Humanos , Mutação/genética , Sequenciamento Completo do Genoma/métodos
3.
Cell ; 184(20): 5179-5188.e8, 2021 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-34499854

RESUMO

We present evidence for multiple independent origins of recombinant SARS-CoV-2 viruses sampled from late 2020 and early 2021 in the United Kingdom. Their genomes carry single-nucleotide polymorphisms and deletions that are characteristic of the B.1.1.7 variant of concern but lack the full complement of lineage-defining mutations. Instead, the remainder of their genomes share contiguous genetic variation with non-B.1.1.7 viruses circulating in the same geographic area at the same time as the recombinants. In four instances, there was evidence for onward transmission of a recombinant-origin virus, including one transmission cluster of 45 sequenced cases over the course of 2 months. The inferred genomic locations of recombination breakpoints suggest that every community-transmitted recombinant virus inherited its spike region from a B.1.1.7 parental virus, consistent with a transmission advantage for B.1.1.7's set of mutations.


Assuntos
COVID-19/epidemiologia , COVID-19/transmissão , Pandemias , Recombinação Genética , SARS-CoV-2/genética , Sequência de Bases/genética , COVID-19/virologia , Biologia Computacional/métodos , Frequência do Gene , Genoma Viral , Genótipo , Humanos , Mutação , Filogenia , Polimorfismo de Nucleotídeo Único , Reino Unido/epidemiologia , Sequenciamento Completo do Genoma/métodos
4.
Cell ; 183(1): 197-210.e32, 2020 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-33007263

RESUMO

Cancer genomes often harbor hundreds of somatic DNA rearrangement junctions, many of which cannot be easily classified into simple (e.g., deletion) or complex (e.g., chromothripsis) structural variant classes. Applying a novel genome graph computational paradigm to analyze the topology of junction copy number (JCN) across 2,778 tumor whole-genome sequences, we uncovered three novel complex rearrangement phenomena: pyrgo, rigma, and tyfonas. Pyrgo are "towers" of low-JCN duplications associated with early-replicating regions, superenhancers, and breast or ovarian cancers. Rigma comprise "chasms" of low-JCN deletions enriched in late-replicating fragile sites and gastrointestinal carcinomas. Tyfonas are "typhoons" of high-JCN junctions and fold-back inversions associated with expressed protein-coding fusions, breakend hypermutation, and acral, but not cutaneous, melanomas. Clustering of tumors according to genome graph-derived features identified subgroups associated with DNA repair defects and poor prognosis.


Assuntos
Variação Estrutural do Genoma/genética , Genômica/métodos , Neoplasias/genética , Inversão Cromossômica/genética , Cromotripsia , Variações do Número de Cópias de DNA/genética , Rearranjo Gênico/genética , Genoma Humano/genética , Humanos , Mutação/genética , Sequenciamento Completo do Genoma/métodos
5.
Cell ; 177(4): 821-836.e16, 2019 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-30982602

RESUMO

Whole-genome-sequencing (WGS) of human tumors has revealed distinct mutation patterns that hint at the causative origins of cancer. We examined mutational signatures in 324 WGS human-induced pluripotent stem cells exposed to 79 known or suspected environmental carcinogens. Forty-one yielded characteristic substitution mutational signatures. Some were similar to signatures found in human tumors. Additionally, six agents produced double-substitution signatures and eight produced indel signatures. Investigating mutation asymmetries across genome topography revealed fully functional mismatch and transcription-coupled repair pathways. DNA damage induced by environmental mutagens can be resolved by disparate repair and/or replicative pathways, resulting in an assortment of signature outcomes even for a single agent. This compendium of experimentally induced mutational signatures permits further exploration of roles of environmental agents in cancer etiology and underscores how human stem cell DNA is directly vulnerable to environmental agents. VIDEO ABSTRACT.


Assuntos
Carcinógenos Ambientais/classificação , Neoplasias/genética , Carcinógenos Ambientais/efeitos adversos , Dano ao DNA/genética , Análise Mutacional de DNA/métodos , Reparo do DNA/genética , Replicação do DNA , Perfil Genético , Genoma Humano/genética , Humanos , Mutação INDEL/genética , Mutagênese , Mutação/genética , Células-Tronco Pluripotentes/metabolismo , Sequenciamento Completo do Genoma/métodos
6.
Cell ; 177(1): 70-84, 2019 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-30901550

RESUMO

Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats. Here, we review the current state of technologies for genetic variant discovery, genotyping, and functional interpretation and discuss the prospects for future advances. We focus on germline variants discovered by whole-genome sequencing, genome-wide functional genomic approaches for predicting and measuring variant functional effects, and implications for studies of common and rare human disease.


Assuntos
Variação Genética/genética , Genoma Humano/genética , Análise de Sequência de DNA/tendências , Bancos de Espécimes Biológicos , Mapeamento Cromossômico/métodos , Predisposição Genética para Doença/genética , Testes Genéticos/tendências , Estudo de Associação Genômica Ampla , Genômica/métodos , Genômica/tendências , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Projeto Genoma Humano , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Sequenciamento Completo do Genoma/métodos , Sequenciamento Completo do Genoma/tendências
7.
Cell ; 174(3): 758-769.e9, 2018 07 26.
Artigo em Inglês | MEDLINE | ID: mdl-30033370

RESUMO

While mutations affecting protein-coding regions have been examined across many cancers, structural variants at the genome-wide level are still poorly defined. Through integrative deep whole-genome and -transcriptome analysis of 101 castration-resistant prostate cancer metastases (109X tumor/38X normal coverage), we identified structural variants altering critical regulators of tumorigenesis and progression not detectable by exome approaches. Notably, we observed amplification of an intergenic enhancer region 624 kb upstream of the androgen receptor (AR) in 81% of patients, correlating with increased AR expression. Tandem duplication hotspots also occur near MYC, in lncRNAs associated with post-translational MYC regulation. Classes of structural variations were linked to distinct DNA repair deficiencies, suggesting their etiology, including associations of CDK12 mutation with tandem duplications, TP53 inactivation with inverted rearrangements and chromothripsis, and BRCA2 inactivation with deletions. Together, these observations provide a comprehensive view of how structural variations affect critical regulators in metastatic prostate cancer.


Assuntos
Variação Estrutural do Genoma/genética , Neoplasias da Próstata/genética , Idoso , Idoso de 80 Anos ou mais , Proteína BRCA2/metabolismo , Quinases Ciclina-Dependentes/metabolismo , Variações do Número de Cópias de DNA , Exoma , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Metástase Neoplásica/genética , Proteínas Proto-Oncogênicas c-myc/genética , Proteínas Proto-Oncogênicas c-myc/metabolismo , Receptores Androgênicos/genética , Receptores Androgênicos/metabolismo , Sequências de Repetição em Tandem/genética , Proteína Supressora de Tumor p53/metabolismo , Sequenciamento Completo do Genoma/métodos
8.
Nat Rev Genet ; 23(2): 120-133, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34556834

RESUMO

The prevalence of obesity has tripled over the past four decades, imposing an enormous burden on people's health. Polygenic (or common) obesity and rare, severe, early-onset monogenic obesity are often polarized as distinct diseases. However, gene discovery studies for both forms of obesity show that they have shared genetic and biological underpinnings, pointing to a key role for the brain in the control of body weight. Genome-wide association studies (GWAS) with increasing sample sizes and advances in sequencing technology are the main drivers behind a recent flurry of new discoveries. However, it is the post-GWAS, cross-disciplinary collaborations, which combine new omics technologies and analytical approaches, that have started to facilitate translation of genetic loci into meaningful biology and new avenues for treatment.


Assuntos
Predisposição Genética para Doença/genética , Variação Genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Obesidade/genética , Sequenciamento Completo do Genoma/métodos , Animais , Ingestão de Alimentos/genética , Interação Gene-Ambiente , Humanos , Herança Multifatorial/genética , Sobrepeso/genética
9.
Mol Cell ; 80(3): 541-553.e5, 2020 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33068522

RESUMO

To address how genetic variation alters gene expression in complex cell mixtures, we developed direct nuclear tagmentation and RNA sequencing (DNTR-seq), which enables whole-genome and mRNA sequencing jointly in single cells. DNTR-seq readily identified minor subclones within leukemia patients. In a large-scale DNA damage screen, DNTR-seq was used to detect regions under purifying selection and identified genes where mRNA abundance was resistant to copy-number alteration, suggesting strong genetic compensation. mRNA sequencing (mRNA-seq) quality equals RNA-only methods, and the low positional bias of genomic libraries allowed detection of sub-megabase aberrations at ultra-low coverage. Each cell library is individually addressable and can be re-sequenced at increased depth, allowing multi-tiered study designs. Additionally, the direct tagmentation protocol enables coverage-independent estimation of ploidy, which can be used to identify cell singlets. Thus, DNTR-seq directly links each cell's state to its corresponding genome at scale, enabling routine analysis of heterogeneous tumors and other complex tissues.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Sequência de Bases/genética , Linhagem Celular Tumoral , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , RNA/genética , RNA Mensageiro/genética , Análise de Sequência de DNA/métodos
10.
Am J Hum Genet ; 111(5): 990-995, 2024 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-38636510

RESUMO

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.


Assuntos
Frequência do Gene , Genótipo , Polimorfismo de Nucleotídeo Único , Software , Humanos , Estudos de Coortes , Desequilíbrio de Ligação , Estudo de Associação Genômica Ampla/métodos , Genoma Humano , Controle de Qualidade , Aprendizado de Máquina , Sequenciamento Completo do Genoma/normas , Sequenciamento Completo do Genoma/métodos
11.
Genome Res ; 34(6): 811-821, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-38955465

RESUMO

Recent advances in genomics, coupled with a unique population structure and remarkable levels of variation, have propelled the domestic dog to new levels as a system for understanding fundamental principles in mammalian biology. Central to this advance are more than 350 recognized breeds, each a closed population that has undergone selection for unique features. Genetic variation in the domestic dog is particularly well characterized compared with other domestic mammals, with almost 3000 high-coverage genomes publicly available. Importantly, as the number of sequenced genomes increases, new avenues for analysis are becoming available. Herein, we discuss recent discoveries in canine genomics regarding behavior, morphology, and disease susceptibility. We explore the limitations of current data sets for variant interpretation, tradeoffs between sequencing strategies, and the burgeoning role of long-read genomes for capturing structural variants. In addition, we consider how large-scale collections of whole-genome sequence data drive rare variant discovery and assess the geographic distribution of canine diversity, which identifies Asia as a major source of missing variation. Finally, we review recent comparative genomic analyses that will facilitate annotation of the noncoding genome in dogs.


Assuntos
Genoma , Genômica , Cães/genética , Animais , Genômica/métodos , Variação Genética , Sequenciamento Completo do Genoma/métodos
12.
Genome Res ; 34(4): 633-641, 2024 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-38589250

RESUMO

Accurate detection of somatic mutations in DNA sequencing data is a fundamental prerequisite for cancer research. Previous analytical challenges were overcome by consensus mutation calling from four to five popular callers. This, however, increases the already nontrivial computing time from individual callers. Here, we launch MuSE 2, powered by multistep parallelization and efficient memory allocation, to resolve the computing time bottleneck. MuSE 2 speeds up 50 times more than MuSE 1 and eight to 80 times more than other popular callers. Our benchmark study suggests combining MuSE 2 and the recently accelerated Strelka2 achieves high efficiency and accuracy in analyzing large cancer genomic data sets.


Assuntos
Sequenciamento do Exoma , Mutação , Neoplasias , Sequenciamento Completo do Genoma , Humanos , Neoplasias/genética , Sequenciamento do Exoma/métodos , Sequenciamento Completo do Genoma/métodos , Software , Genoma Humano , Genômica/métodos , Algoritmos , Análise Mutacional de DNA/métodos
13.
Genome Res ; 34(6): 877-887, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-38977307

RESUMO

The zoonotic parasite Cryptosporidium parvum is a global cause of gastrointestinal disease in humans and ruminants. Sequence analysis of the highly polymorphic gp60 gene enabled the classification of C. parvum isolates into multiple groups (e.g., IIa, IIc, Id) and a large number of subtypes. In Europe, subtype IIaA15G2R1 is largely predominant and has been associated with many water- and food-borne outbreaks. In this study, we generated new whole-genome sequence (WGS) data from 123 human- and ruminant-derived isolates collected in 13 European countries and included other available WGS data from Europe, Egypt, China, and the United States (n = 72) in the largest comparative genomics study to date. We applied rigorous filters to exclude mixed infections and analyzed a data set from 141 isolates from the zoonotic groups IIa (n = 119) and IId (n = 22). Based on 28,047 high-quality, biallelic genomic SNPs, we identified three distinct and strongly supported populations: Isolates from China (IId) and Egypt (IIa and IId) formed population 1; a minority of European isolates (IIa and IId) formed population 2; and the majority of European (IIa, including all IIaA15G2R1 isolates) and all isolates from the United States (IIa) clustered in population 3. Based on analyses of the population structure, population genetics, and recombination, we show that population 3 has recently emerged and expanded throughout Europe to then, possibly from the United Kingdom, reach the United States, where it also expanded. The reason(s) for the successful spread of population 3 remain elusive, although genes under selective pressure uniquely in this population were identified.


Assuntos
Criptosporidiose , Cryptosporidium parvum , Surtos de Doenças , Cryptosporidium parvum/genética , Estados Unidos/epidemiologia , Europa (Continente)/epidemiologia , Humanos , Criptosporidiose/parasitologia , Criptosporidiose/epidemiologia , Animais , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Filogenia , Sequenciamento Completo do Genoma/métodos , Genoma de Protozoário , China/epidemiologia , Egito/epidemiologia
14.
PLoS Genet ; 20(7): e1011092, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38959269

RESUMO

Haplotype estimation, or phasing, has gained significant traction in large-scale projects due to its valuable contributions to population genetics, variant analysis, and the creation of reference panels for imputation and phasing of new samples. To scale with the growing number of samples, haplotype estimation methods designed for population scale rely on highly optimized statistical models to phase genotype data, and usually ignore read-level information. Statistical methods excel in resolving common variants, however, they still struggle at rare variants due to the lack of statistical information. In this study we introduce SAPPHIRE, a new method that leverages whole-genome sequencing data to enhance the precision of haplotype calls produced by statistical phasing. SAPPHIRE achieves this by refining haplotype estimates through the realignment of sequencing reads, particularly targeting low-confidence phase calls. Our findings demonstrate that SAPPHIRE significantly enhances the accuracy of haplotypes obtained from state of the art methods and also provides the subset of phase calls that are validated by sequencing reads. Finally, we show that our method scales to large data sets by its successful application to the extensive 3.6 Petabytes of sequencing data of the last UK Biobank 200,031 sample release.


Assuntos
Genética Populacional , Haplótipos , Sequenciamento Completo do Genoma , Sequenciamento Completo do Genoma/métodos , Humanos , Genética Populacional/métodos , Genoma Humano , Polimorfismo de Nucleotídeo Único/genética , Estudo de Associação Genômica Ampla/métodos , Algoritmos
15.
Hum Mol Genet ; 33(16): 1429-1441, 2024 Aug 06.
Artigo em Inglês | MEDLINE | ID: mdl-38747556

RESUMO

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38 465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program (with varying sample size by trait, where the minimum sample size was n = 737 for MMP-1). We identified 22 distinct single-variant associations across 6 traits-E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin-that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.


Assuntos
Biomarcadores , Estudo de Associação Genômica Ampla , Inflamação , Medicina de Precisão , Sequenciamento Completo do Genoma , Humanos , Medicina de Precisão/métodos , Inflamação/genética , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Predisposição Genética para Doença , Feminino , Interleucina-6/genética
16.
Annu Rev Genomics Hum Genet ; 24: 393-414, 2023 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-36913714

RESUMO

Genome sequencing is increasingly used in research and integrated into clinical care. In the research domain, large-scale analyses, including whole genome sequencing with variant interpretation and curation, virtually guarantee identification of variants that are pathogenic or likely pathogenic and actionable. Multiple guidelines recommend that findings associated with actionable conditions be offered to research participants in order to demonstrate respect for autonomy, reciprocity, and participant interests in health and privacy. Some recommendations go further and support offering a wider range of findings, including those that are not immediately actionable. In addition, entities covered by the US Health Insurance Portability and Accountability Act (HIPAA) may be required to provide a participant's raw genomic data on request. Despite these widely endorsed guidelines and requirements, the implementation of return of genomic results and data by researchers remains uneven. This article analyzes the ethical and legal foundations for researcher duties to offer adult participants their interpreted results and raw data as the new normal in genomic research.


Assuntos
Genômica , Sequenciamento Completo do Genoma , Genômica/métodos , Sequenciamento Completo do Genoma/métodos , Humanos , United States Food and Drug Administration , Estados Unidos , Armazenamento e Recuperação da Informação , Health Insurance Portability and Accountability Act
17.
Am J Hum Genet ; 110(10): 1704-1717, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37802043

RESUMO

Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions in lipid metabolism. Large-scale whole-genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess more associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with measurement of blood lipids and lipoproteins (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare-variant aggregate association tests using the STAAR (variant-set test for association using annotation information) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare-coding variants in nearby protein-coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500-kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variation and rare protein-coding variation at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNAs.


Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Estudo de Associação Genômica Ampla , Medicina de Precisão , Sequenciamento Completo do Genoma/métodos , Lipídeos/genética , Polimorfismo de Nucleotídeo Único/genética
18.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38349058

RESUMO

The assembly of complete and circularized mitochondrial genomes (mitogenomes) is essential for population genetics, phylogenetics and evolution studies. Recently, Song et al. developed a seed-free tool called MEANGS for de novo mitochondrial assembly from whole genome sequencing (WGS) data in animals, achieving highly accurate and intact assemblies. However, the suitability of this tool for marine fish remains unexplored. Additionally, we have concerns regarding the overlap sequences in their original results, which may impact downstream analyses. In this Letter to the Editor, the effectiveness of MEANGS in assembling mitogenomes of cartilaginous and ray-finned fish species was assessed. Moreover, we also discussed the appropriate utilization of MEANGS in mitogenome assembly, including the implementation of the data-cut function and circular detection module. Our observations indicated that with the utilization of these modules, MEANGS efficiently assembled complete and circularized mitogenomes, even when handling large WGS datasets. Therefore, we strongly recommend users employ the data-cut function and circular detection module when using MEANGS, as the former significantly reduces runtime and the latter aids in the removal of overlapped sequences for improved circularization. Furthermore, our findings suggested that approximately 2× coverage of clean WGS data was sufficient for MEANGS to assemble mitogenomes in marine fish species. Moreover, due to its seed-free nature, MEANGS can be deemed one of the most efficient software tools for assembling mitogenomes from animal WGS data, particularly in studies with limited species or genetic background information.


Assuntos
Genoma Mitocondrial , Animais , Sequenciamento Completo do Genoma/métodos , Software , Filogenia
19.
Brief Bioinform ; 25(Supplement_1)2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39041913

RESUMO

This study describes the development of a resource module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module is designed to facilitate interactive learning of whole-genome bisulfite sequencing (WGBS) data analysis utilizing cloud-based tools in Google Cloud Platform, such as Cloud Storage, Vertex AI notebooks and Google Batch. WGBS is a powerful technique that can provide comprehensive insights into DNA methylation patterns at single cytosine resolution, essential for understanding epigenetic regulation across the genome. The designed learning module first provides step-by-step tutorials that guide learners through two main stages of WGBS data analysis, preprocessing and the identification of differentially methylated regions. And then, it provides a streamlined workflow and demonstrates how to effectively use it for large datasets given the power of cloud infrastructure. The integration of these interconnected submodules progressively deepens the user's understanding of the WGBS analysis process along with the use of cloud resources. Through this module, we can enhance the accessibility and adoption of cloud computing in epigenomic research, speeding up the advancements in the related field and beyond. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.


Assuntos
Computação em Nuvem , Metilação de DNA , Software , Sequenciamento Completo do Genoma , Sequenciamento Completo do Genoma/métodos , Sulfitos/química , Humanos , Epigênese Genética , Biologia Computacional/métodos
20.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38966948

RESUMO

Variants in cis-regulatory elements link the noncoding genome to human pathology; however, detailed analytic tools for understanding the association between cell-level brain pathology and noncoding variants are lacking. CWAS-Plus, adapted from a Python package for category-wide association testing (CWAS), enhances noncoding variant analysis by integrating both whole-genome sequencing (WGS) and user-provided functional data. With simplified parameter settings and an efficient multiple testing correction method, CWAS-Plus conducts the CWAS workflow 50 times faster than CWAS, making it more accessible and user-friendly for researchers. Here, we used a single-nuclei assay for transposase-accessible chromatin with sequencing to facilitate CWAS-guided noncoding variant analysis at cell-type-specific enhancers and promoters. Examining autism spectrum disorder WGS data (n = 7280), CWAS-Plus identified noncoding de novo variant associations in transcription factor binding sites within conserved loci. Independently, in Alzheimer's disease WGS data (n = 1087), CWAS-Plus detected rare noncoding variant associations in microglia-specific regulatory elements. These findings highlight CWAS-Plus's utility in genomic disorders and scalability for processing large-scale WGS data and in multiple-testing corrections. CWAS-Plus and its user manual are available at https://github.com/joonan-lab/cwas/ and https://cwas-plus.readthedocs.io/en/latest/, respectively.


Assuntos
Sequenciamento Completo do Genoma , Humanos , Sequenciamento Completo do Genoma/métodos , Doença de Alzheimer/genética , Estudo de Associação Genômica Ampla/métodos , Transtorno do Espectro Autista/genética , Variação Genética , Software , Cromatina/genética , Cromatina/metabolismo , Genoma Humano
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA