Búsqueda | Portal de Búsqueda de la BVS

Genotyping sequence-resolved copy-number variation using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes.

Ma, Walfred; Chaisson, Mark Jp.

bioRxiv ; 2024 Aug 29.

Artículo en Inglés | MEDLINE | ID: mdl-39149335

RESUMEN

Human pangenomes contain assemblies of non-reference copy-number variable (CNV) genes. We developed a new method, ctyper, to identify the copy-number of specific alleles of CNV genes cataloged in pangenomes with NGS datasets. Applying ctyper to the 1000-genomes samples revealed population stratification of paralogs and two classes of CNVs: recent CNVs due to ongoing duplications, and polymorphic CNVs from non-reference ancient paralogs. Expression quantitative trait locus analysis determined allele-specific expression within gene families, revealing that 7.94% of paralogs and 3.28% orthologs had significantly divergent expression. Case studies of individual genes include finding lower expression on SMN-1 copies that arose from conversion from SMN-2, and increased expression on a form of AMY2B that has undergone a translocation. Moreover, 4.7% of paralogs and 1.2% of orthologs had different most-expressed tissues. Furthermore, the genotypes explain more expression variance than known eQTL variants. Overall, ctyper enables biobank-scale genotyping of sequence-resolved CNVs.

Structural variation across 138,134 samples in the TOPMed consortium.

Jun, Goo; English, Adam C; Metcalf, Ginger A; Yang, Jianzhi; Chaisson, Mark Jp; Pankratz, Nathan; Menon, Vipin K; Salerno, William J; Krasheninina, Olga; Smith, Albert V; Lane, John A; Blackwell, Tom; Kang, Hyun Min; Salvi, Sejal; Meng, Qingchang; Shen, Hua; Pasham, Divya; Bhamidipati, Sravya; Kottapalli, Kavya; Arnett, Donna K; Ashley-Koch, Allison; Auer, Paul L; Beutel, Kathleen M; Bis, Joshua C; Blangero, John; Bowden, Donald W; Brody, Jennifer A; Cade, Brian E; Chen, Yii-Der Ida; Cho, Michael H; Curran, Joanne E; Fornage, Myriam; Freedman, Barry I; Fingerlin, Tasha; Gelb, Bruce D; Hou, Lifang; Hung, Yi-Jen; Kane, John P; Kaplan, Robert; Kim, Wonji; Loos, Ruth J F; Marcus, Gregory M; Mathias, Rasika A; McGarvey, Stephen T; Montgomery, Courtney; Naseri, Take; Nouraie, S Mehdi; Preuss, Michael H; Palmer, Nicholette D; Peyser, Patricia A.

bioRxiv ; 2023 Jan 25.

Artículo en Inglés | MEDLINE | ID: mdl-36747810

RESUMEN

Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hemotologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.

Structural variation across 138,134 samples in the TOPMed consortium.

Res Sq ; 2023 Feb 03.

Artículo en Inglés | MEDLINE | ID: mdl-36778386

RESUMEN

Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hematologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA