RESUMO
Human genetic variation has enabled the identification of several key regulators of fetal-to-adult hemoglobin switching, including BCL11A, resulting in therapeutic advances. However, despite the progress made, limited further insights have been obtained to provide a fuller accounting of how genetic variation contributes to the global mechanisms of fetal hemoglobin (HbF) gene regulation. Here, we have conducted a multi-ancestry genome-wide association study of 28,279 individuals from several cohorts spanning 5 continents to define the architecture of human genetic variation impacting HbF. We have identified a total of 178 conditionally independent genome-wide significant or suggestive variants across 14 genomic windows. Importantly, these new data enable us to better define the mechanisms by which HbF switching occurs in vivo. We conduct targeted perturbations to define BACH2 as a new genetically-nominated regulator of hemoglobin switching. We define putative causal variants and underlying mechanisms at the well-studied BCL11A and HBS1L-MYB loci, illuminating the complex variant-driven regulation present at these loci. We additionally show how rare large-effect deletions in the HBB locus can interact with polygenic variation to influence HbF levels. Our study paves the way for the next generation of therapies to more effectively induce HbF in sickle cell disease and ß-thalassemia.
RESUMO
Background: Causal variants underlying rare disorders may remain elusive even after expansive gene panels or exome sequencing (ES). Clinicians and researchers may then turn to genome sequencing (GS), though the added value of this technique and its optimal use remain poorly defined. We therefore investigated the advantages of GS within a phenotypically diverse cohort. Methods: GS was performed for 744 individuals with rare disease who were genetically undiagnosed. Analysis included review of single nucleotide, indel, structural, and mitochondrial variants. Results: We successfully solved 218/744 (29.3%) cases using GS, with most solves involving established disease genes (157/218, 72.0%). Of all solved cases, 148 (67.9%) had previously had non-diagnostic ES. We systematically evaluated the 218 causal variants for features requiring GS to identify and 61/218 (28.0%) met these criteria, representing 8.2% of the entire cohort. These included small structural variants (13), copy neutral inversions and complex rearrangements (8), tandem repeat expansions (6), deep intronic variants (15), and coding variants that may be more easily found using GS related to uniformity of coverage (19). Conclusion: We describe the diagnostic yield of GS in a large and diverse cohort, illustrating several types of pathogenic variation eluding ES or other techniques. Our results reveal a higher diagnostic yield of GS, supporting the utility of a genome-first approach, with consideration of GS as a secondary or tertiary test when higher-resolution structural variant analysis is needed or there is a strong clinical suspicion for a condition and prior targeted genetic testing has been negative.