Your browser doesn't support javascript.
loading
Genome-wide identification of dominant polyadenylation hexamers for use in variant classification.
Shiferaw, Henoke K; Hong, Celine S; Cooper, David N; Johnston, Jennifer J; Biesecker, Leslie G.
Affiliation
  • Shiferaw HK; Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Bethesda, MD 20892, United States.
  • Hong CS; Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Bethesda, MD 20892, United States.
  • Cooper DN; Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom.
  • Johnston JJ; Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Bethesda, MD 20892, United States.
  • Nisc; NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, National Institutes of Health, Bethesda, MD 20892, United States.
  • Biesecker LG; Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, 50 South Drive, Bethesda, MD 20892, United States.
Hum Mol Genet ; 32(23): 3211-3224, 2023 Nov 17.
Article in En | MEDLINE | ID: mdl-37606238
ABSTRACT
Polyadenylation is an essential process for the stabilization and export of mRNAs to the cytoplasm and the polyadenylation signal hexamer (herein referred to as hexamer) plays a key role in this process. Yet, only 14 Mendelian disorders have been associated with hexamer variants. This is likely an under-ascertainment as hexamers are not well defined and not routinely examined in molecular analysis. To facilitate the interrogation of putatively pathogenic hexamer variants, we set out to define functionally important hexamers genome-wide as a resource for research and clinical testing interrogation. We identified predominant polyA sites (herein referred to as pPAS) and putative predominant hexamers across protein coding genes (PAS usage >50% per gene). As a measure of the validity of these sites, the population constraint of 4532 predominant hexamers were measured. The predominant hexamers had fewer observed variants compared to non-predominant hexamers and trimer controls, and CADD scores for variants in these hexamers were significantly higher than controls. Exome data for 1477 individuals were interrogated for hexamer variants and transcriptome data were generated for 76 individuals with 65 variants in predominant hexamers. 3' RNA-seq data showed these variants resulted in alternate polyadenylation events (38%) and in elongated mRNA transcripts (12%). Our list of pPAS and predominant hexamers are available in the UCSC genome browser and on GitHub. We suggest this list of predominant hexamers can be used to interrogate exome and genome data. Variants in these predominant hexamers should be considered candidates for pathogenic variation in human disease, and to that end we suggest pathogenicity criteria for classifying hexamer variants.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Genome / Polyadenylation Type of study: Diagnostic_studies / Prognostic_studies Limits: Humans Language: En Journal: Hum Mol Genet Journal subject: BIOLOGIA MOLECULAR / GENETICA MEDICA Year: 2023 Document type: Article Affiliation country: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Genome / Polyadenylation Type of study: Diagnostic_studies / Prognostic_studies Limits: Humans Language: En Journal: Hum Mol Genet Journal subject: BIOLOGIA MOLECULAR / GENETICA MEDICA Year: 2023 Document type: Article Affiliation country: United States