Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters

Database
Language
Affiliation country
Publication year range
1.
Nature ; 2024 May 20.
Article in English | MEDLINE | ID: mdl-38768635

ABSTRACT

Rare coding variants that significantly impact function provide insights into the biology of a gene1-3. However, ascertaining their frequency requires large sample sizes4-8. Here, we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. 23% of the Regeneron Genetics Center Million Exome data (RGC-ME) comes from non-European individuals of African, East Asian, Indigenous American, Middle Eastern, and South Asian ancestry. This catalogue includes over 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss-of-function, we identify 3,988 loss-of-function intolerant genes, including 86 that were previously assessed as tolerant and 1,153 lacking established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions depleted of missense variants despite being tolerant to pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this important resource of coding variation from the RGC-ME accessible via a public variant allele frequency browser.

2.
bioRxiv ; 2023 Nov 02.
Article in English | MEDLINE | ID: mdl-37214792

ABSTRACT

Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.

3.
Rev Sci Instrum ; 89(9): 095111, 2018 Sep.
Article in English | MEDLINE | ID: mdl-30278750

ABSTRACT

We present an inexpensive, generalizable approach for modifying visible wavelength fluorescence microplate readers to detect emission in the near-infrared (NIR) I (650-950 nm) and NIR II (1000-1350 nm) tissue imaging windows. These wavelength ranges are promising for high sensitivity fluorescence-based cell assays and biological imaging, but the inaccessibility of NIR microplate readers is limiting development of the requisite, biocompatible fluorescent probes. Our modifications enable rapid screening of NIR candidate probes, using short pulses of UV light to provide excitation of diverse systems including dye molecules, semiconductor quantum dots, and metal clusters. To confirm the utility of our approach for rapid discovery of new NIR probes, we examine the silver cluster synthesis products formed on 375 candidate DNA strands that were originally designed to produce green-emitting, DNA-stabilized silver clusters. The fast, sensitive system developed here discovered DNA strands that unexpectedly stabilize NIR-emitting silver clusters.

4.
ACS Nano ; 12(8): 8240-8247, 2018 08 28.
Article in English | MEDLINE | ID: mdl-30059609

ABSTRACT

DNA nucleobase sequence controls the size of DNA-stabilized silver clusters, leading to their well-known yet little understood sequence-tuned colors. The enormous space of possible DNA sequences for templating clusters has challenged the understanding of how sequence selects cluster properties and has limited the design of applications that employ these clusters. We investigate the genomic role of DNA sequence for fluorescent silver clusters using a data-driven approach. Employing rapid parallel silver cluster synthesis and fluorimetry, we determine the fluorescence spectra of silver cluster products stabilized by 1432 distinct DNA oligomers. By applying pattern recognition algorithms to this large experimental data set, we discover certain DNA base patterns, or "motifs," that correlate to silver clusters with similar fluorescence spectra. These motifs are employed in machine learning classifiers to predictively design DNA template sequences for specific fluorescence color bands. Our method improves selectivity of templates by 330% for silver clusters with peak emission wavelengths beyond 660 nm. The discovered base motifs also provide physical insights into how DNA sequence controls silver cluster size and color. This predictive design approach for color of DNA-stabilized silver clusters exhibits the potential of machine learning and data mining to increase the precision and efficiency of nanomaterials design, even for a soft-matter-inorganic hybrid system characterized by an extremely large parameter space.


Subject(s)
Color , Coloring Agents/chemistry , DNA/genetics , Fluorescence , Silver/chemistry , Base Sequence
5.
Nanoscale ; 10(42): 19701-19705, 2018 Nov 01.
Article in English | MEDLINE | ID: mdl-30350832

ABSTRACT

We use high throughput near-infrared (NIR) screening technology to discover abundant new DNA-stabilized silver clusters, AgN-DNA, that fluoresce in the NIR. These include the longest wavelength AgN-DNA fluorophores identified to date, with peak emission beyond 950 nm that extends into the NIR II tissue transparency window, and the highest silver content.


Subject(s)
DNA/chemistry , Silver/chemistry , Spectroscopy, Near-Infrared , Chromatography, High Pressure Liquid , Mass Spectrometry
SELECTION OF CITATIONS
SEARCH DETAIL