Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 97
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Cell ; 187(10): 2411-2427.e25, 2024 May 09.
Article in English | MEDLINE | ID: mdl-38608704

ABSTRACT

We set out to exhaustively characterize the impact of the cis-chromatin environment on prime editing, a precise genome engineering tool. Using a highly sensitive method for mapping the genomic locations of randomly integrated reporters, we discover massive position effects, exemplified by editing efficiencies ranging from ∼0% to 94% for an identical target site and edit. Position effects on prime editing efficiency are well predicted by chromatin marks, e.g., positively by H3K79me2 and negatively by H3K9me3. Next, we developed a multiplex perturbational framework to assess the interaction of trans-acting factors with the cis-chromatin environment on editing outcomes. Applying this framework to DNA repair factors, we identify HLTF as a context-dependent repressor of prime editing. Finally, several lines of evidence suggest that active transcriptional elongation enhances prime editing. Consistent with this, we show we can robustly decrease or increase the efficiency of prime editing by preceding it with CRISPR-mediated silencing or activation, respectively.


Subject(s)
CRISPR-Cas Systems , Chromatin , Epigenesis, Genetic , Gene Editing , Humans , Chromatin/metabolism , Chromatin/genetics , CRISPR-Cas Systems/genetics , Gene Editing/methods , Histones/metabolism , Transcription Factors/metabolism , Histone Code
2.
Cell ; 176(1-2): 377-390.e19, 2019 01 10.
Article in English | MEDLINE | ID: mdl-30612741

ABSTRACT

Over one million candidate regulatory elements have been identified across the human genome, but nearly all are unvalidated and their target genes uncertain. Approaches based on human genetics are limited in scope to common variants and in resolution by linkage disequilibrium. We present a multiplex, expression quantitative trait locus (eQTL)-inspired framework for mapping enhancer-gene pairs by introducing random combinations of CRISPR/Cas9-mediated perturbations to each of many cells, followed by single-cell RNA sequencing (RNA-seq). Across two experiments, we used dCas9-KRAB to perturb 5,920 candidate enhancers with no strong a priori hypothesis as to their target gene(s), measuring effects by profiling 254,974 single-cell transcriptomes. We identified 664 (470 high-confidence) cis enhancer-gene pairs, which were enriched for specific transcription factors, non-housekeeping status, and genomic and 3D conformational proximity to their target genes. This framework will facilitate the large-scale mapping of enhancer-gene regulatory interactions, a critical yet largely uncharted component of the cis-regulatory landscape of the human genome.


Subject(s)
Chromosome Mapping/methods , Enhancer Elements, Genetic/genetics , Gene Expression Regulation/genetics , CRISPR-Cas Systems/genetics , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Gene Expression Profiling , Gene Regulatory Networks/genetics , Genome, Human , Genome-Wide Association Study , Genomics , Humans , Quantitative Trait Loci , Transcription Factors/genetics
3.
Nature ; 626(8001): 1084-1093, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38355799

ABSTRACT

The house mouse (Mus musculus) is an exceptional model system, combining genetic tractability with close evolutionary affinity to humans1,2. Mouse gestation lasts only 3 weeks, during which the genome orchestrates the astonishing transformation of a single-cell zygote into a free-living pup composed of more than 500 million cells. Here, to establish a global framework for exploring mammalian development, we applied optimized single-cell combinatorial indexing3 to profile the transcriptional states of 12.4 million nuclei from 83 embryos, precisely staged at 2- to 6-hour intervals spanning late gastrulation (embryonic day 8) to birth (postnatal day 0). From these data, we annotate hundreds of cell types and explore the ontogenesis of the posterior embryo during somitogenesis and of kidney, mesenchyme, retina and early neurons. We leverage the temporal resolution and sampling depth of these whole-embryo snapshots, together with published data4-8 from earlier timepoints, to construct a rooted tree of cell-type relationships that spans the entirety of prenatal development, from zygote to birth. Throughout this tree, we systematically nominate genes encoding transcription factors and other proteins as candidate drivers of the in vivo differentiation of hundreds of cell types. Remarkably, the most marked temporal shifts in cell states are observed within one hour of birth and presumably underlie the massive physiological adaptations that must accompany the successful transition of a mammalian fetus to life outside the womb.


Subject(s)
Animals, Newborn , Embryo, Mammalian , Embryonic Development , Gastrula , Single-Cell Analysis , Time-Lapse Imaging , Animals , Female , Mice , Pregnancy , Animals, Newborn/embryology , Animals, Newborn/genetics , Cell Differentiation/genetics , Embryo, Mammalian/cytology , Embryo, Mammalian/embryology , Embryonic Development/genetics , Gastrula/cytology , Gastrula/embryology , Gastrulation/genetics , Kidney/cytology , Kidney/embryology , Mesoderm/cytology , Mesoderm/enzymology , Neurons/cytology , Neurons/metabolism , Retina/cytology , Retina/embryology , Somites/cytology , Somites/embryology , Time Factors , Transcription Factors/genetics , Transcription, Genetic , Organ Specificity/genetics
5.
Nature ; 608(7921): 98-107, 2022 08.
Article in English | MEDLINE | ID: mdl-35794474

ABSTRACT

DNA is naturally well suited to serve as a digital medium for in vivo molecular recording. However, contemporary DNA-based memory devices are constrained in terms of the number of distinct 'symbols' that can be concurrently recorded and/or by a failure to capture the order in which events occur1. Here we describe DNA Typewriter, a general system for in vivo molecular recording that overcomes these and other limitations. For DNA Typewriter, the blank recording medium ('DNA Tape') consists of a tandem array of partial CRISPR-Cas9 target sites, with all but the first site truncated at their 5' ends and therefore inactive. Short insertional edits serve as symbols that record the identity of the prime editing guide RNA2 mediating the edit while also shifting the position of the 'type guide' by one unit along the DNA Tape, that is, sequential genome editing. In this proof of concept of DNA Typewriter, we demonstrate recording and decoding of thousands of symbols, complex event histories and short text messages; evaluate the performance of dozens of orthogonal tapes; and construct 'long tape' potentially capable of recording as many as 20 serial events. Finally, we leverage DNA Typewriter in conjunction with single-cell RNA-seq to reconstruct a monophyletic lineage of 3,257 cells and find that the Poisson-like accumulation of sequential edits to multicopy DNA tape can be maintained across at least 20 generations and 25 days of in vitro clonal expansion.


Subject(s)
DNA , Gene Editing , Genome , CRISPR-Cas Systems/genetics , DNA/genetics , Gene Editing/methods , Genome/genetics , RNA, Guide, Kinetoplastida/genetics , RNA-Seq , Single-Cell Analysis , Time Factors
6.
Nature ; 610(7930): 143-153, 2022 10.
Article in English | MEDLINE | ID: mdl-36007540

ABSTRACT

Embryonic stem (ES) cells can undergo many aspects of mammalian embryogenesis in vitro1-5, but their developmental potential is substantially extended by interactions with extraembryonic stem cells, including trophoblast stem (TS) cells, extraembryonic endoderm stem (XEN) cells and inducible XEN (iXEN) cells6-11. Here we assembled stem cell-derived embryos in vitro from mouse ES cells, TS cells and iXEN cells and showed that they recapitulate the development of whole natural mouse embryo in utero up to day 8.5 post-fertilization. Our embryo model displays headfolds with defined forebrain and midbrain regions and develops a beating heart-like structure, a trunk comprising a neural tube and somites, a tail bud containing neuromesodermal progenitors, a gut tube, and primordial germ cells. This complete embryo model develops within an extraembryonic yolk sac that initiates blood island development. Notably, we demonstrate that the neurulating embryo model assembled from Pax6-knockout ES cells aggregated with wild-type TS cells and iXEN cells recapitulates the ventral domain expansion of the neural tube that occurs in natural, ubiquitous Pax6-knockout embryos. Thus, these complete embryoids are a powerful in vitro model for dissecting the roles of diverse cell lineages and genes in development. Our results demonstrate the self-organization ability of ES cells and two types of extraembryonic stem cells to reconstitute mammalian development through and beyond gastrulation to neurulation and early organogenesis.


Subject(s)
Embryo, Mammalian , Gastrulation , Models, Biological , Neurulation , Organogenesis , Animals , Cell Lineage , Embryo, Mammalian/cytology , Embryo, Mammalian/embryology , Embryonic Stem Cells/cytology , Endoderm/cytology , Endoderm/embryology , Heart/embryology , Mesencephalon/embryology , Mice , Neural Tube/embryology , PAX6 Transcription Factor/deficiency , PAX6 Transcription Factor/genetics , Prosencephalon/embryology , Somites/embryology
7.
Nat Methods ; 21(6): 983-993, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38724692

ABSTRACT

The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.


Subject(s)
Gene Expression Regulation, Developmental , Single-Cell Analysis , Single-Cell Analysis/methods , Animals , Mice , Genes, Reporter , Regulatory Sequences, Nucleic Acid , Humans , Transcription Factors/genetics , Transcription Factors/metabolism , Chromatin/genetics , Chromatin/metabolism , Regulatory Elements, Transcriptional , Gene Expression Profiling/methods
8.
Genome Res ; 31(5): 866-876, 2021 05.
Article in English | MEDLINE | ID: mdl-33879525

ABSTRACT

Massively parallel reporter assays (MPRAs) are useful tools to characterize regulatory elements in human genomes. An aspect of MPRAs that is not typically the focus of analysis is their intrinsic ability to differentiate activity levels for a given sequence element when placed in both of its possible orientations relative to the reporter construct. Here, we describe pervasive strand asymmetry of MPRA signals in data sets from multiple reporter configurations in both published and newly reported data. These effects are reproducible across different cell types and in different treatments within a cell type and are observed both within and outside of annotated regulatory elements. From elements in gene bodies, MPRA strand asymmetry favors the sense strand, suggesting that function related to endogenous transcription is driving the phenomenon. Similarly, we find that within Alu mobile element insertions, strand asymmetry favors the transcribed strand of the ancestral retrotransposon. The effect is consistent across the multiplicity of Alu elements in human genomes and is more pronounced in less diverged Alu elements. We find sequence features driving MPRA strand asymmetry and show its prediction from sequence alone. We see some evidence for RNA stabilization and transcriptional activation mechanisms and hypothesize that the effect is driven by natural selection favoring efficient transcription. Our results indicate that strand asymmetry is a pervasive and reproducible feature in MPRA data. More importantly, the fact that MPRA asymmetry favors naturally transcribed strands suggests that it stems from preserved biological functions that have a substantial, global impact on gene and genome evolution.


Subject(s)
Genome, Human , Regulatory Sequences, Nucleic Acid , Gene Expression Regulation , Genes, Reporter , Humans
9.
Mol Syst Biol ; 19(6): e11517, 2023 06 12.
Article in English | MEDLINE | ID: mdl-37154091

ABSTRACT

Recent advances in multiplexed single-cell transcriptomics experiments facilitate the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible. Therefore, computational methods are needed to predict, interpret, and prioritize perturbations. Here, we present the compositional perturbation autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA learns to in silico predict transcriptional perturbation response at the single-cell level for unseen dosages, cell types, time points, and species. Using newly generated single-cell drug combination data, we validate that CPA can predict unseen drug combinations while outperforming baseline models. Additionally, the architecture's modularity enables incorporating the chemical representation of the drugs, allowing the prediction of cellular response to completely unseen drugs. Furthermore, CPA is also applicable to genetic combinatorial screens. We demonstrate this by imputing in silico 5,329 missing combinations (97.6% of all possibilities) in a single-cell Perturb-seq experiment with diverse genetic interactions. We envision CPA will facilitate efficient experimental design and hypothesis generation by enabling in silico response prediction at the single-cell level and thus accelerate therapeutic applications using single-cell technologies.


Subject(s)
Computational Biology , Gene Expression Profiling , High-Throughput Screening Assays , Single-Cell Gene Expression Analysis
10.
Nature ; 562(7726): 217-222, 2018 10.
Article in English | MEDLINE | ID: mdl-30209399

ABSTRACT

Variants of uncertain significance fundamentally limit the clinical utility of genetic information. The challenge they pose is epitomized by BRCA1, a tumour suppressor gene in which germline loss-of-function variants predispose women to breast and ovarian cancer. Although BRCA1 has been sequenced in millions of women, the risk associated with most newly observed variants cannot be definitively assigned. Here we use saturation genome editing to assay 96.5% of all possible single-nucleotide variants (SNVs) in 13 exons that encode functionally critical domains of BRCA1. Functional effects for nearly 4,000 SNVs are bimodally distributed and almost perfectly concordant with established assessments of pathogenicity. Over 400 non-functional missense SNVs are identified, as well as around 300 SNVs that disrupt expression. We predict that these results will be immediately useful for the clinical interpretation of BRCA1 variants, and that this approach can be extended to overcome the challenge of variants of uncertain significance in additional clinically actionable genes.


Subject(s)
BRCA1 Protein/genetics , Gene Editing , Genetic Predisposition to Disease/classification , Genetic Variation/genetics , Genome, Human/genetics , Hereditary Breast and Ovarian Cancer Syndrome/genetics , Cell Line , Exons/genetics , Female , Genes, Essential/genetics , Humans , Loss of Function Mutation/genetics , Models, Molecular , Prognosis , RNA, Messenger/genetics , RNA, Messenger/metabolism , Recombinational DNA Repair/genetics
11.
Nat Methods ; 17(11): 1083-1091, 2020 11.
Article in English | MEDLINE | ID: mdl-33046894

ABSTRACT

Massively parallel reporter assays (MPRAs) functionally screen thousands of sequences for regulatory activity in parallel. To date, there are limited studies that systematically compare differences in MPRA design. Here, we screen a library of 2,440 candidate liver enhancers and controls for regulatory activity in HepG2 cells using nine different MPRA designs. We identify subtle but significant differences that correlate with epigenetic and sequence-level features, as well as differences in dynamic range and reproducibility. We also validate that enhancer activity is largely independent of orientation, at least for our library and designs. Finally, we assemble and test the same enhancers as 192-mers, 354-mers and 678-mers and observe sizable differences. This work provides a framework for the experimental design of high-throughput reporter assays, suggesting that the extended sequence context of tested elements and to a lesser degree the precise assay, influence MPRA results.


Subject(s)
Gene Library , Genes, Reporter , High-Throughput Nucleotide Sequencing/methods , Regulatory Sequences, Nucleic Acid , Sequence Analysis, DNA/methods , Enhancer Elements, Genetic , Hep G2 Cells , Humans , Reproducibility of Results , Transcription Factors/genetics
12.
J Fam Issues ; 44(3): 766-784, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36798515

ABSTRACT

International human rights conventions, Canadian law and academic research all support the right to family life. Internationally and domestically, multiple definitions of family are recognized, acknowledging that long-term interpersonal commitments can be based on biological relationships as well as co-residential, legal, and emotional ties. Yet, the Canadian immigration system's limited and exclusionary understanding of parent-child relationships complicates migrant family reunification. Drawing on qualitative interview and survey data from separated families and key informants who support them, we analyze national status and class assumptions embedded in Canadian immigration standards. We argue that Canadian immigration policies disproportionately deny the right to family life to transnational Canadians and their children who hail from the Global South and/or who are socio-economically disadvantaged. Immigration policies neither recognize the globally accepted "best interests of the child" welfare standard nor the human right to family life. We offer suggestions for addressing these inequities in practice and policy.

13.
J Am Pharm Assoc (2003) ; 62(1): 167-175.e1, 2022.
Article in English | MEDLINE | ID: mdl-34503908

ABSTRACT

BACKGROUND: Over-the-counter (OTC) medication use is associated with risks of adverse drug reactions (ADRs), particularly among older adults. The Drug Facts Label (DFL) is supposed to provide consumers with information that would avoid ADRs, yet research suggests that consumers frequently fail to interact with this critical information. We postulate that emphasizing critical information by placing it on the front of the package may increase its usage. Before doing so, the most critical information from the DFL needs to be identified. OBJECTIVES: This study aimed to determine which information from the DFL is most critical in reducing ADRs at the time of purchase or use by older adults. METHODS: A national survey of practicing pharmacists knowledgeable about OTC medication use by older adults asked participants to rank order the importance of the DFL sections to reduce ADRs in older adults. Open-ended questions focused on identifying ways of improving OTC medication labeling. Quantitative rankings were used to calculate the content validity ratio and analyzed using Wilcoxon signed rank tests. Qualitative results were categorized into themes. RESULTS: A total of 318 responses (12% response rate) were analyzed. There was high consensus that uses and purpose, active ingredient, warnings, and directions for use were the most important sections of the DFL. Within the warning section, 2 specific warnings, "Do not use" and "Ask a doctor or pharmacist," were deemed most important. Similarly, qualitative themes focused on seeking health care provider assistance or were specific to age-related precautions. CONCLUSIONS: Prioritizing warnings that highlight the importance of possible drug-drug and drug-disease precautions and the need to seek medical advice before taking OTC medications were deemed most critical. Moving this type of information to the front of the package may help reduce ADRs among older adults.


Subject(s)
Drug-Related Side Effects and Adverse Reactions , Pharmacists , Aged , Consumer Behavior , Counseling , Drug-Related Side Effects and Adverse Reactions/prevention & control , Humans , Nonprescription Drugs/adverse effects
14.
Clin Chem ; 68(1): 143-152, 2021 12 30.
Article in English | MEDLINE | ID: mdl-34286830

ABSTRACT

BACKGROUND: The urgent need for massively scaled clinical testing for SARS-CoV-2, along with global shortages of critical reagents and supplies, has necessitated development of streamlined laboratory testing protocols. Conventional nucleic acid testing for SARS-CoV-2 involves collection of a clinical specimen with a nasopharyngeal swab in transport medium, nucleic acid extraction, and quantitative reverse-transcription PCR (RT-qPCR). As testing has scaled across the world, the global supply chain has buckled, rendering testing reagents and materials scarce. To address shortages, we developed SwabExpress, an end-to-end protocol developed to employ mass produced anterior nares swabs and bypass the requirement for transport media and nucleic acid extraction. METHODS: We evaluated anterior nares swabs, transported dry and eluted in low-TE buffer as a direct-to-RT-qPCR alternative to extraction-dependent viral transport media. We validated our protocol of using heat treatment for viral inactivation and added a proteinase K digestion step to reduce amplification interference. We tested this protocol across archived and prospectively collected swab specimens to fine-tune test performance. RESULTS: After optimization, SwabExpress has a low limit of detection at 2-4 molecules/µL, 100% sensitivity, and 99.4% specificity when compared side by side with a traditional RT-qPCR protocol employing extraction. On real-world specimens, SwabExpress outperforms an automated extraction system while simultaneously reducing cost and hands-on time. CONCLUSION: SwabExpress is a simplified workflow that facilitates scaled testing for COVID-19 without sacrificing test performance. It may serve as a template for the simplification of PCR-based clinical laboratory tests, particularly in times of critical shortages during pandemics.


Subject(s)
COVID-19 Nucleic Acid Testing/methods , COVID-19 , COVID-19/diagnosis , Clinical Laboratory Techniques , Humans , RNA, Viral/isolation & purification , Real-Time Polymerase Chain Reaction , SARS-CoV-2/isolation & purification , Sensitivity and Specificity , Specimen Handling
15.
Blood ; 143(3): 187-188, 2024 Jan 18.
Article in English | MEDLINE | ID: mdl-38236612
16.
PLoS Comput Biol ; 16(6): e1007956, 2020 06.
Article in English | MEDLINE | ID: mdl-32497118

ABSTRACT

Targeted sequencing of genomic regions is a cost- and time-efficient approach for screening patient cohorts. We present a fast and efficient workflow to analyze highly imbalanced, targeted next-generation sequencing data generated using molecular inversion probe (MIP) capture. Our Snakemake pipeline performs sample demultiplexing, overlap paired-end merging, alignment, MIP-arm trimming, variant calling, coverage analysis and report generation. Further, we support the analysis of probes specifically designed to capture certain structural variants and can assign sex using Y-chromosome-unique probes. In a user-friendly HTML report, we summarize all these results including covered, incomplete or missing regions, called variants and their predicted effects. We developed and tested our pipeline using the hemophilia A & B MIP design from the "My Life, Our Future" initiative. HemoMIPs is available as an open-source tool on GitHub at: https://github.com/kircherlab/hemoMIPs.


Subject(s)
Automation , Chromosomes, Human, Y , Genetic Testing/methods , High-Throughput Nucleotide Sequencing/methods , Cohort Studies , Humans , Male , Programming Languages
17.
J Am Pharm Assoc (2003) ; 61(6): e71-e75, 2021.
Article in English | MEDLINE | ID: mdl-34456146

ABSTRACT

BACKGROUND: In today's culture, cannabis and its cannabinoids are used for both recreational and medicinal purposes. Patients are able to obtain medical and commercial cannabis products. Pharmacists should feel comfortable counseling their patients, given the increased interest, access, and use of these products. OBJECTIVES: The objective of this survey was to assess the familiarity, attitudes, and knowledge of Wisconsin pharmacists regarding products containing cannabinoids. METHODS: An anonymous, Web-based survey was administered to 511 Wisconsin pharmacists using the Pharmacy Practice Enhancement and Action Research Link. The survey was adapted from a nationally developed survey with established validity evidence. Survey items evaluated pharmacists' knowledge of the legality and the pharmacokinetic and pharmacodynamic properties of cannabis. The survey included knowledge (22 items), familiarity (14 items), and attitude (8 items) scales as well as pharmacist demographics and workplace type. Descriptive statistics, Fisher exact test, and Cronbach's alpha were calculated. RESULTS: The survey had a response rate of 19.3%. Nearly 75% of respondents were unfamiliar with the testing practices and pesticide regulations on cannabis production. Pharmacists were also unfamiliar with doses related to commercially available cannabinoid products. A quarter reported that they counsel at least monthly on cannabinoid therapies, but results showed that the majority are uncomfortable with the pharmacology and pharmacotherapy of these compounds. Over two-thirds reported that they need further education on cannabinoids and ranked continuing pharmacy education credits and webinars as their preferred method of learning. Over two-thirds at least somewhat agreed that they would feel comfortable recommending a Food and Drug Administration (FDA)-approved treatment, but a similar proportion reported that they would not recommend non-FDA approved cannabinoid treatments. CONCLUSION: Wisconsin pharmacists require more education to fill knowledge gaps regarding the therapeutic uses of cannabinoid products.


Subject(s)
Cannabinoids , Community Pharmacy Services , Attitude of Health Personnel , Health Knowledge, Attitudes, Practice , Humans , Pharmacists , Surveys and Questionnaires , Wisconsin
18.
Genome Res ; 27(1): 38-52, 2017 01.
Article in English | MEDLINE | ID: mdl-27831498

ABSTRACT

Candidate enhancers can be identified on the basis of chromatin modifications, the binding of chromatin modifiers and transcription factors and cofactors, or chromatin accessibility. However, validating such candidates as bona fide enhancers requires functional characterization, typically achieved through reporter assays that test whether a sequence can increase expression of a transcriptional reporter via a minimal promoter. A longstanding concern is that reporter assays are mainly implemented on episomes, which are thought to lack physiological chromatin. However, the magnitude and determinants of differences in cis-regulation for regulatory sequences residing in episomes versus chromosomes remain almost completely unknown. To address this systematically, we developed and applied a novel lentivirus-based massively parallel reporter assay (lentiMPRA) to directly compare the functional activities of 2236 candidate liver enhancers in an episomal versus a chromosomally integrated context. We find that the activities of chromosomally integrated sequences are substantially different from the activities of the identical sequences assayed on episomes, and furthermore are correlated with different subsets of ENCODE annotations. The results of chromosomally based reporter assays are also more reproducible and more strongly predictable by both ENCODE annotations and sequence-based models. With a linear model that combines chromatin annotations and sequence information, we achieve a Pearson's R2 of 0.362 for predicting the results of chromosomally integrated reporter assays. This level of prediction is better than with either chromatin annotations or sequence information alone and also outperforms predictive models of episomal assays. Our results have broad implications for how cis-regulatory elements are identified, prioritized and functionally validated.


Subject(s)
Chromatin/genetics , Enhancer Elements, Genetic/genetics , Gene Expression Regulation/genetics , Plasmids/genetics , Chromatin Assembly and Disassembly/genetics , Chromosomes/genetics , Genes, Reporter , High-Throughput Nucleotide Sequencing , Humans , Promoter Regions, Genetic , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors
19.
Hum Mutat ; 40(9): 1280-1291, 2019 09.
Article in English | MEDLINE | ID: mdl-31106481

ABSTRACT

The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.


Subject(s)
DNA/chemistry , Epigenomics/methods , Point Mutation , Binding Sites , Cell Line , Chromatin/genetics , DNA/metabolism , Enhancer Elements, Genetic , Genetic Predisposition to Disease , Humans , Machine Learning , Promoter Regions, Genetic , Transcription Factors/metabolism
20.
Nature ; 500(7461): 207-11, 2013 Aug 08.
Article in English | MEDLINE | ID: mdl-23925245

ABSTRACT

The HeLa cell line was established in 1951 from cervical cancer cells taken from a patient, Henrietta Lacks. This was the first successful attempt to immortalize human-derived cells in vitro. The robust growth and unrestricted distribution of HeLa cells resulted in its broad adoption--both intentionally and through widespread cross-contamination--and for the past 60 years it has served a role analogous to that of a model organism. The cumulative impact of the HeLa cell line on research is demonstrated by its occurrence in more than 74,000 PubMed abstracts (approximately 0.3%). The genomic architecture of HeLa remains largely unexplored beyond its karyotype, partly because like many cancers, its extensive aneuploidy renders such analyses challenging. We carried out haplotype-resolved whole-genome sequencing of the HeLa CCL-2 strain, examined point- and indel-mutation variations, mapped copy-number variations and loss of heterozygosity regions, and phased variants across full chromosome arms. We also investigated variation and copy-number profiles for HeLa S3 and eight additional strains. We find that HeLa is relatively stable in terms of point variation, with few new mutations accumulating after early passaging. Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region of chromosome 8q24.21 at which integration of the human papilloma virus type 18 (HPV-18) genome occurred and that is likely to be the event that initiated tumorigenesis. We combined these maps with RNA-seq and ENCODE Project data sets to phase the HeLa epigenome. This revealed strong, haplotype-specific activation of the proto-oncogene MYC by the integrated HPV-18 genome approximately 500 kilobases upstream, and enabled global analyses of the relationship between gene dosage and expression. These data provide an extensively phased, high-quality reference genome for past and future experiments relying on HeLa, and demonstrate the value of haplotype resolution for characterizing cancer genomes and epigenomes.


Subject(s)
Epigenomics , Genome, Human/genetics , Aneuploidy , DNA Copy Number Variations , Female , Genes, myc/genetics , Haplotypes , HeLa Cells , Human papillomavirus 18/genetics , Human papillomavirus 18/physiology , Humans , Molecular Sequence Data , Mutation , Proto-Oncogene Mas , Sequence Analysis, DNA , Transcriptional Activation/genetics , Uterine Cervical Neoplasms/genetics , Uterine Cervical Neoplasms/pathology , Uterine Cervical Neoplasms/virology
SELECTION OF CITATIONS
SEARCH DETAIL