Rechercher | Portail Régional BVS

1.

Dissecting the cis-regulatory syntax of transcription initiation with deep learning.

Cochran, Kelly; Yin, Melody; Mantripragada, Anika; Schreiber, Jacob; Marinov, Georgi K; Kundaje, Anshul.

bioRxiv ; 2024 May 31.

Article de Anglais | MEDLINE | ID: mdl-38853896

RÉSUMÉ

Despite extensive characterization of mammalian Pol II transcription, the DNA sequence determinants of transcription initiation at a third of human promoters and most enhancers remain poorly understood. Hence, we trained and interpreted a neural network called ProCapNet that accurately models base-resolution initiation profiles from PRO-cap experiments using local DNA sequence. ProCapNet learns sequence motifs with distinct effects on initiation rates and TSS positioning and uncovers context-specific cryptic initiator elements intertwined within other TF motifs. ProCapNet annotates predictive motifs in nearly all actively transcribed regulatory elements across multiple cell-lines, revealing a shared cis-regulatory logic across promoters and enhancers mediated by a highly epistatic sequence syntax of cooperative and competitive motif interactions. ProCapNet models of RAMPAGE profiles measuring steady-state RNA abundance at TSSs distill initiation signals on par with models trained directly on PRO-cap profiles. ProCapNet learns a largely cell-type-agnostic cis-regulatory code of initiation complementing sequence drivers of cell-type-specific chromatin state critical for accurate prediction of cell-type-specific transcription initiation.

2.

Transfer learning reveals sequence determinants of the quantitative response to transcription factor dosage.

Naqvi, Sahin; Kim, Seungsoo; Tabatabaee, Saman; Pampari, Anusri; Kundaje, Anshul; Pritchard, Jonathan K; Wysocka, Joanna.

bioRxiv ; 2024 May 29.

Article de Anglais | MEDLINE | ID: mdl-38853998

RÉSUMÉ

Deep learning approaches have made significant advances in predicting cell type-specific chromatin patterns from the identity and arrangement of transcription factor (TF) binding motifs. However, most models have been applied in unperturbed contexts, precluding a predictive understanding of how chromatin state responds to TF perturbation. Here, we used transfer learning to train and interpret deep learning models that use DNA sequence to predict, with accuracy approaching experimental reproducibility, how the concentration of two dosage-sensitive TFs (TWIST1, SOX9) affects regulatory element (RE) chromatin accessibility in facial progenitor cells. High-affinity motifs that allow for heterotypic TF co-binding and are concentrated at the center of REs buffer against quantitative changes in TF dosage and strongly predict unperturbed accessibility. In contrast, motifs with low-affinity or homotypic binding distributed throughout REs lead to sensitive responses with minimal contributions to unperturbed accessibility. Both buffering and sensitizing features show signatures of purifying selection. We validated these predictive sequence features using reporter assays and showed that a biophysical model of TF-nucleosome competition can explain the sensitizing effect of low-affinity motifs. Our approach of combining transfer learning and quantitative measurements of the chromatin response to TF dosage therefore represents a powerful method to reveal additional layers of the cis-regulatory code.

3.

An updated compendium and reevaluation of the evidence for nuclear transcription factor occupancy over the mitochondrial genome.

Marinov, Georgi K; Ramalingam, Vivekanandan; Greenleaf, William J; Kundaje, Anshul.

bioRxiv ; 2024 Jun 06.

Article de Anglais | MEDLINE | ID: mdl-38895386

RÉSUMÉ

In most eukaryotes, mitochondrial organelles contain their own genome, usually circular, which is the remnant of the genome of the ancestral bacterial endosymbiont that gave rise to modern mitochondria. Mitochondrial genomes are dramatically reduced in their gene content due to the process of endosymbiotic gene transfer to the nucleus; as a result most mitochondrial proteins are encoded in the nucleus and imported into mitochondria. This includes the components of the dedicated mitochondrial transcription and replication systems and regulatory factors, which are entirely distinct from the information processing systems in the nucleus. However, since the 1990s several nuclear transcription factors have been reported to act in mitochondria, and previously we identified 8 human and 3 mouse transcription factors (TFs) with strong localized enrichment over the mitochondrial genome using ChIP-seq (Chromatin Immunoprecipitation) datasets from the second phase of the ENCODE (Encyclopedia of DNA Elements) Project Consortium. Here, we analyze the greatly expanded in the intervening decade ENCODE compendium of TF ChIP-seq datasets (a total of 6,153 ChIP experiments for 942 proteins, of which 763 are sequence-specific TFs) combined with interpretative deep learning models of TF occupancy to create a comprehensive compendium of nuclear TFs that show evidence of association with the mitochondrial genome. We find some evidence for chrM occupancy for 50 nuclear TFs and two other proteins, with bZIP TFs emerging as most likely to be playing a role in mitochondria. However, we also observe that in cases where the same TF has been assayed with multiple antibodies and ChIP protocols, evidence for its chrM occupancy is not always reproducible. In the light of these findings, we discuss the evidential criteria for establishing chrM occupancy and reevaluate the overall compendium of putative mitochondrial-acting nuclear TFs.

4.

Multiplexed single-cell characterization of alternative polyadenylation regulators.

Kowalski, Madeline H; Wessels, Hans-Hermann; Linder, Johannes; Dalgarno, Carol; Mascio, Isabella; Choudhary, Saket; Hartman, Austin; Hao, Yuhan; Kundaje, Anshul; Satija, Rahul.

Cell ; 187(16): 4408-4425.e23, 2024 Aug 08.

Article de Anglais | MEDLINE | ID: mdl-38925112

RÉSUMÉ

Most mammalian genes have multiple polyA sites, representing a substantial source of transcript diversity regulated by the cleavage and polyadenylation (CPA) machinery. To better understand how these proteins govern polyA site choice, we introduce CPA-Perturb-seq, a multiplexed perturbation screen dataset of 42 CPA regulators with a 3' scRNA-seq readout that enables transcriptome-wide inference of polyA site usage. We develop a framework to detect perturbation-dependent changes in polyadenylation and characterize modules of co-regulated polyA sites. We find groups of intronic polyA sites regulated by distinct components of the nuclear RNA life cycle, including elongation, splicing, termination, and surveillance. We train and validate a deep neural network (APARENT-Perturb) for tandem polyA site usage, delineating a cis-regulatory code that predicts perturbation response and reveals interactions between regulatory complexes. Our work highlights the potential for multiplexed single-cell perturbation screens to further our understanding of post-transcriptional regulation.

Sujet(s)

Poly A , Polyadénylation , Analyse sur cellule unique , Analyse sur cellule unique/méthodes , Humains , Poly A/métabolisme , Animaux , Souris , Introns/génétique , Transcriptome/génétique , ARN messager/métabolisme , ARN messager/génétique , Régulation de l'expression des gènes

5.

Using a comprehensive atlas and predictive models to reveal the complexity and evolution of brain-active regulatory elements.

Pratt, Henry E; Andrews, Gregory; Shedd, Nicole; Phalke, Nishigandha; Li, Tongxin; Pampari, Anusri; Jensen, Matthew; Wen, Cindy; Consortium, PsychENCODE; Gandal, Michael J; Geschwind, Daniel H; Gerstein, Mark; Moore, Jill; Kundaje, Anshul; Colubri, Andrés; Weng, Zhiping.

Sci Adv ; 10(21): eadj4452, 2024 May 24.

Article de Anglais | MEDLINE | ID: mdl-38781344

RÉSUMÉ

Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.

Sujet(s)

Encéphale , Épigenèse génétique , Séquences d'acides nucléiques régulatrices , Humains , Encéphale/métabolisme , Séquences d'acides nucléiques régulatrices/génétique , Animaux , Évolution moléculaire , Troubles mentaux/génétique , Éléments de régulation transcriptionnelle/génétique , Neurones/métabolisme , Régulation de l'expression des gènes , Facteurs de transcription/génétique , Facteurs de transcription/métabolisme

6.

Genome-wide interaction study of dietary intake of fibre, fruits, and vegetables with risk of colorectal cancer.

Papadimitriou, Nikos; Kim, Andre; Kawaguchi, Eric S; Morrison, John; Diez-Obrero, Virginia; Albanes, Demetrius; Berndt, Sonja I; Bézieau, Stéphane; Bien, Stephanie A; Bishop, D Timothy; Bouras, Emmanouil; Brenner, Hermann; Buchanan, Daniel D; Campbell, Peter T; Carreras-Torres, Robert; Chan, Andrew T; Chang-Claude, Jenny; Conti, David V; Devall, Matthew A; Dimou, Niki; Drew, David A; Gruber, Stephen B; Harrison, Tabitha A; Hoffmeister, Michael; Huyghe, Jeroen R; Joshi, Amit D; Keku, Temitope O; Kundaje, Anshul; Küry, Sébastien; Le Marchand, Loic; Lewinger, Juan Pablo; Li, Li; Lynch, Brigid M; Moreno, Victor; Newton, Christina C; Obón-Santacana, Mireia; Ose, Jennifer; Pellatt, Andrew J; Peoples, Anita R; Platz, Elizabeth A; Qu, Conghui; Rennert, Gad; Ruiz-Narvaez, Edward; Shcherbina, Anna; Stern, Mariana C; Su, Yu-Ru; Thomas, Duncan C; Thomas, Claire E; Tian, Yu; Tsilidis, Konstantinos K.

EBioMedicine ; 104: 105146, 2024 Jun.

Article de Anglais | MEDLINE | ID: mdl-38749303

RÉSUMÉ

BACKGROUND: Consumption of fibre, fruits and vegetables have been linked with lower colorectal cancer (CRC) risk. A genome-wide gene-environment (G × E) analysis was performed to test whether genetic variants modify these associations. METHODS: A pooled sample of 45 studies including up to 69,734 participants (cases: 29,896; controls: 39,838) of European ancestry were included. To identify G × E interactions, we used the traditional 1--degree-of-freedom (DF) G × E test and to improve power a 2-step procedure and a 3DF joint test that investigates the association between a genetic variant and dietary exposure, CRC risk and G × E interaction simultaneously. FINDINGS: The 3-DF joint test revealed two significant loci with p-value <5 × 10-8. Rs4730274 close to the SLC26A3 gene showed an association with fibre (p-value: 2.4 × 10-3) and G × fibre interaction with CRC (OR per quartile of fibre increase = 0.87, 0.80, and 0.75 for CC, TC, and TT genotype, respectively; G × E p-value: 1.8 × 10-7). Rs1620977 in the NEGR1 gene showed an association with fruit intake (p-value: 1.0 × 10-8) and G × fruit interaction with CRC (OR per quartile of fruit increase = 0.75, 0.65, and 0.56 for AA, AG, and GG genotype, respectively; G × E -p-value: 0.029). INTERPRETATION: We identified 2 loci associated with fibre and fruit intake that also modify the association of these dietary factors with CRC risk. Potential mechanisms include chronic inflammatory intestinal disorders, and gut function. However, further studies are needed for mechanistic validation and replication of findings. FUNDING: National Institutes of Health, National Cancer Institute. Full funding details for the individual consortia are provided in acknowledgments.

Sujet(s)

Tumeurs colorectales , Fibre alimentaire , Fruit , Interaction entre gènes et environnement , Prédisposition génétique à une maladie , Étude d'association pangénomique , Polymorphisme de nucléotide simple , Légumes , Humains , Tumeurs colorectales/génétique , Tumeurs colorectales/étiologie , Fibre alimentaire/administration et posologie , Génotype , Régime alimentaire , Mâle , Femelle , Facteurs de risque

7.

Two genome-wide interaction loci modify the association of nonsteroidal anti-inflammatory drugs with colorectal cancer.

Drew, David A; Kim, Andre E; Lin, Yi; Qu, Conghui; Morrison, John; Lewinger, Juan Pablo; Kawaguchi, Eric; Wang, Jun; Fu, Yubo; Zemlianskaia, Natalia; Díez-Obrero, Virginia; Bien, Stephanie A; Dimou, Niki; Albanes, Demetrius; Baurley, James W; Wu, Anna H; Buchanan, Daniel D; Potter, John D; Prentice, Ross L; Harlid, Sophia; Arndt, Volker; Barry, Elizabeth L; Berndt, Sonja I; Bouras, Emmanouil; Brenner, Hermann; Budiarto, Arif; Burnett-Hartman, Andrea; Campbell, Peter T; Carreras-Torres, Robert; Casey, Graham; Chang-Claude, Jenny; Conti, David V; Devall, Matthew A M; Figueiredo, Jane C; Gruber, Stephen B; Gsur, Andrea; Gunter, Marc J; Harrison, Tabitha A; Hidaka, Akihisa; Hoffmeister, Michael; Huyghe, Jeroen R; Jenkins, Mark A; Jordahl, Kristina M; Kundaje, Anshul; Le Marchand, Loic; Li, Li; Lynch, Brigid M; Murphy, Neil; Nassir, Rami; Newcomb, Polly A.

Sci Adv ; 10(22): eadk3121, 2024 May 31.

Article de Anglais | MEDLINE | ID: mdl-38809988

RÉSUMÉ

Regular, long-term aspirin use may act synergistically with genetic variants, particularly those in mechanistically relevant pathways, to confer a protective effect on colorectal cancer (CRC) risk. We leveraged pooled data from 52 clinical trial, cohort, and case-control studies that included 30,806 CRC cases and 41,861 controls of European ancestry to conduct a genome-wide interaction scan between regular aspirin/nonsteroidal anti-inflammatory drug (NSAID) use and imputed genetic variants. After adjusting for multiple comparisons, we identified statistically significant interactions between regular aspirin/NSAID use and variants in 6q24.1 (top hit rs72833769), which has evidence of influencing expression of TBC1D7 (a subunit of the TSC1-TSC2 complex, a key regulator of MTOR activity), and variants in 5p13.1 (top hit rs350047), which is associated with expression of PTGER4 (codes a cell surface receptor directly involved in the mode of action of aspirin). Genetic variants with functional impact may modulate the chemopreventive effect of regular aspirin use, and our study identifies putative previously unidentified targets for additional mechanistic interrogation.

Sujet(s)

Anti-inflammatoires non stéroïdiens , Tumeurs colorectales , Étude d'association pangénomique , Polymorphisme de nucléotide simple , Humains , Tumeurs colorectales/génétique , Tumeurs colorectales/traitement médicamenteux , Anti-inflammatoires non stéroïdiens/pharmacologie , Acide acétylsalicylique/pharmacologie , Sous-type EP4 des récepteurs des prostaglandines E/génétique , Sous-type EP4 des récepteurs des prostaglandines E/métabolisme , Mâle , Prédisposition génétique à une maladie , Femelle , Études cas-témoins , Adulte d'âge moyen , Locus génétiques , Sujet âgé

8.

Predicting chromatin conformation contact maps.

Min, Alan; Schreiber, Jacob; Kundaje, Anshul; Noble, William Stafford.

bioRxiv ; 2024 Apr 14.

Article de Anglais | MEDLINE | ID: mdl-38645064

RÉSUMÉ

Over the past 15 years, a variety of next-generation sequencing assays have been developed for measuring the 3D conformation of DNA in the nucleus. Each of these assays gives, for a particular cell or tissue type, a distinct picture of 3D chromatin architecture. Accordingly, making sense of the relationship between genome structure and function requires teasing apart two closely related questions: how does chromatin 3D structure change from one cell type to the next, and how do different measurements of that structure differ from one another, even when the two assays are carried out in the same cell type? In this work, we assemble a collection of chromatin 3D datasets-each represented as a 2D contact map- spanning multiple assay types and cell types. We then build a machine learning model that predicts missing contact maps in this collection. We use the model to systematically explore how genome 3D architecture changes, at the level of compartments, domains, and loops, between cell type and between assay types.

9.

Genetic risk impacts the association of menopausal hormone therapy with colorectal cancer risk.

Tian, Yu; Lin, Yi; Qu, Conghui; Arndt, Volker; Baurley, James W; Berndt, Sonja I; Bien, Stephanie A; Bishop, D Timothy; Brenner, Hermann; Buchanan, Daniel D; Budiarto, Arif; Campbell, Peter T; Carreras-Torres, Robert; Casey, Graham; Chan, Andrew T; Chen, Rui; Chen, Xuechen; Conti, David V; Díez-Obrero, Virginia; Dimou, Niki; Drew, David A; Figueiredo, Jane C; Gallinger, Steven; Giles, Graham G; Gruber, Stephen B; Gunter, Marc J; Harlid, Sophia; Harrison, Tabitha A; Hidaka, Akihisa; Hoffmeister, Michael; Huyghe, Jeroen R; Jenkins, Mark A; Jordahl, Kristina M; Joshi, Amit D; Keku, Temitope O; Kawaguchi, Eric; Kim, Andre E; Kundaje, Anshul; Larsson, Susanna C; Marchand, Loic Le; Lewinger, Juan Pablo; Li, Li; Moreno, Victor; Morrison, John; Murphy, Neil; Nan, Hongmei; Nassir, Rami; Newcomb, Polly A; Obón-Santacana, Mireia; Ogino, Shuji.

Br J Cancer ; 130(10): 1687-1696, 2024 Jun.

Article de Anglais | MEDLINE | ID: mdl-38561434

RÉSUMÉ

BACKGROUND: Menopausal hormone therapy (MHT), a common treatment to relieve symptoms of menopause, is associated with a lower risk of colorectal cancer (CRC). To inform CRC risk prediction and MHT risk-benefit assessment, we aimed to evaluate the joint association of a polygenic risk score (PRS) for CRC and MHT on CRC risk. METHODS: We used data from 28,486 postmenopausal women (11,519 cases and 16,967 controls) of European descent. A PRS based on 141 CRC-associated genetic variants was modeled as a categorical variable in quartiles. Multiplicative interaction between PRS and MHT use was evaluated using logistic regression. Additive interaction was measured using the relative excess risk due to interaction (RERI). 30-year cumulative risks of CRC for 50-year-old women according to MHT use and PRS were calculated. RESULTS: The reduction in odds ratios by MHT use was larger in women within the highest quartile of PRS compared to that in women within the lowest quartile of PRS (p-value = 2.7 × 10-8). At the highest quartile of PRS, the 30-year CRC risk was statistically significantly lower for women taking any MHT than for women not taking any MHT, 3.7% (3.3%-4.0%) vs 6.1% (5.7%-6.5%) (difference 2.4%, P-value = 1.83 × 10-14); these differences were also statistically significant but smaller in magnitude in the lowest PRS quartile, 1.6% (1.4%-1.8%) vs 2.2% (1.9%-2.4%) (difference 0.6%, P-value = 1.01 × 10-3), indicating 4 times greater reduction in absolute risk associated with any MHT use in the highest compared to the lowest quartile of genetic CRC risk. CONCLUSIONS: MHT use has a greater impact on the reduction of CRC risk for women at higher genetic risk. These findings have implications for the development of risk prediction models for CRC and potentially for the consideration of genetic information in the risk-benefit assessment of MHT use.

Sujet(s)

Tumeurs colorectales , Prédisposition génétique à une maladie , Humains , Femelle , Tumeurs colorectales/génétique , Tumeurs colorectales/épidémiologie , Adulte d'âge moyen , Études cas-témoins , Facteurs de risque , Sujet âgé , Hormonothérapie substitutive/effets indésirables , Appréciation des risques , Ménopause , Post-ménopause , Oestrogénothérapie substitutive/effets indésirables

10.

Multicenter integrated analysis of noncoding CRISPRi screens.

Yao, David; Tycko, Josh; Oh, Jin Woo; Bounds, Lexi R; Gosai, Sager J; Lataniotis, Lazaros; Mackay-Smith, Ava; Doughty, Benjamin R; Gabdank, Idan; Schmidt, Henri; Guerrero-Altamirano, Tania; Siklenka, Keith; Guo, Katherine; White, Alexander D; Youngworth, Ingrid; Andreeva, Kalina; Ren, Xingjie; Barrera, Alejandro; Luo, Yunhai; Yardimci, Galip Gürkan; Tewhey, Ryan; Kundaje, Anshul; Greenleaf, William J; Sabeti, Pardis C; Leslie, Christina; Pritykin, Yuri; Moore, Jill E; Beer, Michael A; Gersbach, Charles A; Reddy, Timothy E; Shen, Yin; Engreitz, Jesse M; Bassik, Michael C; Reilly, Steven K.

Nat Methods ; 21(4): 723-734, 2024 Apr.

Article de Anglais | MEDLINE | ID: mdl-38504114

RÉSUMÉ

The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.

Sujet(s)

Systèmes CRISPR-Cas , Clustered regularly interspaced short palindromic repeats , Humains , Clustered regularly interspaced short palindromic repeats/génétique , Systèmes CRISPR-Cas/génétique , Génome , Cellules K562 ,

11.

Protocol for mapping the three-dimensional organization of dinoflagellate genomes.

Marinov, Georgi K; Kundaje, Anshul; Greenleaf, William J; Grossman, Arthur R.

STAR Protoc ; 5(2): 102941, 2024 Jun 21.

Article de Anglais | MEDLINE | ID: mdl-38483898

RÉSUMÉ

Dinoflagellate genomes often are very large and difficult to assemble, which has until recently precluded their analysis with modern functional genomic tools. Here, we present a protocol for mapping three-dimensional (3D) genome organization in dinoflagellates and using it for scaffolding their genome assemblies. We describe steps for crosslinking, nuclear lysis, denaturation, restriction digest, ligation, and DNA shearing and purification. We then detail procedures sequencing library generation and computational analysis, including initial Hi-C read mapping and 3D-DNA scaffolding/assembly correction. For complete details on the use and execution of this protocol, please refer to Marinov et al.1.

Sujet(s)

Dinoflagellida , Génome de protozoaire , Dinoflagellida/génétique , Génome de protozoaire/génétique , Génomique/méthodes , Cartographie chromosomique/méthodes , Analyse de séquence d'ADN/méthodes

12.

Disease diagnostics using machine learning of immune receptors.

Zaslavsky, Maxim E; Craig, Erin; Michuda, Jackson K; Sehgal, Nidhi; Ram-Mohan, Nikhil; Lee, Ji-Yeun; Nguyen, Khoa D; Hoh, Ramona A; Pham, Tho D; Röltgen, Katharina; Lam, Brandon; Parsons, Ella S; Macwana, Susan R; DeJager, Wade; Drapeau, Elizabeth M; Roskin, Krishna M; Cunningham-Rundles, Charlotte; Moody, M Anthony; Haynes, Barton F; Goldman, Jason D; Heath, James R; Nadeau, Kari C; Pinsky, Benjamin A; Blish, Catherine A; Hensley, Scott E; Jensen, Kent; Meyer, Everett; Balboni, Imelda; Utz, Paul J; Merrill, Joan T; Guthridge, Joel M; James, Judith A; Yang, Samuel; Tibshirani, Robert; Kundaje, Anshul; Boyd, Scott D.

bioRxiv ; 2024 Apr 03.

Article de Anglais | MEDLINE | ID: mdl-35547855

RÉSUMÉ

Clinical diagnosis typically incorporates physical examination, patient history, and various laboratory tests and imaging studies, but makes limited use of the human system's own record of antigen exposures encoded by receptors on B cells and T cells. We analyzed immune receptor datasets from 593 individuals to develop MAchine Learning for Immunological Diagnosis (Mal-ID) , an interpretive framework to screen for multiple illnesses simultaneously or precisely test for one condition. This approach detects specific infections, autoimmune disorders, vaccine responses, and disease severity differences. Human-interpretable features of the model recapitulate known immune responses to SARS-CoV-2, Influenza, and HIV, highlight antigen-specific receptors, and reveal distinct characteristics of Systemic Lupus Erythematosus and Type-1 Diabetes autoreactivity. This analysis framework has broad potential for scientific and clinical interpretation of human immune responses.

13.

Drug Discovery in Low Data Regimes: Leveraging a Computational Pipeline for the Discovery of Novel SARS-CoV-2 Nsp14-MTase Inhibitors.

Nigam, AkshatKumar; Hurley, Matthew F D; Li, Fengling; Konkolová, Eva; Klíma, Martin; Trylcová, Jana; Pollice, Robert; Çinaroglu, Süleyman Selim; Levin-Konigsberg, Roni; Handjaya, Jasemine; Schapira, Matthieu; Chau, Irene; Perveen, Sumera; Ng, Ho-Leung; Ümit Kaniskan, H; Han, Yulin; Singh, Sukrit; Gorgulla, Christoph; Kundaje, Anshul; Jin, Jian; Voelz, Vincent A; Weber, Jan; Nencka, Radim; Boura, Evzen; Vedadi, Masoud; Aspuru-Guzik, Alán.

bioRxiv ; 2024 Jan 13.

Article de Anglais | MEDLINE | ID: mdl-37873443

RÉSUMÉ

The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has led to significant global morbidity and mortality. A crucial viral protein, the non-structural protein 14 (nsp14), catalyzes the methylation of viral RNA and plays a critical role in viral genome replication and transcription. Due to the low mutation rate in the nsp region among various SARS-CoV-2 variants, nsp14 has emerged as a promising therapeutic target. However, discovering potential inhibitors remains a challenge. In this work, we introduce a computational pipeline for the rapid and efficient identification of potential nsp14 inhibitors by leveraging virtual screening and the NCI open compound collection, which contains 250,000 freely available molecules for researchers worldwide. The introduced pipeline provides a cost-effective and efficient approach for early-stage drug discovery by allowing researchers to evaluate promising molecules without incurring synthesis expenses. Our pipeline successfully identified seven promising candidates after experimentally validating only 40 compounds. Notably, we discovered NSC620333, a compound that exhibits a strong binding affinity to nsp14 with a dissociation constant of 427 ± 84 nM. In addition, we gained new insights into the structure and function of this protein through molecular dynamics simulations. We identified new conformational states of the protein and determined that residues Phe367, Tyr368, and Gln354 within the binding pocket serve as stabilizing residues for novel ligand interactions. We also found that metal coordination complexes are crucial for the overall function of the binding pocket. Lastly, we present the solved crystal structure of the nsp14-MTase complexed with SS148 (PDB:8BWU), a potent inhibitor of methyltransferase activity at the nanomolar level (IC50 value of 70 ± 6 nM). Our computational pipeline accurately predicted the binding pose of SS148, demonstrating its effectiveness and potential in accelerating drug discovery efforts against SARS-CoV-2 and other emerging viruses.

14.

Genome-Wide Gene-Environment Interaction Analyses to Understand the Relationship between Red Meat and Processed Meat Intake and Colorectal Cancer Risk.

Stern, Mariana C; Sanchez Mendez, Joel; Kim, Andre E; Obón-Santacana, Mireia; Moratalla-Navarro, Ferran; Martín, Vicente; Moreno, Victor; Lin, Yi; Bien, Stephanie A; Qu, Conghui; Su, Yu-Ru; White, Emily; Harrison, Tabitha A; Huyghe, Jeroen R; Tangen, Catherine M; Newcomb, Polly A; Phipps, Amanda I; Thomas, Claire E; Kawaguchi, Eric S; Lewinger, Juan Pablo; Morrison, John L; Conti, David V; Wang, Jun; Thomas, Duncan C; Platz, Elizabeth A; Visvanathan, Kala; Keku, Temitope O; Newton, Christina C; Um, Caroline Y; Kundaje, Anshul; Shcherbina, Anna; Murphy, Neil; Gunter, Marc J; Dimou, Niki; Papadimitriou, Nikos; Bézieau, Stéphane; van Duijnhoven, Franzel J B; Männistö, Satu; Rennert, Gad; Wolk, Alicja; Hoffmeister, Michael; Brenner, Hermann; Chang-Claude, Jenny; Tian, Yu; Le Marchand, Loïc; Cotterchio, Michelle; Tsilidis, Konstantinos K; Bishop, D Timothy; Melaku, Yohannes Adama; Lynch, Brigid M.

Cancer Epidemiol Biomarkers Prev ; 33(3): 400-410, 2024 03 01.

Article de Anglais | MEDLINE | ID: mdl-38112776

RÉSUMÉ

BACKGROUND: High red meat and/or processed meat consumption are established colorectal cancer risk factors. We conducted a genome-wide gene-environment (GxE) interaction analysis to identify genetic variants that may modify these associations. METHODS: A pooled sample of 29,842 colorectal cancer cases and 39,635 controls of European ancestry from 27 studies were included. Quantiles for red meat and processed meat intake were constructed from harmonized questionnaire data. Genotyping arrays were imputed to the Haplotype Reference Consortium. Two-step EDGE and joint tests of GxE interaction were utilized in our genome-wide scan. RESULTS: Meta-analyses confirmed positive associations between increased consumption of red meat and processed meat with colorectal cancer risk [per quartile red meat OR = 1.30; 95% confidence interval (CI) = 1.21-1.41; processed meat OR = 1.40; 95% CI = 1.20-1.63]. Two significant genome-wide GxE interactions for red meat consumption were found. Joint GxE tests revealed the rs4871179 SNP in chromosome 8 (downstream of HAS2); greater than median of consumption ORs = 1.38 (95% CI = 1.29-1.46), 1.20 (95% CI = 1.12-1.27), and 1.07 (95% CI = 0.95-1.19) for CC, CG, and GG, respectively. The two-step EDGE method identified the rs35352860 SNP in chromosome 18 (SMAD7 intron); greater than median of consumption ORs = 1.18 (95% CI = 1.11-1.24), 1.35 (95% CI = 1.26-1.44), and 1.46 (95% CI = 1.26-1.69) for CC, CT, and TT, respectively. CONCLUSIONS: We propose two novel biomarkers that support the role of meat consumption with an increased risk of colorectal cancer. IMPACT: The reported GxE interactions may explain the increased risk of colorectal cancer in certain population subgroups.

Sujet(s)

Tumeurs colorectales , Viande rouge , Humains , Interaction entre gènes et environnement , Viande rouge/effets indésirables , Viande/effets indésirables , Facteurs de risque , Tumeurs colorectales/génétique

15.

Latent human herpesvirus 6 is reactivated in CAR T cells.

Lareau, Caleb A; Yin, Yajie; Maurer, Katie; Sandor, Katalin D; Daniel, Bence; Yagnik, Garima; Peña, José; Crawford, Jeremy Chase; Spanjaart, Anne M; Gutierrez, Jacob C; Haradhvala, Nicholas J; Riberdy, Janice M; Abay, Tsion; Stickels, Robert R; Verboon, Jeffrey M; Liu, Vincent; Buquicchio, Frank A; Wang, Fangyi; Southard, Jackson; Song, Ren; Li, Wenjing; Shrestha, Aastha; Parida, Laxmi; Getz, Gad; Maus, Marcela V; Li, Shuqiang; Moore, Alison; Roberts, Zachary J; Ludwig, Leif S; Talleur, Aimee C; Thomas, Paul G; Dehghani, Houman; Pertel, Thomas; Kundaje, Anshul; Gottschalk, Stephen; Roth, Theodore L; Kersten, Marie J; Wu, Catherine J; Majzner, Robbie G; Satpathy, Ansuman T.

Nature ; 623(7987): 608-615, 2023 Nov.

Article de Anglais | MEDLINE | ID: mdl-37938768

RÉSUMÉ

Cell therapies have yielded durable clinical benefits for patients with cancer, but the risks associated with the development of therapies from manipulated human cells are understudied. For example, we lack a comprehensive understanding of the mechanisms of toxicities observed in patients receiving T cell therapies, including recent reports of encephalitis caused by reactivation of human herpesvirus 6 (HHV-6)1. Here, through petabase-scale viral genomics mining, we examine the landscape of human latent viral reactivation and demonstrate that HHV-6B can become reactivated in cultures of human CD4+ T cells. Using single-cell sequencing, we identify a rare population of HHV-6 'super-expressors' (about 1 in 300-10,000 cells) that possess high viral transcriptional activity, among research-grade allogeneic chimeric antigen receptor (CAR) T cells. By analysing single-cell sequencing data from patients receiving cell therapy products that are approved by the US Food and Drug Administration2 or are in clinical studies3-5, we identify the presence of HHV-6-super-expressor CAR T cells in patients in vivo. Together, the findings of our study demonstrate the utility of comprehensive genomics analyses in implicating cell therapy products as a potential source contributing to the lytic HHV-6 infection that has been reported in clinical trials1,6-8 and may influence the design and production of autologous and allogeneic cell therapies.

Sujet(s)

Lymphocytes T CD4+ , Herpèsvirus humain de type 6 , Immunothérapie adoptive , Récepteurs chimériques pour l'antigène , Activation virale , Latence virale , Humains , Lymphocytes T CD4+/immunologie , Lymphocytes T CD4+/virologie , Essais cliniques comme sujet , Régulation de l'expression des gènes viraux , Génomique , Herpèsvirus humain de type 6/génétique , Herpèsvirus humain de type 6/isolement et purification , Herpèsvirus humain de type 6/physiologie , Immunothérapie adoptive/effets indésirables , Immunothérapie adoptive/méthodes , Encéphalite infectieuse/complications , Encéphalite infectieuse/virologie , Récepteurs chimériques pour l'antigène/immunologie , Infections à roséolovirus/complications , Infections à roséolovirus/virologie , Analyse de l'expression du gène de la cellule unique , Charge virale

16.

The chromatin landscape of the euryarchaeon Haloferax volcanii.

Marinov, Georgi K; Bagdatli, S Tansu; Wu, Tong; He, Chuan; Kundaje, Anshul; Greenleaf, William J.

Genome Biol ; 24(1): 253, 2023 11 06.

Article de Anglais | MEDLINE | ID: mdl-37932847

RÉSUMÉ

BACKGROUND: Archaea, together with Bacteria, represent the two main divisions of life on Earth, with many of the defining characteristics of the more complex eukaryotes tracing their origin to evolutionary innovations first made in their archaeal ancestors. One of the most notable such features is nucleosomal chromatin, although archaeal histones and chromatin differ significantly from those of eukaryotes, not all archaea possess histones and it is not clear if histones are a main packaging component for all that do. Despite increased interest in archaeal chromatin in recent years, its properties have been little studied using genomic tools. RESULTS: Here, we adapt the ATAC-seq assay to archaea and use it to map the accessible landscape of the genome of the euryarchaeote Haloferax volcanii. We integrate the resulting datasets with genome-wide maps of active transcription and single-stranded DNA (ssDNA) and find that while H. volcanii promoters exist in a preferentially accessible state, unlike most eukaryotes, modulation of transcriptional activity is not associated with changes in promoter accessibility. Applying orthogonal single-molecule footprinting methods, we quantify the absolute levels of physical protection of H. volcanii and find that Haloferax chromatin is similarly or only slightly more accessible, in aggregate, than that of eukaryotes. We also evaluate the degree of coordination of transcription within archaeal operons and make the unexpected observation that some CRISPR arrays are associated with highly prevalent ssDNA structures. CONCLUSIONS: Our results provide the first comprehensive maps of chromatin accessibility and active transcription in Haloferax across conditions and thus a foundation for future functional studies of archaeal chromatin.

Sujet(s)

Protéines d'archée , Haloferax volcanii , Chromatine , Histone/génétique , Haloferax volcanii/génétique , Haloferax volcanii/métabolisme , Nucléosomes , Évolution biologique , Eucaryotes/génétique , Protéines d'archée/génétique

17.

The landscape of the histone-organized chromatin of Bdellovibrionota bacteria.

Marinov, Georgi K; Doughty, Benjamin; Kundaje, Anshul; Greenleaf, William J.

bioRxiv ; 2023 Nov 02.

Article de Anglais | MEDLINE | ID: mdl-37961278

RÉSUMÉ

Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a Bacteriovorax strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, Bacteriovorax chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in Bacteriovorax positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, Bacteriovorax promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the Bacteriovorax genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.

18.

Transcriptomics and chromatin accessibility in multiple African population samples.

DeGorter, Marianne K; Goddard, Page C; Karakoc, Emre; Kundu, Soumya; Yan, Stephanie M; Nachun, Daniel; Abell, Nathan; Aguirre, Matthew; Carstensen, Tommy; Chen, Ziwei; Durrant, Matthew; Dwaracherla, Vikranth R; Feng, Karen; Gloudemans, Michael J; Hunter, Naiomi; Moorthy, Mohana P S; Pomilla, Cristina; Rodrigues, Kameron B; Smith, Courtney J; Smith, Kevin S; Ungar, Rachel A; Balliu, Brunilda; Fellay, Jacques; Flicek, Paul; McLaren, Paul J; Henn, Brenna; McCoy, Rajiv C; Sugden, Lauren; Kundaje, Anshul; Sandhu, Manjinder S; Gurdasani, Deepti; Montgomery, Stephen B.

bioRxiv ; 2023 Nov 06.

Article de Anglais | MEDLINE | ID: mdl-37986808

RÉSUMÉ

Mapping the functional human genome and impact of genetic variants is often limited to European-descendent population samples. To aid in overcoming this limitation, we measured gene expression using RNA sequencing in lymphoblastoid cell lines (LCLs) from 599 individuals from six African populations to identify novel transcripts including those not represented in the hg38 reference genome. We used whole genomes from the 1000 Genomes Project and 164 Maasai individuals to identify 8,881 expression and 6,949 splicing quantitative trait loci (eQTLs/sQTLs), and 2,611 structural variants associated with gene expression (SV-eQTLs). We further profiled chromatin accessibility using ATAC-Seq in a subset of 100 representative individuals, to identity chromatin accessibility quantitative trait loci (caQTLs) and allele-specific chromatin accessibility, and provide predictions for the functional effect of 78.9 million variants on chromatin accessibility. Using this map of eQTLs and caQTLs we fine-mapped GWAS signals for a range of complex diseases. Combined, this work expands global functional genomic data to identify novel transcripts, functional elements and variants, understand population genetic history of molecular quantitative trait loci, and further resolve the genetic basis of multiple human traits and disease.

19.

RNA polymerase II dynamics and mRNA stability feedback scale mRNA amounts with cell size.

Swaffer, Matthew P; Marinov, Georgi K; Zheng, Huan; Fuentes Valenzuela, Lucas; Tsui, Crystal Yee; Jones, Andrew W; Greenwood, Jessica; Kundaje, Anshul; Greenleaf, William J; Reyes-Lamothe, Rodrigo; Skotheim, Jan M.

Cell ; 186(24): 5254-5268.e26, 2023 11 22.

Article de Anglais | MEDLINE | ID: mdl-37944513

RÉSUMÉ

A fundamental feature of cellular growth is that total protein and RNA amounts increase with cell size to keep concentrations approximately constant. A key component of this is that global transcription rates increase in larger cells. Here, we identify RNA polymerase II (RNAPII) as the limiting factor scaling mRNA transcription with cell size in budding yeast, as transcription is highly sensitive to the dosage of RNAPII but not to other components of the transcriptional machinery. Our experiments support a dynamic equilibrium model where global RNAPII transcription at a given size is set by the mass action recruitment kinetics of unengaged nucleoplasmic RNAPII to the genome. However, this only drives a sub-linear increase in transcription with size, which is then partially compensated for by a decrease in mRNA decay rates as cells enlarge. Thus, limiting RNAPII and feedback on mRNA stability work in concert to scale mRNA amounts with cell size.

Sujet(s)

Taille de la cellule , RNA polymerase II , Transcription génétique , Rétroaction , RNA polymerase II/métabolisme , Stabilité de l'ARN , ARN messager/génétique , ARN messager/métabolisme

20.

An encyclopedia of enhancer-gene regulatory interactions in the human genome.

Gschwind, Andreas R; Mualim, Kristy S; Karbalayghareh, Alireza; Sheth, Maya U; Dey, Kushal K; Jagoda, Evelyn; Nurtdinov, Ramil N; Xi, Wang; Tan, Anthony S; Jones, Hank; Ma, X Rosa; Yao, David; Nasser, Joseph; Avsec, Ziga; James, Benjamin T; Shamim, Muhammad S; Durand, Neva C; Rao, Suhas S P; Mahajan, Ragini; Doughty, Benjamin R; Andreeva, Kalina; Ulirsch, Jacob C; Fan, Kaili; Perez, Elizabeth M; Nguyen, Tri C; Kelley, David R; Finucane, Hilary K; Moore, Jill E; Weng, Zhiping; Kellis, Manolis; Bassik, Michael C; Price, Alkes L; Beer, Michael A; Guigó, Roderic; Stamatoyannopoulos, John A; Lieberman Aiden, Erez; Greenleaf, William J; Leslie, Christina S; Steinmetz, Lars M; Kundaje, Anshul; Engreitz, Jesse M.

bioRxiv ; 2023 Nov 13.

Article de Anglais | MEDLINE | ID: mdl-38014075

RÉSUMÉ

Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

RÉSUMÉ

RÉSUMÉ

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

ENVOYER À:

SÉLECTION CITATIONS

DÉTAIL DE RECHERCHE