Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

Recombination of repeat elements generates somatic complexity in human genomes.

Pascarella, Giovanni; Hon, Chung Chau; Hashimoto, Kosuke; Busch, Annika; Luginbühl, Joachim; Parr, Callum; Hin Yip, Wing; Abe, Kazumi; Kratz, Anton; Bonetti, Alessandro; Agostini, Federico; Severin, Jessica; Murayama, Shigeo; Suzuki, Yutaka; Gustincich, Stefano; Frith, Martin; Carninci, Piero.

Cell ; 185(16): 3025-3040.e6, 2022 08 04.

Artigo em Inglês | MEDLINE | ID: mdl-35882231

RESUMO

Non-allelic recombination between homologous repetitive elements contributes to evolution and human genetic disorders. Here, we combine short- and long-DNA read sequencing of repeat elements with a new bioinformatics pipeline to show that somatic recombination of Alu and L1 elements is widespread in the human genome. Our analysis uncovers tissue-specific non-allelic homologous recombination hallmarks; moreover, we find that centromeres and cancer-associated genes are enriched for retroelements that may act as recombination hotspots. We compare recombination profiles in human-induced pluripotent stem cells and differentiated neurons and find that the neuron-specific recombination of repeat elements accompanies chromatin changes during cell-fate determination. Finally, we report that somatic recombination profiles are altered in Parkinson's and Alzheimer's disease, suggesting a link between retroelement recombination and genomic instability in neurodegeneration. This work highlights a significant contribution of the somatic recombination of repeat elements to genomic diversity in health and disease.

Assuntos

Genoma Humano , Retroelementos , Elementos Alu/genética , Recombinação Homóloga , Humanos , Elementos Nucleotídeos Longos e Dispersos , Sequências Repetitivas de Ácido Nucleico

2.

Heteromeric RNP Assembly at LINEs Controls Lineage-Specific RNA Processing.

Attig, Jan; Agostini, Federico; Gooding, Clare; Chakrabarti, Anob M; Singh, Aarti; Haberman, Nejc; Zagalak, Julian A; Emmett, Warren; Smith, Christopher W J; Luscombe, Nicholas M; Ule, Jernej.

Cell ; 174(5): 1067-1081.e17, 2018 08 23.

Artigo em Inglês | MEDLINE | ID: mdl-30078707

RESUMO

Long mammalian introns make it challenging for the RNA processing machinery to identify exons accurately. We find that LINE-derived sequences (LINEs) contribute to this selection by recruiting dozens of RNA-binding proteins (RBPs) to introns. This includes MATR3, which promotes binding of PTBP1 to multivalent binding sites within LINEs. Both RBPs repress splicing and 3' end processing within and around LINEs. Notably, repressive RBPs preferentially bind to evolutionarily young LINEs, which are located far from exons. These RBPs insulate the LINEs and the surrounding intronic regions from RNA processing. Upon evolutionary divergence, changes in RNA motifs within LINEs lead to gradual loss of their insulation. Hence, older LINEs are located closer to exons, are a common source of tissue-specific exons, and increasingly bind to RBPs that enhance RNA processing. Thus, LINEs are hubs for the assembly of repressive RBPs and also contribute to the evolution of new, lineage-specific transcripts in mammals. VIDEO ABSTRACT.

Assuntos

Ribonucleoproteínas Nucleares Heterogêneas/química , Elementos Nucleotídeos Longos e Dispersos , Proteínas Associadas à Matriz Nuclear/química , Poliadenilação , Proteína de Ligação a Regiões Ricas em Polipirimidinas/química , Proteínas de Ligação a RNA/química , RNA/química , Processamento Alternativo , Animais , Sítios de Ligação , Éxons , Células HeLa , Humanos , Íntrons , Camundongos , Mutação , Motivos de Nucleotídeos , Filogenia , Ligação Proteica , Mapeamento de Interação de Proteínas , Splicing de RNA

3.

No way out: when RNA elements promote nuclear retention.

Agostini, Federico; Ule, Jernej; Zagalak, Julian A.

EMBO J ; 37(6)2018 03 15.

Artigo em Inglês | MEDLINE | ID: mdl-29487065

Assuntos

Núcleo Celular , RNA , Humanos , RNA Mensageiro

4.

Molecular landscape of the interaction between the urease accessory proteins UreE and UreG.

Merloni, Anna; Dobrovolska, Olena; Zambelli, Barbara; Agostini, Federico; Bazzani, Micaela; Musiani, Francesco; Ciurli, Stefano.

Biochim Biophys Acta ; 1844(9): 1662-74, 2014 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-24982029

RESUMO

Urease, the most efficient enzyme so far discovered, depends on the presence of nickel ions in the catalytic site for its activity. The transformation of inactive apo-urease into active holo-urease requires the insertion of two Ni(II) ions in the substrate binding site, a process that involves the interaction of four accessory proteins named UreD, UreF, UreG and UreE. This study, carried out using calorimetric and NMR-based structural analysis, is focused on the interaction between UreE and UreG from Sporosarcina pasteurii, a highly ureolytic bacterium. Isothermal calorimetric protein-protein titrations revealed the occurrence of a binding event between SpUreE and SpUreG, entailing two independent steps with positive cooperativity (Kd1=42±9µM; Kd2=1.7±0.3µM). This was interpreted as indicating the formation of the (UreE)2(UreG)2 hetero-oligomer upon binding of two UreG monomers onto the pre-formed UreE dimer. The molecular details of this interaction were elucidated using high-resolution NMR spectroscopy. The occurrence of SpUreE chemical shift perturbations upon addition of SpUreG was investigated and analyzed to establish the protein-protein interaction site. The latter appears to involve the Ni(II) binding site as well as mobile portions on the C-terminal and the N-terminal domains. Docking calculations based on the information obtained from NMR provided a structural basis for the protein-protein contact site. The high sequence and structural similarity within these protein classes suggests a generality of the interaction mode among homologous proteins. The implications of these results on the molecular details of the urease activation process are considered and analyzed.

Assuntos

Proteínas de Bactérias/química , Proteínas de Transporte/química , Níquel/química , Sporosarcina/química , Urease/química , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Calorimetria , Proteínas de Transporte/genética , Proteínas de Transporte/metabolismo , Cátions Bivalentes , Escherichia coli/genética , Escherichia coli/metabolismo , Expressão Gênica , Cinética , Espectroscopia de Ressonância Magnética , Simulação de Acoplamento Molecular , Níquel/metabolismo , Proteínas de Ligação a Fosfato , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , Multimerização Proteica , Estrutura Secundária de Proteína , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Sporosarcina/enzimologia , Termodinâmica , Urease/genética , Urease/metabolismo

5.

Neurodegenerative diseases: quantitative predictions of protein-RNA interactions.

Cirillo, Davide; Agostini, Federico; Klus, Petr; Marchese, Domenica; Rodriguez, Silvia; Bolognesi, Benedetta; Tartaglia, Gian Gaetano.

RNA ; 19(2): 129-40, 2013 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-23264567

RESUMO

Increasing evidence indicates that RNA plays an active role in a number of neurodegenerative diseases. We recently introduced a theoretical framework, catRAPID, to predict the binding ability of protein and RNA molecules. Here, we use catRAPID to investigate ribonucleoprotein interactions linked to inherited intellectual disability, amyotrophic lateral sclerosis, Creutzfeuld-Jakob, Alzheimer's, and Parkinson's diseases. We specifically focus on (1) RNA interactions with fragile X mental retardation protein FMRP; (2) protein sequestration caused by CGG repeats; (3) noncoding transcripts regulated by TAR DNA-binding protein 43 TDP-43; (4) autogenous regulation of TDP-43 and FMRP; (5) iron-mediated expression of amyloid precursor protein APP and α-synuclein; (6) interactions between prions and RNA aptamers. Our results are in striking agreement with experimental evidence and provide new insights in processes associated with neuronal function and misfunction.

Assuntos

Algoritmos , Síndrome do Cromossomo X Frágil/metabolismo , Doenças Neurodegenerativas/metabolismo , Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Ribonucleoproteínas/metabolismo , Precursor de Proteína beta-Amiloide/metabolismo , Aptâmeros de Nucleotídeos/metabolismo , Proteínas de Ligação a DNA/metabolismo , Feminino , Proteína do X Frágil da Deficiência Intelectual/metabolismo , Síndrome do Cromossomo X Frágil/genética , Regulação da Expressão Gênica , Humanos , Masculino , Modelos Teóricos , Doenças Neurodegenerativas/genética , Príons/metabolismo , Ligação Proteica , RNA não Traduzido/metabolismo , alfa-Sinucleína/metabolismo

6.

ccSOL omics: a webserver for solubility prediction of endogenous and heterologous expression in Escherichia coli.

Agostini, Federico; Cirillo, Davide; Livi, Carmen Maria; Delli Ponti, Riccardo; Tartaglia, Gian Gaetano.

Bioinformatics ; 30(20): 2975-7, 2014 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-24990610

RESUMO

SUMMARY: Here we introduce ccSOL omics, a webserver for large-scale calculations of protein solubility. Our method allows (i) proteome-wide predictions; (ii) identification of soluble fragments within each sequences; (iii) exhaustive single-point mutation analysis. RESULTS: Using coil/disorder, hydrophobicity, hydrophilicity, ß-sheet and α-helix propensities, we built a predictor of protein solubility. Our approach shows an accuracy of 79% on the training set (36 990 Target Track entries). Validation on three independent sets indicates that ccSOL omics discriminates soluble and insoluble proteins with an accuracy of 74% on 31 760 proteins sharing <30% sequence similarity. AVAILABILITY AND IMPLEMENTATION: ccSOL omics can be freely accessed on the web at http://s.tartaglialab.com/page/ccsol_group. Documentation and tutorial are available at http://s.tartaglialab.com/static_files/shared/tutorial_ccsol_omics.html. CONTACT: gian.tartaglia@crg.es SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Internet , Proteômica/métodos , Algoritmos , Expressão Gênica , Interações Hidrofóbicas e Hidrofílicas , Estrutura Secundária de Proteína , Solubilidade

7.

The cleverSuite approach for protein characterization: predictions of structural properties, solubility, chaperone requirements and RNA-binding abilities.

Klus, Petr; Bolognesi, Benedetta; Agostini, Federico; Marchese, Domenica; Zanzoni, Andreas; Tartaglia, Gian Gaetano.

Bioinformatics ; 30(11): 1601-8, 2014 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-24493033

RESUMO

MOTIVATION: The recent shift towards high-throughput screening is posing new challenges for the interpretation of experimental results. Here we propose the cleverSuite approach for large-scale characterization of protein groups. DESCRIPTION: The central part of the cleverSuite is the cleverMachine (CM), an algorithm that performs statistics on protein sequences by comparing their physico-chemical propensities. The second element is called cleverClassifier and builds on top of the models generated by the CM to allow classification of new datasets. RESULTS: We applied the cleverSuite to predict secondary structure properties, solubility, chaperone requirements and RNA-binding abilities. Using cross-validation and independent datasets, the cleverSuite reproduces experimental findings with great accuracy and provides models that can be used for future investigations. AVAILABILITY: The intuitive interface for dataset exploration, analysis and prediction is available at http://s.tartaglialab.com/clever_suite.

Assuntos

Chaperonas Moleculares/química , Proteínas/química , Proteínas de Ligação a RNA/química , Software , Algoritmos , Proteínas Intrinsicamente Desordenadas/química , Chaperonas Moleculares/metabolismo , Estrutura Secundária de Proteína , Proteínas de Ligação a RNA/metabolismo , Análise de Sequência de Proteína , Solubilidade

8.

X-inactivation: quantitative predictions of protein interactions in the Xist network.

Agostini, Federico; Cirillo, Davide; Bolognesi, Benedetta; Tartaglia, Gian Gaetano.

Nucleic Acids Res ; 41(1): e31, 2013 Jan 07.

Artigo em Inglês | MEDLINE | ID: mdl-23093590

RESUMO

The transcriptional silencing of one of the female X-chromosomes is a finely regulated process that requires accumulation in cis of the long non-coding RNA X-inactive-specific transcript (Xist) followed by a series of epigenetic modifications. Little is known about the molecular machinery regulating initiation and maintenance of chromosomal silencing. Here, we introduce a new version of our algorithm catRAPID to investigate Xist associations with a number of proteins involved in epigenetic regulation, nuclear scaffolding, transcription and splicing processes. Our method correctly identifies binding regions and affinities of protein interactions, providing a powerful theoretical framework for the study of X-chromosome inactivation and other events mediated by ribonucleoprotein associations.

Assuntos

Algoritmos , RNA Longo não Codificante/metabolismo , Proteínas de Ligação a RNA/metabolismo , Inativação do Cromossomo X , Animais , Sítios de Ligação , Proteína Potenciadora do Homólogo 2 de Zeste , Feminino , Ribonucleoproteínas Nucleares Heterogêneas Grupo U/metabolismo , Proteínas de Ligação à Região de Interação com a Matriz/metabolismo , Camundongos , Proteínas Nucleares/metabolismo , Complexo Repressor Polycomb 2/metabolismo , RNA Longo não Codificante/química , Sequências Repetitivas de Ácido Nucleico , Fatores de Processamento de Serina-Arginina , Fator de Transcrição YY1/metabolismo

9.

Principles of self-organization in biological pathways: a hypothesis on the autogenous association of alpha-synuclein.

Zanzoni, Andreas; Marchese, Domenica; Agostini, Federico; Bolognesi, Benedetta; Cirillo, Davide; Botta-Orfila, Maria; Livi, Carmen Maria; Rodriguez-Mulero, Silvia; Tartaglia, Gian Gaetano.

Nucleic Acids Res ; 41(22): 9987-98, 2013 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-24003031

RESUMO

Previous evidence indicates that a number of proteins are able to interact with cognate mRNAs. These autogenous associations represent important regulatory mechanisms that control gene expression at the translational level. Using the catRAPID approach to predict the propensity of proteins to bind to RNA, we investigated the occurrence of autogenous associations in the human proteome. Our algorithm correctly identified binding sites in well-known cases such as thymidylate synthase, tumor suppressor P53, synaptotagmin-1, serine/ariginine-rich splicing factor 2, heat shock 70 kDa, ribonucleic particle-specific U1A and ribosomal protein S13. In addition, we found that several other proteins are able to bind to their own mRNAs. A large-scale analysis of biological pathways revealed that aggregation-prone and structurally disordered proteins have the highest propensity to interact with cognate RNAs. These findings are substantiated by experimental evidence on amyloidogenic proteins such as TAR DNA-binding protein 43 and fragile X mental retardation protein. Among the amyloidogenic proteins, we predicted that Parkinson's disease-related α-synuclein is highly prone to interact with cognate transcripts, which suggests the existence of RNA-dependent factors in its function and dysfunction. Indeed, as aggregation is intrinsically concentration dependent, it is possible that autogenous interactions play a crucial role in controlling protein homeostasis.

Assuntos

Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , alfa-Sinucleína/metabolismo , Algoritmos , Sítios de Ligação , Regulação da Expressão Gênica , Humanos , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/metabolismo , Proteínas Nucleares/química , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Biossíntese de Proteínas , RNA/química , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/genética , Ribonucleoproteínas/química , Ribonucleoproteínas/genética , Ribonucleoproteínas/metabolismo , Fatores de Processamento de Serina-Arginina , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo

10.

SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences.

Agostini, Federico; Cirillo, Davide; Ponti, Riccardo Delli; Tartaglia, Gian Gaetano.

BMC Genomics ; 15: 925, 2014 Oct 23.

Artigo em Inglês | MEDLINE | ID: mdl-25341390

RESUMO

BACKGROUND: The large amount of data produced by high-throughput sequencing poses new computational challenges. In the last decade, several tools have been developed for the identification of transcription and splicing factor binding sites. RESULTS: Here, we introduce the SeAMotE (Sequence Analysis of Motifs Enrichment) algorithm for discovery of regulatory regions in nucleic acid sequences. SeAMotE provides (i) a robust analysis of high-throughput sequence sets, (ii) a motif search based on pattern occurrences and (iii) an easy-to-use web-server interface. We applied our method to recently published data including 351 chromatin immunoprecipitation (ChIP) and 13 crosslinking immunoprecipitation (CLIP) experiments and compared our results with those of other well-established motif discovery tools. SeAMotE shows an average accuracy of 80% in finding discriminative motifs and outperforms other methods available in literature. CONCLUSIONS: SeAMotE is a fast, accurate and flexible algorithm for the identification of sequence patterns involved in protein-DNA and protein-RNA recognition. The server can be freely accessed at http://s.tartaglialab.com/new_submission/seamote.

Assuntos

Software , Algoritmos , Sequência de Bases , Imunoprecipitação da Cromatina , DNA/química , DNA/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Ligação Proteica , Proteínas/química , Proteínas/metabolismo , RNA/química , RNA/metabolismo , Análise de Sequência de DNA , Interface Usuário-Computador

11.

catRAPID omics: a web server for large-scale prediction of protein-RNA interactions.

Agostini, Federico; Zanzoni, Andreas; Klus, Petr; Marchese, Domenica; Cirillo, Davide; Tartaglia, Gian Gaetano.

Bioinformatics ; 29(22): 2928-30, 2013 Nov 15.

Artigo em Inglês | MEDLINE | ID: mdl-23975767

RESUMO

SUMMARY: Here we introduce catRAPID omics, a server for large-scale calculations of protein-RNA interactions. Our web server allows (i) predictions at proteomic and transcriptomic level; (ii) use of protein and RNA sequences without size restriction; (iii) analysis of nucleic acid binding regions in proteins; and (iv) detection of RNA motifs involved in protein recognition. RESULTS: We developed a web server to allow fast calculation of ribonucleoprotein associations in Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, Homo sapiens, Mus musculus, Rattus norvegicus, Saccharomyces cerevisiae and Xenopus tropicalis (custom libraries can be also generated). The catRAPID omics was benchmarked on the recently published RNA interactomes of Serine/arginine-rich splicing factor 1 (SRSF1), Histone-lysine N-methyltransferase EZH2 (EZH2), TAR DNA-binding protein 43 (TDP43) and RNA-binding protein FUS (FUS) as well as on the protein interactomes of U1/U2 small nucleolar RNAs, X inactive specific transcript (Xist) repeat A region (RepA) and Crumbs homolog 3 (CRB3) 3'-untranslated region RNAs. Our predictions are highly significant (P < 0.05) and will help the experimentalist to identify candidates for further validation. AVAILABILITY: catRAPID omics can be freely accessed on the Web at http://s.tartaglialab.com/catrapid/omics. Documentation, tutorial and FAQs are available at http://s.tartaglialab.com/page/catrapid_group.

Assuntos

Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , RNA/química , RNA/metabolismo , Software , Regiões 3' não Traduzidas , Algoritmos , Animais , Caenorhabditis elegans , Perfilação da Expressão Gênica , Humanos , Internet , Camundongos , Motivos de Nucleotídeos , Estrutura Terciária de Proteína , Proteômica , Ratos , Análise de Sequência de Proteína , Análise de Sequência de RNA

12.

Control of protein synthesis through mRNA pseudouridylation by dyskerin.

Pederiva, Chiara; Trevisan, Davide M; Peirasmaki, Dimitra; Chen, Shan; Savage, Sharon A; Larsson, Ola; Ule, Jernej; Baranello, Laura; Agostini, Federico; Farnebo, Marianne.

Sci Adv ; 9(30): eadg1805, 2023 07 28.

Artigo em Inglês | MEDLINE | ID: mdl-37506213

RESUMO

Posttranscriptional modifications of mRNA have emerged as regulators of gene expression. Although pseudouridylation is the most abundant, its biological role remains poorly understood. Here, we demonstrate that the pseudouridine synthase dyskerin associates with RNA polymerase II, binds to thousands of mRNAs, and is responsible for their pseudouridylation, an action that occurs in chromatin and does not appear to require a guide RNA with full complementarity. In cells lacking dyskerin, mRNA pseudouridylation is reduced, while at the same time, de novo protein synthesis is enhanced, indicating that this modification interferes with translation. Accordingly, mRNAs with fewer pseudouridines due to knockdown of dyskerin are translated more efficiently. Moreover, mRNA pseudouridylation is severely reduced in patients with dyskeratosis congenita caused by inherited mutations in the gene encoding dyskerin (i.e., DKC1). Our findings demonstrate that pseudouridylation by dyskerin modulates mRNA translatability, with important implications for both normal development and disease.

Assuntos

Proteínas Nucleares , Proteínas de Ligação a RNA , Humanos , RNA Mensageiro/genética , Proteínas de Ligação a RNA/genética , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Proteínas de Ciclo Celular/metabolismo

13.

Deficiency of the Heterogeneous Nuclear Ribonucleoprotein U locus leads to delayed hindbrain neurogenesis.

Mastropasqua, Francesca; Oksanen, Marika; Soldini, Cristina; Alatar, Shemim; Arora, Abishek; Ballarino, Roberto; Molinari, Maya; Agostini, Federico; Poulet, Axel; Watts, Michelle; Rabkina, Ielyzaveta; Becker, Martin; Li, Danyang; Anderlid, Britt-Marie; Isaksson, Johan; Lundin Remnelius, Karl; Moslem, Mohsen; Jacob, Yannick; Falk, Anna; Crosetto, Nicola; Bienko, Magda; Santini, Emanuela; Borgkvist, Anders; Bölte, Sven; Tammimies, Kristiina.

Biol Open ; 12(10)2023 10 15.

Artigo em Inglês | MEDLINE | ID: mdl-37815090

RESUMO

Genetic variants affecting Heterogeneous Nuclear Ribonucleoprotein U (HNRNPU) have been identified in several neurodevelopmental disorders (NDDs). HNRNPU is widely expressed in the human brain and shows the highest postnatal expression in the cerebellum. Recent studies have investigated the role of HNRNPU in cerebral cortical development, but the effects of HNRNPU deficiency on cerebellar development remain unknown. Here, we describe the molecular and cellular outcomes of HNRNPU locus deficiency during in vitro neural differentiation of patient-derived and isogenic neuroepithelial stem cells with a hindbrain profile. We demonstrate that HNRNPU deficiency leads to chromatin remodeling of A/B compartments, and transcriptional rewiring, partly by impacting exon inclusion during mRNA processing. Genomic regions affected by the chromatin restructuring and host genes of exon usage differences show a strong enrichment for genes implicated in epilepsies, intellectual disability, and autism. Lastly, we show that at the cellular level HNRNPU downregulation leads to an increased fraction of neural progenitors in the maturing neuronal population. We conclude that the HNRNPU locus is involved in delayed commitment of neural progenitors to differentiate in cell types with hindbrain profile.

Assuntos

Ribonucleoproteínas Nucleares Heterogêneas Grupo U , Transtornos do Neurodesenvolvimento , Humanos , Cromatina , Ribonucleoproteínas Nucleares Heterogêneas Grupo U/genética , Ribonucleoproteínas Nucleares Heterogêneas Grupo U/metabolismo , Transtornos do Neurodesenvolvimento/genética , Neurogênese/genética , Rombencéfalo/metabolismo

14.

An atlas of endogenous DNA double-strand breaks arising during human neural cell fate determination.

Ballarino, Roberto; Bouwman, Britta A M; Agostini, Federico; Harbers, Luuk; Diekmann, Constantin; Wernersson, Erik; Bienko, Magda; Crosetto, Nicola.

Sci Data ; 9(1): 400, 2022 07 12.

Artigo em Inglês | MEDLINE | ID: mdl-35821502

RESUMO

Endogenous DNA double-strand breaks (DSBs) occurring in neural cells have been implicated in the pathogenesis of neurodevelopmental disorders (NDDs). Currently, a genomic map of endogenous DSBs arising during human neurogenesis is missing. Here, we applied in-suspension Breaks Labeling In Situ and Sequencing (sBLISS), RNA-Seq, and Hi-C to chart the genomic landscape of DSBs and relate it to gene expression and genome architecture in 2D cultures of human neuroepithelial stem cells (NES), neural progenitor cells (NPC), and post-mitotic neural cells (NEU). Endogenous DSBs were enriched at the promoter and along the gene body of transcriptionally active genes, at the borders of topologically associating domains (TADs), and around chromatin loop anchors. NDD risk genes harbored significantly more DSBs in comparison to other protein-coding genes, especially in NEU cells. We provide sBLISS, RNA-Seq, and Hi-C datasets for each differentiation stage, and all the scripts needed to reproduce our analyses. Our datasets and tools represent a unique resource that can be harnessed to investigate the role of genome fragility in the pathogenesis of NDDs.

Assuntos

Quebras de DNA de Cadeia Dupla , Neurogênese , Linhagem Celular Tumoral , DNA/metabolismo , Genômica , Humanos

15.

Intergenic RNA mainly derives from nascent transcripts of known genes.

Agostini, Federico; Zagalak, Julian; Attig, Jan; Ule, Jernej; Luscombe, Nicholas M.

Genome Biol ; 22(1): 136, 2021 05 05.

Artigo em Inglês | MEDLINE | ID: mdl-33952325

RESUMO

BACKGROUND: Eukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs originating from intergenic regions in the human genome remain unclear. RESULTS: We hypothesize that many intergenic RNAs can be ascribed to the presence of as-yet unannotated genes or the "fuzzy" transcription of known genes that extends beyond the annotated boundaries. To elucidate the contributions of these two sources, we assemble a dataset of more than 2.5 billion publicly available RNA-seq reads across 5 human cell lines and multiple cellular compartments to annotate transcriptional units in the human genome. About 80% of transcripts from unannotated intergenic regions can be attributed to the fuzzy transcription of existing genes; the remaining transcripts originate mainly from putative long non-coding RNA loci that are rarely spliced. We validate the transcriptional activity of these intergenic RNAs using independent measurements, including transcriptional start sites, chromatin signatures, and genomic occupancies of RNA polymerase II in various phosphorylation states. We also analyze the nuclear localization and sensitivities of intergenic transcripts to nucleases to illustrate that they tend to be rapidly degraded either on-chromatin by XRN2 or off-chromatin by the exosome. CONCLUSIONS: We provide a curated atlas of intergenic RNAs that distinguishes between alternative processing of well-annotated genes from independent transcriptional units based on the combined analysis of chromatin signatures, nuclear RNA localization, and degradation pathways.

Assuntos

DNA Intergênico/genética , Genes , RNA Mensageiro/genética , Linhagem Celular , Cromatina/genética , Endonucleases/metabolismo , Humanos , RNA Mensageiro/metabolismo , Transcrição Gênica

16.

BLVector: Fast BLAST-Like Algorithm for Manycore CPU With Vectorization.

Gálvez, Sergio; Agostini, Federico; Caselli, Javier; Hernandez, Pilar; Dorado, Gabriel.

Front Genet ; 12: 618659, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33603776

RESUMO

New High-Performance Computing architectures have been recently developed for commercial central processing unit (CPU). Yet, that has not improved the execution time of widely used bioinformatics applications, like BLAST+. This is due to a lack of optimization between the bases of the existing algorithms and the internals of the hardware that allows taking full advantage of the available CPU cores. To optimize the new architectures, algorithms must be revised and redesigned; usually rewritten from scratch. BLVector adapts the high-level concepts of BLAST+ to the x86 architectures with AVX-512, to harness their capabilities. A deep comprehensive study has been carried out to optimize the approach, with a significant reduction in time execution. BLVector reduces the execution time of BLAST+ when aligning up to mid-size protein sequences (â¼750 amino acids). The gain in real scenario cases is 3.2-fold. When applied to longer proteins, BLVector consumes more time than BLAST+, but retrieves a much larger set of results. BLVector and BLAST+ are fine-tuned heuristics. Therefore, the relevant results returned by both are the same, although they behave differently specially when performing alignments with low scores. Hence, they can be considered complementary bioinformatics tools.

17.

Somatic Copy Number Alterations in Human Cancers: An Analysis of Publicly Available Data From The Cancer Genome Atlas.

Harbers, Luuk; Agostini, Federico; Nicos, Marcin; Poddighe, Dimitri; Bienko, Magda; Crosetto, Nicola.

Front Oncol ; 11: 700568, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34395272

RESUMO

Somatic copy number alterations (SCNAs) are a pervasive trait of human cancers that contributes to tumorigenesis by affecting the dosage of multiple genes at the same time. In the past decade, The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) initiatives have generated and made publicly available SCNA genomic profiles from thousands of tumor samples across multiple cancer types. Here, we present a comprehensive analysis of 853,218 SCNAs across 10,729 tumor samples belonging to 32 cancer types using TCGA data. We then discuss current models for how SCNAs likely arise during carcinogenesis and how genomic SCNA profiles can inform clinical practice. Lastly, we highlight open questions in the field of cancer-associated SCNAs.

18.

Comparative Genomics, Evolution, and Drought-Induced Expression of Dehydrin Genes in Model Brachypodium Grasses.

Decena, Maria Angeles; Gálvez-Rojas, Sergio; Agostini, Federico; Sancho, Ruben; Contreras-Moreira, Bruno; Des Marais, David L; Hernandez, Pilar; Catalán, Pilar.

Plants (Basel) ; 10(12)2021 Dec 03.

Artigo em Inglês | MEDLINE | ID: mdl-34961135

RESUMO

Dehydration proteins (dehydrins, DHNs) confer tolerance to water-stress deficit in plants. We performed a comparative genomics and evolutionary study of DHN genes in four model Brachypodium grass species. Due to limited knowledge on dehydrin expression under water deprivation stress in Brachypodium, we also performed a drought-induced gene expression analysis in 32 ecotypes of the genus' flagship species B. distachyon showing different hydric requirements. Genomic sequence analysis detected 10 types of dehydrin genes (Bdhn) across the Brachypodium species. Domain and conserved motif contents of peptides encoded by Bdhn genes revealed eight protein architectures. Bdhn genes were spread across several chromosomes. Selection analysis indicated that all the Bdhn genes were constrained by purifying selection. Three upstream cis-regulatory motifs (BES1, MYB124, ZAT) were detected in several Bdhn genes. Gene expression analysis demonstrated that only four Bdhn1-Bdhn2, Bdhn3, and Bdhn7 genes, orthologs of wheat, barley, rice, sorghum, and maize genes, were expressed in mature leaves of B. distachyon and that all of them were more highly expressed in plants under drought conditions. Brachypodium dehydrin expression was significantly correlated with drought-response phenotypic traits (plant biomass, leaf carbon and proline contents and water use efficiency increases, and leaf water and nitrogen content decreases) being more pronounced in drought-tolerant ecotypes. Our results indicate that dehydrin type and regulation could be a key factor determining the acquisition of water-stress tolerance in grasses.

19.

Genome-wide detection of DNA double-strand breaks by in-suspension BLISS.

Bouwman, Britta A M; Agostini, Federico; Garnerone, Silvano; Petrosino, Giuseppe; Gothe, Henrike J; Sayols, Sergi; Moor, Andreas E; Itzkovitz, Shalev; Bienko, Magda; Roukos, Vassilis; Crosetto, Nicola.

Nat Protoc ; 15(12): 3894-3941, 2020 12.

Artigo em Inglês | MEDLINE | ID: mdl-33139954

RESUMO

sBLISS (in-suspension breaks labeling in situ and sequencing) is a versatile and widely applicable method for identification of endogenous and induced DNA double-strand breaks (DSBs) in any cell type that can be brought into suspension. sBLISS provides genome-wide profiles of the most consequential DNA lesion implicated in a variety of pathological, but also physiological, processes. In sBLISS, after in situ labeling, DSB ends are linearly amplified, followed by next-generation sequencing and DSB landscape analysis. Here, we present a step-by-step experimental protocol for sBLISS, as well as a basic computational analysis. The main advantages of sBLISS are (i) the suspension setup, which renders the protocol user-friendly and easily scalable; (ii) the possibility of adapting it to a high-throughput or single-cell workflow; and (iii) its flexibility and its applicability to virtually every cell type, including patient-derived cells, organoids, and isolated nuclei. The wet-lab protocol can be completed in 1.5 weeks and is suitable for researchers with intermediate expertise in molecular biology and genomics. For the computational analyses, basic-to-intermediate bioinformatics expertise is required.

Assuntos

Quebras de DNA de Cadeia Dupla , Genômica/métodos , Sequência de Bases , Linhagem Celular , Suspensões

20.

GPSeq reveals the radial organization of chromatin in the cell nucleus.

Girelli, Gabriele; Custodio, Joaquin; Kallas, Tomasz; Agostini, Federico; Wernersson, Erik; Spanjaard, Bastiaan; Mota, Ana; Kolbeinsdottir, Solrun; Gelali, Eleni; Crosetto, Nicola; Bienko, Magda.

Nat Biotechnol ; 38(10): 1184-1193, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-32451505

RESUMO

With the exception of lamina-associated domains, the radial organization of chromatin in mammalian cells remains largely unexplored. Here we describe genomic loci positioning by sequencing (GPSeq), a genome-wide method for inferring distances to the nuclear lamina all along the nuclear radius. GPSeq relies on gradual restriction digestion of chromatin from the nuclear lamina toward the nucleus center, followed by sequencing of the generated cut sites. Using GPSeq, we mapped the radial organization of the human genome at 100-kb resolution, which revealed radial patterns of genomic and epigenomic features and gene expression, as well as A and B subcompartments. By combining radial information with chromosome contact frequencies measured by Hi-C, we substantially improved the accuracy of whole-genome structure modeling. Finally, we charted the radial topography of DNA double-strand breaks, germline variants and cancer mutations and found that they have distinctive radial arrangements in A and B subcompartments. We conclude that GPSeq can reveal fundamental aspects of genome architecture.

Assuntos

Núcleo Celular/genética , Cromatina/genética , Epigenômica , Genoma Humano/genética , Regulação da Expressão Gênica/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA