RESUMEN
In this issue, Shachar et al. report a high-throughput imaging position mapping platform (HIPmap) enabling large-scale, high-resolution localization of 3D gene positions in single cells. Coupling loss-of-function screens with HIPmap, the authors identify DNA replication rather than mitosis as a major determinant of genome positioning.
Asunto(s)
Núcleo Celular/genética , Genes , Técnicas Genéticas , HumanosRESUMEN
Genomes are arranged non-randomly in the 3D space of the cell nucleus. Here, we have developed HIPMap, a high-precision, high-throughput, automated fluorescent in situ hybridization imaging pipeline, for mapping of the spatial location of genome regions at large scale. High-throughput imaging position mapping (HIPMap) enabled an unbiased siRNA screen for factors involved in genome organization in human cells. We identify 50 cellular factors required for proper positioning of a set of functionally diverse genomic loci. Positioning factors include chromatin remodelers, histone modifiers, and nuclear envelope and pore proteins. Components of the replication and post-replication chromatin re-assembly machinery are prominently represented among positioning factors, and timely progression of cells through replication, but not mitosis, is required for correct gene positioning. Our results establish a method for the large-scale mapping of genome locations and have led to the identification of a compendium of cellular factors involved in spatial genome organization.
Asunto(s)
Núcleo Celular/genética , Genes , Técnicas Genéticas , Línea Celular , Replicación del ADN , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Hibridación Fluorescente in Situ/métodos , Análisis de la Célula Individual/métodosRESUMEN
The nucleus is highly organized, such that factors involved in the transcription and processing of distinct classes of RNA are confined within specific nuclear bodies1,2. One example is the nuclear speckle, which is defined by high concentrations of protein and noncoding RNA regulators of pre-mRNA splicing3. What functional role, if any, speckles might play in the process of mRNA splicing is unclear4,5. Here we show that genes localized near nuclear speckles display higher spliceosome concentrations, increased spliceosome binding to their pre-mRNAs and higher co-transcriptional splicing levels than genes that are located farther from nuclear speckles. Gene organization around nuclear speckles is dynamic between cell types, and changes in speckle proximity lead to differences in splicing efficiency. Finally, directed recruitment of a pre-mRNA to nuclear speckles is sufficient to increase mRNA splicing levels. Together, our results integrate the long-standing observations of nuclear speckles with the biochemistry of mRNA splicing and demonstrate a crucial role for dynamic three-dimensional spatial organization of genomic DNA in driving spliceosome concentrations and controlling the efficiency of mRNA splicing.
Asunto(s)
Genoma , Motas Nucleares , Precursores del ARN , Empalme del ARN , ARN Mensajero , Empalmosomas , Animales , Humanos , Masculino , Ratones , Genes , Genoma/genética , Células Madre Embrionarias Humanas/metabolismo , Células Madre Embrionarias de Ratones/metabolismo , Motas Nucleares/genética , Motas Nucleares/metabolismo , Precursores del ARN/metabolismo , Precursores del ARN/genética , Empalme del ARN/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Empalmosomas/metabolismo , Transcripción GenéticaRESUMEN
Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we examine the need for a universal annotation standard that includes all medically significant genes and maintains their relationships with different reference genomes for the use of the human gene catalogue in clinical settings.
Asunto(s)
Genes , Genoma Humano , Anotación de Secuencia Molecular , Isoformas de Proteínas , Humanos , Genoma Humano/genética , Anotación de Secuencia Molecular/normas , Anotación de Secuencia Molecular/tendencias , Isoformas de Proteínas/genética , Proyecto Genoma Humano , Seudogenes , ARN/genéticaRESUMEN
The transcriptional machinery is thought to dissociate from DNA during replication. Certain proteins, termed epigenetic marks, must be transferred from parent to daughter DNA strands in order to maintain the memory of transcriptional states1,2. These proteins are believed to re-initiate rebuilding of chromatin structure, which ultimately recruits RNA polymerase II (Pol II) to the newly replicated daughter strands. It is believed that Pol II is recruited back to active genes only after chromatin is rebuilt3,4. However, there is little experimental evidence addressing the central questions of when and how Pol II is recruited back to the daughter strands and resumes transcription. Here we show that immediately after passage of the replication fork, Pol II in complex with other general transcription proteins and immature RNA re-associates with active genes on both leading and lagging strands of nascent DNA, and rapidly resumes transcription. This suggests that the transcriptionally active Pol II complex is retained in close proximity to DNA, with a Pol II-PCNA interaction potentially underlying this retention. These findings indicate that the Pol II machinery may not require epigenetic marks to be recruited to the newly synthesized DNA during the transition from DNA replication to resumption of transcription.
Asunto(s)
Cromatina , Replicación del ADN , ADN , Genes , ARN Polimerasa II , Transcripción Genética , Cromatina/genética , ADN/biosíntesis , ADN/genética , ADN/metabolismo , ADN Polimerasa II/metabolismo , Epigénesis Genética , Antígeno Nuclear de Célula en Proliferación/metabolismo , ARN Polimerasa II/metabolismo , Factores Generales de Transcripción/metabolismo , ARN/genética , ARN/metabolismoRESUMEN
The goal of genomics and systems biology is to understand how complex systems of factors assemble into pathways and structures that combine to form living organisms. Great advances in understanding biological processes result from determining the function of individual genes, a process that has classically relied on characterizing single mutations. Advances in DNA sequencing has made available the complete set of genetic instructions for an astonishing and growing number of species. To understand the function of this ever-increasing number of genes, a high-throughput method was developed that in a single experiment can measure the function of genes across the genome of an organism. This occurred approximately 10 years ago, when high-throughput DNA sequencing was combined with advances in transposon-mediated mutagenesis in a method termed transposon insertion sequencing (TIS). In the subsequent years, TIS succeeded in addressing fundamental questions regarding the genes of bacteria, many of which have been shown to play central roles in bacterial infections that result in major human diseases. The field of TIS has matured and resulted in studies of hundreds of species that include significant innovations with a number of transposons. Here, we summarize a number of TIS experiments to provide an understanding of the method and explanation of approaches that are instructive when designing a study. Importantly, we emphasize critical aspects of a TIS experiment and highlight the extension and applicability of TIS into nonbacterial species such as yeast.
Asunto(s)
Elementos Transponibles de ADN/genética , Genes/genética , Animales , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutagénesis/genética , MutaciónRESUMEN
Animals and fungi have radically distinct morphologies, yet both evolved within the same eukaryotic supergroup: Opisthokonta1,2. Here we reconstructed the trajectory of genetic changes that accompanied the origin of Metazoa and Fungi since the divergence of Opisthokonta with a dataset that includes four novel genomes from crucial positions in the Opisthokonta phylogeny. We show that animals arose only after the accumulation of genes functionally important for their multicellularity, a tendency that began in the pre-metazoan ancestors and later accelerated in the metazoan root. By contrast, the pre-fungal ancestors experienced net losses of most functional categories, including those gained in the path to Metazoa. On a broad-scale functional level, fungal genomes contain a higher proportion of metabolic genes and diverged less from the last common ancestor of Opisthokonta than did the gene repertoires of Metazoa. Metazoa and Fungi also show differences regarding gene gain mechanisms. Gene fusions are more prevalent in Metazoa, whereas a larger fraction of gene gains were detected as horizontal gene transfers in Fungi and protists, in agreement with the long-standing idea that transfers would be less relevant in Metazoa due to germline isolation3-5. Together, our results indicate that animals and fungi evolved under two contrasting trajectories of genetic change that predated the origin of both groups. The gradual establishment of two clearly differentiated genomic contexts thus set the stage for the emergence of Metazoa and Fungi.
Asunto(s)
Evolución Molecular , Hongos , Genoma , Genómica , Filogenia , Animales , Hongos/genética , Transferencia de Gen Horizontal , Genes , Genoma/genética , Genoma Fúngico/genética , Metabolismo/genéticaRESUMEN
Oxidative genome damage is an unavoidable consequence of cellular metabolism. It arises at gene regulatory elements by epigenetic demethylation during transcriptional activation1,2. Here we show that promoters are protected from oxidative damage via a process mediated by the nuclear mitotic apparatus protein NuMA (also known as NUMA1). NuMA exhibits genomic occupancy approximately 100 bp around transcription start sites. It binds the initiating form of RNA polymerase II, pause-release factors and single-strand break repair (SSBR) components such as TDP1. The binding is increased on chromatin following oxidative damage, and TDP1 enrichment at damaged chromatin is facilitated by NuMA. Depletion of NuMA increases oxidative damage at promoters. NuMA promotes transcription by limiting the polyADP-ribosylation of RNA polymerase II, increasing its availability and release from pausing at promoters. Metabolic labelling of nascent RNA identifies genes that depend on NuMA for transcription including immediate-early response genes. Complementation of NuMA-deficient cells with a mutant that mediates binding to SSBR, or a mitotic separation-of-function mutant, restores SSBR defects. These findings underscore the importance of oxidative DNA damage repair at gene regulatory elements and describe a process that fulfils this function.
Asunto(s)
Proteínas de Ciclo Celular , Daño del ADN , Reparación del ADN , Estrés Oxidativo , Regiones Promotoras Genéticas , Proteínas de Ciclo Celular/metabolismo , Cromatina/genética , Genes , Prueba de Complementación Genética , Mitosis , Mutación , Estrés Oxidativo/genética , Hidrolasas Diéster Fosfóricas/metabolismo , Poli ADP Ribosilación , Regiones Promotoras Genéticas/genética , ARN/biosíntesis , ARN/genética , ARN Polimerasa II/metabolismo , Huso Acromático/metabolismo , Sitio de Iniciación de la TranscripciónRESUMEN
We may never understand the function of all genes, findings by Freeman, Munro and colleagues suggest, unless we rethink our approaches. They make a thorough attempt at quantifying the unknownness of protein-coding genes and experimentally prove that many neglected genes hold the seed of important discoveries.
Asunto(s)
GenesRESUMEN
The genetic approach, based on the study of inherited forms of deafness, has proven to be particularly effective for deciphering the molecular mechanisms underlying the development of the peripheral auditory system, the cochlea and its afferent auditory neurons, and how this system extracts the physical parameters of sound. Although this genetic dissection has provided little information about the central auditory system, scattered data suggest that some genes may have a critical role in both the peripheral and central auditory systems. Here, we review the genes controlling the development and function of the peripheral and central auditory systems, focusing on those with demonstrated intrinsic roles in both systems and highlighting the current underappreciation of these genes. Their encoded products are diverse, from transcription factors to ion channels, as are their roles in the central auditory system, mostly evaluated in brainstem nuclei. We examine the ontogenetic and evolutionary mechanisms that may underlie their expression at different sites.
Asunto(s)
Vías Auditivas/fisiología , Regulación del Desarrollo de la Expresión Génica , Genes , Neurogénesis/genética , Animales , Vías Auditivas/crecimiento & desarrollo , Evolución Biológica , Cóclea/embriología , Cóclea/crecimiento & desarrollo , Cóclea/fisiología , Ontología de Genes , Células Ciliadas Auditivas/citología , Células Ciliadas Auditivas/fisiología , Trastornos de la Audición/genética , Humanos , Canales Iónicos/genética , Canales Iónicos/fisiología , Proteínas del Tejido Nervioso/genética , Proteínas del Tejido Nervioso/fisiología , Rombencéfalo/embriología , Rombencéfalo/crecimiento & desarrollo , Rombencéfalo/fisiología , Células Receptoras Sensoriales/fisiología , Factores de Transcripción/genética , Factores de Transcripción/fisiologíaRESUMEN
The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.
Asunto(s)
Evolución Molecular , Genoma/genética , Genómica , Pan paniscus/genética , Filogenia , Animales , Factor 4A Eucariótico de Iniciación/genética , Femenino , Genes , Gorilla gorilla/genética , Anotación de Secuencia Molecular/normas , Pan troglodytes/genética , Pongo/genética , Duplicaciones Segmentarias en el Genoma , Análisis de Secuencia de ADNRESUMEN
The three-dimensional (3D) structure of chromatin is intrinsically associated with gene regulation and cell function1-3. Methods based on chromatin conformation capture have mapped chromatin structures in neuronal systems such as in vitro differentiated neurons, neurons isolated through fluorescence-activated cell sorting from cortical tissues pooled from different animals and from dissociated whole hippocampi4-6. However, changes in chromatin organization captured by imaging, such as the relocation of Bdnf away from the nuclear periphery after activation7, are invisible with such approaches8. Here we developed immunoGAM, an extension of genome architecture mapping (GAM)2,9, to map 3D chromatin topology genome-wide in specific brain cell types, without tissue disruption, from single animals. GAM is a ligation-free technology that maps genome topology by sequencing the DNA content from thin (about 220 nm) nuclear cryosections. Chromatin interactions are identified from the increased probability of co-segregation of contacting loci across a collection of nuclear slices. ImmunoGAM expands the scope of GAM to enable the selection of specific cell types using low cell numbers (approximately 1,000 cells) within a complex tissue and avoids tissue dissociation2,10. We report cell-type specialized 3D chromatin structures at multiple genomic scales that relate to patterns of gene expression. We discover extensive 'melting' of long genes when they are highly expressed and/or have high chromatin accessibility. The contacts most specific of neuron subtypes contain genes associated with specialized processes, such as addiction and synaptic plasticity, which harbour putative binding sites for neuronal transcription factors within accessible chromatin regions. Moreover, sensory receptor genes are preferentially found in heterochromatic compartments in brain cells, which establish strong contacts across tens of megabases. Our results demonstrate that highly specific chromatin conformations in brain cells are tightly related to gene regulation mechanisms and specialized functions.
Asunto(s)
Encéfalo/citología , Células/clasificación , Ensamble y Desensamble de Cromatina , Cromatina/química , Cromatina/genética , Genes , Conformación Molecular , Animales , Sitios de Unión , Células/metabolismo , Cromatina/metabolismo , Regulación de la Expresión Génica , Masculino , Ratones , Familia de Multigenes/genética , Neuronas/clasificación , Neuronas/metabolismo , Desnaturalización de Ácido Nucleico , Factores de Transcripción/metabolismoRESUMEN
Homologous recombination (HR) repairs DNA double-strand breaks (DSBs) in the S and G2 phases of the cell cycle1-3. Several HR proteins are preferentially recruited to DSBs at transcriptionally active loci4-10, but how transcription promotes HR is poorly understood. Here we develop an assay to assess the effect of local transcription on HR. Using this assay, we find that transcription stimulates HR to a substantial extent. Tethering RNA transcripts to the vicinity of DSBs recapitulates the effects of local transcription, which suggests that transcription enhances HR through RNA transcripts. Tethered RNA transcripts stimulate HR in a sequence- and orientation-dependent manner, indicating that they function by forming DNA-RNA hybrids. In contrast to most HR proteins, RAD51-associated protein 1 (RAD51AP1) only promotes HR when local transcription is active. RAD51AP1 drives the formation of R-loops in vitro and is required for tethered RNAs to stimulate HR in cells. Notably, RAD51AP1 is necessary for the DSB-induced formation of DNA-RNA hybrids in donor DNA, linking R-loops to D-loops. In vitro, RAD51AP1-generated R-loops enhance the RAD51-mediated formation of D-loops locally and give rise to intermediates that we term 'DR-loops', which contain both DNA-DNA and DNA-RNA hybrids and favour RAD51 function. Thus, at DSBs in transcribed regions, RAD51AP1 promotes the invasion of RNA transcripts into donor DNA, and stimulates HR through the formation of DR-loops.
Asunto(s)
ADN/genética , ADN/metabolismo , Recombinación Homóloga/genética , Estructuras R-Loop/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Transcripción Genética , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genética , Línea Celular , ADN/química , Roturas del ADN de Doble Cadena , Reparación del ADN , Proteínas de Unión al ADN/metabolismo , Genes/genética , Genes Reporteros/genética , Proteínas Fluorescentes Verdes/genética , Humanos , Técnicas In Vitro , ARN Mensajero/química , Proteínas de Unión al ARN/metabolismo , Recombinasa Rad51/metabolismoRESUMEN
DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA1-5 and contain genetic variations associated with diseases and phenotypic traits6-8. We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis-regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.
Asunto(s)
Cromatina/genética , ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Anotación de Secuencia Molecular , Cromatina/química , Cromatina/metabolismo , ADN/química , ADN/genética , Regulación de la Expresión Génica , Genes/genética , Genoma Humano/genética , Humanos , Regiones Promotoras Genéticas/genética , Secuencias Reguladoras de Ácidos Nucleicos/genéticaRESUMEN
Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.
Asunto(s)
Bases de Datos Factuales , Enfermedad , Genes , Fenotipo , Humanos , Internet , Bases de Datos Factuales/normas , Programas Informáticos , Genes/genética , Enfermedad/genéticaRESUMEN
RNA interference (RNAi) is an effective tool for genome-scale, high-throughput analysis of gene function. In the past five years, a number of genome-scale RNAi high-throughput screens (HTSs) have been done in both Drosophila and mammalian cultured cells to study diverse biological processes, including signal transduction, cancer biology, and host cell responses to infection. Results from these screens have led to the identification of new components of these processes and, importantly, have also provided insights into the complexity of biological systems, forcing new and innovative approaches to understanding functional networks in cells. Here, we review the main findings that have emerged from RNAi HTS and discuss technical issues that remain to be improved, in particular the verification of RNAi results and validation of their biological relevance. Furthermore, we discuss the importance of multiplexed and integrated experimental data analysis pipelines to RNAi HTS.
Asunto(s)
Genes , Técnicas Genéticas , Interferencia de ARN , Animales , Genoma , HumanosRESUMEN
Mammalian gene expression is inherently stochastic1,2, and results in discrete bursts of RNA molecules that are synthesized from each allele3-7. Although transcription is known to be regulated by promoters and enhancers, it is unclear how cis-regulatory sequences encode transcriptional burst kinetics. Characterization of transcriptional bursting, including the burst size and frequency, has mainly relied on live-cell4,6,8 or single-molecule RNA fluorescence in situ hybridization3,5,8,9 recordings of selected loci. Here we determine transcriptome-wide burst frequencies and sizes for endogenous mouse and human genes using allele-sensitive single-cell RNA sequencing. We show that core promoter elements affect burst size and uncover synergistic effects between TATA and initiator elements, which were masked at mean expression levels. Notably, we provide transcriptome-wide evidence that enhancers control burst frequencies, and demonstrate that cell-type-specific gene expression is primarily shaped by changes in burst frequencies. Together, our data show that burst frequency is primarily encoded in enhancers and burst size in core promoters, and that allelic single-cell RNA sequencing is a powerful model for investigating transcriptional kinetics.
Asunto(s)
Genes/genética , Genómica , Transcripción Genética/genética , Alelos , Animales , Elementos de Facilitación Genéticos/genética , Fibroblastos/metabolismo , Humanos , Cinética , Masculino , Ratones , Células Madre Embrionarias de Ratones/metabolismo , Especificidad de Órganos/genética , Polimorfismo Genético , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ARN , Eliminación de Secuencia , Análisis de la Célula Individual , Procesos Estocásticos , TATA Box/genética , Transcriptoma/genéticaRESUMEN
Large-scale genome sequencing is poised to provide a substantial increase in the rate of discovery of disease-associated mutations, but the functional interpretation of such mutations remains challenging. Here we show that deletions of a sequence on human chromosome 16 that we term the intestine-critical region (ICR) cause intractable congenital diarrhoea in infants1,2. Reporter assays in transgenic mice show that the ICR contains a regulatory sequence that activates transcription during the development of the gastrointestinal system. Targeted deletion of the ICR in mice caused symptoms that recapitulated the human condition. Transcriptome analysis revealed that an unannotated open reading frame (Percc1) flanks the regulatory sequence, and the expression of this gene was lost in the developing gut of mice that lacked the ICR. Percc1-knockout mice displayed phenotypes similar to those observed upon ICR deletion in mice and patients, whereas an ICR-driven Percc1 transgene was sufficient to rescue the phenotypes found in mice that lacked the ICR. Together, our results identify a gene that is critical for intestinal function and underscore the need for targeted in vivo studies to interpret the growing number of clinical genetic findings that do not affect known protein-coding genes.
Asunto(s)
Diarrea/congénito , Diarrea/genética , Elementos de Facilitación Genéticos/genética , Regulación del Desarrollo de la Expresión Génica , Genes , Intestinos/fisiología , Eliminación de Secuencia/genética , Animales , Cromosomas Humanos Par 16/genética , Modelos Animales de Enfermedad , Femenino , Genes Reporteros , Sitios Genéticos/genética , Humanos , Masculino , Ratones , Ratones Noqueados , Ratones Transgénicos , Linaje , Fenotipo , Activación Transcripcional , Transcriptoma/genética , Transgenes/genéticaRESUMEN
g:Profiler is a reliable and up-to-date functional enrichment analysis tool that supports various evidence types, identifier types and organisms. The toolset integrates many databases, including Gene Ontology, KEGG and TRANSFAC, to provide a comprehensive and in-depth analysis of gene lists. It also provides interactive and intuitive user interfaces and supports ordered queries and custom statistical backgrounds, among other settings. g:Profiler provides multiple programmatic interfaces to access its functionality. These can be easily integrated into custom workflows and external tools, making them valuable resources for researchers who want to develop their own solutions. g:Profiler has been available since 2007 and is used to analyse millions of queries. Research reproducibility and transparency are achieved by maintaining working versions of all past database releases since 2015. g:Profiler supports 849 species, including vertebrates, plants, fungi, insects and parasites, and can analyse any organism through user-uploaded custom annotation files. In this update article, we introduce a novel filtering method highlighting Gene Ontology driver terms, accompanied by new graph visualizations providing a broader context for significant Gene Ontology terms. As a leading enrichment analysis and gene list interoperability service, g:Profiler offers a valuable resource for genetics, biology and medical researchers. It is freely accessible at https://biit.cs.ut.ee/gprofiler.
Asunto(s)
Mapeo Cromosómico , Biología Computacional , Genes , Programas Informáticos , Animales , Mapeo Cromosómico/instrumentación , Mapeo Cromosómico/métodos , Bases de Datos Genéticas , Internet , Reproducibilidad de los Resultados , Interfaz Usuario-Computador , Biología Computacional/instrumentación , Biología Computacional/métodos , Genes/genética , HumanosRESUMEN
Variation in gene expression across lineages is thought to explain much of the observed phenotypic variation and adaptation. The protein is closer to the target of natural selection but gene expression is typically measured as the amount of mRNA. The broad assumption that mRNA levels are good proxies for protein levels has been undermined by a number of studies reporting moderate or weak correlations between the two measures across species. One biological explanation for this discrepancy is that there has been compensatory evolution between the mRNA level and regulation of translation. However, we do not understand the evolutionary conditions necessary for this to occur nor the expected strength of the correlation between mRNA and protein levels. Here, we develop a theoretical model for the coevolution of mRNA and protein levels and investigate the dynamics of the model over time. We find that compensatory evolution is widespread when there is stabilizing selection on the protein level; this observation held true across a variety of regulatory pathways. When the protein level is under directional selection, the mRNA level of a gene and the translation rate of the same gene were negatively correlated across lineages but positively correlated across genes. These findings help explain results from comparative studies of gene expression and potentially enable researchers to disentangle biological and statistical hypotheses for the mismatch between transcriptomic and proteomic data.