Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
1.
Bioinformatics ; 39(8)2023 08 01.
Article in English | MEDLINE | ID: mdl-37624924

ABSTRACT

SUMMARY: Many existing software libraries for genomics require researchers to pick between competing considerations: the performance of compiled languages and the accessibility of interpreted languages. Go, a modern compiled language, provides an opportunity to address this conflict. We introduce Gonomics, an open-source collection of command line programs and bioinformatic libraries implemented in Go that unites readability and performance for genomic analyses. Gonomics contains packages to read, write, and manipulate a wide array of file formats (e.g. FASTA, FASTQ, BED, BEDPE, SAM, BAM, and VCF), and can convert and interface between these formats. Furthermore, our modular library structure provides a flexible platform for researchers developing their own software tools to address specific questions. These commands can be combined and incorporated into complex pipelines to meet the growing need for high-performance bioinformatic resources. AVAILABILITY AND IMPLEMENTATION: Gonomics is implemented in the Go programming language. Source code, installation instructions, and documentation are freely available at https://github.com/vertgenlab/gonomics.


Subject(s)
Comprehension , Genomics , Computational Biology , Programming Languages , Documentation
2.
bioRxiv ; 2023 May 11.
Article in English | MEDLINE | ID: mdl-37214832

ABSTRACT

Spinocerebellar ataxia type 7 (SCA7) is an inherited neurodegenerative disorder caused by a CAG-polyglutamine repeat expansion. SCA7 patients display a striking loss of Purkinje cell (PC) neurons with disease progression; however, PCs are rare, making them difficult to characterize. We developed a PC nuclei enrichment protocol and applied it to single-nucleus RNA-seq of a SCA7 knock-in mouse model. Our results unify prior observations into a central mechanism of cell identity loss, impacting both glia and PCs, driving accumulation of inhibitory synapses and altered PC spiking. Zebrin-II subtype dysregulation is the predominant signal in PCs, leading to complete loss of zebrin-II striping at motor symptom onset in SCA7 mice. We show this zebrin-II subtype degradation is shared across Polyglutamine Ataxia mouse models and SCA7 patients. It has been speculated that PC subtype organization is critical for cerebellar function, and our results suggest that a breakdown of zebrin-II parasagittal striping is pathological.

3.
Nat Rev Genet ; 24(10): 687-711, 2023 10.
Article in English | MEDLINE | ID: mdl-36737647

ABSTRACT

Our ancestors acquired morphological, cognitive and metabolic modifications that enabled humans to colonize diverse habitats, develop extraordinary technologies and reshape the biosphere. Understanding the genetic, developmental and molecular bases for these changes will provide insights into how we became human. Connecting human-specific genetic changes to species differences has been challenging owing to an abundance of low-effect size genetic changes, limited descriptions of phenotypic differences across development at the level of cell types and lack of experimental models. Emerging approaches for single-cell sequencing, genetic manipulation and stem cell culture now support descriptive and functional studies in defined cell types with a human or ape genetic background. In this Review, we describe how the sequencing of genomes from modern and archaic hominins, great apes and other primates is revealing human-specific genetic changes and how new molecular and cellular approaches - including cell atlases and organoids - are enabling exploration of the candidate causal factors that underlie human-specific traits.


Subject(s)
Hominidae , Animals , Humans , Hominidae/genetics , Organoids , Biological Evolution , Evolution, Molecular
4.
Cell ; 185(24): 4587-4603.e23, 2022 11 23.
Article in English | MEDLINE | ID: mdl-36423581

ABSTRACT

Searches for the genetic underpinnings of uniquely human traits have focused on human-specific divergence in conserved genomic regions, which reflects adaptive modifications of existing functional elements. However, the study of conserved regions excludes functional elements that descended from previously neutral regions. Here, we demonstrate that the fastest-evolved regions of the human genome, which we term "human ancestor quickly evolved regions" (HAQERs), rapidly diverged in an episodic burst of directional positive selection prior to the human-Neanderthal split, before transitioning to constraint within hominins. HAQERs are enriched for bivalent chromatin states, particularly in gastrointestinal and neurodevelopmental tissues, and genetic variants linked to neurodevelopmental disease. We developed a multiplex, single-cell in vivo enhancer assay to discover that rapid sequence divergence in HAQERs generated hominin-unique enhancers in the developing cerebral cortex. We propose that a lack of pleiotropic constraints and elevated mutation rates poised HAQERs for rapid adaptation and subsequent susceptibility to disease.


Subject(s)
Hominidae , Neanderthals , Animals , Humans , Hominidae/genetics , Regulatory Sequences, Nucleic Acid , Neanderthals/genetics , Genome, Human , Genomics
5.
Cell ; 185(24): 4507-4525.e18, 2022 11 23.
Article in English | MEDLINE | ID: mdl-36356582

ABSTRACT

The human pathogen Mycobacterium tuberculosis typically causes lung disease but can also disseminate to other tissues. We identified a M. tuberculosis (Mtb) outbreak presenting with unusually high rates of extrapulmonary dissemination and bone disease. We found that the causal strain carried an ancestral full-length version of the type VII-secreted effector EsxM rather than the truncated version present in other modern Mtb lineages. The ancestral EsxM variant exacerbated dissemination through enhancement of macrophage motility, increased egress of macrophages from established granulomas, and alterations in macrophage actin dynamics. Reconstitution of the ancestral version of EsxM in an attenuated modern strain of Mtb altered the migratory mode of infected macrophages, enhancing their motility. In a zebrafish model, full-length EsxM promoted bone disease. The presence of a derived nonsense variant in EsxM throughout the major Mtb lineages 2, 3, and 4 is consistent with a role for EsxM in regulating the extent of dissemination.


Subject(s)
Bone Diseases , Mycobacterium marinum , Mycobacterium tuberculosis , Tuberculosis , Animals , Humans , Zebrafish , Tuberculosis/microbiology , Macrophages/microbiology , Bacterial Proteins/genetics
6.
Nat Ecol Evol ; 6(10): 1537-1552, 2022 10.
Article in English | MEDLINE | ID: mdl-36050398

ABSTRACT

Understanding the mechanisms leading to new traits or additional features in organisms is a fundamental goal of evolutionary biology. We show that HOXDB regulatory changes have been used repeatedly in different fish genera to alter the length and number of the prominent dorsal spines used to classify stickleback species. In Gasterosteus aculeatus (typically 'three-spine sticklebacks'), a variant HOXDB allele is genetically linked to shortening an existing spine and adding an additional spine. In Apeltes quadracus (typically 'four-spine sticklebacks'), a variant HOXDB allele is associated with lengthening a spine and adding an additional spine in natural populations. The variant alleles alter the same non-coding enhancer region in the HOXDB locus but do so by diverse mechanisms, including single-nucleotide polymorphisms, deletions and transposable element insertions. The independent regulatory changes are linked to anterior expansion or contraction of HOXDB expression. We propose that associated changes in spine lengths and numbers are partial identity transformations in a repeating skeletal series that forms major defensive structures in fish. Our findings support the long-standing hypothesis that natural Hox gene variation underlies key patterning changes in wild populations and illustrate how different mutational mechanisms affecting the same region may produce opposite gene expression changes with similar phenotypic outcomes.


Subject(s)
Genes, Homeobox , Smegmamorpha , Animals , DNA Transposable Elements , Phenotype , Smegmamorpha/genetics
7.
Cell ; 176(4): 743-756.e17, 2019 02 07.
Article in English | MEDLINE | ID: mdl-30735633

ABSTRACT

Direct comparisons of human and non-human primate brains can reveal molecular pathways underlying remarkable specializations of the human brain. However, chimpanzee tissue is inaccessible during neocortical neurogenesis when differences in brain size first appear. To identify human-specific features of cortical development, we leveraged recent innovations that permit generating pluripotent stem cell-derived cerebral organoids from chimpanzee. Despite metabolic differences, organoid models preserve gene regulatory networks related to primary cell types and developmental processes. We further identified 261 differentially expressed genes in human compared to both chimpanzee organoids and macaque cortex, enriched for recent gene duplications, and including multiple regulators of PI3K-AKT-mTOR signaling. We observed increased activation of this pathway in human radial glia, dependent on two receptors upregulated specifically in human: INSR and ITGB8. Our findings establish a platform for systematic analysis of molecular changes contributing to human brain development and evolution.


Subject(s)
Cerebral Cortex/cytology , Organoids/metabolism , Animals , Biological Evolution , Brain/cytology , Cell Culture Techniques/methods , Cell Differentiation/genetics , Cerebral Cortex/metabolism , Gene Regulatory Networks/genetics , Humans , Induced Pluripotent Stem Cells/cytology , Macaca , Neurogenesis/genetics , Organoids/growth & development , Pan troglodytes , Pluripotent Stem Cells/cytology , Single-Cell Analysis , Species Specificity , Transcriptome/genetics
8.
Am J Hum Genet ; 103(3): 421-430, 2018 09 06.
Article in English | MEDLINE | ID: mdl-30100087

ABSTRACT

Bipolar disorder (BD) and schizophrenia (SCZ) are highly heritable diseases that affect more than 3% of individuals worldwide. Genome-wide association studies have strongly and repeatedly linked risk for both of these neuropsychiatric diseases to a 100 kb interval in the third intron of the human calcium channel gene CACNA1C. However, the causative mutation is not yet known. We have identified a human-specific tandem repeat in this region that is composed of 30 bp units, often repeated hundreds of times. This large tandem repeat is unstable using standard polymerase chain reaction and bacterial cloning techniques, which may have resulted in its incorrect size in the human reference genome. The large 30-mer repeat region is polymorphic in both size and sequence in human populations. Particular sequence variants of the 30-mer are associated with risk status at several flanking single-nucleotide polymorphisms in the third intron of CACNA1C that have previously been linked to BD and SCZ. The tandem repeat arrays function as enhancers that increase reporter gene expression in a human neural progenitor cell line. Different human arrays vary in the magnitude of enhancer activity, and the 30-mer arrays associated with increased psychiatric disease risk status have decreased enhancer activity. Changes in the structure and sequence of these arrays likely contribute to changes in CACNA1C function during human evolution and may modulate neuropsychiatric disease risk in modern human populations.


Subject(s)
Bipolar Disorder/genetics , Genetic Predisposition to Disease/genetics , Polymorphism, Single Nucleotide/genetics , Schizophrenia/genetics , Tandem Repeat Sequences/genetics , Calcium Channels, L-Type/genetics , Genome, Human/genetics , Genome-Wide Association Study/methods , Humans , Introns/genetics
10.
Genome Res ; 28(2): 256-265, 2018 02.
Article in English | MEDLINE | ID: mdl-29229672

ABSTRACT

We present a method to detect copy number variants (CNVs) that are differentially present between two groups of sequenced samples. We use a finite-state transducer where the emitted read depth is conditioned on the mappability and GC-content of all reads that occur at a given base position. In this model, the read depth within a region is a mixture of binomials, which in simulations matches the read depth more closely than the often-used negative binomial distribution. The method analyzes all samples simultaneously, preserving uncertainty as to the breakpoints and magnitude of CNVs present in an individual when it identifies CNVs differentially present between the two groups. We apply this method to identify CNVs that are recurrently associated with postglacial adaptation of marine threespine stickleback (Gasterosteus aculeatus) to freshwater. We identify 6664 regions of the stickleback genome, totaling 1.7 Mbp, which show consistent copy number differences between marine and freshwater populations. These deletions and duplications affect both protein-coding genes and cis-regulatory elements, including a noncoding intronic telencephalon enhancer of DCHS1 The functions of the genes near or included within the 6664 CNVs are enriched for immunity and muscle development, as well as head and limb morphology. Although freshwater stickleback have repeatedly evolved from marine populations, we show that freshwater stickleback also act as reservoirs for ancient ancestral sequences that are highly conserved among distantly related teleosts, but largely missing from marine stickleback due to recent selective sweeps in marine populations.


Subject(s)
Adaptation, Physiological/genetics , DNA Copy Number Variations/genetics , Selection, Genetic , Smegmamorpha/genetics , Animals , Fresh Water , Genome/genetics , Polymorphism, Single Nucleotide/genetics , Sampling Studies
11.
Mol Biol Evol ; 32(1): 23-8, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25415961

ABSTRACT

The evolution of avian feathers has recently been illuminated by fossils and the identification of genes involved in feather patterning and morphogenesis. However, molecular studies have focused mainly on protein-coding genes. Using comparative genomics and more than 600,000 conserved regulatory elements, we show that patterns of genome evolution in the vicinity of feather genes are consistent with a major role for regulatory innovation in the evolution of feathers. Rates of innovation at feather regulatory elements exhibit an extended period of innovation with peaks in the ancestors of amniotes and archosaurs. We estimate that 86% of such regulatory elements and 100% of the nonkeratin feather gene set were present prior to the origin of Dinosauria. On the branch leading to modern birds, we detect a strong signal of regulatory innovation near insulin-like growth factor binding protein (IGFBP) 2 and IGFBP5, which have roles in body size reduction, and may represent a genomic signature for the miniaturization of dinosaurian body size preceding the origin of flight.


Subject(s)
Birds/genetics , Dinosaurs/anatomy & histology , Feathers/growth & development , Genomics/methods , Regulatory Elements, Transcriptional , Animals , Biological Evolution , Birds/anatomy & histology , Body Size , Dinosaurs/genetics , Dinosaurs/growth & development , Evolution, Molecular , Feathers/metabolism , Insulin-Like Growth Factor Binding Protein 2/genetics , Insulin-Like Growth Factor Binding Protein 5/genetics , Keratins/genetics , Mutation Rate , Phylogeny
12.
PLoS One ; 7(8): e43128, 2012.
Article in English | MEDLINE | ID: mdl-22952639

ABSTRACT

Recent research supports the view that changes in gene regulation, as opposed to changes in the genes themselves, play a significant role in morphological evolution. Gene regulation is largely dependent on transcription factor binding sites. Researchers are now able to use the available 29 mammalian genomes to measure selective constraint at the level of binding sites. This detailed map of constraint suggests that mammalian genomes co-opt fragments of mobile elements to act as gene regulatory sequence on a large scale. In the human genome we detect over 280,000 putative regulatory elements, totaling approximately 7 Mb of sequence, that originated as mobile element insertions. These putative regulatory regions are conserved non-exonic elements (CNEEs), which show considerable cross-species constraint and signatures of continued negative selection in humans, yet do not appear in a known mature transcript. These putative regulatory elements were co-opted from SINE, LINE, LTR and DNA transposon insertions. We demonstrate that at least 11%, and an estimated 20%, of gene regulatory sequence in the human genome showing cross-species conservation was co-opted from mobile elements. The location in the genome of CNEEs co-opted from mobile elements closely resembles that of CNEEs in general, except in the centers of the largest gene deserts where recognizable co-option events are relatively rare. We find that regions of certain mobile element insertions are more likely to be held under purifying selection than others. In particular, we show 6 examples where paralogous instances of an often co-opted mobile element region define a sequence motif that closely matches a transcription factor's binding profile.


Subject(s)
Genome, Human , Genome , Mammals/genetics , Regulatory Elements, Transcriptional , 5' Untranslated Regions , Animals , Binding Sites , Gene Frequency , Humans , Models, Genetic , Models, Statistical , Phylogeny , Protein Binding , Sequence Alignment , Transcription Factors/metabolism
13.
Nature ; 478(7370): 476-82, 2011 Oct 12.
Article in English | MEDLINE | ID: mdl-21993624

ABSTRACT

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.


Subject(s)
Evolution, Molecular , Genome, Human/genetics , Genome/genetics , Mammals/genetics , Animals , Disease , Exons/genetics , Genomics , Health , Humans , Molecular Sequence Annotation , Phylogeny , RNA/classification , RNA/genetics , Selection, Genetic/genetics , Sequence Alignment , Sequence Analysis, DNA
14.
Nature ; 477(7366): 587-91, 2011 Aug 31.
Article in English | MEDLINE | ID: mdl-21881562

ABSTRACT

The evolution of the amniotic egg was one of the great evolutionary innovations in the history of life, freeing vertebrates from an obligatory connection to water and thus permitting the conquest of terrestrial environments. Among amniotes, genome sequences are available for mammals and birds, but not for non-avian reptiles. Here we report the genome sequence of the North American green anole lizard, Anolis carolinensis. We find that A. carolinensis microchromosomes are highly syntenic with chicken microchromosomes, yet do not exhibit the high GC and low repeat content that are characteristic of avian microchromosomes. Also, A. carolinensis mobile elements are very young and diverse-more so than in any other sequenced amniote genome. The GC content of this lizard genome is also unusual in its homogeneity, unlike the regionally variable GC content found in mammals and birds. We describe and assign sequence to the previously unknown A. carolinensis X chromosome. Comparative gene analysis shows that amniote egg proteins have evolved significantly more rapidly than other proteins. An anole phylogeny resolves basal branches to illuminate the history of their repeated adaptive radiations.


Subject(s)
Birds/genetics , Evolution, Molecular , Genome/genetics , Lizards/genetics , Mammals/genetics , Animals , Chickens/genetics , GC Rich Sequence/genetics , Genomics , Humans , Molecular Sequence Data , Phylogeny , Synteny/genetics , X Chromosome/genetics
15.
Science ; 333(6045): 1019-24, 2011 Aug 19.
Article in English | MEDLINE | ID: mdl-21852499

ABSTRACT

The gain, loss, and modification of gene regulatory elements may underlie a substantial proportion of phenotypic changes on animal lineages. To investigate the gain of regulatory elements throughout vertebrate evolution, we identified genome-wide sets of putative regulatory regions for five vertebrates, including humans. These putative regulatory regions are conserved nonexonic elements (CNEEs), which are evolutionarily conserved yet do not overlap any coding or noncoding mature transcript. We then inferred the branch on which each CNEE came under selective constraint. Our analysis identified three extended periods in the evolution of gene regulatory elements. Early vertebrate evolution was characterized by regulatory gains near transcription factors and developmental genes, but this trend was replaced by innovations near extracellular signaling genes, and then innovations near posttranslational protein modifiers.


Subject(s)
Biological Evolution , Conserved Sequence , Evolution, Molecular , Regulatory Elements, Transcriptional , Regulatory Sequences, Nucleic Acid , Vertebrates/genetics , Animals , Cattle , DNA, Intergenic/genetics , Gene Expression Regulation , Genes, Developmental , Genome , Humans , Markov Chains , Mice , Oryzias/genetics , Phylogeny , Protein Processing, Post-Translational/genetics , Selection, Genetic , Sequence Alignment , Smegmamorpha/genetics , Transcription Factors/genetics
16.
Nat Biotechnol ; 28(5): 495-501, 2010 May.
Article in English | MEDLINE | ID: mdl-20436461

ABSTRACT

We developed the Genomic Regions Enrichment of Annotations Tool (GREAT) to analyze the functional significance of cis-regulatory regions identified by localized measurements of DNA binding events across an entire genome. Whereas previous methods took into account only binding proximal to genes, GREAT is able to properly incorporate distal binding sites and control for false positives using a binomial test over the input genomic regions. GREAT incorporates annotations from 20 ontologies and is available as a web application. Applying GREAT to data sets from chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) of multiple transcription-associated factors, including SRF, NRSF, GABP, Stat3 and p300 in different developmental contexts, we recover many functions of these factors that are missed by existing gene-based tools, and we generate testable hypotheses. The utility of GREAT is not limited to ChIP-seq, as it could also be applied to open chromatin, localized epigenomic markers and similar functional data sets, as well as comparative genomics sets.


Subject(s)
Data Mining/methods , Genome , Genomics/methods , Regulatory Elements, Transcriptional , Software , Animals , Chromatin Immunoprecipitation , Databases, Genetic , E1A-Associated p300 Protein , Humans , Jurkat Cells , Mice , Protein Binding , Serum Response Factor
17.
J Hered ; 101(4): 437-47, 2010.
Article in English | MEDLINE | ID: mdl-20332163

ABSTRACT

We report that 18 conserved, and by extension functional, elements in the human genome are the result of retroposon insertions that are evolving under purifying selection in mammals. We show evidence that 1 of the 18 elements regulates the expression of ASXL3 during development by encoding an alternatively spliced exon that causes nonsense-mediated decay of the transcript. The retroposon that gave rise to these functional elements was quickly inactivated in the mammalian ancestor, and all traces of it have been lost due to neutral decay. However, the tuatara has maintained a near-ancestral version of this retroposon in its extant genome, which allows us to connect the 18 human elements to the evolutionary events that created them. We propose that conservation efforts over more than 100 years may not have only prevented the tuatara from going extinct but could have preserved our ability to understand the evolutionary history of functional elements in the human genome. Through simulations, we argue that species with historically low population sizes are more likely to harbor ancient mobile elements for long periods of time and in near-ancestral states, making these species indispensable in understanding the evolutionary origin of functional elements in the human genome.


Subject(s)
Endangered Species , Evolution, Molecular , Genome, Human , Animals , Humans , Mammals/genetics , Retroelements , Transcription Factors/genetics
18.
PLoS Comput Biol ; 3(12): e247, 2007 Dec.
Article in English | MEDLINE | ID: mdl-18085818

ABSTRACT

Taking advantage of the complete genome sequences of several mammals, we developed a novel method to detect losses of well-established genes in the human genome through syntenic mapping of gene structures between the human, mouse, and dog genomes. Unlike most previous genomic methods for pseudogene identification, this analysis is able to differentiate losses of well-established genes from pseudogenes formed shortly after segmental duplication or generated via retrotransposition. Therefore, it enables us to find genes that were inactivated long after their birth, which were likely to have evolved nonredundant biological functions before being inactivated. The method was used to look for gene losses along the human lineage during the approximately 75 million years (My) since the common ancestor of primates and rodents (the euarchontoglire crown group). We identified 26 losses of well-established genes in the human genome that were all lost at least 50 My after their birth. Many of them were previously characterized pseudogenes in the human genome, such as GULO and UOX. Our methodology is highly effective at identifying losses of single-copy genes of ancient origin, allowing us to find a few well-known pseudogenes in the human genome missed by previous high-throughput genome-wide studies. In addition to confirming previously known gene losses, we identified 16 previously uncharacterized human pseudogenes that are definitive losses of long-established genes. Among them is ACYL3, an ancient enzyme present in archaea, bacteria, and eukaryotes, but lost approximately 6 to 8 Mya in the ancestor of humans and chimps. Although losses of well-established genes do not equate to adaptive gene losses, they are a useful proxy to use when searching for such genetic changes. This is especially true for adaptive losses that occurred more than 250,000 years ago, since any genetic evidence of the selective sweep indicative of such an event has been erased.


Subject(s)
Biological Evolution , Chromosome Mapping/methods , DNA Mutational Analysis/methods , Evolution, Molecular , Gene Deletion , Genome, Human/genetics , Pseudogenes/genetics , Animals , Dogs , Genetic Variation/genetics , Genomics/methods , Humans , Mice
19.
Proc Natl Acad Sci U S A ; 104(47): 18613-8, 2007 Nov 20.
Article in English | MEDLINE | ID: mdl-18003932

ABSTRACT

The evolutionary forces that establish and hone target gene networks of transcription factors are largely unknown. Transposition of retroelements may play a role, but its global importance, beyond a few well described examples for isolated genes, is not clear. We report that LTR class I endogenous retrovirus (ERV) retroelements impact considerably the transcriptional network of human tumor suppressor protein p53. A total of 1,509 of approximately 319,000 human ERV LTR regions have a near-perfect p53 DNA binding site. The LTR10 and MER61 families are particularly enriched for copies with a p53 site. These ERV families are primate-specific and transposed actively near the time when the New World and Old World monkey lineages split. Other mammalian species lack these p53 response elements. Analysis of published genomewide ChIP data for p53 indicates that more than one-third of identified p53 binding sites are accounted for by ERV copies with a p53 site. ChIP and expression studies for individual genes indicate that human ERV p53 sites are likely part of the p53 transcriptional program and direct regulation of p53 target genes. These results demonstrate how retroelements can significantly shape the regulatory network of a transcription factor in a species-specific manner.


Subject(s)
Endogenous Retroviruses/physiology , Gene Regulatory Networks/genetics , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Protein p53/metabolism , Binding Sites , Cell Line, Tumor , Endogenous Retroviruses/classification , Evolution, Molecular , Gene Dosage/genetics , Gene Expression Regulation, Viral/genetics , Genome, Viral/genetics , Humans , Protein Binding , Regulatory Elements, Transcriptional/genetics
20.
Proc Natl Acad Sci U S A ; 104(19): 8005-10, 2007 May 08.
Article in English | MEDLINE | ID: mdl-17463089

ABSTRACT

At least 5% of the human genome predating the mammalian radiation is thought to have evolved under purifying selection, yet protein-coding and related untranslated exons occupy at most 2% of the genome. Thus, the majority of conserved and, by extension, functional sequence in the human genome seems to be nonexonic. Recent work has highlighted a handful of cases where mobile element insertions have resulted in the introduction of novel conserved nonexonic elements. Here, we present a genome-wide survey of 10,402 constrained nonexonic elements in the human genome that have all been deposited by characterized mobile elements. These repeat instances have been under strong purifying selection since at least the boreoeutherian ancestor (100 Mya). They are most often located in gene deserts and show a strong preference for residing closest to genes involved in development and transcription regulation. In particular, constrained nonexonic elements with clear repetitive origins are located near genes involved in cell adhesion, including all characterized cellular members of the reelin-signaling pathway. Overall, we find that mobile elements have contributed at least 5.5% of all constrained nonexonic elements unique to mammals, suggesting that mobile elements may have played a larger role than previously recognized in shaping and specializing the landscape of gene regulation during mammalian evolution.


Subject(s)
DNA Transposable Elements , Genes, Developmental , Selection, Genetic , Cell Adhesion Molecules, Neuronal/physiology , Extracellular Matrix Proteins/physiology , Genes, Regulator , Humans , Multigene Family , Nerve Tissue Proteins/physiology , Reelin Protein , Serine Endopeptidases/physiology , Signal Transduction
SELECTION OF CITATIONS
SEARCH DETAIL
...