RESUMO
PURPOSE: With the advent of gene therapies for inherited retinal degenerations (IRDs), genetic diagnostics will have an increasing role in clinical decision-making. Yet the genetic cause of disease cannot be identified using exon-based sequencing for a significant portion of patients. We hypothesized that noncoding pathogenic variants contribute significantly to the genetic causality of IRDs and evaluated patients with single coding pathogenic variants in RPGRIP1 to test this hypothesis. METHODS: IRD families underwent targeted panel sequencing. Unsolved cases were explored by exome and genome sequencing looking for additional pathogenic variants. Candidate pathogenic variants were then validated by Sanger sequencing, quantitative polymerase chain reaction, and in vitro splicing assays in two cell lines analyzed through amplicon sequencing. RESULTS: Among 1722 families, 3 had biallelic loss-of-function pathogenic variants in RPGRIP1 while 7 had a single disruptive coding pathogenic variants. Exome and genome sequencing revealed potential noncoding pathogenic variants in these 7 families. In 6, the noncoding pathogenic variants were shown to lead to loss of function in vitro. CONCLUSION: Noncoding pathogenic variants were identified in 6 of 7 families with single coding pathogenic variants in RPGRIP1. The results suggest that noncoding pathogenic variants contribute significantly to the genetic causality of IRDs and RPGRIP1-mediated IRDs are more common than previously thought.
Assuntos
DNA Intergênico/genética , Proteínas/genética , Degeneração Retiniana/genética , Adulto , Mapeamento Cromossômico , Proteínas do Citoesqueleto , Análise Mutacional de DNA/métodos , DNA Intergênico/fisiologia , Feminino , Células HEK293 , Humanos , Masculino , Mutação , Linhagem , Proteínas/fisiologia , Degeneração Retiniana/etiologia , Sequenciamento do Exoma/métodos , Sequenciamento Completo do Genoma/métodosRESUMO
Bidirectional promoters are identified in diverse organisms with widely varied genome sizes, including bacteria, yeast, mammals, and plants. However, little research has been done on any individual endogenous bidirectional promoter from plants. Here, we describe a promoter positioned in the intergenic region of two defensin-like protein genes, Def1 and Def2 in maize (Zea mays). We examined the expression profiles of Def1 and Def2 in 14 maize tissues by qRT-PCR, and the results showed that this gene pair was expressed abundantly and specifically in seeds. When fused to either green fluorescent protein (GFP) or ß-glucuronidase (GUS) reporter genes, P ZmBD1 , P ZmDef1 , and P ZmDef2 were active and reproduced the expression patterns of both Def1 and Def2 genes in transformed immature maize embryos, as well as in developing seeds of transgenic maize. Comparative analysis revealed that PZmBD1 shared most of the expression characteristics of the two polar promoters, but displayed more stringent embryo specificity, delayed expression initiation, and asymmetric promoter activity. Moreover, a truncated promoter study revealed that the core promoters only exhibit basic bidirectional activity, while interacting with necessary cis-elements, which leads to polarity and different strengths. The sophisticated interaction or counteraction between the core promoter and cis-elements may potentially regulate bidirectional promoters.
Assuntos
DNA Intergênico/fisiologia , Genes de Plantas/genética , Proteínas de Plantas/fisiologia , Regiões Promotoras Genéticas/fisiologia , Zea mays/genética , DNA Intergênico/genética , Regulação da Expressão Gênica de Plantas/genética , Regulação da Expressão Gênica de Plantas/fisiologia , Genes de Plantas/fisiologia , Proteínas de Plantas/genética , Plantas Geneticamente Modificadas , Regiões Promotoras Genéticas/genética , Sementes/metabolismo , Sementes/fisiologia , Transcriptoma , Zea mays/fisiologiaRESUMO
Gene expression differences are shaped by selective pressures and contribute to phenotypic differences between species. We identified 964 copy number differences (CNDs) of conserved sequences across three primate species and examined their potential effects on gene expression profiles. Samples with copy number different genes had significantly different expression than samples with neutral copy number. Genes encoding regulatory molecules differed in copy number and were associated with significant expression differences. Additionally, we identified 127 CNDs that were processed pseudogenes and some of which were expressed. Furthermore, there were copy number-different regulatory regions such as ultraconserved elements and long intergenic noncoding RNAs with the potential to affect expression. We postulate that CNDs of these conserved sequences fine-tune developmental pathways by altering the levels of RNA.
Assuntos
DNA Intergênico/fisiologia , Dosagem de Genes/fisiologia , Regulação da Expressão Gênica/fisiologia , Pseudogenes/fisiologia , RNA não Traduzido/fisiologia , Elementos Reguladores de Transcrição/fisiologia , Animais , Linhagem Celular , Humanos , Macaca mulatta , Pan troglodytes , Especificidade da EspécieRESUMO
Most of the mammalian genome consists of nucleotide sequences not coding for proteins. Exons of genes make up only 3% of the human genome, while the significance of most other sequences remains unknown. Recent genome studies with high-throughput methods demonstrate that the so-called noncoding part of the genome may perform important functions. This hypothesis is supported by three groups of experimental data: 1) approximately 10% of the sequences, most of which are located in noncoding parts of the genome, is evolutionarily conserved and thus can be of functional importance; 2) up to 99% of the mammalian genome is being transcribed forming short and long noncoding RNAs in addition to common mRNA; and 3) mutations in noncoding parts of the genome can be accompanied by progression of pathological states of the organism. In the light of these data, in the review we consider the functional role of numerous known sequences of noncoding parts of the genome including introns, DNA methylation regions, enhancers and locus control regions, insulators, S/MAR sequences, pseudogenes, and genes of noncoding RNAs, as well as transposons and simple repeats of centromeric and telomeric regions of chromosomes. The assumption is made that the intergenic noncoding sequences without definite/clear functions can be involved in spatial organization of genetic loci in interphase nuclei.
Assuntos
DNA Intergênico/fisiologia , Genoma , Mamíferos/genética , Sequências Reguladoras de Ácido Nucleico , Animais , Centrômero/química , Elementos de DNA Transponíveis , DNA Intergênico/química , Humanos , Pseudogenes , RNA não Traduzido/química , RNA não Traduzido/genética , RNA não Traduzido/fisiologia , Telômero/químicaRESUMO
Asymmetrical segregation of differentiated sister chromatids is thought to be important for cellular differentiation in higher eukaryotes. Similarly, in fission yeast, cellular differentiation involves the asymmetrical segregation of a chromosomal imprint. This imprint has been shown to consist of two ribonucleotides that are incorporated into the DNA during lagging-strand synthesis in response to a replication pause, but the underlying mechanism remains unknown. Here we present key novel discoveries important for unravelling this process. Our data show that cis-acting sequences within the mat1 cassette mediate pausing of replication forks at the proximity of the imprinting site, and the results suggest that this pause dictates specific priming at the position of imprinting in a sequence-independent manner. Also, we identify a novel type of cis-acting spacer region important for the imprinting process that affects where subsequent primers are put down after the replication fork is released from the pause. Thus, our data suggest that the imprint is formed by ligation of a not-fully-processed Okazaki fragment to the subsequent fragment. The presented work addresses how differentiated sister chromatids are established during DNA replication through the involvement of replication barriers.
Assuntos
DNA Intergênico/fisiologia , Impressão Genômica , Schizosaccharomyces/genética , Sequência de Bases , Southern Blotting , Ciclo Celular , Mapeamento Cromossômico , Período de Replicação do DNA/fisiologia , DNA Intergênico/genética , Eletroforese em Gel Bidimensional , Loci Gênicos , Dados de Sequência Molecular , Schizosaccharomyces/crescimento & desenvolvimento , Análise de Sequência de DNA , Transcrição Gênica , Ativação TranscricionalRESUMO
Clustered regularly interspaced short palindromic repeats (CRISPR) in bacterial and archaeal DNA have recently been shown to be a new type of antiviral immune system in these organisms. We here study the diversity of spacers in CRISPR under selective pressure. We propose a population dynamics model that explains the biological observation that the leader-proximal end of CRISPR is more diversified and the leader-distal end of CRISPR is more conserved. This result is shown to be in agreement with recent experiments. Our results show that the CRISPR spacer structure is influenced by and provides a record of the viral challenges that bacteria face.
Assuntos
Bactérias/genética , DNA Intergênico/genética , Genoma Bacteriano/genética , Sequências Repetidas Invertidas , Bactérias/classificação , Bactérias/metabolismo , Análise por Conglomerados , DNA Bacteriano/genética , DNA Bacteriano/fisiologia , DNA Intergênico/fisiologia , Genoma Bacteriano/fisiologia , Polimorfismo Genético , Fatores de TempoRESUMO
Meiotic recombination is initiated by DNA double-strand breaks (DSBs) made by Spo11 (Rec12 in fission yeast), which becomes covalently linked to the DSB ends. Like recombination events, DSBs occur at hotspots in the genome, but the genetic factors responsible for most hotspots have remained elusive. Here we describe in fission yeast the genome-wide distribution of meiosis-specific Rec12-DNA linkages, which closely parallel DSBs measured by conventional Southern blot hybridization. Prominent DSB hotspots are located approximately 65 kb apart, separated by intervals with little or no detectable breakage. Most hotspots lie within exceptionally large intergenic regions. Thus, the chromosomal architecture responsible for hotspots in fission yeast is markedly different from that of budding yeast, in which DSB hotspots are much more closely spaced and, in many regions of the genome, occur at each promoter. Our analysis in fission yeast reveals a clearly identifiable chromosomal feature that can predict the majority of recombination hotspots across a whole genome and provides a basis for searching for the chromosomal features that dictate hotspots of meiotic recombination in other organisms, including humans.
Assuntos
Quebras de DNA de Cadeia Dupla , DNA Intergênico/fisiologia , Meiose/genética , Schizosaccharomyces/genética , Southern Blotting , Mapeamento Cromossômico , Cromossomos Fúngicos , Proteínas de Schizosaccharomyces pombe/genética , Proteínas de Schizosaccharomyces pombe/fisiologiaRESUMO
A systematic search for non-conventional open reading frames in human DNA reveals a large number of small ORFs encoding peptides generally smaller than 100 amino-acids. These ORFs are transcribed and translated into small proteins, which are demonstrated to have functional significance by bulk CRISPR inactivation. Evidence is also found for bicistronic mRNAs including such a small ORF upstream of a canonical coding sequence. These findings add a new facet to our understanding of biological processes.
Assuntos
DNA Intergênico/fisiologia , Biologia Molecular/tendências , Sequência de Aminoácidos , Sequência de Bases , Evolução Molecular , História do Século XX , História do Século XXI , Humanos , Biologia Molecular/história , Biologia Molecular/métodos , Fases de Leitura Aberta/genética , RNA Mensageiro/genéticaRESUMO
BACKGROUND AND AIMS: A recently identified locus for coronary artery disease (CAD) tagged by rs8042271 is in a region of tight linkage disequilibrium (LD) between 2 genes (MFGE8, ABHD2) previously linked to atherosclerosis. Here we have explored the regulatory framework of this region to identify its functional relationship to CAD. METHODS: The CAD Associated Region between MFGE8 and ABHD2 (CARMA) was investigated by bioinformatic approaches and transcriptional reporter assays to prioritize target genes and identify putative causal variants. Findings were integrated with publicly available gene expression datasets. MFGE8 silencing was performed in cell models relevant to CAD. RESULTS: The regulatory potential of CARMA is disseminated sparsely over the entire region. CARMA contains multiple eQTL that regulate MFGE8 in coronary artery and coronary artery smooth muscle cell (CoSMC). SNPs that predict the expression of MFGE8 in artery are concordantly associated with higher risk of CAD (pvalâ¯=â¯0.0014). Targeting CARMA by CRISPR/Cas9 in a cellular model increased MFGE8 expression. MFGE8 silencing was found to reduce CoSMC and monocyte (THP-1) but not endothelial cell proliferation. CONCLUSIONS: These findings support a mechanistic link between a GWAS identified CAD risk locus and atherosclerosis. The intergenic locus CARMA regulates MFGE8 in a haplotype dependent manner. Individuals genetically susceptible to increased MFGE8 expression exhibit greater CAD risk. Suppressing MFGE8 expression reduced SMC and THP-1 proliferation. These data support an atherogenic contribution of CARMA/MFGE8 that may be linked to cell proliferation and/or improved survival of CAD relevant cell types.
Assuntos
Antígenos de Superfície/genética , Aterosclerose/genética , Doença da Artéria Coronariana/genética , DNA Intergênico/fisiologia , Proteínas do Leite/genética , Antígenos de Superfície/fisiologia , Regulação da Expressão Gênica , HumanosRESUMO
We previously demonstrated that a approximately 1 Mb domain of genes upstream of and including Hoxa13 is co-expressed in the developing mouse limbs and genitalia. A highly conserved non-coding sequence, mmA13CNS, was shown to be insufficient in transgenic mice to direct precise Hoxa13-like expression in the limb buds or genital bud, although some LacZ expression from the transgene was reproducibly found in these tissues. In this report, we used beta-globin minimal promoter LacZ recombinant BAC transgenes encompassing mmA13CNS to identify a single critical region involved in mouse Hoxa13-like embryonic genital bud expression. By analyzing the expression patterns of these overlapping BAC clones in transgenic mice, we show that at least two sequences remote to the HoxA cluster are required collectively to drive Hoxa13-like expression in developing distal limbs. Given that the paralogous posterior HoxD and neighboring genes have been shown to be under the influence of long-range distal limb and genital bud enhancers, we hypothesize that both long-range enhancers have one ancestral origin, which diverged in both sequence and function after the HoxA/D cluster duplication.
Assuntos
Desenvolvimento Embrionário/genética , Elementos Facilitadores Genéticos , Genitália/embriologia , Proteínas de Homeodomínio/genética , Botões de Extremidades/metabolismo , Animais , Cromossomos Artificiais Bacterianos , DNA Intergênico/fisiologia , Regulação da Expressão Gênica no Desenvolvimento , Genes Reporter , Genitália/metabolismo , Proteínas de Homeodomínio/metabolismo , Óperon Lac , Camundongos , Camundongos Transgênicos/embriologia , Camundongos Transgênicos/metabolismo , TransgenesRESUMO
Several recent studies of genome evolution indicate that the rate of DNA loss exceeds that of DNA gain, leading to an underlying mutational pressure towards collapsing the length of noncoding DNA. That such a collapse is not observed suggests opposing mechanisms favoring longer noncoding regions. The presence of transposable elements alone also does not explain observed features of noncoding DNA. At present, a multidisciplinary approach--using population genetics techniques, large-scale genomic analyses, and in silico evolution--is beginning to provide new and valuable insights into the forces that shape the length of noncoding DNA and, ultimately, genome size. Recombination, in a broad sense, might be the missing key parameter for understanding the observed variation in length of noncoding DNA in eukaryotes.
Assuntos
Sequência Conservada/genética , DNA Intergênico/química , Animais , Sequência Conservada/fisiologia , Elementos de DNA Transponíveis , DNA Intergênico/fisiologia , Drosophila , Evolução Molecular , Genoma Humano , Humanos , Repetições de Microssatélites , Mutação , Recombinação Genética , Seleção GenéticaRESUMO
Expression of the seven open reading frames (ORFs) of single-stranded DNA Curtoviruses such as Beet curly top virus (BCTV) and Beet severe curly top virus (BSCTV) is driven by a bi-directional promoter. To investigate this bi-directional promoter activity with respect to viral late gene expression, transgenic Arabidopsis plants expressing a GUS reporter gene under the control of either the BCTV or BSCTV bi-directional promoter were constructed. Transgenic plants harboring constructs showed higher expression levels when the promoter of the less virulent BCTV was used than when the promoter of the more virulent BSCTV was used. In transgenic seedlings, the reporter gene constructs were expressed primarily in actively dividing tissues such as root tips and apical meristems. As the transgenic plants matured, reporter gene expression diminished but viral infection of mature transgenic plants restored reporter gene expression, particularly in transgenic plants containing BCTV virion-sense gene promoter constructs. A 30 base pair conserved late element (CLE) motif was identified that was present three times in tandem in the BCTV promoter and once in that of BSCTV. Progressive deletion of these repeats from the BCTV promoter resulted in decreased reporter gene expression, but BSCTV promoters in which one or two extra copies of this motif were inserted did not exhibit increased late gene promoter activity. These results demonstrate that Curtovirus late gene expression by virion-sense promoters depends on the developmental stage of the host plant as well as on the number of CLE motifs present in the promoter.
Assuntos
Arabidopsis/virologia , Geminiviridae/genética , Plantas Geneticamente Modificadas/virologia , Regiões Promotoras Genéticas/fisiologia , Sequência de Bases , DNA Intergênico/fisiologia , Regulação Viral da Expressão Gênica/fisiologia , Dados de Sequência Molecular , Ativação Transcricional/fisiologiaRESUMO
Eukaryotic DNA is organized into chromatin domains that regulate gene expression and chromosome behavior. Insulators and/or scaffold-matrix attachment regions (S/MARs) mark the boundaries of these chromatin domains where they delimit enhancing and silencing effects from the outside. By recombinase-mediated cassette exchange (RMCE), we were able to compare these two types of bordering elements at a number of predefined genomic loci. Flanking an expression vector with either S/MARs or two copies of the non-S/MAR chicken hypersensitive site 4 insulator demonstrates that while these borders confer related expression characteristics at most loci, their effect on chromatin organization is clearly distinct. Our results suggest that the activity of bordering elements is most pronounced for the abundant class of loci with a low but negligible expression potential in the case of highly expressed sites. By the RMCE procedure, we demonstrate that expression parameters are not due to a potential targeting action of bordering elements, in the sense that a linked transgene is directed into a special class of loci. Instead, we can relate the observed transcriptional augmentation phenomena to their function as genomic insulators.
Assuntos
Cromatina/metabolismo , Regulação da Expressão Gênica , Genoma , Elementos Isolantes/fisiologia , Regiões de Interação com a Matriz/genética , Animais , Linhagem Celular , Galinhas/genética , Cromatina/genética , DNA Intergênico/genética , DNA Intergênico/fisiologia , Genes Reporter/genética , Proteínas de Fluorescência Verde/análise , Proteínas de Fluorescência Verde/genética , Elementos Isolantes/genética , Camundongos , Recombinases/genética , Recombinases/fisiologia , Recombinação Genética/genética , beta-Galactosidase/análise , beta-Galactosidase/genéticaRESUMO
A quantitative model was developed that detects a new function of noncoding sequences in the eukaryotic genome, namely, the protection of coding sequences from chemical (mainly endogenous) mutagens. It was shown that, under common ecological conditions, the number of nucleotides damaged by mutagens in coding sequences of the genome is inversely proportional to the size of their noncoding counterparts. Noncoding sequences can differently protect single genetic loci from chemical mutagens by the formation of specific spatial structures of the protected loci in the interphase nuclei. The significant differences in genome sizes between species (paradox C) can be explained by different contributions of noncoding sequences to the total effect of genome protection from endogenous chemical mutagens.
Assuntos
DNA Intergênico/fisiologia , Genoma/efeitos dos fármacos , Modelos Biológicos , Mutagênese/genética , Mutagênicos/toxicidade , Animais , Sequência de Bases , Plantas/genéticaRESUMO
Recent work by Ivan Ovcharenko and colleagues has shed new light on the functional importance of gene deserts. They demonstrate that sequence conservation levels separate gene deserts into stable (more conserved) and variable classes. Both classes exhibit characteristics suggestive of function. The stable deserts in particular show features suggesting a role in the complex regulation of core vertebrate genes.
Assuntos
DNA Intergênico/fisiologia , Evolução Molecular , Genoma Humano , Fases de Leitura Aberta/fisiologia , DNA Intergênico/genética , Humanos , Fases de Leitura Aberta/genéticaAssuntos
Doenças Cardiovasculares/genética , Cromossomos de Mamíferos/genética , Inibidor de Quinase Dependente de Ciclina p15/genética , Inibidor p16 de Quinase Dependente de Ciclina/genética , DNA Intergênico/fisiologia , Animais , Vasos Sanguíneos/enzimologia , Vasos Sanguíneos/patologia , Doenças Cardiovasculares/enzimologia , Doenças Cardiovasculares/patologia , Proliferação de Células , Cromossomos Humanos Par 9/genética , Inibidor de Quinase Dependente de Ciclina p15/biossíntese , Inibidor p16 de Quinase Dependente de Ciclina/biossíntese , Regulação da Expressão Gênica , Loci Gênicos , Predisposição Genética para Doença , Humanos , Camundongos , RiscoRESUMO
BACKGROUND: Numerous tools have been developed to align genomic sequences. However, their relative performance in specific applications remains poorly characterized. Alignments of protein-coding sequences typically have been benchmarked against "correct" alignments inferred from structural data. For noncoding sequences, where such independent validation is lacking, simulation provides an effective means to generate "correct" alignments with which to benchmark alignment tools. RESULTS: Using rates of noncoding sequence evolution estimated from the genus Drosophila, we simulated alignments over a range of divergence times under varying models incorporating point substitution, insertion/deletion events, and short blocks of constrained sequences such as those found in cis-regulatory regions. We then compared "correct" alignments generated by a modified version of the ROSE simulation platform to alignments of the simulated derived sequences produced by eight pairwise alignment tools (Avid, BlastZ, Chaos, ClustalW, DiAlign, Lagan, Needle, and WABA) to determine the off-the-shelf performance of each tool. As expected, the ability to align noncoding sequences accurately decreases with increasing divergence for all tools, and declines faster in the presence of insertion/deletion evolution. Global alignment tools (Avid, ClustalW, Lagan, and Needle) typically have higher sensitivity over entire noncoding sequences as well as in constrained sequences. Local tools (BlastZ, Chaos, and WABA) have lower overall sensitivity as a consequence of incomplete coverage, but have high specificity to detect constrained sequences as well as high sensitivity within the subset of sequences they align. Tools such as DiAlign, which generate both local and global outputs, produce alignments of constrained sequences with both high sensitivity and specificity for divergence distances in the range of 1.25-3.0 substitutions per site. CONCLUSION: For species with genomic properties similar to Drosophila, we conclude that a single pair of optimally diverged species analyzed with a high performance alignment tool can yield accurate and specific alignments of functionally constrained noncoding sequences. Further algorithm development, optimization of alignment parameters, and benchmarking studies will be necessary to extract the maximal biological information from alignments of functional noncoding DNA.
Assuntos
Benchmarking/métodos , DNA Intergênico/genética , Alinhamento de Sequência/métodos , Alinhamento de Sequência/normas , Animais , Caenorhabditis/genética , Caenorhabditis elegans/genética , Simulação por Computador , DNA de Helmintos/genética , DNA de Helmintos/fisiologia , DNA Intergênico/fisiologia , Drosophila/genética , Drosophila melanogaster/genética , Evolução Molecular , Genômica/métodos , Genômica/normas , Humanos , Camundongos , Sensibilidade e Especificidade , Homologia de Sequência do Ácido NucleicoRESUMO
Pseudogenes have been defined as nonfunctional sequences of genomic DNA (junk DNA) originally derived from functional genes. It is therefore assumed that pseudogenes are not subject to natural selection and consequently pseudogene mutations are selectively neutral and have equal probability to become fixed in the population. We describe some unexpected features of pseudogenes in diverse organisms that are inconsistent with this widely accepted point of view. Pseudogenes are often evolutionary conserved and transcriptionally active. Moreover, pseudogenes that have been suitably investigated often exhibit functional roles, such as gene regulation, generation of genetic diversity, and other features that are expected in genes or DNA sequences that have functional roles. A review of the evidence leads to the conclusion that pseudogenes are important components of genomes, representing a repertoire of sequences available for functional evolution and subject to non-neutral evolutionary changes. Pseudogenes might be considered as potogenes, i.e. DNA sequences with a potentiality for becoming new genes or acquire new functions. Furthermore we conjecture that some pseudogenes along with their parental sequences may constitute sets of indivisible functionally interacting entities (intergenic complexes or "intergenes"), in which all the component elements are required in order to fulfill a collective functional role.
Assuntos
DNA Intergênico/fisiologia , Pseudogenes/fisiologia , Animais , Drosophila melanogaster , Esterases/genética , Regulação da Expressão Gênica , Variação Genética , Humanos , Família Multigênica , Plantas , Pseudogenes/genética , Transcrição GênicaRESUMO
It is now clear that animal genomes are predominantly non-protein-coding, and that these sequences encode a wide array of RNA transcripts and other regulatory elements that are fundamental to the development of complex life. We have previously argued that the proportion of an animal genome that is non-protein-coding DNA (ncDNA) correlates well with its apparent biological complexity. Here we extend on that work and, using data from a total of 1,627 prokaryotic and 153 eukaryotic complete and annotated genomes, show that the proportion of ncDNA per haploid genome is significantly positively correlated with a previously published proxy of biological complexity, the number of distinct cell types. This is in contrast to the amount of the genome that encodes proteins, which we show is essentially unchanged across Metazoa. Furthermore, using a total of 179 RNA-seq data sets from nematode (47), fruit fly (72), zebrafish (20) and human (42), we show, consistent with other recent reports, that the vast majority of ncDNA in animals is transcribed. This includes more than 60 human loci previously considered "gene deserts," many of which are expressed tissue-specifically and associated with previously reported GWAS SNPs. These results suggest that ncDNA, and the ncRNAs encoded within it, may be intimately involved in the evolution, maintenance and development of complex life.
Assuntos
DNA Intergênico/fisiologia , Animais , Evolução Molecular , Genoma , Humanos , Íntrons/fisiologia , Fases de Leitura Aberta , Transcrição Gênica , TranscriptomaRESUMO
The cluster of human neuronal nicotinic receptor genes (CHRNA5/A3/B4) (15q25.1) has been associated with a variety of smoking and drug-related behaviors, as well as risk for lung cancer. CHRNA3/B4 intergenic single nucleotide polymorphisms (SNPs) rs1948 and rs8023462 have been associated with early initiation of alcohol and tobacco use, and rs6495309 has been associated with nicotine dependence and risk for lung cancer. An in vitro luciferase expression assay was used to determine whether these SNPs and surrounding sequences contribute to differences in gene expression using cell lines either expressing proteins characteristic of neuronal tissue or derived from lung cancers. Electrophoretic mobility shift assays (EMSAs) were performed to investigate whether nuclear proteins from these cell lines bind SNP alleles differentially. Results from expression assays were dependent on cell culture type and haplotype. EMSAs indicated that rs8023462 and rs6495309 bind nuclear proteins in an allele-specific way. Additionally, GATA transcription factors appeared to bind rs8023462 only when the minor/risk allele was present. Much work has been done to describe the rat Chrnb4/a3 intergenic region, but few studies have examined the human intergenic region effects on expression; therefore, these studies greatly aid human genetic research as it relates to observed nicotine phenotypes, lung cancer risk and potential underlying genetic mechanisms. Data from these experiments support the hypothesis that SNPs associated with human addiction-related phenotypes and lung cancer risk can affect gene expression, and are potential therapeutic targets. Additionally, this is the first evidence that rs8023462 interacts with GATA transcription factors to influence gene expression.