Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 340
Filter
Add more filters

Publication year range
1.
Nature ; 555(7694): 107-111, 2018 03 01.
Article in English | MEDLINE | ID: mdl-29466324

ABSTRACT

Long noncoding RNAs (lncRNAs) are emerging as key parts of multiple cellular pathways, but their modes of action and how these are dictated by sequence remain unclear. lncRNAs tend to be enriched in the nuclear fraction, whereas most mRNAs are overtly cytoplasmic, although several studies have found that hundreds of mRNAs in various cell types are retained in the nucleus. It is thus conceivable that some mechanisms that promote nuclear enrichment are shared between lncRNAs and mRNAs. Here, to identify elements in lncRNAs and mRNAs that can force nuclear localization, we screened libraries of short fragments tiled across nuclear RNAs, which were cloned into the untranslated regions of an efficiently exported mRNA. The screen identified a short sequence derived from Alu elements and bound by HNRNPK that increased nuclear accumulation. Binding of HNRNPK to C-rich motifs outside Alu elements is also associated with nuclear enrichment in both lncRNAs and mRNAs, and this mechanism is conserved across species. Our results thus identify a pathway for regulation of RNA accumulation and subcellular localization that has been co-opted to regulate the fate of transcripts with integrated Alu elements.


Subject(s)
Alu Elements/genetics , Cell Nucleus/genetics , RNA Transport , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Active Transport, Cell Nucleus , Animals , Base Sequence , Binding Sites , Conserved Sequence , Evolution, Molecular , HeLa Cells , Heterogeneous-Nuclear Ribonucleoprotein K/metabolism , Humans , MCF-7 Cells , Mice , Species Specificity , Untranslated Regions/genetics
2.
Mol Microbiol ; 117(1): 193-214, 2022 01.
Article in English | MEDLINE | ID: mdl-34783400

ABSTRACT

Staphylococcus aureus RsaG is a 3'-untranslated region (3'UTR) derived sRNA from the conserved uhpT gene encoding a glucose-6-phosphate (G6P) transporter expressed in response to extracellular G6P. The transcript uhpT-RsaG undergoes degradation from 5'- to 3'-end by the action of the exoribonucleases J1/J2, which are blocked by a stable hairpin structure at the 5'-end of RsaG, leading to its accumulation. RsaG together with uhpT is induced when bacteria are internalized into host cells or in the presence of mucus-secreting cells. Using MS2-affinity purification coupled with RNA sequencing, several RNAs were identified as targets including mRNAs encoding the transcriptional factors Rex, CcpA, SarA, and the sRNA RsaI. Our data suggested that RsaG contributes to the control of redox homeostasis and adjusts metabolism to changing environmental conditions. RsaG uses different molecular mechanisms to stabilize, degrade, or repress the translation of its mRNA targets. Although RsaG is conserved only in closely related species, the uhpT 3'UTR of the ape pathogen S. simiae harbors an sRNA, whose sequence is highly different, and which does not respond to G6P levels. Our results hypothesized that the 3'UTRs from UhpT transporter encoding mRNAs could have rapidly evolved to enable adaptation to host niches.


Subject(s)
Antiporters/metabolism , Monosaccharide Transport Proteins/metabolism , RNA, Small Untranslated/genetics , Staphylococcal Infections/microbiology , Staphylococcus aureus/genetics , Transcription Factors/metabolism , Untranslated Regions/genetics , Adaptation, Physiological , Antiporters/genetics , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Biological Transport , Gene Expression Regulation, Bacterial , Glucose-6-Phosphate/metabolism , Homeostasis , Monosaccharide Transport Proteins/genetics , Oxidation-Reduction , RNA Stability , Staphylococcus aureus/pathogenicity , Staphylococcus aureus/physiology , Transcription Factors/genetics
3.
PLoS Comput Biol ; 18(1): e1009804, 2022 01.
Article in English | MEDLINE | ID: mdl-35045069

ABSTRACT

Nonstructural protein 1 (nsp1) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a 180-residue protein that blocks translation of host mRNAs in SARS-CoV-2-infected cells. Although it is known that SARS-CoV-2's own RNA evades nsp1's host translation shutoff, the molecular mechanism underlying the evasion was poorly understood. We performed an extended ensemble molecular dynamics simulation to investigate the mechanism of the viral RNA evasion. Simulation results suggested that the stem loop structure of the SARS-CoV-2 RNA 5'-untranslated region (SL1) binds to both nsp1's N-terminal globular region and intrinsically disordered region. The consistency of the results was assessed by modeling nsp1-40S ribosome structure based on reported nsp1 experiments, including the X-ray crystallographic structure analysis, the cryo-EM electron density map, and cross-linking experiments. The SL1 binding region predicted from the simulation was open to the solvent, yet the ribosome could interact with SL1. Cluster analysis of the binding mode and detailed analysis of the binding poses suggest residues Arg124, Lys47, Arg43, and Asn126 may be involved in the SL1 recognition mechanism, consistent with the existing mutational analysis.


Subject(s)
COVID-19/virology , Host-Pathogen Interactions/genetics , SARS-CoV-2 , Untranslated Regions/genetics , Viral Nonstructural Proteins , Computational Biology , Humans , Models, Genetic , Molecular Dynamics Simulation , Protein Binding , Protein Biosynthesis , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , Viral Nonstructural Proteins/chemistry , Viral Nonstructural Proteins/genetics , Viral Nonstructural Proteins/metabolism
4.
J Virol ; 95(18): e0087821, 2021 08 25.
Article in English | MEDLINE | ID: mdl-34190596

ABSTRACT

The influenza A virus genome is comprised of eight single-stranded negative-sense viral RNA (vRNA) segments. Each of the eight vRNA segments contains segment-specific nonconserved noncoding regions (NCRs) of similar sequence and length in different influenza A virus strains. However, in the subtype-determinant segments, encoding hemagglutinin (HA) and neuraminidase (NA), the segment-specific noncoding regions are subtype specific, varying significantly in sequence and length at both the 3' and 5' termini among different subtypes. The significance of these subtype-specific noncoding regions (ssNCR) in the influenza virus replication cycle is not fully understood. In this study, we show that truncations of the 3'-end H1-subtype-specific noncoding region (H1-ssNCR) resulted in recombinant viruses with decreased HA vRNA replication and attenuated growth phenotype, although the vRNA replication was not affected in single-template RNP reconstitution assays. The attenuated viruses were unstable, and point mutations at nucleotide position 76 or 56 in the adjacent coding region of HA vRNA were found after serial passage. The mutations restored the HA vRNA replication and reversed the attenuated virus growth phenotype. We propose that the terminal noncoding and adjacent coding regions act synergistically to ensure optimal levels of HA vRNA replication in a multisegment environment. These results provide novel insights into the role of the 3'-end nonconserved noncoding regions and adjacent coding regions on template preference in multiple-segmented negative-strand RNA viruses. IMPORTANCE While most influenza A virus vRNA segments contain segment-specific nonconserved noncoding regions of similar length and sequence, these regions vary considerably both in length and sequence in the segments encoding HA and NA, the two major antigenic determinants of influenza A viruses. In this study, we investigated the function of the 3'-end H1-ssNCR and observed a synergistic effect between the 3'-end H1-ssNCR nucleotides and adjacent coding nucleotide(s) of the HA segment on template preference in a multisegment environment. The results unravel an additional level of complexity in the regulation of RNA replication in multiple-segmented negative-strand RNA viruses.


Subject(s)
Hemagglutinin Glycoproteins, Influenza Virus/metabolism , Influenza A virus/growth & development , Influenza, Human/virology , Open Reading Frames/genetics , RNA, Viral/metabolism , Untranslated Regions/genetics , Viral Proteins/metabolism , Virus Replication , A549 Cells , Base Sequence , HEK293 Cells , Hemagglutinin Glycoproteins, Influenza Virus/genetics , Humans , Influenza A virus/genetics , Influenza A virus/metabolism , Influenza, Human/genetics , Influenza, Human/metabolism , RNA, Viral/genetics , Viral Proteins/genetics , Virus Assembly
5.
Arch Virol ; 166(10): 2859-2863, 2021 Oct.
Article in English | MEDLINE | ID: mdl-34291341

ABSTRACT

Sclerotinia sclerotiorum ourmiavirus 17 (SsOV17) was isolated from the hypovirulent strain GF3 of Sclerotinia sclerotiorum. The genome of SsOV17 is 2,802 nt in length and contains a single long open reading frame (ORF) flanked by a short structured 5'-untranslated region (5'-UTR) (28 nt) and a long 3'-UTR (788 nt), respectively. The ORF encodes a protein with 663 amino acids and a predicted molecular mass of 75.0 kDa. A BLASTp search indicated that the protein encoded by SsOV17 is closely related to the putative RNA-dependent RNA polymerase (RdRp) of Sclerotinia sclerotiorum ourmiavirus 13 (71% identity). A multiple sequence alignment indicated that eight conserved amino acid motifs were present in the RdRp conserved region of SsOV17. Phylogenetic analysis demonstrated that SsOV17 clustered with members of the genus Botoulivirus.


Subject(s)
Ascomycota/virology , Fungal Viruses/classification , Plant Diseases/microbiology , RNA Viruses/classification , Amino Acid Motifs , Ascomycota/pathogenicity , Brassica napus/microbiology , Fungal Viruses/genetics , Fungal Viruses/isolation & purification , Genome, Viral/genetics , Open Reading Frames/genetics , Phylogeny , RNA Viruses/genetics , RNA Viruses/isolation & purification , RNA, Viral/genetics , RNA-Dependent RNA Polymerase/genetics , Untranslated Regions/genetics
6.
Int J Mol Sci ; 22(15)2021 Jul 27.
Article in English | MEDLINE | ID: mdl-34360778

ABSTRACT

G-quadruplexes are the non-canonical nucleic acid structures that are preferentially formed in G-rich regions. This structure has been shown to be associated with many biological functions. Regardless of the broad efforts on DNA G-quadruplexes, we still have limited knowledge on RNA G-quadruplexes, especially in a transcriptome-wide manner. Herein, by integrating the DMS-seq and the bioinformatics pipeline, we profiled and depicted the RNA G-quadruplexes in the human transcriptome. The genes that contain RNA G-quadruplexes in their specific regions are significantly related to immune pathways and the COVID-19-related gene sets. Bioinformatics analysis reveals the potential regulatory functions of G-quadruplexes on miRNA targeting at the scale of the whole transcriptome. In addition, the G-quadruplexes are depleted in the putative, not the real, PAS-strong poly(A) sites, which may weaken the possibility of such sites being the real cleaved sites. In brief, our study provides insight into the potential function of RNA G-quadruplexes in post-transcription.


Subject(s)
G-Quadruplexes , Transcriptome/genetics , COVID-19/genetics , Cell Line , Computational Biology , Gene Expression Profiling , Humans , MicroRNAs/chemistry , MicroRNAs/metabolism , Poly A/genetics , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , Untranslated Regions/genetics
7.
RNA ; 24(12): 1634-1646, 2018 12.
Article in English | MEDLINE | ID: mdl-30190375

ABSTRACT

Complementary sequences in cellular transcripts base-pair to form double-stranded RNA (dsRNA) structures. Because transposon-derived repeats often give rise to self-complementary sequences, dsRNA structures are prevalent in eukaryotic genomes, typically occurring in gene introns and untranslated regions (UTRs). However, the regulatory impact of double-stranded structures within genes is not fully understood. We used three independent methods to define loci in Caenorhabditis elegans predicted to form dsRNA and correlated these structures with patterns of gene expression, gene essentiality, and genome organization. As previously observed, dsRNA loci are enriched on distal arms of C. elegans autosomes, where genes typically show less conservation and lower overall expression. In contrast, we find that dsRNAs are associated with essential genes on autosome arms, and dsRNA-associated genes exhibit higher-than-expected expression and histone modification patterns associated with transcriptional elongation. Genes with significant repetitive sequence content are also highly expressed, and, thus, observed gene expression trends may relate either to dsRNA structures or to repeat content. Our results raise the possibility that as-yet-undescribed mechanisms promote expression of loci that produce dsRNAs, despite their well-characterized roles in gene silencing.


Subject(s)
Caenorhabditis elegans/genetics , Inverted Repeat Sequences/genetics , RNA, Double-Stranded/genetics , Animals , Gene Expression Regulation/genetics , Gene Silencing , Histone Code/genetics , Introns/genetics , Nucleic Acid Conformation , RNA Editing/genetics , RNA Interference , Untranslated Regions/genetics
8.
Nature ; 509(7502): 575-81, 2014 May 29.
Article in English | MEDLINE | ID: mdl-24870542

ABSTRACT

The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.


Subject(s)
Proteome/metabolism , Proteomics , Adult , Cells, Cultured , Databases, Protein , Fetus/metabolism , Fourier Analysis , Gene Expression Profiling , Genome, Human/genetics , Hematopoietic Stem Cells/cytology , Hematopoietic Stem Cells/metabolism , Humans , Internet , Mass Spectrometry , Molecular Sequence Annotation , Open Reading Frames/genetics , Organ Specificity , Protein Biosynthesis , Protein Isoforms/analysis , Protein Isoforms/genetics , Protein Isoforms/metabolism , Protein Sorting Signals , Protein Transport , Proteome/analysis , Proteome/chemistry , Proteome/genetics , Pseudogenes/genetics , RNA, Untranslated/genetics , Reproducibility of Results , Untranslated Regions/genetics
9.
J Eur Acad Dermatol Venereol ; 34(1): 112-118, 2020 Jan.
Article in English | MEDLINE | ID: mdl-31287604

ABSTRACT

BACKGROUND: Genetic predictors for treatment response could optimize allocation of biological treatment in patients with psoriasis. There is minimal knowledge about pharmacogenetics of anti-IL-17 agents. OBJECTIVES: To assess whether genetic variants in the protein-coding region or untranslated regions of the IL-17A gene are associated with response to IL-17A inhibitors in patients with psoriasis. METHODS: This was a multicenter European cohort study investigating pharmacogenetics of IL-17A inhibitors in patients with psoriasis. Patients with plaque psoriasis treated with secukinumab or ixekizumab in daily practice were included. For all participants, the protein-coding region and untranslated regions of the IL-17A gene were analysed using Sanger sequencing. Identified genetic variants were tested for association with response to secukinumab/ixekizumab, measured as ∆PASI, after 12 weeks (primary outcome) and after 24 weeks (secondary outcome). Association was tested using a linear regression model with correction for baseline PASI as a fixed covariate and for biological naivety and body mass index as additional covariates. RESULTS: In total, 134 patients treated with secukinumab or ixekizumab were included. Genotyping of the cohort identified genetic variants present in untranslated regions and intronic DNA, but not in the protein-coding region of the IL-17A gene. Five genetic variants in non-coding DNA with a known or suspected functional effect on IL-17A expression were selected for association analyses: rs2275913, rs8193037, rs3819025, rs7747909 and rs3748067. After 12 weeks, 62% of patients achieved PASI75 and 39% achieved PASI90. At week 24, PASI75 and PASI90 response rates were 72% and 62%, respectively. No associations were found between the five genetic variants and ∆PASI, PASI75 or PASI90 after 12 and 24 weeks of anti-IL-17A treatment. CONCLUSIONS: Response to IL-17A inhibitors secukinumab and ixekizumab cannot be explained by genetic variation in the protein-coding and untranslated regions of the IL-17A gene. Pharmacogenetics of IL-17A inhibitors in the treatment of psoriasis requires further exploration.


Subject(s)
Antibodies, Monoclonal, Humanized/therapeutic use , Dermatologic Agents/therapeutic use , Interleukin-17/genetics , Psoriasis/drug therapy , Psoriasis/genetics , Adult , Cohort Studies , Europe , Female , Genetic Variation/genetics , Humans , Male , Middle Aged , Open Reading Frames/genetics , Pharmacogenomic Testing , Treatment Outcome , Untranslated Regions/genetics
10.
Am J Hum Genet ; 99(3): 540-554, 2016 09 01.
Article in English | MEDLINE | ID: mdl-27569545

ABSTRACT

Rare mutations, including copy-number variants (CNVs), contribute significantly to autism spectrum disorder (ASD) risk. Although their importance has been established in families with only one affected child (simplex families), the contribution of both de novo and inherited CNVs to ASD in families with multiple affected individuals (multiplex families) is less well understood. We analyzed 1,532 families from the Autism Genetic Resource Exchange (AGRE) to assess the impact of de novo and rare CNVs on ASD risk in multiplex families. We observed a higher burden of large, rare CNVs, including inherited events, in individuals with ASD than in their unaffected siblings (odds ratio [OR] = 1.7), but the rate of de novo events was significantly lower than in simplex families. In previously characterized ASD risk loci, we identified 49 CNVs, comprising 24 inherited events, 19 de novo events, and 6 events of unknown inheritance, a significant enrichment in affected versus control individuals (OR = 3.3). In 21 of the 30 families (71%) in whom at least one affected sibling harbored an established ASD major risk CNV, including five families harboring inherited CNVs, the CNV was not shared by all affected siblings, indicating that other risk factors are contributing. We also identified a rare risk locus for ASD and language delay at chromosomal region 2q24 (implicating NR4A2) and another lower-penetrance locus involving inherited deletions and duplications of WWOX. The genetic architecture in multiplex families differs from that in simplex families and is complex, warranting more complete genetic characterization of larger multiplex ASD cohorts.


Subject(s)
Autism Spectrum Disorder/genetics , DNA Copy Number Variations/genetics , Genetic Predisposition to Disease/genetics , Chromosomes, Human, Pair 2/genetics , Cohort Studies , Databases, Genetic , Exons/genetics , Female , Gene Duplication/genetics , Genome-Wide Association Study , Humans , Language Development Disorders/genetics , Male , Odds Ratio , Oligonucleotide Array Sequence Analysis , Oxidoreductases/genetics , Penetrance , Promoter Regions, Genetic/genetics , Risk Factors , Sequence Deletion/genetics , Siblings , Tumor Suppressor Proteins/genetics , Untranslated Regions/genetics , WW Domain-Containing Oxidoreductase
11.
Plant Cell ; 28(2): 454-65, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26772995

ABSTRACT

C4 photosynthesis is a complex phenotype that allows more efficient carbon capture than the ancestral C3 pathway. In leaves of C4 species, hundreds of transcripts increase in abundance compared with C3 relatives and become restricted to mesophyll (M) or bundle sheath (BS) cells. However, no mechanism has been reported that regulates the compartmentation of multiple enzymes in M or BS cells. We examined mechanisms regulating CARBONIC ANHYDRASE4 (CA4) in C4 Gynandropsis gynandra. Increased abundance is directed by both the promoter region and introns of the G. gynandra gene. A nine-nucleotide motif located in the 5' untranslated region (UTR) is required for preferential accumulation of GUS in M cells. This element is present and functional in three additional 5' UTRs and six 3' UTRs where it determines accumulation of two isoforms of CA and pyruvate,orthophosphate dikinase in M cells. Although the GgCA4 5' UTR is sufficient to direct GUS accumulation in M cells, transcripts encoding GUS are abundant in both M and BS. Mutating the GgCA4 5' UTR abolishes enrichment of protein in M cells without affecting transcript abundance. The work identifies a mechanism that directs cell-preferential accumulation of multiple enzymes required for C4 photosynthesis.


Subject(s)
Cleome/genetics , Plant Proteins/metabolism , Carbonic Anhydrases/genetics , Carbonic Anhydrases/metabolism , Cleome/cytology , Cleome/enzymology , Genes, Reporter , Introns/genetics , Mesophyll Cells/enzymology , Photosynthesis/genetics , Plant Leaves/cytology , Plant Leaves/enzymology , Plant Leaves/genetics , Plant Proteins/genetics , Promoter Regions, Genetic/genetics , Sequence Alignment , Untranslated Regions/genetics
12.
Mol Biol Rep ; 46(1): 1413-1424, 2019 Feb.
Article in English | MEDLINE | ID: mdl-30448895

ABSTRACT

Human astrovirus (HAstV) constitutes a major cause of acute gastroenteritis in children. The viral 5' and 3' untranslated regions (UTR) have been involved in the regulation of several molecular mechanisms. However, in astrovirues have been less characterized. Here, we analyzed the secondary structures of the 5' and 3' UTR of HAstV, as well as their putative target sites that might be recognized by cellular factors. To our knowledge, this is the first bioinformatic analysis that predicts the HAstV 5' UTR secondary structure. The analysis showed that both the UTR sequence and secondary structure are highly conserved in all HAstVs analyzed, suggesting their regulatory role of viral activities. Notably, the UTRs of HAstVs contain putative binding sites for the serine/arginine-rich factors SRSF2, SRSF5, SRSF6, SRSF3, and the multifunctional hnRNPE2 protein. More importantly, putative binding sites for PTB were localized in single-stranded RNA sequences, while hnRNPE2 sites were localized in double-stranded sequence of the HAstV 5' and 3' UTR structures. These analyses suggest that the combination of SRSF proteins, hnRNPE2 and PTB described here could be involved in the maintenance of the secondary structure of the HAstVs, possibly allowing the recruitment of the replication complex that selects and recruits viral RNA replication templates.


Subject(s)
Computer Simulation , Mamastrovirus/genetics , Proteins/metabolism , Untranslated Regions/genetics , Base Sequence , Binding Sites , Nucleic Acid Conformation
13.
Genet Sel Evol ; 51(1): 12, 2019 Apr 15.
Article in English | MEDLINE | ID: mdl-30987584

ABSTRACT

BACKGROUND: In quail, two feather colour phenotypes i.e. fawn-2/beige and yellow are associated with the ASIP locus. The aim of our study was to characterize the structural modifications within this locus that explain the yellow mutation (large deletion) and the fawn-2/beige mutation (assumed to be caused by a different structural modification). RESULTS: For the yellow phenotype, we identified a complex mutation that involves a 141,162-bp long deletion. For the fawn-2/beige phenotype, we identified a 71-kb tandem duplication that comprises one unchanged copy of ASIP and one copy present in the ITCH-ASIP fusion gene, which leads to a transcript coding for a normal ASIP protein. Although this agrees with previous reports that reported an increased level of ASIP transcripts in the skin of mutant animals, we show that in the skin from fawn-2/beige embryos, this level is higher than expected with a simple duplication of the ASIP gene. Thus, we hypothesize that the 5' region of the ITCH-ASIP fusion gene leads to a higher transcription level than the 5' region of the ASIP gene. CONCLUSIONS: We were able to conclude that the fawn-2 and beige phenotypes are caused by the same allele at the ASIP locus. Both of the associated mutations fawn-2/beige and yellow lead to the formation of a fusion gene, which encodes a transcript for the ASIP protein. In both cases, transcription of ASIP depends on the promoter of a different gene, which includes alternative up-regulating sequences. However, we cannot exclude the possibility that the loss of the 5' region of the ASIP gene itself has additional impacts, especially for the fawn-2/beige mutation. In addition, in several other species including mammals, the existence of other dominant gain-of-function structural modifications that are localized upstream of the ASIP coding sequences has been reported, which supports our hypothesis that repressors in the 5' region of ASIP are absent in the fawn-2/beige mutant.


Subject(s)
Agouti Signaling Protein/genetics , Pigmentation/genetics , Quail/genetics , Agouti Signaling Protein/metabolism , Alleles , Animals , Color , Exons/genetics , Feathers/metabolism , Genotype , Mutation/genetics , Phenotype , Untranslated Regions/genetics
14.
PLoS Genet ; 12(8): e1006130, 2016 08.
Article in English | MEDLINE | ID: mdl-27536991

ABSTRACT

Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of "linked selection" on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of other modes of linked selection and of adaptation in particular. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.


Subject(s)
Drosophila melanogaster/genetics , Evolution, Molecular , Genetic Variation , Selection, Genetic/genetics , Adaptation, Biological/genetics , Amino Acid Substitution/genetics , Animals , Chromosome Mapping , Genome, Insect , Models, Genetic , Untranslated Regions/genetics
15.
Plant J ; 92(6): 1232-1244, 2017 Dec.
Article in English | MEDLINE | ID: mdl-28980350

ABSTRACT

Chlamydomonas reinhardtii is a unicellular green alga that has attracted interest due to its potential biotechnological applications, and as a model for algal biofuel and energy metabolism. Despite all the advantages that this unicellular alga offers, poor and inconsistent expression of nuclear transgenes remains an obstacle for basic and applied research. We used a data-mining strategy to identify highly expressed genes in Chlamydomonas whose flanking sequences were tested for the ability to drive heterologous nuclear transgene expression. Candidates identified in this search included two ribosomal protein genes, RPL35a and RPL23, and ferredoxin, FDX1, whose flanking regions including promoters, terminators and untranslated sequences could drive stable luciferase transgene expression to significantly higher levels than the commonly used Hsp70A-RBCS2 (AR) hybrid promoter/terminator sequences. The RPL23 flanking sequences were further tested using the zeocin resistance gene sh-ble as a reporter in monocistronic and dicistronic constructs, and consistently yielded higher numbers of zeocin-resistant transformants and higher levels of resistance than AR- or PSAD-based vectors. Chlamydomonas RPL23 sequences also enabled transgene expression in Volvox carteri. Our study provides an additional benchmark for strong constitutive expression of transgenes in Chlamydomonas, and develops a general approach for identifying flanking sequences that can be used to drive transgene expression for any organism where transcriptome data are available.


Subject(s)
3' Flanking Region/genetics , 5' Flanking Region/genetics , Chlamydomonas reinhardtii/genetics , Volvox/genetics , Cell Nucleus/metabolism , Gene Expression , Genetic Vectors/genetics , Luciferases/genetics , Promoter Regions, Genetic/genetics , Terminator Regions, Genetic/genetics , Transgenes , Untranslated Regions/genetics
16.
BMC Evol Biol ; 18(1): 35, 2018 03 27.
Article in English | MEDLINE | ID: mdl-29580206

ABSTRACT

BACKGROUND: Protein-coding genes expressed in sperm evolve at different rates. To gain deeper insight into the factors underlying this heterogeneity we examined the relative importance of a diverse set of previously described rate correlates in determining the evolution of murine sperm proteins. RESULTS: Using partial rank correlations we detected several major rate indicators: Phyletic gene age, numbers of protein-protein interactions, and survival essentiality emerged as particularly important rate correlates in murine sperm proteins. Tissue specificity, numbers of paralogs, and untranslated region lengths also correlate significantly with sperm genes' evolutionary rates, albeit to a lesser extent. Multifunctionality, coding sequence or average intron lengths, and mean expression level have insignificant or virtually no independent effects on evolutionary rates in murine sperm genes. Gene ontology enrichment analyses of three equally sized murine sperm protein groups classified based on their evolutionary rates indicate strongest sperm-specific functional specialization in the most quickly evolving gene class. CONCLUSIONS: We propose a model according to which slowly evolving murine sperm proteins tend to be constrained by factors such as survival essentiality, network connectivity, and/or broad expression. In contrast, evolutionary change may arise especially in less constrained sperm proteins, which might, moreover, be prone to specialize to reproduction-related functions. Our results should be taken into account in future studies on rate variations of reproductive genes.


Subject(s)
Evolution, Molecular , Proteome/metabolism , Spermatozoa/metabolism , Animals , Gene Expression Regulation , Gene Ontology , Genetic Pleiotropy , Introns/genetics , Male , Mice , Molecular Sequence Annotation , Open Reading Frames/genetics , Organ Specificity/genetics , Phylogeny , Statistics, Nonparametric , Untranslated Regions/genetics
17.
Hum Mol Genet ; 25(22): 4962-4982, 2016 11 15.
Article in English | MEDLINE | ID: mdl-28171598

ABSTRACT

We performed a thorough characterization of expressed repetitive element loci (RE) in the human orbitofrontal cortex (OFC) using directional RNA sequencing data. Considering only sequencing reads that map uniquely onto the human genome, we discovered that the overwhelming majority of intronic and exonic RE are expressed in the same orientation as the gene in which they reside. Our mapping approach enabled the identification of novel differentially expressed RE transcripts between the OFC and peripheral blood lymphocytes. Further analysis revealed that RE are extensively spliced into coding regions of gene transcripts yielding thousands of novel mRNA variants with altered coding potential. Lower frequency splicing of RE into untranslated regions of gene transcripts was also observed. The same pattern of RE splicing in the brain was also detected for Drosophila, zebrafish, mouse, rat, dog and rabbit. RE splicing occurs largely at canonical GT-AG splice junctions with LINE and SINE elements forming the most RE splice junctions in the human OFC. This type of splicing usually gives rise to a minor splice variant of the endogenous gene and in silico analysis suggests that RE splicing has the potential to introduce novel open reading frames. Reanalysis of previously published sequencing data performed in the mouse cerebellum revealed that thousands of RE splice variants are associated with translating ribosomes. Our results demonstrate that RE expression is more complex than previously envisioned and raise the possibility that RE splicing might generate functional protein isoforms.


Subject(s)
Interspersed Repetitive Sequences/genetics , RNA Splice Sites/genetics , RNA Splicing/genetics , Alternative Splicing/genetics , Animals , Base Sequence , Brain/metabolism , DNA/genetics , Exons , Gene Expression Profiling/methods , Genome/genetics , Humans , Introns , Open Reading Frames/genetics , Prefrontal Cortex/metabolism , Protein Isoforms/genetics , RNA, Messenger/genetics , Repetitive Sequences, Nucleic Acid/genetics , Sequence Analysis, RNA , Untranslated Regions/genetics
18.
Breast Cancer Res Treat ; 168(2): 311-325, 2018 Apr.
Article in English | MEDLINE | ID: mdl-29236234

ABSTRACT

PURPOSE: The molecular mechanism of breast and/or ovarian cancer susceptibility remains unclear in the majority of patients. While germline mutations in the regulatory non-coding regions of BRCA1 and BRCA2 genes have been described, screening has generally been limited to coding regions. The aim of this study was to evaluate the contribution of BRCA1/2 non-coding variants. METHODS: Four BRCA1/2 non-coding regions were screened using high-resolution melting analysis/Sanger sequencing or next-generation sequencing on DNA extracted from index cases with breast and ovarian cancer predisposition (3926 for BRCA1 and 3910 for BRCA2). The impact of a set of variants on BRCA1/2 gene regulation was evaluated by site-directed mutagenesis, transfection, followed by Luciferase gene reporter assay. RESULTS: We identified a total of 117 variants and tested twelve BRCA1 and 8 BRCA2 variants mapping to promoter and intronic regions. We highlighted two neighboring BRCA1 promoter variants (c.-130del; c.-125C > T) and one BRCA2 promoter variants (c.-296C > T) inhibiting significantly the promoter activity. In the functional assays, a regulating region within the intron 12 was found with the same enhancing impact as within the intron 2. Furthermore, the variants c.81-3980A > G and c.4186-2022C > T suppress the positive effect of the introns 2 and 12, respectively, on the BRCA1 promoter activity. We also found some variants inducing the promoter activities. CONCLUSION: In this study, we highlighted some variants among many, modulating negatively the promoter activity of BRCA1 or 2 and thus having a potential impact on the risk of developing cancer. This selection makes it possible to conduct future validation studies on a limited number of variants.


Subject(s)
BRCA1 Protein/genetics , BRCA2 Protein/genetics , Genes, BRCA1 , Genes, BRCA2 , Hereditary Breast and Ovarian Cancer Syndrome/genetics , Adult , Aged , Cohort Studies , Computational Biology , Female , Genetic Predisposition to Disease , Germ-Line Mutation , High-Throughput Nucleotide Sequencing , Humans , Introns/genetics , Middle Aged , Pedigree , Polymorphism, Single Nucleotide , Promoter Regions, Genetic/genetics , Untranslated Regions/genetics
19.
Mol Cell ; 40(2): 228-37, 2010 Oct 22.
Article in English | MEDLINE | ID: mdl-20965418

ABSTRACT

A number of stresses, including nutrient stress, temperature shock, DNA damage, and hypoxia, can lead to changes in gene expression patterns caused by a general shutdown and reprogramming of protein synthesis. Each of these stress conditions results in selective recruitment of ribosomes to mRNAs whose protein products are required for responding to stress. This recruitment is regulated by elements within the 5' and 3' untranslated regions of mRNAs, including internal ribosome entry segments, upstream open reading frames, and microRNA target sites. These elements can act singly or in combination and are themselves regulated by trans-acting factors. Translational reprogramming can result in increased life span, and conversely, deregulation of these translation pathways is associated with disease including cancer and diabetes.


Subject(s)
Eukaryotic Cells/metabolism , Gene Expression Regulation , Protein Biosynthesis/genetics , Stress, Physiological/physiology , Animals , Humans , Models, Genetic , RNA, Messenger/genetics , Untranslated Regions/genetics
20.
Nat Methods ; 11(3): 294-6, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24487584

ABSTRACT

Identifying functionally relevant variants against the background of ubiquitous genetic variation is a major challenge in human genetics. For variants in protein-coding regions, our understanding of the genetic code and splicing allows us to identify likely candidates, but interpreting variants outside genic regions is more difficult. Here we present genome-wide annotation of variants (GWAVA), a tool that supports prioritization of noncoding variants by integrating various genomic and epigenomic annotations.


Subject(s)
Molecular Sequence Annotation , Untranslated Regions/genetics , Algorithms , Computer Simulation , Genetic Variation , Humans
SELECTION OF CITATIONS
SEARCH DETAIL