Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 65
Filtrar
1.
Mol Cell ; 72(3): 482-495.e7, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30388410

RESUMO

Productive splicing of human precursor messenger RNAs (pre-mRNAs) requires the correct selection of authentic splice sites (SS) from the large pool of potential SS. Although SS consensus sequence and splicing regulatory proteins are known to influence SS usage, the mechanisms ensuring the effective suppression of cryptic SS are insufficiently explored. Here, we find that many aberrant exonic SS are efficiently silenced by the exon junction complex (EJC), a multi-protein complex that is deposited on spliced mRNA near the exon-exon junction. Upon depletion of EJC proteins, cryptic SS are de-repressed, leading to the mis-splicing of a broad set of mRNAs. Mechanistically, the EJC-mediated recruitment of the splicing regulator RNPS1 inhibits cryptic 5'SS usage, while the deposition of the EJC core directly masks reconstituted 3'SS, thereby precluding transcript disintegration. Thus, the EJC protects the transcriptome of mammalian cells from inadvertent loss of exonic sequences and safeguards the expression of intact, full-length mRNAs.


Assuntos
Processamento Alternativo/fisiologia , Éxons/fisiologia , Sítios de Splice de RNA/fisiologia , Sequência Consenso/genética , RNA Helicases DEAD-box/metabolismo , Fator de Iniciação 4A em Eucariotos/metabolismo , Células HeLa , Humanos , Íntrons , Precursores de RNA/fisiologia , Splicing de RNA/fisiologia , RNA Mensageiro/genética , Proteínas de Ligação a RNA/metabolismo , Ribonucleoproteínas/metabolismo , Transcriptoma/genética
2.
J Biol Chem ; 299(12): 105442, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37949222

RESUMO

Adenine base editors (ABEs) are genome-editing tools that have been harnessed to introduce precise A•T to G•C conversion. The discovery of split genes revealed that all introns contain two highly conserved dinucleotides, canonical "AG" (acceptor) and "GT" (donor) splice sites. ABE can directly edit splice acceptor sites of the adenine (A) base, leading to aberrant gene splicing, which may be further adopted to remodel splicing. However, spliced isoforms triggered with ABE have not been well explored. To address it, we initially generated a cell line harboring C-terminal enhanced GFP (eGFP)-tagged ß-actin (ACTB), in which the eGFP signal can track endogenous ß-actin expression. Expectedly, after the editing of splice acceptor sites, we observed a dramatical decrease in the percentage of eGFP-positive cells and generation of splicing products with the noncanonical splice site. Furthermore, we manipulated Peroxidasin in mouse embryos with ABE, in which a noncanonical acceptor was activated to remodel splicing, successfully generating a mouse disease model of anophthalmia and severely malformed microphthalmia. Collectively, we demonstrate that ABE-mediated splicing remodeling can activate a noncanonical acceptor to manipulate human and mouse genomes, which will facilitate the investigation of basic and translational medicine studies.


Assuntos
Adenina , Sítios de Splice de RNA , Animais , Humanos , Camundongos , Actinas/genética , Sequência de Bases , Edição de Genes , Íntrons , Splicing de RNA , Células HEK293
3.
Ann Hum Genet ; 2024 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-39361243

RESUMO

The DYNC2H1 gene has been associated with short-rib polydactyly syndrome (SRPS), among other skeletal ciliopathies. Two cases are presented of distinctive phenotypes resulting from splicing variants in DYNC2H1. The first is a 14-week-old fetus with enlarged nuchal translucency, oral hamartoma, malformed uvula, bifid epiglottis, short ribs, micromelia, long bone agenesis, polysyndactyly, heart defect, pancreatic cysts, multicystic dysplastic kidney, megabladder and trident acetabulum. A ciliopathies NGS panel revealed two compound heterozygous variants in DYNC2H1: c.7840-18T>G r.7841_7964del p.Gly2614Aspfs*5 and c.11070G>A r.11044_11116del p.Ile3682Aspfs*2. Both variants were initially classified as variants of uncertain significance but were reclassified as likely pathogenic after PCR-based RNA testing. The second is an 11-year-old overweight male with multiple accessory oral frenula, median cleft lip and alveolar ridge, polysyndactyly, brachydactyly, normal rib length, and hypogonadism. Exome sequencing revealed two compound heterozygous variants in DYNC2H1: c.6315del p.(Thr2106Glnfs*7), classified as likely pathogenic, and c.3303-16A>G p.(?), classified as a variant of uncertain significance. PCR-based RNA testing suggested that c.3303-16A>G induces an in-frame deletion: r.3303_3458del p.Asp1102_Arg1153del, although the normal transcript is still produced. These results are consistent with both SRPS type I/III in the first case and orofaciodigital syndrome in the second, an unprecedented description. This work thus improves the clinical and molecular knowledge of the phenotypes associated with splicing variants in the DYNC2H1 gene.

4.
Adv Exp Med Biol ; 1415: 183-187, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37440032

RESUMO

Inherited retinal diseases (IRDs) are an extremely diverse group of ocular disorders characterized by progressive loss of photoreceptors leading to blindness. So far, pathogenic variants in over 300 genes are reported to structurally and functionally affect the retina resulting in visual impairment. Around 15% of all IRD mutations are known to affect an essential regulatory mechanism, pre-mRNA splicing, which contributes to the transcriptomic diversity. These variants disrupt potential donor and acceptor splice sites as well as other crucial cis-acting elements resulting in aberrant splicing. One group of these elements, the exonic splicing enhancers (ESEs), are involved in promoting exon definition and are likely to harbor "hidden" mutations since sequence-analyzing pipelines cannot identify them efficiently. The main focus of this review is to discuss the molecular mechanisms behind various exonic variants affecting splice sites and ESEs that lead to impaired splicing which in turn result in an IRD pathology.


Assuntos
Splicing de RNA , Doenças Retinianas , Humanos , Splicing de RNA/genética , Mutação , Éxons/genética , Doenças Retinianas/genética , Retina , Processamento Alternativo
5.
BMC Bioinformatics ; 23(1): 413, 2022 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-36203144

RESUMO

BACKGROUND: Identifying splice site regions is an important step in the genomic DNA sequencing pipelines of biomedical and pharmaceutical research. Within this research purview, efficient and accurate splice site detection is highly desirable, and a variety of computational models have been developed toward this end. Neural network architectures have recently been shown to outperform classical machine learning approaches for the task of splice site prediction. Despite these advances, there is still considerable potential for improvement, especially regarding model prediction accuracy, and error rate. RESULTS: Given these deficits, we propose EnsembleSplice, an ensemble learning architecture made up of four (4) distinct convolutional neural networks (CNN) model architecture combination that outperform existing splice site detection methods in the experimental evaluation metrics considered including the accuracies and error rates. We trained and tested a variety of ensembles made up of CNNs and DNNs using the five-fold cross-validation method to identify the model that performed the best across the evaluation and diversity metrics. As a result, we developed our diverse and highly effective splice site (SS) detection model, which we evaluated using two (2) genomic Homo sapiens datasets and the Arabidopsis thaliana dataset. The results showed that for of the Homo sapiens EnsembleSplice achieved accuracies of 94.16% for one of the acceptor splice sites and 95.97% for donor splice sites, with an error rate for the same Homo sapiens dataset, 4.03% for the donor splice sites and 5.84% for the acceptor splice sites datasets. CONCLUSIONS: Our five-fold cross validation ensured the prediction accuracy of our models are consistent. For reproducibility, all the datasets used, models generated, and results in our work are publicly available in our GitHub repository here: https://github.com/OluwadareLab/EnsembleSplice.


Assuntos
Aprendizado Profundo , Genômica , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Reprodutibilidade dos Testes
6.
RNA ; 26(7): 784-793, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32241834

RESUMO

Long noncoding RNAs (lncRNAs) have recently emerged as prominent regulators of gene expression in eukaryotes. LncRNAs often drive the modification and maintenance of gene activation or gene silencing states via chromatin conformation rearrangements. In plants, lncRNAs have been shown to participate in gene regulation, and are essential to processes such as vernalization and photomorphogenesis. Despite their prominent functions, only over a dozen lncRNAs have been experimentally and functionally characterized. Similar to its animal counterparts, the rates of sequence divergence are much higher in plant lncRNAs than in protein coding mRNAs, making it difficult to identify lncRNA conservation using traditional sequence comparison methods. Beyond this, little is known about the evolutionary patterns of lncRNAs in plants. Here, we characterized the splicing conservation of lncRNAs in Brassicaceae. We generated a whole-genome alignment of 16 Brassica species and used it to identify synthenic lncRNA orthologs. Using a scoring system trained on transcriptomes from A. thaliana and B. oleracea, we identified splice sites across the whole alignment and measured their conservation. Our analysis revealed that 17.9% (112/627) of all intergenic lncRNAs display splicing conservation in at least one exon, an estimate that is substantially higher than previous estimates of lncRNA conservation in this group. Our findings agree with similar studies in vertebrates, demonstrating that splicing conservation can be evidence of stabilizing selection. We provide conclusive evidence for the existence of evolutionary deeply conserved lncRNAs in plants and describe a generally applicable computational workflow to identify functional lncRNAs in plants.


Assuntos
Sequência Conservada/genética , Splicing de RNA/genética , RNA Longo não Codificante/genética , RNA de Plantas/genética , Arabidopsis/genética , Brassica/genética , Evolução Molecular , Genoma de Planta/genética , RNA Mensageiro/genética
7.
Am J Med Genet A ; 188(10): 3089-3095, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35946377

RESUMO

Alternative use of short distance tandem sites such as NAGNn AG are a common mechanism of alternative splicing; however, single nucleotide variants are rarely reported as likely to generate or to disrupt tandem splice sites. We identify a pathogenic intron 5 STK11 variant (NM_000455.4:c.[735-6A>G];[=]) segregating with the mucocutaneous features but not the hamartomatous polyps of Peutz-Jeghers syndrome in two individuals. By RNAseq analysis of peripheral blood mRNA, this variant was shown to generate a novel and preferentially used tandem proximal splice acceptor (AAGTGAAG). The variant transcript (NM_000455.4:c.734_734 + 1insTGAAG), which encodes a frameshift (p.[Tyr246Glufs*43]) constituted 36%-43% of STK11 transcripts suggesting partial escape from nonsense mediated mRNA decay and translation of a truncated protein. A review of the ClinVar database identified other similar variants. We suggest that nucleotide changes creating or disrupting tandem alternative splice sites are a pertinent disease mechanism and require contextualization for clinical reporting. Additionally, we hypothesize that some pathogenic STK11 variants cause an attenuated phenotype.


Assuntos
Síndrome de Peutz-Jeghers , Quinases Proteína-Quinases Ativadas por AMP , Processamento Alternativo , Códon sem Sentido , Humanos , Nucleotídeos , Síndrome de Peutz-Jeghers/genética , Síndrome de Peutz-Jeghers/patologia
8.
RNA Biol ; 19(1): 333-352, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35220879

RESUMO

Latent 5' splice sites, not normally used, are highly abundant in human introns, but are activated under stress and in cancer, generating thousands of nonsense mRNAs. A previously proposed mechanism to suppress latent splicing was shown to be independent of NMD, with a pivotal role for initiator-tRNA independent of protein translation. To further elucidate this mechanism, we searched for nuclear proteins directly bound to initiator-tRNA. Starting with UV-crosslinking, we identified nucleolin (NCL) interacting directly and specifically with initiator-tRNA in the nucleus, but not in the cytoplasm. Next, we show the association of ini-tRNA and NCL with pre-mRNA. We further show that recovery of suppression of latent splicing by initiator-tRNA complementation is NCL dependent. Finally, upon nucleolin knockdown we show activation of latent splicing in hundreds of coding transcripts having important cellular functions. We thus propose nucleolin, a component of the endogenous spliceosome, through its direct binding to initiator-tRNA and its effect on latent splicing, as the first protein of a nuclear quality control mechanism regulating splice site selection to protect cells from latent splicing that can generate defective mRNAs.


Assuntos
Sítios de Ligação , Fosfoproteínas/metabolismo , Sítios de Splice de RNA , Splicing de RNA , Proteínas de Ligação a RNA/metabolismo , Núcleo Celular/genética , Núcleo Celular/metabolismo , Técnicas de Silenciamento de Genes , Humanos , Espectrometria de Massas , Ligação Proteica , Interferência de RNA , RNA de Transferência/genética , Nucleolina
9.
Appl Intell (Dordr) ; 52(3): 3002-3017, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34764607

RESUMO

Viral infection causes a wide variety of human diseases including cancer and COVID-19. Viruses invade host cells and associate with host molecules, potentially disrupting the normal function of hosts that leads to fatal diseases. Novel viral genome prediction is crucial for understanding the complex viral diseases like AIDS and Ebola. While most existing computational techniques classify viral genomes, the efficiency of the classification depends solely on the structural features extracted. The state-of-the-art DNN models achieved excellent performance by automatic extraction of classification features, but the degree of model explainability is relatively poor. During model training for viral prediction, proposed CNN, CNN-LSTM based methods (EdeepVPP, EdeepVPP-hybrid) automatically extracts features. EdeepVPP also performs model interpretability in order to extract the most important patterns that cause viral genomes through learned filters. It is an interpretable CNN model that extracts vital biologically relevant patterns (features) from feature maps of viral sequences. The EdeepVPP-hybrid predictor outperforms all the existing methods by achieving 0.992 mean AUC-ROC and 0.990 AUC-PR on 19 human metagenomic contig experiment datasets using 10-fold cross-validation. We evaluate the ability of CNN filters to detect patterns across high average activation values. To further asses the robustness of EdeepVPP model, we perform leave-one-experiment-out cross-validation. It can work as a recommendation system to further analyze the raw sequences labeled as 'unknown' by alignment-based methods. We show that our interpretable model can extract patterns that are considered to be the most important features for predicting virus sequences through learned filters.

10.
Hum Mutat ; 42(4): 342-345, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33600011

RESUMO

Splice site variants may lead to transcript alterations, causing exons inclusion, exclusion, truncation, or intron retention. Interpreting the consequences of a specific splice site variant is not straightforward, especially if the variant is located outside of the canonical splice sites. We developed MutSpliceDB: https://brb.nci.nih.gov/splicing, a public resource to facilitate the interpretation of splice sites variants effects on splicing based on manually reviewed RNA-seq BAM files from samples with splice site variants.


Assuntos
Sítios de Splice de RNA , Splicing de RNA , Processamento Alternativo , Éxons/genética , Humanos , Íntrons/genética , Sítios de Splice de RNA/genética , Splicing de RNA/genética , RNA-Seq
11.
Methods ; 176: 25-33, 2020 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-30926533

RESUMO

Introns in different genes, or even different introns within the same gene, often have different splice sites and differ in splicing efficiency (SE). One expects mass-transcribed genes to have introns with higher SE than weakly transcribed genes. However, such a simple expectation cannot be tested directly because variable SE for these genes is often not measured. Mechanistically, SE should depend on signal strength at key splice sites (SS) such as 5'SS, 3'SS and branchpoint site (BPS), i.e., SE = F(5'SS, 3'SS, BPS). However, without SE, we again cannot model how these splice sites contribute to SE. Here I present an RNA-Seq approach to quantify SE for each of the 304 introns in yeast (Saccharomyces cerevisiae) genes, including 24 in the 5'UTR, by measuring 1) number of reads mapped to exon-exon junctions (NEE) as a proxy for the abundance of spliced form, and 2) number of reads mapped to exon-intron junction (NEI5 and NEI3 at 5' and 3' ends of intron) as a proxy for the abundance of unspliced form. The total mRNA is NTotal = NEE + p * NEI5 + (1-p) * NEI3, with the simplest p = 0.5 but statistical methods were presented to estimate p from data. An estimated p is needed because NEI5 is expected to be smaller than NEI3 due to 1) step 1 splicing occurs before step 2 so EI5 is broken before EI3, 2) enrichment of poly(A) mRNA by oligo-dT, and 3) 5' degradation. SE is defined as the proportion (NEE/NTotal). Application of the method shows that ribosomal protein messages are efficiently and mostly cotranscriptionally spliced. Yeast genes with long introns are also spliced efficiently. HAC1/YFL031W is poorly spliced partly because its splicing involves a nonspliceosome mechanism and partly because Ire1p, which participate in splicing HAC1, is hardly expressed. Many putative yeast genes have low SE, and some splice sites are incorrectly annotated.


Assuntos
RNA-Seq/métodos , Saccharomyces cerevisiae/genética , Regiões 5' não Traduzidas/genética , Íntrons/genética , Poli A/genética , Splicing de RNA , RNA Fúngico/genética , RNA Mensageiro/genética , Proteínas Ribossômicas/genética , Proteínas de Saccharomyces cerevisiae/genética
12.
Genomics ; 112(2): 1847-1852, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31704313

RESUMO

A novel method is proposed to detect the acceptor and donor splice sites using chaos game representation and artificial neural network. In order to achieve high accuracy, inputs to the neural network, or feature vector, shall reflect the true nature of the DNA segments. Therefore it is important to have one-to-one numerical representation, i.e. a feature vector should be able to represent the original data. Chaos game representation (CGR) is an iterative mapping technique that assigns each nucleotide in a DNA sequence to a respective position on the plane in a one-to-one manner. Using CGR, a DNA sequence can be mapped to a numerical sequence that reflects the true nature of the original sequence. In this research, we propose to use CGR as feature input to a neural network to detect splice sites on the NN269 dataset. Computational experiments indicate that this approach gives good accuracy while being simpler than other methods in the literature, with only one neural network component. The code and data for our method can be accessed from this link: https://github.com/thoang3/portfolio/tree/SpliceSites_ANN_CGR.


Assuntos
Redes Neurais de Computação , Sítios de Splice de RNA , Análise de Sequência de DNA/métodos , Humanos , Dinâmica não Linear , Software
13.
Semin Cell Dev Biol ; 79: 103-112, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-28965864

RESUMO

The U12-dependent (minor) spliceosome excises a rare group of introns that are characterized by a highly conserved 5' splice site and branch point sequence. Several new congenital or somatic diseases have recently been associated with mutations in components of the minor spliceosome. A common theme in these diseases is the detection of elevated levels of transcripts containing U12-type introns, of which a subset is associated with other splicing defects. Here we review the present understanding of minor spliceosome diseases, particularly those associated with the specific components of the minor spliceosome. We also present a model for interpreting the molecular-level consequences of the different diseases.


Assuntos
Doença/genética , Precursores de RNA/genética , Splicing de RNA , Ribonucleoproteínas Nucleares Pequenas/genética , Spliceossomos/genética , Animais , Sequência de Bases , Humanos , Mutação , RNA Mensageiro/genética
14.
RNA ; 24(10): 1314-1325, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30006499

RESUMO

The tri-snRNP 27K protein is a component of the human U4/U6-U5 tri-snRNP and contains an N-terminal phosphorylated RS domain. In a forward genetic screen in C. elegans, we previously identified a dominant mutation, M141T, in the highly-conserved C-terminal region of this protein. The mutant allele promotes changes in cryptic 5' splice site choice. To better understand the function of this poorly characterized splicing factor, we performed high-throughput mRNA sequencing analysis on worms containing this dominant mutation. Comparison of alternative splice site usage between the mutant and wild-type strains led to the identification of 26 native genes whose splicing changes in the presence of the snrp-27 mutation. The changes in splicing are specific to alternative 5' splice sites. Analysis of new alleles suggests that snrp-27 is an essential gene for worm viability. We performed a novel directed-mutation experiment in which we used the CRISPR-cas9 system to randomly generate mutations specifically at M141 of SNRP-27. We identified eight amino acid substitutions at this position that are viable, and three that are homozygous lethal. All viable substitutions at M141 led to varying degrees of changes in alternative 5' splicing of native targets. We hypothesize a role for this SR-related factor in maintaining the position of the 5' splice site as U1snRNA trades interactions at the 5' end of the intron with U6snRNA and PRP8 as the catalytic site is assembled.


Assuntos
Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Sítios de Splice de RNA , Splicing de RNA , Ribonucleoproteínas Nucleares Pequenas/metabolismo , Spliceossomos/metabolismo , Animais , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação , Precursores de RNA/genética , Precursores de RNA/metabolismo , Ribonucleoproteínas Nucleares Pequenas/genética , Análise de Sequência de RNA
15.
Int J Mol Sci ; 21(7)2020 03 26.
Artigo em Inglês | MEDLINE | ID: mdl-32225107

RESUMO

Noncanonical splice-site mutations are an important cause of inherited diseases. Based on in vitro and stem-cell-based studies, some splice-site variants show a stronger splice defect than expected based on their predicted effects, suggesting that other sequence motifs influence the outcome. We investigated whether splice defects due to human-inherited-disease-associated variants in noncanonical splice-site sequences in ABCA4, DMD, and TMC1 could be rescued by strengthening the splice site on the other side of the exon. Noncanonical 5'- and 3'-splice-site variants were selected. Rescue variants were introduced based on an increase in predicted splice-site strength, and the effects of these variants were analyzed using in vitro splice assays in HEK293T cells. Exon skipping due to five variants in noncanonical splice sites of exons in ABCA4, DMD, and TMC1 could be partially or completely rescued by increasing the predicted strengths of the other splice site of the same exon. We named this mechanism "splicing interdependency", and it is likely based on exon recognition by splicing machinery. Awareness of this interdependency is of importance in the classification of noncanonical splice-site variants associated with disease and may open new opportunities for treatments.


Assuntos
Éxons , Sítios de Splice de RNA , Transportadores de Cassetes de Ligação de ATP/genética , Transportadores de Cassetes de Ligação de ATP/metabolismo , Distrofina/genética , Distrofina/metabolismo , Células HEK293 , Humanos , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Splicing de RNA
16.
BMC Bioinformatics ; 20(Suppl 23): 652, 2019 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-31881982

RESUMO

BACKGROUND: Identifying splice sites is a necessary step to analyze the location and structure of genes. Two dinucleotides, GT and AG, are highly frequent on splice sites, and many other patterns are also on splice sites with important biological functions. Meanwhile, the dinucleotides occur frequently at the sequences without splice sites, which makes the prediction prone to generate false positives. Most existing tools select all the sequences with the two dimers and then focus on distinguishing the true splice sites from those pseudo ones. Such an approach will lead to a decrease in false positives; however, it will result in non-canonical splice sites missing. RESULT: We have designed SpliceFinder based on convolutional neural network (CNN) to predict splice sites. To achieve the ab initio prediction, we used human genomic data to train our neural network. An iterative approach is adopted to reconstruct the dataset, which tackles the data unbalance problem and forces the model to learn more features of splice sites. The proposed CNN obtains the classification accuracy of 90.25%, which is 10% higher than the existing algorithms. The method outperforms other existing methods in terms of area under receiver operating characteristics (AUC), recall, precision, and F1 score. Furthermore, SpliceFinder can find the exact position of splice sites on long genomic sequences with a sliding window. Compared with other state-of-the-art splice site prediction tools, SpliceFinder generates results in about half lower false positive while keeping recall higher than 0.8. Also, SpliceFinder captures the non-canonical splice sites. In addition, SpliceFinder performs well on the genomic sequences of Drosophila melanogaster, Mus musculus, Rattus, and Danio rerio without retraining. CONCLUSION: Based on CNN, we have proposed a new ab initio splice site prediction tool, SpliceFinder, which generates less false positives and can detect non-canonical splice sites. Additionally, SpliceFinder is transferable to other species without retraining. The source code and additional materials are available at https://gitlab.deepomics.org/wangruohan/SpliceFinder.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Sítios de Splice de RNA/genética , Software , Algoritmos , Animais , Sequência de Bases , Bases de Dados Genéticas , Genoma , Humanos , Splicing de RNA/genética
17.
Planta ; 250(2): 603-628, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31139927

RESUMO

MAIN CONCLUSION: Compared with its parents, Brassica hexaploid underwent significant AS changes, which may provide diversified gene expression regulation patterns and could enhance its adaptability during evolution Polyploidization is considered a significant evolution force that promotes species formation. Alternative splicing (AS) plays a crucial role in multiple biological processes during plant growth and development. To explore the effects of allopolyploidization on the AS patterns of genes, a genome-wide AS analysis was performed by RNA-seq in Brassica hexaploid and its parents. In total, we found 7913 (27540 AS events), 14447 (70179 AS events), and 13205 (60804 AS events) AS genes in Brassica rapa, Brassica carinata, and Brassica hexaploid, respectively. A total of 920 new AS genes were discovered in Brassica hexaploid. There were 56 differently spliced genes between Brassica hexaploid and its parents. In addition, most of the alternative 5' splice sites were located 4 bp upstream of the dominant 5' splice sites, and most of the alternative 3' splice sites were located 3 bp downstream of the dominant 3' splice sites in Brassica hexapliod, which was similar to B. carinata. Furthermore, we cloned and sequenced all amplicons from the RT-PCR products of GRP7/8, namely, Bol045859, Bol016025 and Bol02880. The three genes were found to produce AS transcripts in a new way. The AS patterns of genes were diverse between Brassica hexaploid and its parents, including the loss and gain of AS events. Allopolyploidization changed alternative splicing sites of pre-mRNAs in Brassica hexaploid, which brought about alterations in the sequences of transcripts. Our study provided novel insights into the AS patterns of genes in allopolyploid plants, which may provide a reference for the study of polyploidy adaptability.


Assuntos
Processamento Alternativo , Brassica/genética , Regulação da Expressão Gênica de Plantas/genética , Genoma de Planta/genética , Adaptação Fisiológica , Evolução Biológica , Brassica/fisiologia , Brassica rapa/genética , Brassica rapa/fisiologia , Poliploidia
18.
RNA Biol ; 16(10): 1364-1376, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31213135

RESUMO

Splicing-affecting mutations can disrupt gene function by altering the transcript assembly. To ascertain splicing dysregulation principles, we modified a minigene assay for the parallel high-throughput evaluation of different mutations by next-generation sequencing. In our model system, all exonic and six intronic positions of the SMN1 gene's exon 7 were mutated to all possible nucleotide variants, which amounted to 180 unique single-nucleotide mutants and 470 double mutants. The mutations resulted in a wide range of splicing aberrations. Exonic splicing-affecting mutations resulted either in substantial exon skipping, supposedly driven by predicted exonic splicing silencer or cryptic donor splice site (5'ss) and de novo 5'ss strengthening and use. On the other hand, a single disruption of exonic splicing enhancer was not sufficient to cause major exon skipping, suggesting these elements can be substituted during exon recognition. While disrupting the acceptor splice site led only to exon skipping, some 5'ss mutations potentiated the use of three different cryptic 5'ss. Generally, single mutations supporting cryptic 5'ss use displayed better pre-mRNA/U1 snRNA duplex stability and increased splicing regulatory element strength across the original 5'ss. Analyzing double mutants supported the predominating splicing regulatory elements' effect, but U1 snRNA binding could contribute to the global balance of splicing isoforms. Based on these findings, we suggest that creating a new splicing enhancer across the mutated 5'ss can be one of the main factors driving cryptic 5'ss use.


Assuntos
Processamento Alternativo , Éxons , Mutação , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Linhagem Celular , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Simulação de Dinâmica Molecular , Mutagênese , Conformação de Ácido Nucleico , Ligação Proteica , Sítios de Splice de RNA , RNA Nuclear Pequeno/química , RNA Nuclear Pequeno/genética , RNA Nuclear Pequeno/metabolismo , Proteína 1 de Sobrevivência do Neurônio Motor/química , Proteína 1 de Sobrevivência do Neurônio Motor/metabolismo
19.
BMC Genomics ; 19(1): 237, 2018 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-29618315

RESUMO

BACKGROUND: There are an exceedingly large number of sequence variants discovered through whole genome sequencing in most populations, including cattle. Deciphering which of these affect complex traits is a major challenge. In this study we hypothesize that variants in some functional classes, such as splice site regions, coding regions, DNA methylated regions and long noncoding RNA will explain more variance in complex traits than others. Two variance component approaches were used to test this hypothesis - the first determines if variants in a functional class capture a greater proportion of the variance, than expected by chance, the second uses the proportion of variance explained when variants in all annotations are fitted simultaneously. RESULTS: Our data set consisted of 28.3 million imputed whole genome sequence variants in 16,581 dairy cattle with records for 6 complex trait phenotypes, including production and fertility. We found that sequence variants in splice site regions and synonymous classes captured the greatest proportion of the variance, explaining up to 50% of the variance across all traits. We also found sequence variants in target sites for DNA methylation (genomic regions that are found be highly methylated in bovine placentas), captured a significant proportion of the variance. Per sequence variant, splice site variants explain the highest proportion of variance in this study. The proportion of variance captured by the missense predicted deleterious (from SIFT) and missense tolerated classes was relatively small. CONCLUSION: The results demonstrate using functional annotations to filter whole genome sequence variants into more informative subsets could be useful for prioritization of the variants that are more likely to be associated with complex traits. In addition to variants found in splice sites and protein coding genes regulatory variants and those found in DNA methylated regions, explained considerable variation in milk production and fertility traits. In our analysis synonymous variants captured a significant proportion of the variance, which raises the possible explanation that synonymous mutations might have some effects, or more likely that these variants are miss-annotated, or alternatively the results reflect imperfect imputation of the actual causative variants.


Assuntos
Redes Reguladoras de Genes , Variação Genética , Locos de Características Quantitativas , Sequenciamento Completo do Genoma/veterinária , Animais , Bovinos , Feminino , Fertilidade , Frequência do Gene , Anotação de Sequência Molecular , Gravidez , Sítios de Splice de RNA
20.
Hum Genomics ; 11(1): 7, 2017 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-28472998

RESUMO

BACKGROUND: SPINK1 (serine protease inhibitor, kazal-type, 1), which encodes human pancreatic secretory trypsin inhibitor, is one of the most extensively studied genes underlying chronic pancreatitis. Recently, based upon data from qualitative reverse transcription-PCR (RT-PCR) analyses of transfected HEK293T cells, we concluded that 24 studied SPINK1 intronic variants were not of pathological significance, the sole exceptions being two canonical splice site variants (i.e., c.87 + 1G > A and c.194 + 2T > C). Herein, we employed the splicing prediction tools included within the Alamut software suite to prioritize the 'non-pathological' SPINK1 intronic variants for further quantitative RT-PCR analysis. RESULTS: Although our results demonstrated the utility of in silico prediction in classifying and prioritizing intronic variants, we made two observations worth noting. First, we established that most of the prediction tools employed ignored the general rule that GC is a weaker donor splice site than the canonical GT site. This finding is potentially important because for a given disease gene, a GC variant donor splice site may be associated with a milder clinical manifestation. Second, the non-pathological c.194 + 13T > G variant was consistently predicted by different programs to generate a new and viable donor splice site, the prediction scores being comparable to those for the physiological c.194 + 2T donor splice site and even higher than those for the physiological c.87 + 1G donor splice site. We do however provide convincing in vitro evidence that the predicted donor splice site was not entirely spurious. CONCLUSIONS: Our findings, taken together, serve to emphasize the importance of functional analysis in helping to establish or refute the pathogenicity of specific intronic variants.


Assuntos
Pancreatite/genética , Sítios de Splice de RNA , RNA Mensageiro/genética , Inibidor da Tripsina Pancreática de Kazal/genética , Simulação por Computador , Variação Genética , Íntrons , Estabilidade de RNA , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Software
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa