Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
J Biomed Sci ; 31(1): 27, 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38419051

RESUMO

BACKGROUND: Long non-coding RNAs (lncRNAs) are pivotal players in cellular processes, and their unique cell-type specific expression patterns render them attractive biomarkers and therapeutic targets. Yet, the functional roles of most lncRNAs remain enigmatic. To address the need to identify new druggable lncRNAs, we developed a comprehensive approach integrating transcription factor binding data with other genetic features to generate a machine learning model, which we have called INFLAMeR (Identifying Novel Functional LncRNAs with Advanced Machine Learning Resources). METHODS: INFLAMeR was trained on high-throughput CRISPR interference (CRISPRi) screens across seven cell lines, and the algorithm was based on 71 genetic features. To validate the predictions, we selected candidate lncRNAs in the human K562 leukemia cell line and determined the impact of their knockdown (KD) on cell proliferation and chemotherapeutic drug response. We further performed transcriptomic analysis for candidate genes. Based on these findings, we assessed the lncRNA small nucleolar RNA host gene 6 (SNHG6) for its role in myeloid differentiation. Finally, we established a mouse K562 leukemia xenograft model to determine whether SNHG6 KD attenuates tumor growth in vivo. RESULTS: The INFLAMeR model successfully reconstituted CRISPRi screening data and predicted functional lncRNAs that were previously overlooked. Intensive cell-based and transcriptomic validation of nearly fifty genes in K562 revealed cell type-specific functionality for 85% of the predicted lncRNAs. In this respect, our cell-based and transcriptomic analyses predicted a role for SNHG6 in hematopoiesis and leukemia. Consistent with its predicted role in hematopoietic differentiation, SNHG6 transcription is regulated by hematopoiesis-associated transcription factors. SNHG6 KD reduced the proliferation of leukemia cells and sensitized them to differentiation. Treatment of K562 leukemic cells with hemin and PMA, respectively, demonstrated that SNHG6 inhibits red blood cell differentiation but strongly promotes megakaryocyte differentiation. Using a xenograft mouse model, we demonstrate that SNHG6 KD attenuated tumor growth in vivo. CONCLUSIONS: Our approach not only improved the identification and characterization of functional lncRNAs through genomic approaches in a cell type-specific manner, but also identified new lncRNAs with roles in hematopoiesis and leukemia. Such approaches can be readily applied to identify novel targets for precision medicine.


Assuntos
Leucemia , RNA Longo não Codificante , Animais , Humanos , Camundongos , Diferenciação Celular/genética , Linhagem Celular Tumoral , Proliferação de Células/genética , Regulação Neoplásica da Expressão Gênica , Genômica , Leucemia/genética , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo
2.
BMC Genomics ; 23(1): 402, 2022 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-35619054

RESUMO

CRISPR-Cas9 screening libraries have arisen as a powerful tool to identify protein-coding (pc) and non-coding genes playing a role along different processes. In particular, the usage of a nuclease active Cas9 coupled to a single gRNA has proven to efficiently impair the expression of pc-genes by generating deleterious frameshifts. Here, we first demonstrate that targeting the same gene simultaneously with two guide RNAs (paired guide RNAs, pgRNAs) synergistically enhances the capacity of the CRISPR-Cas9 system to knock out pc-genes. We next design a library to target, in parallel, pc-genes and lncRNAs known to change expression during the transdifferentiation from pre-B cells to macrophages. We show that this system is able to identify known players in this process, and also predicts 26 potential novel ones, of which we select four (two pc-genes and two lncRNAs) for deeper characterization. Our results suggest that in the case of the candidate lncRNAs, their impact in transdifferentiation may be actually mediated by enhancer regions at the targeted loci, rather than by the lncRNA transcripts themselves. The CRISPR-Cas9 coupled to a pgRNAs system is, therefore, a suitable tool to simultaneously target pc-genes and lncRNAs for genomic perturbation assays.


Assuntos
RNA Guia de Cinetoplastídeos , RNA Longo não Codificante , Sistemas CRISPR-Cas , Transdiferenciação Celular , Humanos , Macrófagos , RNA Guia de Cinetoplastídeos/genética , RNA Longo não Codificante/genética
3.
Genes (Basel) ; 12(6)2021 05 27.
Artigo em Inglês | MEDLINE | ID: mdl-34072165

RESUMO

This study investigated whether genetic factors involved in Alzheimer's disease (AD) are associated with enlargement of Perivascular Spaces (ePVS) in the brain. A total of 680 participants with T2-weighted MRI scans and genetic information were acquired from the ALFA study. ePVS in the basal ganglia (BG) and the centrum semiovale (CS) were assessed based on a validated visual rating scale. We used univariate and multivariate logistic regression models to investigate associations between ePVS in BG and CS with BIN1-rs744373, as well as APOE genotypes. We found a significant association of the BIN1-rs744373 polymorphism in the CS subscale (p value = 0.019; OR = 2.564), suggesting that G allele carriers have an increased risk of ePVS in comparison with A allele carriers. In stratified analysis by APOE-ε4 status (carriers vs. non-carriers), these results remained significant only for ε4 carriers (p value = 0.011; OR = 1.429). To our knowledge, the present study is the first suggesting that genetic predisposition for AD is associated with ePVS in CS. These findings provide evidence that underlying biological processes affecting AD may influence CS-ePVS.


Assuntos
Doença de Alzheimer/genética , Predisposição Genética para Doença , Sistema Glinfático/diagnóstico por imagem , Proteínas Adaptadoras de Transdução de Sinal/genética , Idoso , Apolipoproteínas E/genética , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Proteínas Nucleares/genética , Linhagem , Polimorfismo de Nucleotídeo Único , Proteínas Supressoras de Tumor/genética
4.
Nat Commun ; 12(1): 2300, 2021 04 16.
Artigo em Inglês | MEDLINE | ID: mdl-33863890

RESUMO

The ability of nucleic acids to form double-stranded structures is essential for all living systems on Earth. Current knowledge on functional RNA structures is focused on locally-occurring base pairs. However, crosslinking and proximity ligation experiments demonstrated that long-range RNA structures are highly abundant. Here, we present the most complete to-date catalog of conserved complementary regions (PCCRs) in human protein-coding genes. PCCRs tend to occur within introns, suppress intervening exons, and obstruct cryptic and inactive splice sites. Double-stranded structure of PCCRs is supported by decreased icSHAPE nucleotide accessibility, high abundance of RNA editing sites, and frequent occurrence of forked eCLIP peaks. Introns with PCCRs show a distinct splicing pattern in response to RNAPII slowdown suggesting that splicing is widely affected by co-transcriptional RNA folding. The enrichment of 3'-ends within PCCRs raises the intriguing hypothesis that coupling between RNA folding and splicing could mediate co-transcriptional suppression of premature pre-mRNA cleavage and polyadenylation.


Assuntos
Pareamento de Bases/fisiologia , DNA Complementar/genética , Precursores de RNA/metabolismo , Splicing de RNA/fisiologia , Células A549 , Sequência de Bases/genética , Sequência Conservada/fisiologia , Biblioteca Gênica , Células Hep G2 , Humanos , Íntrons/genética , Poliadenilação , Dobramento de RNA/fisiologia , Precursores de RNA/genética , RNA-Seq
5.
Pharmacol Res ; 161: 105249, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33068730

RESUMO

The molecular complexity of human breast cancer (BC) renders the clinical management of the disease challenging. Long non-coding RNAs (lncRNAs) are promising biomarkers for BC patient stratification, early detection, and disease monitoring. Here, we identified the involvement of the long intergenic non-coding RNA 01087 (LINC01087) in breast oncogenesis. LINC01087 appeared significantly downregulated in triple-negative BCs (TNBCs) and upregulated in the luminal BC subtypes in comparison to mammary samples from cancer-free women and matched normal cancer pairs. Interestingly, deregulation of LINC01087 allowed to accurately distinguish between luminal and TNBC specimens, independently of the clinicopathological parameters, and of the histological and TP53 or BRCA1/2 mutational status. Moreover, increased expression of LINC01087 predicted a better prognosis in luminal BCs, while TNBC tumors that harbored lower levels of LINC01087 were associated with reduced relapse-free survival. Furthermore, bioinformatics analyses were performed on TNBC and luminal BC samples and suggested that the putative tumor suppressor activity of LINC01087 may rely on interferences with pathways involved in cell survival, proliferation, adhesion, invasion, inflammation and drug sensitivity. Altogether, these data suggest that the assessment of LINC01087 deregulation could represent a novel, specific and promising biomarker not only for the diagnosis and prognosis of luminal BC subtypes and TNBCs, but also as a predictive biomarker of pharmacological interventions.


Assuntos
Biomarcadores Tumorais/metabolismo , Neoplasias da Mama/metabolismo , RNA Longo não Codificante/metabolismo , Neoplasias de Mama Triplo Negativas/metabolismo , Biomarcadores Tumorais/genética , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Células MCF-7 , Metástase Neoplásica , Recidiva Local de Neoplasia , Intervalo Livre de Progressão , Mapas de Interação de Proteínas , RNA Longo não Codificante/genética , Transdução de Sinais , Fatores de Tempo , Transcriptoma , Neoplasias de Mama Triplo Negativas/tratamento farmacológico , Neoplasias de Mama Triplo Negativas/genética , Neoplasias de Mama Triplo Negativas/patologia
6.
PLoS Comput Biol ; 16(10): e1008349, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33075075

RESUMO

The development of increasingly sophisticated methods to acquire high-resolution images has led to the generation of large collections of biomedical imaging data, including images of tissues and organs. Many of the current machine learning methods that aim to extract biological knowledge from histopathological images require several data preprocessing stages, creating an overhead before the proper analysis. Here we present PyHIST (https://github.com/manuel-munoz-aguirre/PyHIST), an easy-to-use, open source whole slide histological image tissue segmentation and preprocessing command-line tool aimed at tile generation for machine learning applications. From a given input image, the PyHIST pipeline i) optionally rescales the image to a different resolution, ii) produces a mask for the input image which separates the background from the tissue, and iii) generates individual image tiles with tissue content.


Assuntos
Histocitoquímica/métodos , Interpretação de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Software , Biologia Computacional , Humanos , Neoplasias/diagnóstico por imagem , Pele/diagnóstico por imagem
7.
Genome Res ; 30(7): 1047-1059, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32759341

RESUMO

We have produced RNA sequencing data for 53 primary cells from different locations in the human body. The clustering of these primary cells reveals that most cells in the human body share a few broad transcriptional programs, which define five major cell types: epithelial, endothelial, mesenchymal, neural, and blood cells. These act as basic components of many tissues and organs. Based on gene expression, these cell types redefine the basic histological types by which tissues have been traditionally classified. We identified genes whose expression is specific to these cell types, and from these genes, we estimated the contribution of the major cell types to the composition of human tissues. We found this cellular composition to be a characteristic signature of tissues and to reflect tissue morphological heterogeneity and histology. We identified changes in cellular composition in different tissues associated with age and sex, and found that departures from the normal cellular composition correlate with histological phenotypes associated with disease.


Assuntos
Transcrição Gênica , Linhagem Celular , Células Endoteliais/metabolismo , Células Epiteliais/metabolismo , Feminino , Perfilação da Expressão Gênica , Ginecomastia/genética , Ginecomastia/metabolismo , Humanos , Masculino , Mesoderma/citologia , Mesoderma/metabolismo , Neoplasias/genética , Especificidade de Órgãos , Análise de Sequência de RNA
8.
Genome Med ; 12(1): 49, 2020 05 27.
Artigo em Inglês | MEDLINE | ID: mdl-32460841

RESUMO

BACKGROUND: Mosaic mutations acquired during early embryogenesis can lead to severe early-onset genetic disorders and cancer predisposition, but are often undetectable in blood samples. The rate and mutational spectrum of embryonic mosaic mutations (EMMs) have only been studied in few tissues, and their contribution to genetic disorders is unknown. Therefore, we investigated how frequent mosaic mutations occur during embryogenesis across all germ layers and tissues. METHODS: Mosaic mutation detection in 49 normal tissues from 570 individuals (Genotype-Tissue Expression (GTEx) cohort) was performed using a newly developed multi-tissue, multi-individual variant calling approach for RNA-seq data. Our method allows for reliable identification of EMMs and the developmental stage during which they appeared. RESULTS: The analysis of EMMs in 570 individuals revealed that newborns on average harbor 0.5-1 EMMs in the exome affecting multiple organs (1.3230 × 10-8 per nucleotide per individual), a similar frequency as reported for germline de novo mutations. Our multi-tissue, multi-individual study design allowed us to distinguish mosaic mutations acquired during different stages of embryogenesis and adult life, as well as to provide insights into the rate and spectrum of mosaic mutations. We observed that EMMs are dominated by a mutational signature associated with spontaneous deamination of methylated cytosines and the number of cell divisions. After birth, cells continue to accumulate somatic mutations, which can lead to the development of cancer. Investigation of the mutational spectrum of the gastrointestinal tract revealed a mutational pattern associated with the food-borne carcinogen aflatoxin, a signature that has so far only been reported in liver cancer. CONCLUSIONS: In summary, our multi-tissue, multi-individual study reveals a surprisingly high number of embryonic mosaic mutations in coding regions, implying novel hypotheses and diagnostic procedures for investigating genetic causes of disease and cancer predisposition.


Assuntos
Desenvolvimento Embrionário/genética , Mosaicismo , Humanos , RNA-Seq , Sequenciamento do Exoma
9.
Noncoding RNA ; 6(1)2020 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-32182990

RESUMO

The biological role and therapeutic potential of long non-coding RNAs (lncRNAs) in chronic lymphocytic leukemia (CLL) are still open questions. Herein, we investigated the significance of the lncRNA NEAT1 in CLL. We examined NEAT1 expression in 310 newly diagnosed Binet A patients, in normal CD19+ B-cells, and other types of B-cell malignancies. Although global NEAT1 expression level was not statistically different in CLL cells compared to normal B cells, the median ratio of NEAT1_2 long isoform and global NEAT1 expression in CLL samples was significantly higher than in other groups. NEAT1_2 was more expressed in patients carrying mutated IGHV genes. Concerning cytogenetic aberrations, NEAT1_2 expression in CLL with trisomy 12 was lower with respect to patients without alterations. Although global NEAT1 expression appeared not to be associated with clinical outcome, patients with the lowest NEAT1_2 expression displayed the shortest time to first treatment; however, a multivariate regression analysis showed that the NEAT1_2 risk model was not independent from other known prognostic factors, particularly the IGHV mutational status. Overall, our data prompt future studies to investigate whether the increased amount of the long NEAT1_2 isoform detected in CLL cells may have a specific role in the pathology of the disease.

11.
Nat Genet ; 50(9): 1327-1334, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30127527

RESUMO

Coding variants represent many of the strongest associations between genotype and phenotype; however, they exhibit inter-individual differences in effect, termed 'variable penetrance'. Here, we study how cis-regulatory variation modifies the penetrance of coding variants. Using functional genomic and genetic data from the Genotype-Tissue Expression Project (GTEx), we observed that in the general population, purifying selection has depleted haplotype combinations predicted to increase pathogenic coding variant penetrance. Conversely, in cancer and autism patients, we observed an enrichment of penetrance increasing haplotype configurations for pathogenic variants in disease-implicated genes, providing evidence that regulatory haplotype configuration of coding variants affects disease risk. Finally, we experimentally validated this model by editing a Mendelian single-nucleotide polymorphism (SNP) using CRISPR/Cas9 on distinct expression haplotypes with the transcriptome as a phenotypic readout. Our results demonstrate that joint regulatory and coding variant effects are an important part of the genetic architecture of human traits and contribute to modified penetrance of disease-causing variants.


Assuntos
Doença/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Sistemas CRISPR-Cas , Genoma Humano , Haplótipos , Humanos , Fenótipo , Locos de Características Quantitativas , Transcriptoma
12.
Nat Med ; 24(6): 868-880, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29785028

RESUMO

Chronic lymphocytic leukemia (CLL) is a frequent hematological neoplasm in which underlying epigenetic alterations are only partially understood. Here, we analyze the reference epigenome of seven primary CLLs and the regulatory chromatin landscape of 107 primary cases in the context of normal B cell differentiation. We identify that the CLL chromatin landscape is largely influenced by distinct dynamics during normal B cell maturation. Beyond this, we define extensive catalogues of regulatory elements de novo reprogrammed in CLL as a whole and in its major clinico-biological subtypes classified by IGHV somatic hypermutation levels. We uncover that IGHV-unmutated CLLs harbor more active and open chromatin than IGHV-mutated cases. Furthermore, we show that de novo active regions in CLL are enriched for NFAT, FOX and TCF/LEF transcription factor family binding sites. Although most genetic alterations are not associated with consistent epigenetic profiles, CLLs with MYD88 mutations and trisomy 12 show distinct chromatin configurations. Furthermore, we observe that non-coding mutations in IGHV-mutated CLLs are enriched in H3K27ac-associated regulatory elements outside accessible chromatin. Overall, this study provides an integrative portrait of the CLL epigenome, identifies extensive networks of altered regulatory elements and sheds light on the relationship between the genetic and epigenetic architecture of the disease.


Assuntos
Cromatina/metabolismo , Epigenômica , Leucemia Linfocítica Crônica de Células B/genética , Linfócitos B/metabolismo , Sequência de Bases , Estudos de Coortes , Humanos
13.
PLoS Comput Biol ; 13(3): e1005341, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28253259

RESUMO

CRISPR-Cas9 technology can be used to engineer precise genomic deletions with pairs of single guide RNAs (sgRNAs). This approach has been widely adopted for diverse applications, from disease modelling of individual loci, to parallelized loss-of-function screens of thousands of regulatory elements. However, no solution has been presented for the unique bioinformatic design requirements of CRISPR deletion. We here present CRISPETa, a pipeline for flexible and scalable paired sgRNA design based on an empirical scoring model. Multiple sgRNA pairs are returned for each target, and any number of targets can be analyzed in parallel, making CRISPETa equally useful for focussed or high-throughput studies. Fast run-times are achieved using a pre-computed off-target database. sgRNA pair designs are output in a convenient format for visualisation and oligonucleotide ordering. We present pre-designed, high-coverage library designs for entire classes of protein-coding and non-coding elements in human, mouse, zebrafish, Drosophila melanogaster and Caenorhabditis elegans. In human cells, we reproducibly observe deletion efficiencies of ≥50% for CRISPETa designs targeting an enhancer and exonic fragment of the MALAT1 oncogene. In the latter case, deletion results in production of desired, truncated RNA. CRISPETa will be useful for researchers seeking to harness CRISPR for targeted genomic deletion, in a variety of model organisms, from single-target to high-throughput scales.


Assuntos
Proteínas Associadas a CRISPR/genética , Deleção de Genes , Edição de Genes/métodos , Técnicas de Silenciamento de Genes/métodos , Edição de RNA/genética , RNA Guia de Cinetoplastídeos/genética
14.
PLoS Comput Biol ; 13(2): e1005383, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28192430

RESUMO

Selenocysteine (Sec) is known as the 21st amino acid, a cysteine analogue with selenium replacing sulphur. Sec is inserted co-translationally in a small fraction of proteins called selenoproteins. In selenoprotein genes, the Sec specific tRNA (tRNASec) drives the recoding of highly specific UGA codons from stop signals to Sec. Although found in organisms from the three domains of life, Sec is not universal. Many species are completely devoid of selenoprotein genes and lack the ability to synthesize Sec. Since tRNASec is a key component in selenoprotein biosynthesis, its efficient identification in genomes is instrumental to characterize the utilization of Sec across lineages. Available tRNA prediction methods fail to accurately predict tRNASec, due to its unusual structural fold. Here, we present Secmarker, a method based on manually curated covariance models capturing the specific tRNASec structure in archaea, bacteria and eukaryotes. We exploited the non-universality of Sec to build a proper benchmark set for tRNASec predictions, which is not possible for the predictions of other tRNAs. We show that Secmarker greatly improves the accuracy of previously existing methods constituting a valuable tool to identify tRNASec genes, and to efficiently determine whether a genome contains selenoproteins. We used Secmarker to analyze a large set of fully sequenced genomes, and the results revealed new insights in the biology of tRNASec, led to the discovery of a novel bacterial selenoprotein family, and shed additional light on the phylogenetic distribution of selenoprotein containing genomes. Secmarker is freely accessible for download, or online analysis through a web server at http://secmarker.crg.cat.


Assuntos
Mapeamento Cromossômico/métodos , Marcadores Genéticos/genética , Genoma/genética , Ensaios de Triagem em Larga Escala/métodos , RNA de Transferência Aminoácido-Específico/genética , Aminoacil-RNA de Transferência/genética , Algoritmos , Componentes Genômicos/genética , Selenocisteína
15.
Sci Rep ; 7: 41544, 2017 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-28128360

RESUMO

Long noncoding RNAs (lncRNAs) represent a vast unexplored genetic space that may hold missing drivers of tumourigenesis, but few such "driver lncRNAs" are known. Until now, they have been discovered through changes in expression, leading to problems in distinguishing between causative roles and passenger effects. We here present a different approach for driver lncRNA discovery using mutational patterns in tumour DNA. Our pipeline, ExInAtor, identifies genes with excess load of somatic single nucleotide variants (SNVs) across panels of tumour genomes. Heterogeneity in mutational signatures between cancer types and individuals is accounted for using a simple local trinucleotide background model, which yields high precision and low computational demands. We use ExInAtor to predict drivers from the GENCODE annotation across 1112 entire genomes from 23 cancer types. Using a stratified approach, we identify 15 high-confidence candidates: 9 novel and 6 known cancer-related genes, including MALAT1, NEAT1 and SAMMSON. Both known and novel driver lncRNAs are distinguished by elevated gene length, evolutionary conservation and expression. We have presented a first catalogue of mutated lncRNA genes driving cancer, which will grow and improve with the application of ExInAtor to future tumour genome projects.


Assuntos
Genoma Humano , Genômica , Neoplasias/genética , Oncogenes , RNA Longo não Codificante/genética , Processamento Alternativo , Biomarcadores Tumorais , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genômica/métodos , Humanos , Mutação , Neoplasias/diagnóstico , Fases de Leitura Aberta , Polimorfismo de Nucleotídeo Único
16.
BMC Genomics ; 18(1): 7, 2017 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-28049418

RESUMO

BACKGROUND: Chimeric transcripts are commonly defined as transcripts linking two or more different genes in the genome, and can be explained by various biological mechanisms such as genomic rearrangement, read-through or trans-splicing, but also by technical or biological artefacts. Several studies have shown their importance in cancer, cell pluripotency and motility. Many programs have recently been developed to identify chimeras from Illumina RNA-seq data (mostly fusion genes in cancer). However outputs of different programs on the same dataset can be widely inconsistent, and tend to include many false positives. Other issues relate to simulated datasets restricted to fusion genes, real datasets with limited numbers of validated cases, result inconsistencies between simulated and real datasets, and gene rather than junction level assessment. RESULTS: Here we present ChimPipe, a modular and easy-to-use method to reliably identify fusion genes and transcription-induced chimeras from paired-end Illumina RNA-seq data. We have also produced realistic simulated datasets for three different read lengths, and enhanced two gold-standard cancer datasets by associating exact junction points to validated gene fusions. Benchmarking ChimPipe together with four other state-of-the-art tools on this data showed ChimPipe to be the top program at identifying exact junction coordinates for both kinds of datasets, and the one showing the best trade-off between sensitivity and precision. Applied to 106 ENCODE human RNA-seq datasets, ChimPipe identified 137 high confidence chimeras connecting the protein coding sequence of their parent genes. In subsequent experiments, three out of four predicted chimeras, two of which recurrently expressed in a large majority of the samples, could be validated. Cloning and sequencing of the three cases revealed several new chimeric transcript structures, 3 of which with the potential to encode a chimeric protein for which we hypothesized a new role. Applying ChimPipe to human and mouse ENCODE RNA-seq data led to the identification of 131 recurrent chimeras common to both species, and therefore potentially conserved. CONCLUSIONS: ChimPipe combines discordant paired-end reads and split-reads to detect any kind of chimeras, including those originating from polymerase read-through, and shows an excellent trade-off between sensitivity and precision. The chimeras found by ChimPipe can be validated in-vitro with high accuracy.


Assuntos
Proteínas de Fusão Oncogênica , Recombinação Genética , Software , Transcrição Gênica , Animais , Biologia Computacional/métodos , Simulação por Computador , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Reprodutibilidade dos Testes , Análise de Sequência de RNA
17.
Open Biol ; 6(11)2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27881738

RESUMO

Dynamic redefinition of the 10 UGAs in human and mouse selenoprotein P (Sepp1) mRNAs to specify selenocysteine instead of termination involves two 3' UTR structural elements (SECIS) and is regulated by selenium availability. In addition to the previously known human Sepp1 mRNA poly(A) addition site just 3' of SECIS 2, two further sites were identified with one resulting in 10-25% of the mRNA lacking SECIS 2. To address function, mutant mice were generated with either SECIS 1 or SECIS 2 deleted or with the first UGA substituted with a serine codon. They were fed on either high or selenium-deficient diets. The mutants had very different effects on the proportions of shorter and longer product Sepp1 protein isoforms isolated from plasma, and on viability. Spatially and functionally distinctive effects of the two SECIS elements on UGA decoding were inferred. We also bioinformatically identify two selenoprotein S mRNAs with different 5' sequences predicted to yield products with different N-termini. These results provide insights into SECIS function and mRNA processing in selenoprotein isoform diversity.


Assuntos
Mutação , RNA Mensageiro/metabolismo , Selenocisteína/genética , Selenoproteína P/genética , Regiões 3' não Traduzidas , Processamento Alternativo , Animais , Códon de Terminação , Células Hep G2 , Humanos , Camundongos , Isoformas de Proteínas/genética , Selênio/metabolismo
18.
Insect Biochem Mol Biol ; 69: 105-14, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26392061

RESUMO

The selenium-dependent glutathione peroxidase (SeGPx) is a well-studied enzyme that detoxifies organic and hydrogen peroxides and provides cells or extracellular fluids with a key antioxidant function. The presence of a SeGPx has not been unequivocally demonstrated in insects. In the present work, we identified the gene and studied the function of a Rhodnius prolixus SeGPx (RpSeGPx). The RpSeGPx mRNA presents the UGA codon that encodes the active site selenocysteine (Sec) and a corresponding Sec insertion sequence (SECIS) in the 3' UTR region. The encoded protein includes a signal peptide, which is consistent with the high levels of GPx enzymatic activity in the insect's hemolymph, and clusters phylogenetically with the extracellular mammalian GPx03. This result contrasts with all other known insect GPxs, which use a cysteine residue instead of Sec and cluster with the mammalian phospholipid hydroperoxide GPx04. RpSeGPx is widely expressed in insect organs, with higher expression levels in the fat body. RNA interference (RNAi) was used to reduce RpSeGPx gene expression and GPx activity in the hemolymph. Adult females were apparently unaffected by RpSeGPx RNAi, whereas first instar nymphs showed a three-day delay in ecdysis. Silencing of RpSeGPx did not alter the gene expression of the antioxidant enzymes catalase, xanthine dehydrogenase and a cysteine-GPx, but it reduced the levels of the dual oxidase and NADPH oxidase 5 transcripts that encode for enzymes releasing extracellular hydrogen peroxide/superoxide. Collectively, our data suggest that RpSeGPx functions in the regulation of extracellular (hemolymph) redox homeostasis of R. prolixus.


Assuntos
Glutationa Peroxidase/química , Glutationa Peroxidase/genética , Rhodnius/enzimologia , Rhodnius/genética , Selênio/química , Animais , Feminino , Inativação Metabólica/genética , Muda , Filogenia , Interferência de RNA , Coelhos , Rhodnius/crescimento & desenvolvimento , Selenocisteína/química
19.
BMC Genomics ; 16: 846, 2015 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-26493208

RESUMO

BACKGROUND: CRISPR genome-editing technology makes it possible to quickly and cheaply delete non-protein-coding regulatory elements. We present a vector system adapted for this purpose called DECKO (Double Excision CRISPR Knockout), which applies a simple two-step cloning to generate lentiviral vectors expressing two guide RNAs (gRNAs) simultaneously. The key feature of DECKO is its use of a single 165 bp starting oligonucleotide carrying the variable sequences of both gRNAs, making it fully scalable from single-locus studies to complex library cloning. RESULTS: We apply DECKO to deleting the promoters of one protein-coding gene and two oncogenic lncRNAs, UCA1 and the highly-expressed MALAT1, focus of many previous studies employing RNA interference approaches. DECKO successfully deleted genomic fragments ranging in size from 100 to 3000 bp in four human cell lines. Using a clone-derivation workflow lasting approximately 20 days, we obtained 9 homozygous and 17 heterozygous promoter knockouts in three human cell lines. Frequent target region inversions were observed. These clones have reductions in steady-state MALAT1 RNA levels of up to 98 % and display reduced proliferation rates. CONCLUSIONS: We present a dual CRISPR tool, DECKO, which is cloned using a single starting oligonucleotide, thereby affording simplicity and scalability to CRISPR knockout studies of non-coding genomic elements, including long non-coding RNAs.


Assuntos
Sistemas CRISPR-Cas/genética , Genoma , RNA Guia de Cinetoplastídeos/genética , RNA Longo não Codificante/genética , Inversão Cromossômica/genética , Vetores Genéticos , Genômica , Humanos , Lentivirus/genética , Deleção de Sequência
20.
Genome Res ; 25(9): 1256-67, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26194102

RESUMO

Selenoproteins are proteins that incorporate selenocysteine (Sec), a nonstandard amino acid encoded by UGA, normally a stop codon. Sec synthesis requires the enzyme Selenophosphate synthetase (SPS or SelD), conserved in all prokaryotic and eukaryotic genomes encoding selenoproteins. Here, we study the evolutionary history of SPS genes, providing a map of selenoprotein function spanning the whole tree of life. SPS is itself a selenoprotein in many species, although functionally equivalent homologs that replace the Sec site with cysteine (Cys) are common. Many metazoans, however, possess SPS genes with substitutions other than Sec or Cys (collectively referred to as SPS1). Using complementation assays in fly mutants, we show that these genes share a common function, which appears to be distinct from the synthesis of selenophosphate carried out by the Sec- and Cys- SPS genes (termed SPS2), and unrelated to Sec synthesis. We show here that SPS1 genes originated through a number of independent gene duplications from an ancestral metazoan selenoprotein SPS2 gene that most likely already carried the SPS1 function. Thus, in SPS genes, parallel duplications and subsequent convergent subfunctionalization have resulted in the segregation to different loci of functions initially carried by a single gene. This evolutionary history constitutes a remarkable example of emergence and evolution of gene function, which we have been able to trace thanks to the singular features of SPS genes, wherein the amino acid at a single site determines unequivocally protein function and is intertwined to the evolutionary fate of the entire selenoproteome.


Assuntos
Evolução Biológica , Fosfotransferases/genética , Fosfotransferases/metabolismo , Animais , Biomarcadores , Eucariotos/genética , Eucariotos/metabolismo , Duplicação Gênica , Humanos , Insetos , Filogenia , Células Procarióticas/metabolismo , Seleção Genética , Selênio/metabolismo , Selenoproteínas/genética , Selenoproteínas/metabolismo , Urocordados , Vertebrados
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA