RESUMO
BACKGROUND: Recently, copy number variations (CNV) impacting genes involved in oncogenic pathways have attracted an increasing attention to manage disease susceptibility. CNV is one of the most important somatic aberrations in the genome of tumor cells. Oncogene activation and tumor suppressor gene inactivation are often attributed to copy number gain/amplification or deletion, respectively, in many cancer types and stages. Recent advances in next generation sequencing protocols allow for the addition of unique molecular identifiers (UMI) to each read. Each targeted DNA fragment is labeled with a unique random nucleotide sequence added to sequencing primers. UMI are especially useful for CNV detection by making each DNA molecule in a population of reads distinct. RESULTS: Here, we present molecular Copy Number Alteration (mCNA), a new methodology allowing the detection of copy number changes using UMI. The algorithm is composed of four main steps: the construction of UMI count matrices, the use of control samples to construct a pseudo-reference, the computation of log-ratios, the segmentation and finally the statistical inference of abnormal segmented breaks. We demonstrate the success of mCNA on a dataset of patients suffering from Diffuse Large B-cell Lymphoma and we highlight that mCNA results have a strong correlation with comparative genomic hybridization. CONCLUSION: We provide mCNA, a new approach for CNV detection, freely available at https://gitlab.com/pierrejulien.viailly/mcna/ under MIT license. mCNA can significantly improve detection accuracy of CNV changes by using UMI.
Assuntos
Hibridização Genômica Comparativa , Variações do Número de Cópias de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Adulto , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Análise de Sequência de DNARESUMO
Influenza virus-like particles (VLPs) have been shown to induce a safe and potent immune response through both humoral and cellular responses. They represent promising novel influenza vaccines. Plant-based biotechnology allows for the large-scale production of VLPs of biopharmaceutical interest using different model organisms, including Nicotiana benthamiana plants. Through this platform, influenza VLPs bud from the plasma membrane and accumulate between the membrane and the plant cell wall. To design and optimize efficient production processes, a better understanding of the plant cell wall composition of infiltrated tobacco leaves is a major interest for the plant biotechnology industry. In this study, we have investigated the alteration of the biochemical composition of the cell walls of N. benthamiana leaves subjected to abiotic and biotic stresses induced by the Agrobacterium-mediated transient transformation and the resulting high expression levels of influenza VLPs. Results show that abiotic stress due to vacuum infiltration without Agrobacterium did not induce any detectable modification of the leaf cell wall when compared to non infiltrated leaves. In contrast, various chemical changes of the leaf cell wall were observed post-Agrobacterium infiltration. Indeed, Agrobacterium infection induced deposition of callose and lignin, modified the pectin methylesterification and increased both arabinosylation of RG-I side chains and the expression of arabinogalactan proteins. Moreover, these modifications were slightly greater in plants expressing haemagglutinin-based VLP than in plants infiltrated with the Agrobacterium strain containing only the p19 suppressor of silencing.
Assuntos
Agrobacterium/metabolismo , Biotecnologia/métodos , Parede Celular/metabolismo , Hemaglutininas/metabolismo , Nicotiana/metabolismo , Agrobacterium/genética , Hemaglutininas/genética , Vacinas contra Influenza/genética , Vacinas contra Influenza/metabolismo , Plantas Geneticamente Modificadas/genética , Plantas Geneticamente Modificadas/metabolismo , Nicotiana/genéticaRESUMO
The RNA exosome is the major 3'-5' RNA degradation machine of eukaryotic cells and participates in processing, surveillance and turnover of both nuclear and cytoplasmic RNA. In both yeast and human, all nuclear functions of the exosome require the RNA helicase MTR4. We show that the Arabidopsis core exosome can associate with two related RNA helicases, AtMTR4 and HEN2. Reciprocal co-immunoprecipitation shows that each of the RNA helicases co-purifies with the exosome core complex and with distinct sets of specific proteins. While AtMTR4 is a predominantly nucleolar protein, HEN2 is located in the nucleoplasm and appears to be excluded from nucleoli. We have previously shown that the major role of AtMTR4 is the degradation of rRNA precursors and rRNA maturation by-products. Here, we demonstrate that HEN2 is involved in the degradation of a large number of polyadenylated nuclear exosome substrates such as snoRNA and miRNA precursors, incompletely spliced mRNAs, and spurious transcripts produced from pseudogenes and intergenic regions. Only a weak accumulation of these exosome substrate targets is observed in mtr4 mutants, suggesting that MTR4 can contribute, but plays rather a minor role for the degradation of non-ribosomal RNAs and cryptic transcripts in Arabidopsis. Consistently, transgene post-transcriptional gene silencing (PTGS) is marginally affected in mtr4 mutants, but increased in hen2 mutants, suggesting that it is mostly the nucleoplasmic exosome that degrades aberrant transgene RNAs to limit their entry in the PTGS pathway. Interestingly, HEN2 is conserved throughout green algae, mosses and land plants but absent from metazoans and other eukaryotic lineages. Our data indicate that, in contrast to human and yeast, plants have two functionally specialized RNA helicases that assist the exosome in the degradation of specific nucleolar and nucleoplasmic RNA populations, respectively.
Assuntos
Arabidopsis/genética , Exossomos/metabolismo , RNA Helicases/genética , Estabilidade de RNA/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Núcleo Celular/genética , Exossomos/genética , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Helicases/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Nucleolar Pequeno/genética , RNA Nucleolar Pequeno/metabolismoRESUMO
Post-translational modification of histones and DNA methylation are important components of chromatin-level control of genome activity in eukaryotes. However, principles governing the combinatorial association of chromatin marks along the genome remain poorly understood. Here, we have generated epigenomic maps for eight histone modifications (H3K4me2 and 3, H3K27me1 and 2, H3K36me3, H3K56ac, H4K20me1 and H2Bub) in the model plant Arabidopsis and we have combined these maps with others, produced under identical conditions, for H3K9me2, H3K9me3, H3K27me3 and DNA methylation. Integrative analysis indicates that these 12 chromatin marks, which collectively cover â¼90% of the genome, are present at any given position in a very limited number of combinations. Moreover, we show that the distribution of the 12 marks along the genomic sequence defines four main chromatin states, which preferentially index active genes, repressed genes, silent repeat elements and intergenic regions. Given the compact nature of the Arabidopsis genome, these four indexing states typically translate into short chromatin domains interspersed with each other. This first combinatorial view of the Arabidopsis epigenome points to simple principles of organization as in metazoans and provides a framework for further studies of chromatin-based regulatory mechanisms in plants.
Assuntos
Arabidopsis/fisiologia , Cromatina/metabolismo , Epigênese Genética , Regulação da Expressão Gênica de Plantas , Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Cromossomos/metabolismo , Metilação de DNA , Histonas/metabolismo , Processamento de Proteína Pós-TraducionalRESUMO
BACKGROUND: Chromatin immunoprecipitation coupled with hybridization to a tiling array (ChIP-chip) is a cost-effective and routinely used method to identify protein-DNA interactions or chromatin/histone modifications. The robust identification of ChIP-enriched regions is frequently complicated by noisy measurements. This identification can be improved by accounting for dependencies between adjacent probes on chromosomes and by modeling of biological replicates. RESULTS: MultiChIPmixHMM is a user-friendly R package to analyse ChIP-chip data modeling spatial dependencies between directly adjacent probes on a chromosome and enabling a simultaneous analysis of replicates. It is based on a linear regression mixture model, designed to perform a joint modeling of immunoprecipitated and input measurements. CONCLUSION: We show the utility of MultiChIPmixHMM by analyzing histone modifications of Arabidopsis thaliana. MultiChIPmixHMM is implemented in R and including functions in C, freely available from the CRAN web site: http://cran.r-project.org.
Assuntos
Imunoprecipitação da Cromatina/métodos , Hibridização de Ácido Nucleico/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Arabidopsis/genética , Histonas/genética , Modelos Lineares , Modelos Genéticos , Proteínas de Plantas/genéticaRESUMO
Plant genomes are earmarked with defined patterns of chromatin marks. Little is known about the stability of these epigenomes when related, but distinct genomes are brought together by intra-species hybridization. Arabidopsis thaliana accessions and their reciprocal hybrids were used as a model system to investigate the dynamics of histone modification patterns. The genome-wide distribution of histone modifications H3K4me2 and H3K27me3 in the inbred parental accessions Col-0, C24 and Cvi and their hybrid offspring was compared by chromatin immunoprecipitation in combination with genome tiling array hybridization. The analysis revealed that, in addition to DNA sequence polymorphisms, chromatin modification variations exist among accessions of A. thaliana. The range of these variations was higher for H3K27me3 (typically a repressive mark) than for H3K4me2 (typically an active mark). H3K4me2 and H3K27me3 were rather stable in response to intra-species hybridization, with mainly additive inheritance in hybrid offspring. In conclusion, intra-species hybridization does not result in gross changes to chromatin modifications.
Assuntos
Arabidopsis/genética , Genoma de Planta/genética , Histona-Lisina N-Metiltransferase/metabolismo , Histonas/metabolismo , Polimorfismo Genético/genética , Processamento de Proteína Pós-Traducional , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Cromatina/genética , Hibridização Genômica Comparativa , Elementos de DNA Transponíveis/genética , DNA de Plantas/genética , Epigênese Genética , Regulação da Expressão Gênica de Plantas , Hibridização Genética , Metilação , Análise de Sequência com Séries de OligonucleotídeosRESUMO
Tiling arrays make possible a large-scale exploration of the genome thanks to probes which cover the whole genome with very high density, up to 2,000,000 probes. Biological questions usually addressed are either the expression difference between two conditions or the detection of transcribed regions. In this work, we propose to consider both questions simultaneously as an unsupervised classification problem by modeling the joint distribution of the two conditions. In contrast to previous methods, we account for all available information on the probes as well as biological knowledge such as annotation and spatial dependence between probes. Since probes are not biologically relevant units, we propose a classification rule for non-connected regions covered by several probes. Applications to transcriptomic and ChIP-chip data of Arabidopsis thaliana obtained with a NimbleGen tiling array highlight the importance of a precise modeling and of the region classification. The "TAHMMAnnot" package is implemented in R and C and is freely available from CRAN.
Assuntos
Imunoprecipitação da Cromatina/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Transcriptoma , Arabidopsis/genética , Cromossomos de Plantas/genética , Hibridização Genômica Comparativa , Biologia Computacional/métodos , Simulação por Computador , Sondas de DNA/genética , Éxons , Perfilação da Expressão Gênica/métodos , Genes de Plantas , Histonas/análise , Histonas/genética , Cadeias de Markov , Modelos Genéticos , Anotação de Sequência Molecular , RNA de Plantas/genéticaRESUMO
MOTIVATION: With Next Generation Sequencing becoming more affordable every year, NGS technologies asserted themselves as the fastest and most reliable way to detect Single Nucleotide Variants (SNV) and Copy Number Variations (CNV) in cancer patients. These technologies can be used to sequence DNA at very high depths thus allowing to detect abnormalities in tumor cells with very low frequencies. Multiple variant callers are publicly available and are usually efficient at calling out variants. However, when frequencies begin to drop under 1%, the specificity of these tools suffers greatly as true variants at very low frequencies can be easily confused with sequencing or PCR artifacts. The recent use of Unique Molecular Identifiers (UMI) in NGS experiments has offered a way to accurately separate true variants from artifacts. UMI-based variant callers are slowly replacing raw-read based variant callers as the standard method for an accurate detection of variants at very low frequencies. However, benchmarking done in the tools publication are usually realized on real biological data in which real variants are not known, making it difficult to assess their accuracy. RESULTS: We present UMI-Gen, a UMI-based read simulator for targeted sequencing paired-end data. UMI-Gen generates reference reads covering the targeted regions at a user customizable depth. After that, using a number of control files, it estimates the background error rate at each position and then modifies the generated reads to mimic real biological data. Finally, it will insert real variants in the reads from a list provided by the user. AVAILABILITY: The entire pipeline is available at https://gitlab.com/vincent-sater/umigen under MIT license.
RESUMO
MOTIVATION: Chromatin immunoprecipitation (ChIP) combined with DNA microarray is a high-throughput technology to investigate DNA-protein binding or chromatin/histone modifications. ChIP-chip data require adapted statistical method in order to identify enriched regions. All methods already proposed are based on the analysis of the log ratio (Ip/Input). Nevertheless, the assumption that the log ratio is a pertinent quantity to assess the probe status is not always veri.ed and it leads to a poor data interpretation. RESULTS: Instead of working on the log ratio, we directly work with the Ip and Input signals of each probe by modeling the distribution of the Ip signal conditional to the Input signal. We propose a method named ChIPmix based on a linear regression mixture model to identify actual binding targets of the protein under study. Moreover, we are able to control the proportion of false positives. The efficiency of ChIPmix is illustrated on several datasets obtained from different organisms and hybridized either on tiling or promoter arrays. This validation shows that ChIPmix is convenient for any two-color array whatever its density and provides promising results. AVAILABILITY: The ChIPmix method is implemented in R and is available at http://www.agroparistech.fr/mia/outil_A.html.
Assuntos
Algoritmos , Imunoprecipitação da Cromatina/métodos , Microscopia de Fluorescência por Excitação Multifotônica/métodos , Modelos Genéticos , Análise de Sequência de DNA/métodos , Sequência de Bases , Simulação por Computador , Interpretação Estatística de Dados , Modelos Estatísticos , Dados de Sequência Molecular , Análise de RegressãoRESUMO
Phaeodactylum tricornutum is the most studied diatom encountered principally in coastal unstable environments. It has been hypothesized that the great adaptability of P. tricornutum is probably due to its pleomorphism. Indeed, P. tricornutum is an atypical diatom since it can display three morphotypes: fusiform, triradiate and oval. Currently, little information is available regarding the physiological significance of this morphogenesis. In this study, we adapted P. tricornutum Pt3 strain to obtain algal culture particularly enriched in one dominant morphotype: fusiform, triradiate or oval. These cultures were used to run high-throughput RNA-Sequencing. The whole mRNA transcriptome of each morphotype was determined. Pairwise comparisons highlighted biological processes and molecular functions which are up- and down-regulated. Finally, intersection analysis allowed us to identify the specific features from the oval morphotype which is of particular interest as it is often described to be more resistant to stresses. This study represent the first transcriptome wide characterization of the three morphotypes from P. tricornutum performed on cultures specifically enriched issued from the same Pt3 strain. This work represents an important step for the understanding of the morphogenesis in P. tricornutum and highlights the particular features of the oval morphotype.
Assuntos
Diatomáceas/genética , Fenótipo , Análise de Sequência de RNA , Diatomáceas/fisiologia , Perfilação da Expressão Gênica , Estresse FisiológicoRESUMO
Authors' Reply to the Letter to the Editor by Y. Lynn Wang (MYD88 mutations and sensitivity to ibrutinib therapy).
Assuntos
Fator 88 de Diferenciação Mieloide , Adenina/análogos & derivados , Mutação , Piperidinas , Pirazóis , PirimidinasRESUMO
Diffuse large B-cell lymphoma (DLBCL) is the most common non-Hodgkin lymphoma. It includes three major subtypes termed germinal center B-cell-like, activated B-cell-like, and primary mediastinal B-cell lymphoma. With the emergence of novel targeted therapies, accurate methods capable of interrogating this cell-of-origin classification should soon become essential in the clinics. To address this issue, we developed a novel gene expression profiling DLBCL classifier based on reverse transcriptase multiplex ligation-dependent probe amplification. This assay simultaneously evaluates the expression of 21 markers, to differentiate primary mediastinal B-cell lymphoma, activated B-cell-like, germinal center B-cell-like, and also Epstein-Barr virus-positive DLBCLs. It was trained using 70 paraffin-embedded biopsies and validated using >160 independent samples. Compared with a reference classification established from Affymetrix U133 + 2 data, reverse transcriptase multiplex ligation-dependent probe amplification classified 85.0% samples into the expected subtype, comparing favorably with current diagnostic methods. This assay also proved to be highly efficient in detecting the MYD88 L265P mutation, even in archival paraffin-embedded tissues. This reliable, rapid, and cost-effective method uses common instruments and reagents and could thus easily be implemented into routine diagnosis workflows, to improve the management of these aggressive tumors.