Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
Add more filters

Country/Region as subject
Publication year range
1.
BMC Bioinformatics ; 22(1): 323, 2021 Jun 14.
Article in English | MEDLINE | ID: mdl-34126932

ABSTRACT

BACKGROUND: Histone modification constitutes a basic mechanism for the genetic regulation of gene expression. In early 2000s, a powerful technique has emerged that couples chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq). This technique provides a direct survey of the DNA regions associated to these modifications. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed or adapted to analyze the massive amount of data it generates. Many of these algorithms were built around natural assumptions such as the Poisson distribution to model the noise in the count data. In this work we start from these natural assumptions and show that it is possible to improve upon them. RESULTS: Our comparisons on seven reference datasets of histone modifications (H3K36me3 & H3K4me3) suggest that natural assumptions are not always realistic under application conditions. We show that the unconstrained multiple changepoint detection model with alternative noise assumptions and supervised learning of the penalty parameter reduces the over-dispersion exhibited by count data. These models, implemented in the R package CROCS ( https://github.com/aLiehrmann/CROCS ), detect the peaks more accurately than algorithms which rely on natural assumptions. CONCLUSION: The segmentation models we propose can benefit researchers in the field of epigenetics by providing new high-quality peak prediction tracks for H3K36me3 and H3K4me3 histone modifications.


Subject(s)
Chromatin Immunoprecipitation Sequencing , High-Throughput Nucleotide Sequencing , Algorithms , Chromatin Immunoprecipitation , Sequence Analysis, DNA
2.
New Phytol ; 229(2): 994-1006, 2021 01.
Article in English | MEDLINE | ID: mdl-32583438

ABSTRACT

The Anthropocene epoch is associated with the spreading of metals in the environment increasing oxidative and genotoxic stress on organisms. Interestingly, c. 520 plant species growing on metalliferous soils acquired the capacity to accumulate and tolerate a tremendous amount of nickel in their shoots. The wide phylogenetic distribution of these species suggests that nickel hyperaccumulation evolved multiple times independently. However, the exact nature of these mechanisms and whether they have been recruited convergently in distant species is not known. To address these questions, we have developed a cross-species RNA-Seq approach combining differential gene expression analysis and cluster of orthologous group annotation to identify genes linked to nickel hyperaccumulation in distant plant families. Our analysis reveals candidate orthologous genes encoding convergent function involved in nickel hyperaccumulation, including the biosynthesis of specialized metabolites and cell wall organization. Our data also point out that the high expression of IREG/Ferroportin transporters recurrently emerged as a mechanism involved in nickel hyperaccumulation in plants. We further provide genetic evidence in the hyperaccumulator Noccaea caerulescens for the role of the NcIREG2 transporter in nickel sequestration in vacuoles. Our results provide molecular tools to better understand the mechanisms of nickel hyperaccumulation and study their evolution in plants.


Subject(s)
Brassicaceae , Nickel , Brassicaceae/genetics , Phylogeny , RNA-Seq , Soil
3.
Int J Mol Sci ; 22(20)2021 Oct 19.
Article in English | MEDLINE | ID: mdl-34681956

ABSTRACT

Plastid gene expression involves many post-transcriptional maturation steps resulting in a complex transcriptome composed of multiple isoforms. Although short-read RNA-Seq has considerably improved our understanding of the molecular mechanisms controlling these processes, it is unable to sequence full-length transcripts. This information is crucial, however, when it comes to understanding the interplay between the various steps of plastid gene expression. Here, we describe a protocol to study the plastid transcriptome using nanopore sequencing. In the leaf of Arabidopsis thaliana, with about 1.5 million strand-specific reads mapped to the chloroplast genome, we could recapitulate most of the complexity of the plastid transcriptome (polygenic transcripts, multiple isoforms associated with post-transcriptional processing) using virtual Northern blots. Even if the transcripts longer than about 2500 nucleotides were missing, the study of the co-occurrence of editing and splicing events identified 42 pairs of events that were not occurring independently. This study also highlighted a preferential chronology of maturation events with splicing happening after most sites were edited.


Subject(s)
Alternative Splicing , Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Gene Expression Regulation, Plant , Plastids/genetics , RNA, Plant/genetics , Transcriptome , Arabidopsis/genetics , Arabidopsis/growth & development , Arabidopsis Proteins/genetics , Plastids/metabolism , RNA, Plant/metabolism , RNA-Seq
4.
BMC Bioinformatics ; 21(1): 120, 2020 Mar 20.
Article in English | MEDLINE | ID: mdl-32197576

ABSTRACT

BACKGROUND: In unsupervised learning and clustering, data integration from different sources and types is a difficult question discussed in several research areas. For instance in omics analysis, dozen of clustering methods have been developed in the past decade. When a single source of data is at play, hierarchical clustering (HC) is extremely popular, as a tree structure is highly interpretable and arguably more informative than just a partition of the data. However, applying blindly HC to multiple sources of data raises computational and interpretation issues. RESULTS: We propose mergeTrees, a method that aggregates a set of trees with the same leaves to create a consensus tree. In our consensus tree, a cluster at height h contains the individuals that are in the same cluster for all the trees at height h. The method is exact and proven to be [Formula: see text], n being the individuals and q being the number of trees to aggregate. Our implementation is extremely effective on simulations, allowing us to process many large trees at a time. We also rely on mergeTrees to perform the cluster analysis of two real -omics data sets, introducing a spectral variant as an efficient and robust by-product. CONCLUSIONS: Our tree aggregation method can be used in conjunction with hierarchical clustering to perform efficient cluster analysis. This approach was found to be robust to the absence of clustering information in some of the data sets as well as an increased variability within true clusters. The method is implemented in R/C++ and available as an R package named mergeTrees, which makes it easy to integrate in existing or new pipelines in several research areas.


Subject(s)
Cluster Analysis , Algorithms , Gene Expression Profiling , Humans , Proteomics
5.
Brief Bioinform ; 19(1): 65-76, 2018 01 01.
Article in English | MEDLINE | ID: mdl-27742662

ABSTRACT

Numerous statistical pipelines are now available for the differential analysis of gene expression measured with RNA-sequencing technology. Most of them are based on similar statistical frameworks after normalization, differing primarily in the choice of data distribution, mean and variance estimation strategy and data filtering. We propose an evaluation of the impact of these choices when few biological replicates are available through the use of synthetic data sets. This framework is based on real data sets and allows the exploration of various scenarios differing in the proportion of non-differentially expressed genes. Hence, it provides an evaluation of the key ingredients of the differential analysis, free of the biases associated with the simulation of data using parametric models. Our results show the relevance of a proper modeling of the mean by using linear or generalized linear modeling. Once the mean is properly modeled, the impact of the other parameters on the performance of the test is much less important. Finally, we propose to use the simple visualization of the raw P-value histogram as a practical evaluation criterion of the performance of differential analysis methods on real data sets.


Subject(s)
Arabidopsis Proteins/genetics , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , RNA/genetics , Sequence Analysis, RNA/methods , Transcriptome , Arabidopsis/genetics , Computer Simulation , Datasets as Topic , Humans , Models, Statistical , Software
6.
Proc Natl Acad Sci U S A ; 114(33): 8877-8882, 2017 08 15.
Article in English | MEDLINE | ID: mdl-28760958

ABSTRACT

RNA editing is converting hundreds of cytosines into uridines during organelle gene expression of land plants. The pentatricopeptide repeat (PPR) proteins are at the core of this posttranscriptional RNA modification. Even if a PPR protein defines the editing site, a DYW domain of the same or another PPR protein is believed to catalyze the deamination. To give insight into the organelle RNA editosome, we performed tandem affinity purification of the plastidial CHLOROPLAST BIOGENESIS 19 (CLB19) PPR editing factor. Two PPR proteins, dually targeted to mitochondria and chloroplasts, were identified as potential partners of CLB19. These two proteins, a P-type PPR and a member of a small PPR-DYW subfamily, were shown to interact in yeast. Insertional mutations resulted in embryo lethality that could be rescued by embryo-specific complementation. A transcriptome analysis of these complemented plants showed major editing defects in both organelles with a very high PPR type specificity, indicating that the two proteins are core members of E+-type PPR editosomes.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Chloroplasts/metabolism , Mitochondria/metabolism , RNA Editing/physiology , RNA-Binding Proteins/metabolism , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Chloroplasts/genetics , Mitochondria/genetics , RNA-Binding Proteins/genetics
7.
PLoS Genet ; 13(3): e1006666, 2017 03.
Article in English | MEDLINE | ID: mdl-28301472

ABSTRACT

Through the local selection of landraces, humans have guided the adaptation of crops to a vast range of climatic and ecological conditions. This is particularly true of maize, which was domesticated in a restricted area of Mexico but now displays one of the broadest cultivated ranges worldwide. Here, we sequenced 67 genomes with an average sequencing depth of 18x to document routes of introduction, admixture and selective history of European maize and its American counterparts. To avoid the confounding effects of recent breeding, we targeted germplasm (lines) directly derived from landraces. Among our lines, we discovered 22,294,769 SNPs and between 0.9% to 4.1% residual heterozygosity. Using a segmentation method, we identified 6,978 segments of unexpectedly high rate of heterozygosity. These segments point to genes potentially involved in inbreeding depression, and to a lesser extent to the presence of structural variants. Genetic structuring and inferences of historical splits revealed 5 genetic groups and two independent European introductions, with modest bottleneck signatures. Our results further revealed admixtures between distinct sources that have contributed to the establishment of 3 groups at intermediate latitudes in North America and Europe. We combined differentiation- and diversity-based statistics to identify both genes and gene networks displaying strong signals of selection. These include genes/gene networks involved in flowering time, drought and cold tolerance, plant defense and starch properties. Overall, our results provide novel insights into the evolutionary history of European maize and highlight a major role of admixture in environmental adaptation, paralleling recent findings in humans.


Subject(s)
Adaptation, Physiological/genetics , Genes, Plant/genetics , Plant Breeding/methods , Zea mays/genetics , Europe , Genetic Variation , Genome, Plant/genetics , Geography , Heterozygote , High-Throughput Nucleotide Sequencing/methods , Humans , Models, Genetic , Phylogeny , Polymorphism, Single Nucleotide , Selection, Genetic , United States , Zea mays/classification
8.
Plant J ; 96(3): 635-650, 2018 11.
Article in English | MEDLINE | ID: mdl-30079488

ABSTRACT

Characterizing the natural diversity of gene expression across environments is an important step in understanding how genotype-by-environment interactions shape phenotypes. Here, we analyzed the impact of water deficit onto gene expression levels in tomato at the genome-wide scale. We sequenced the transcriptome of growing leaves and fruit pericarps at cell expansion stage in a cherry and a large fruited accession and their F1 hybrid grown under two watering regimes. Gene expression levels were steadily affected by the genotype and the watering regime. Whereas phenotypes showed mostly additive inheritance, ~80% of the genes displayed non-additive inheritance. By comparing allele-specific expression (ASE) in the F1 hybrid to the allelic expression in both parental lines, respectively, 3005 genes in leaf and 2857 genes in fruit deviated from 1:1 ratio independently of the watering regime. Among these genes, ~55% were controlled by cis factors, ~25% by trans factors and ~20% by a combination of both types of factors. A total of 328 genes in leaf and 113 in fruit exhibited significant ASE-by-watering regime interaction, among which ~80% presented trans-by-watering regime interaction, suggesting a response to water deficit mediated through a majority of trans-acting loci in tomato. We cross-validated the expression levels of 274 transcripts in fruit and leaves of 124 recombinant inbred lines (RILs) and identified 163 expression quantitative trait loci (eQTLs) mostly confirming the divergences identified by ASE. Combining phenotypic and expression data, we observed a complex network of variation between genes encoding enzymes involved in the sugar metabolism.


Subject(s)
Quantitative Trait Loci/genetics , Solanum lycopersicum/genetics , Transcriptome , Water/physiology , Alleles , Dehydration , Fruit/genetics , Fruit/physiology , Genotype , Solanum lycopersicum/physiology , Phenotype
9.
BMC Genomics ; 20(1): 634, 2019 Aug 06.
Article in English | MEDLINE | ID: mdl-31387530

ABSTRACT

BACKGROUND: The effective use of mutant populations for reverse genetic screens relies on the population-wide characterization of the induced mutations. Genome- and population-wide characterization of the mutations found in fast neutron populations has been hindered, however, by the wide range of mutations generated and the lack of affordable technologies to detect DNA sequence changes. In this study, we therefore aimed to test whether genotyping-by-sequencing (GBS) technology could be used to characterize copy number variation (CNV) induced by fast neutrons in a soybean mutant population. RESULTS: We called CNVs from GBS data in 79 soybean mutants and assessed the sensitivity and precision of this approach by validating our results against array comparative genomic hybridization (aCGH) data for 19 of these mutants as well as targeted PCR and ddPCR assays for a representative subset of the smallest events detected by GBS. Our GBS pipeline detected 55 of the 96 events found by aCGH, with approximate detection thresholds of 60 kb, 500 kb and 1 Mb for homozygous deletions, hemizygous deletions and duplications, respectively. Among the whole set of 79 mutants, the GBS data revealed 105 homozygous deletions, 32 hemizygous deletions and 19 duplications. This included several extremely large events, exhibiting maximum sizes of ~ 11.2 Mb for a homozygous deletion, ~ 11.6 Mb for a hemizygous deletion, and ~ 50 Mb for a duplication. CONCLUSIONS: This study provides a proof of concept that GBS can be used as an affordable high-throughput method for assessing CNVs in fast neutron mutants. The modularity of this GBS approach allows combining as many different libraries or sequencing runs as is necessary for reaching the goals of a particular study. This method should enable the low-cost genome-wide characterization of hundreds to thousands of individuals in fast neutron mutant populations or any population with large genomic deletions and duplications.


Subject(s)
DNA Copy Number Variations , DNA Mutational Analysis , Fast Neutrons , Genotyping Techniques , Glycine max/genetics , Mutation , Mutagenesis
10.
New Phytol ; 217(1): 367-377, 2018 Jan.
Article in English | MEDLINE | ID: mdl-29034956

ABSTRACT

Structural variation is a major source of genetic diversity and an important substrate for selection. In allopolyploids, homoeologous exchanges (i.e. between the constituent subgenomes) are a very frequent type of structural variant. However, their direct impact on gene content and gene expression had not been determined. Here, we used a tissue-specific mRNA-Seq dataset to measure the consequences of homoeologous exchanges (HE) on gene expression in Brassica napus, a representative allotetraploid crop. We demonstrate that expression changes are proportional to the change in gene copy number triggered by the HEs. Thus, when homoeologous gene pairs have unbalanced transcriptional contributions before the HE, duplication of one copy does not accurately compensate for loss of the other and combined homoeologue expression also changes. These effects are, however, mitigated over time. This study sheds light on the origins, timing and functional consequences of homeologous exchanges in allopolyploids. It demonstrates that the interplay between new structural variation and the resulting impacts on gene expression, influences allopolyploid genome evolution.


Subject(s)
Brassica napus/genetics , Gene Dosage , Genetic Variation , Genome, Plant/genetics , Gene Expression , Organ Specificity , Polyploidy , Recombination, Genetic , Sequence Analysis, RNA
11.
Brief Bioinform ; 16(4): 600-15, 2015 Jul.
Article in English | MEDLINE | ID: mdl-25202135

ABSTRACT

A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. To make an objective and reproducible performance assessment, we have designed and implemented a framework to generate realistic DNA copy number profiles of cancer samples with known truth. These profiles are generated by resampling publicly available SNP microarray data from genomic regions with known copy-number state. The original data have been extracted from dilutions series of tumor cell lines with matched blood samples at several concentrations. Therefore, the signal-to-noise ratio of the generated profiles can be controlled through the (known) percentage of tumor cells in the sample. This article describes this framework and its application to a comparison study between methods for segmenting DNA copy number profiles from SNP microarrays. This study indicates that no single method is uniformly better than all others. It also helps identifying pros and cons of the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers. This comparison study may be reproduced using the open source and cross-platform R package jointseg, which implements the proposed data generation and evaluation framework: http://r-forge.r-project.org/R/?group_id=1562.


Subject(s)
DNA Copy Number Variations , Humans , Oligonucleotide Array Sequence Analysis , Polymorphism, Single Nucleotide
12.
Nucleic Acids Res ; 43(Database issue): D1010-7, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25392409

ABSTRACT

CATdb (http://urgv.evry.inra.fr/CATdb) is a database providing a public access to a large collection of transcriptomic data, mainly for Arabidopsis but also for other plants. This resource has the rare advantage to contain several thousands of microarray experiments obtained with the same technical protocol and analyzed by the same statistical pipelines. In this paper, we present GEM2Net, a new module of CATdb that takes advantage of this homogeneous dataset to mine co-expression units and decipher Arabidopsis gene functions. GEM2Net explores 387 stress conditions organized into 18 biotic and abiotic stress categories. For each one, a model-based clustering is applied on expression differences to identify clusters of co-expressed genes. To characterize functions associated with these clusters, various resources are analyzed and integrated: Gene Ontology, subcellular localization of proteins, Hormone Families, Transcription Factor Families and a refined stress-related gene list associated to publications. Exploiting protein-protein interactions and transcription factors-targets interactions enables to display gene networks. GEM2Net presents the analysis of the 18 stress categories, in which 17,264 genes are involved and organized within 681 co-expression clusters. The meta-data analyses were stored and organized to compose a dynamic Web resource.


Subject(s)
Arabidopsis/genetics , Databases, Genetic , Gene Expression Regulation, Plant , Gene Regulatory Networks , Stress, Physiological/genetics , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Gene Expression Profiling , Internet , Models, Genetic , Protein Interaction Mapping
13.
BMC Genomics ; 17(1): 818, 2016 10 21.
Article in English | MEDLINE | ID: mdl-27769163

ABSTRACT

BACKGROUND: Higher plants have to cope with increasing concentrations of pollutants of both natural and anthropogenic origin. Given their capacity to concentrate and metabolize various compounds including pollutants, plants can be used to treat environmental problems - a process called phytoremediation. However, the molecular mechanisms underlying the stabilization, the extraction, the accumulation and partial or complete degradation of pollutants by plants remain poorly understood. RESULTS: Here, we determined the molecular events involved in the early plant response to phenanthrene, used as a model of polycyclic aromatic hydrocarbons. A transcriptomic and a metabolic analysis strongly suggest that energy availability is the crucial limiting factor leading to high and rapid transcriptional reprogramming that can ultimately lead to death. We show that the accumulation of phenanthrene in leaves inhibits electron transfer and photosynthesis within a few minutes, probably disrupting energy transformation. CONCLUSION: This kinetic analysis improved the resolution of the transcriptome in the initial plant response to phenanthrene, identifying genes that are involved in primary processes set up to sense and detoxify this pollutant but also in molecular mechanisms used by the plant to cope with such harmful stress. The identification of first events involved in plant response to phenanthrene is a key step in the selection of candidates for further functional characterization, with the prospect of engineering efficient ecological detoxification systems for polycyclic aromatic hydrocarbons.


Subject(s)
Environmental Pollutants/pharmacology , Phenanthrenes/pharmacology , Plant Physiological Phenomena/drug effects , Plant Physiological Phenomena/genetics , Cluster Analysis , Dose-Response Relationship, Drug , Energy Metabolism/drug effects , Energy Metabolism/genetics , Gene Expression Regulation, Plant/drug effects , Plant Development/drug effects , Plant Development/genetics , Transcriptome , Xenobiotics/pharmacology
14.
Bioinformatics ; 30(11): 1539-46, 2014 Jun 01.
Article in English | MEDLINE | ID: mdl-24493034

ABSTRACT

MOTIVATION: DNA copy number profiles characterize regions of chromosome gains, losses and breakpoints in tumor genomes. Although many models have been proposed to detect these alterations, it is not clear which model is appropriate before visual inspection the signal, noise and models for a particular profile. RESULTS: We propose SegAnnDB, a Web-based computer vision system for genomic segmentation: first, visually inspect the profiles and manually annotate altered regions, then SegAnnDB determines the precise alteration locations using a mathematical model of the data and annotations. SegAnnDB facilitates collaboration between biologists and bioinformaticians, and uses the University of California, Santa Cruz genome browser to visualize copy number alterations alongside known genes. AVAILABILITY AND IMPLEMENTATION: The breakpoints project on INRIA GForge hosts the source code, an Amazon Machine Image can be launched and a demonstration Web site is http://bioviz.rocq.inria.fr.


Subject(s)
DNA Copy Number Variations , Software , Algorithms , Chromosome Breakpoints , Genomics/methods , Internet
15.
Nucleic Acids Res ; 41(21): e200, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24062158

ABSTRACT

Traditional methods that aim to identify biomarkers that distinguish between two groups, like Significance Analysis of Microarrays or the t-test, perform optimally when such biomarkers show homogeneous behavior within each group and differential behavior between the groups. However, in many applications, this is not the case. Instead, a subgroup of samples in one group shows differential behavior with respect to all other samples. To successfully detect markers showing such imbalanced patterns of differential signal, a different approach is required. We propose a novel method, specifically designed for the Detection of Imbalanced Differential Signal (DIDS). We use an artificial dataset and a human breast cancer dataset to measure its performance and compare it with three traditional methods and four approaches that take imbalanced signal into account. Supported by extensive experimental results, we show that DIDS outperforms all other approaches in terms of power and positive predictive value. In a mouse breast cancer dataset, DIDS is the only approach that detects a functionally validated marker of chemotherapy resistance. DIDS can be applied to any continuous value data, including gene expression data, and in any context where imbalanced differential signal is manifested.


Subject(s)
Algorithms , Biomarkers, Tumor/metabolism , Gene Expression , Animals , Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Female , Humans , Mammary Neoplasms, Experimental/genetics , Mammary Neoplasms, Experimental/metabolism , Mice , Receptor, ErbB-2/analysis
16.
Bioinformatics ; 28(18): 2357-65, 2012 Sep 15.
Article in English | MEDLINE | ID: mdl-22796958

ABSTRACT

MOTIVATION: Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. RESULTS: We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. AVAILABILITY: The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ CONTACT: l.wessels@nki.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Sequence Analysis, DNA , Algorithms , Breast Neoplasms/genetics , Cell Line, Tumor , Female , Genomics/methods , Genotype , Humans , Linear Models , Polymorphism, Single Nucleotide
17.
NAR Genom Bioinform ; 5(4): lqad098, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37954572

ABSTRACT

To fully understand gene regulation, it is necessary to have a thorough understanding of both the transcriptome and the enzymatic and RNA-binding activities that shape it. While many RNA-Seq-based tools have been developed to analyze the transcriptome, most only consider the abundance of sequencing reads along annotated patterns (such as genes). These annotations are typically incomplete, leading to errors in the differential expression analysis. To address this issue, we present DiffSegR - an R package that enables the discovery of transcriptome-wide expression differences between two biological conditions using RNA-Seq data. DiffSegR does not require prior annotation and uses a multiple changepoints detection algorithm to identify the boundaries of differentially expressed regions in the per-base log2 fold change. In a few minutes of computation, DiffSegR could rightfully predict the role of chloroplast ribonuclease Mini-III in rRNA maturation and chloroplast ribonuclease PNPase in (3'/5')-degradation of rRNA, mRNA and tRNA precursors as well as intron accumulation. We believe DiffSegR will benefit biologists working on transcriptomics as it allows access to information from a layer of the transcriptome overlooked by the classical differential expression analysis pipelines widely used today. DiffSegR is available at https://aliehrmann.github.io/DiffSegR/index.html.

18.
Biostatistics ; 12(3): 413-28, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21209153

ABSTRACT

The statistical analysis of array comparative genomic hybridization (CGH) data has now shifted to the joint assessment of copy number variations at the cohort level. Considering multiple profiles gives the opportunity to correct for systematic biases observed on single profiles, such as probe GC content or the so-called "wave effect." In this article, we extend the segmentation model developed in the univariate case to the joint analysis of multiple CGH profiles. Our contribution is multiple: we propose an integrated model to perform joint segmentation, normalization, and calling for multiple array CGH profiles. This model shows great flexibility, especially in the modeling of the wave effect that gives a likelihood framework to approaches proposed by others. We propose a new dynamic programming algorithm for break point positioning, as well as a model selection criterion based on a modified bayesian information criterion proposed in the univariate case. The performance of our method is assessed using simulated and real data sets. Our method is implemented in the R package cghseg.


Subject(s)
Bayes Theorem , Comparative Genomic Hybridization/methods , Data Interpretation, Statistical , Models, Genetic , Models, Statistical , Algorithms , Computer Simulation , Haplotypes , Humans
19.
Front Plant Sci ; 13: 980587, 2022.
Article in English | MEDLINE | ID: mdl-36479518

ABSTRACT

Partial resistance in plants generally exerts a low selective pressure on pathogens, and thus ensuring their durability in agrosystems. However, little is known about the effect of partial resistance on the molecular mechanisms of pathogenicity, a knowledge that could advance plant breeding for sustainable plant health. Here we investigate the gene expression of Phytophthora capsici during infection of pepper (Capsicum annuum L.), where only partial genetic resistance is reported, using Illumina RNA-seq. Comparison of transcriptomes of P. capsici infecting susceptible and partially resistant peppers identified a small number of genes that redirected its own resources into lipid biosynthesis to subsist on partially resistant plants. The adapted and non-adapted isolates of P. capsici differed in expression of genes involved in nucleic acid synthesis and transporters. Transient ectopic expression of the RxLR effector genes CUST_2407 and CUST_16519 in pepper lines differing in resistance levels revealed specific host-isolate interactions that either triggered local necrotic lesions (hypersensitive response or HR) or elicited leave abscission (extreme resistance or ER), preventing the spread of the pathogen to healthy tissue. Although these effectors did not unequivocally explain the quantitative host resistance, our findings highlight the importance of plant genes limiting nutrient resources to select pepper cultivars with sustainable resistance to P. capsici.

20.
Genes (Basel) ; 13(1)2021 12 27.
Article in English | MEDLINE | ID: mdl-35052407

ABSTRACT

RNA silencing serves key roles in a multitude of cellular processes, including development, stress responses, metabolism, and maintenance of genome integrity. Dicer, Argonaute (AGO), double-stranded RNA binding (DRB) proteins, RNA-dependent RNA polymerase (RDR), and DNA-dependent RNA polymerases known as Pol IV and Pol V form core components to trigger RNA silencing. Common bean (Phaseolus vulgaris) is an important staple crop worldwide. In this study, we aimed to unravel the components of the RNA-guided silencing pathway in this non-model plant, taking advantage of the availability of two genome assemblies of Andean and Meso-American origin. We identified six PvDCLs, thirteen PvAGOs, 10 PvDRBs, 5 PvRDRs, in both genotypes, suggesting no recent gene amplification or deletion after the gene pool separation. In addition, we identified one PvNRPD1 and one PvNRPE1 encoding the largest subunits of Pol IV and Pol V, respectively. These genes were categorized into subgroups based on phylogenetic analyses. Comprehensive analyses of gene structure, genomic localization, and similarity among these genes were performed. Their expression patterns were investigated by means of expression models in different organs using online data and quantitative RT-PCR after pathogen infection. Several of the candidate genes were up-regulated after infection with the fungus Colletotrichum lindemuthianum.


Subject(s)
Colletotrichum/physiology , Gene Expression Regulation, Plant , Genome-Wide Association Study , Phaseolus/genetics , Plant Diseases/genetics , Plant Proteins/metabolism , RNA Interference , Argonaute Proteins/genetics , Argonaute Proteins/metabolism , DNA-Directed RNA Polymerases/genetics , DNA-Directed RNA Polymerases/metabolism , Phaseolus/growth & development , Phaseolus/immunology , Phaseolus/microbiology , Phylogeny , Plant Diseases/immunology , Plant Diseases/microbiology , Plant Proteins/genetics , RNA-Dependent RNA Polymerase/genetics , RNA-Dependent RNA Polymerase/metabolism , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL