Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 22(1): 323, 2021 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-34126932

RESUMEN

BACKGROUND: Histone modification constitutes a basic mechanism for the genetic regulation of gene expression. In early 2000s, a powerful technique has emerged that couples chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq). This technique provides a direct survey of the DNA regions associated to these modifications. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed or adapted to analyze the massive amount of data it generates. Many of these algorithms were built around natural assumptions such as the Poisson distribution to model the noise in the count data. In this work we start from these natural assumptions and show that it is possible to improve upon them. RESULTS: Our comparisons on seven reference datasets of histone modifications (H3K36me3 & H3K4me3) suggest that natural assumptions are not always realistic under application conditions. We show that the unconstrained multiple changepoint detection model with alternative noise assumptions and supervised learning of the penalty parameter reduces the over-dispersion exhibited by count data. These models, implemented in the R package CROCS ( https://github.com/aLiehrmann/CROCS ), detect the peaks more accurately than algorithms which rely on natural assumptions. CONCLUSION: The segmentation models we propose can benefit researchers in the field of epigenetics by providing new high-quality peak prediction tracks for H3K36me3 and H3K4me3 histone modifications.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Algoritmos , Inmunoprecipitación de Cromatina , Análisis de Secuencia de ADN
2.
New Phytol ; 229(2): 994-1006, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-32583438

RESUMEN

The Anthropocene epoch is associated with the spreading of metals in the environment increasing oxidative and genotoxic stress on organisms. Interestingly, c. 520 plant species growing on metalliferous soils acquired the capacity to accumulate and tolerate a tremendous amount of nickel in their shoots. The wide phylogenetic distribution of these species suggests that nickel hyperaccumulation evolved multiple times independently. However, the exact nature of these mechanisms and whether they have been recruited convergently in distant species is not known. To address these questions, we have developed a cross-species RNA-Seq approach combining differential gene expression analysis and cluster of orthologous group annotation to identify genes linked to nickel hyperaccumulation in distant plant families. Our analysis reveals candidate orthologous genes encoding convergent function involved in nickel hyperaccumulation, including the biosynthesis of specialized metabolites and cell wall organization. Our data also point out that the high expression of IREG/Ferroportin transporters recurrently emerged as a mechanism involved in nickel hyperaccumulation in plants. We further provide genetic evidence in the hyperaccumulator Noccaea caerulescens for the role of the NcIREG2 transporter in nickel sequestration in vacuoles. Our results provide molecular tools to better understand the mechanisms of nickel hyperaccumulation and study their evolution in plants.


Asunto(s)
Brassicaceae , Níquel , Brassicaceae/genética , Filogenia , RNA-Seq , Suelo
3.
Int J Mol Sci ; 22(20)2021 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-34681956

RESUMEN

Plastid gene expression involves many post-transcriptional maturation steps resulting in a complex transcriptome composed of multiple isoforms. Although short-read RNA-Seq has considerably improved our understanding of the molecular mechanisms controlling these processes, it is unable to sequence full-length transcripts. This information is crucial, however, when it comes to understanding the interplay between the various steps of plastid gene expression. Here, we describe a protocol to study the plastid transcriptome using nanopore sequencing. In the leaf of Arabidopsis thaliana, with about 1.5 million strand-specific reads mapped to the chloroplast genome, we could recapitulate most of the complexity of the plastid transcriptome (polygenic transcripts, multiple isoforms associated with post-transcriptional processing) using virtual Northern blots. Even if the transcripts longer than about 2500 nucleotides were missing, the study of the co-occurrence of editing and splicing events identified 42 pairs of events that were not occurring independently. This study also highlighted a preferential chronology of maturation events with splicing happening after most sites were edited.


Asunto(s)
Empalme Alternativo , Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Regulación de la Expresión Génica de las Plantas , Plastidios/genética , ARN de Planta/genética , Transcriptoma , Arabidopsis/genética , Arabidopsis/crecimiento & desarrollo , Proteínas de Arabidopsis/genética , Plastidios/metabolismo , ARN de Planta/metabolismo , RNA-Seq
4.
BMC Bioinformatics ; 21(1): 120, 2020 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-32197576

RESUMEN

BACKGROUND: In unsupervised learning and clustering, data integration from different sources and types is a difficult question discussed in several research areas. For instance in omics analysis, dozen of clustering methods have been developed in the past decade. When a single source of data is at play, hierarchical clustering (HC) is extremely popular, as a tree structure is highly interpretable and arguably more informative than just a partition of the data. However, applying blindly HC to multiple sources of data raises computational and interpretation issues. RESULTS: We propose mergeTrees, a method that aggregates a set of trees with the same leaves to create a consensus tree. In our consensus tree, a cluster at height h contains the individuals that are in the same cluster for all the trees at height h. The method is exact and proven to be [Formula: see text], n being the individuals and q being the number of trees to aggregate. Our implementation is extremely effective on simulations, allowing us to process many large trees at a time. We also rely on mergeTrees to perform the cluster analysis of two real -omics data sets, introducing a spectral variant as an efficient and robust by-product. CONCLUSIONS: Our tree aggregation method can be used in conjunction with hierarchical clustering to perform efficient cluster analysis. This approach was found to be robust to the absence of clustering information in some of the data sets as well as an increased variability within true clusters. The method is implemented in R/C++ and available as an R package named mergeTrees, which makes it easy to integrate in existing or new pipelines in several research areas.


Asunto(s)
Análisis por Conglomerados , Algoritmos , Perfilación de la Expresión Génica , Humanos , Proteómica
5.
Brief Bioinform ; 19(1): 65-76, 2018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-27742662

RESUMEN

Numerous statistical pipelines are now available for the differential analysis of gene expression measured with RNA-sequencing technology. Most of them are based on similar statistical frameworks after normalization, differing primarily in the choice of data distribution, mean and variance estimation strategy and data filtering. We propose an evaluation of the impact of these choices when few biological replicates are available through the use of synthetic data sets. This framework is based on real data sets and allows the exploration of various scenarios differing in the proportion of non-differentially expressed genes. Hence, it provides an evaluation of the key ingredients of the differential analysis, free of the biases associated with the simulation of data using parametric models. Our results show the relevance of a proper modeling of the mean by using linear or generalized linear modeling. Once the mean is properly modeled, the impact of the other parameters on the performance of the test is much less important. Finally, we propose to use the simple visualization of the raw P-value histogram as a practical evaluation criterion of the performance of differential analysis methods on real data sets.


Asunto(s)
Proteínas de Arabidopsis/genética , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma , Arabidopsis/genética , Simulación por Computador , Conjuntos de Datos como Asunto , Humanos , Modelos Estadísticos , Programas Informáticos
6.
Proc Natl Acad Sci U S A ; 114(33): 8877-8882, 2017 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-28760958

RESUMEN

RNA editing is converting hundreds of cytosines into uridines during organelle gene expression of land plants. The pentatricopeptide repeat (PPR) proteins are at the core of this posttranscriptional RNA modification. Even if a PPR protein defines the editing site, a DYW domain of the same or another PPR protein is believed to catalyze the deamination. To give insight into the organelle RNA editosome, we performed tandem affinity purification of the plastidial CHLOROPLAST BIOGENESIS 19 (CLB19) PPR editing factor. Two PPR proteins, dually targeted to mitochondria and chloroplasts, were identified as potential partners of CLB19. These two proteins, a P-type PPR and a member of a small PPR-DYW subfamily, were shown to interact in yeast. Insertional mutations resulted in embryo lethality that could be rescued by embryo-specific complementation. A transcriptome analysis of these complemented plants showed major editing defects in both organelles with a very high PPR type specificity, indicating that the two proteins are core members of E+-type PPR editosomes.


Asunto(s)
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Cloroplastos/metabolismo , Mitocondrias/metabolismo , Edición de ARN/fisiología , Proteínas de Unión al ARN/metabolismo , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Cloroplastos/genética , Mitocondrias/genética , Proteínas de Unión al ARN/genética
7.
PLoS Genet ; 13(3): e1006666, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-28301472

RESUMEN

Through the local selection of landraces, humans have guided the adaptation of crops to a vast range of climatic and ecological conditions. This is particularly true of maize, which was domesticated in a restricted area of Mexico but now displays one of the broadest cultivated ranges worldwide. Here, we sequenced 67 genomes with an average sequencing depth of 18x to document routes of introduction, admixture and selective history of European maize and its American counterparts. To avoid the confounding effects of recent breeding, we targeted germplasm (lines) directly derived from landraces. Among our lines, we discovered 22,294,769 SNPs and between 0.9% to 4.1% residual heterozygosity. Using a segmentation method, we identified 6,978 segments of unexpectedly high rate of heterozygosity. These segments point to genes potentially involved in inbreeding depression, and to a lesser extent to the presence of structural variants. Genetic structuring and inferences of historical splits revealed 5 genetic groups and two independent European introductions, with modest bottleneck signatures. Our results further revealed admixtures between distinct sources that have contributed to the establishment of 3 groups at intermediate latitudes in North America and Europe. We combined differentiation- and diversity-based statistics to identify both genes and gene networks displaying strong signals of selection. These include genes/gene networks involved in flowering time, drought and cold tolerance, plant defense and starch properties. Overall, our results provide novel insights into the evolutionary history of European maize and highlight a major role of admixture in environmental adaptation, paralleling recent findings in humans.


Asunto(s)
Adaptación Fisiológica/genética , Genes de Plantas/genética , Fitomejoramiento/métodos , Zea mays/genética , Europa (Continente) , Variación Genética , Genoma de Planta/genética , Geografía , Heterocigoto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Modelos Genéticos , Filogenia , Polimorfismo de Nucleótido Simple , Selección Genética , Estados Unidos , Zea mays/clasificación
8.
Plant J ; 96(3): 635-650, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30079488

RESUMEN

Characterizing the natural diversity of gene expression across environments is an important step in understanding how genotype-by-environment interactions shape phenotypes. Here, we analyzed the impact of water deficit onto gene expression levels in tomato at the genome-wide scale. We sequenced the transcriptome of growing leaves and fruit pericarps at cell expansion stage in a cherry and a large fruited accession and their F1 hybrid grown under two watering regimes. Gene expression levels were steadily affected by the genotype and the watering regime. Whereas phenotypes showed mostly additive inheritance, ~80% of the genes displayed non-additive inheritance. By comparing allele-specific expression (ASE) in the F1 hybrid to the allelic expression in both parental lines, respectively, 3005 genes in leaf and 2857 genes in fruit deviated from 1:1 ratio independently of the watering regime. Among these genes, ~55% were controlled by cis factors, ~25% by trans factors and ~20% by a combination of both types of factors. A total of 328 genes in leaf and 113 in fruit exhibited significant ASE-by-watering regime interaction, among which ~80% presented trans-by-watering regime interaction, suggesting a response to water deficit mediated through a majority of trans-acting loci in tomato. We cross-validated the expression levels of 274 transcripts in fruit and leaves of 124 recombinant inbred lines (RILs) and identified 163 expression quantitative trait loci (eQTLs) mostly confirming the divergences identified by ASE. Combining phenotypic and expression data, we observed a complex network of variation between genes encoding enzymes involved in the sugar metabolism.


Asunto(s)
Sitios de Carácter Cuantitativo/genética , Solanum lycopersicum/genética , Transcriptoma , Agua/fisiología , Alelos , Deshidratación , Frutas/genética , Frutas/fisiología , Genotipo , Solanum lycopersicum/fisiología , Fenotipo
9.
BMC Genomics ; 20(1): 634, 2019 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-31387530

RESUMEN

BACKGROUND: The effective use of mutant populations for reverse genetic screens relies on the population-wide characterization of the induced mutations. Genome- and population-wide characterization of the mutations found in fast neutron populations has been hindered, however, by the wide range of mutations generated and the lack of affordable technologies to detect DNA sequence changes. In this study, we therefore aimed to test whether genotyping-by-sequencing (GBS) technology could be used to characterize copy number variation (CNV) induced by fast neutrons in a soybean mutant population. RESULTS: We called CNVs from GBS data in 79 soybean mutants and assessed the sensitivity and precision of this approach by validating our results against array comparative genomic hybridization (aCGH) data for 19 of these mutants as well as targeted PCR and ddPCR assays for a representative subset of the smallest events detected by GBS. Our GBS pipeline detected 55 of the 96 events found by aCGH, with approximate detection thresholds of 60 kb, 500 kb and 1 Mb for homozygous deletions, hemizygous deletions and duplications, respectively. Among the whole set of 79 mutants, the GBS data revealed 105 homozygous deletions, 32 hemizygous deletions and 19 duplications. This included several extremely large events, exhibiting maximum sizes of ~ 11.2 Mb for a homozygous deletion, ~ 11.6 Mb for a hemizygous deletion, and ~ 50 Mb for a duplication. CONCLUSIONS: This study provides a proof of concept that GBS can be used as an affordable high-throughput method for assessing CNVs in fast neutron mutants. The modularity of this GBS approach allows combining as many different libraries or sequencing runs as is necessary for reaching the goals of a particular study. This method should enable the low-cost genome-wide characterization of hundreds to thousands of individuals in fast neutron mutant populations or any population with large genomic deletions and duplications.


Asunto(s)
Variaciones en el Número de Copia de ADN , Análisis Mutacional de ADN , Neutrones Rápidos , Técnicas de Genotipaje , Glycine max/genética , Mutación , Mutagénesis
10.
New Phytol ; 217(1): 367-377, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-29034956

RESUMEN

Structural variation is a major source of genetic diversity and an important substrate for selection. In allopolyploids, homoeologous exchanges (i.e. between the constituent subgenomes) are a very frequent type of structural variant. However, their direct impact on gene content and gene expression had not been determined. Here, we used a tissue-specific mRNA-Seq dataset to measure the consequences of homoeologous exchanges (HE) on gene expression in Brassica napus, a representative allotetraploid crop. We demonstrate that expression changes are proportional to the change in gene copy number triggered by the HEs. Thus, when homoeologous gene pairs have unbalanced transcriptional contributions before the HE, duplication of one copy does not accurately compensate for loss of the other and combined homoeologue expression also changes. These effects are, however, mitigated over time. This study sheds light on the origins, timing and functional consequences of homeologous exchanges in allopolyploids. It demonstrates that the interplay between new structural variation and the resulting impacts on gene expression, influences allopolyploid genome evolution.


Asunto(s)
Brassica napus/genética , Dosificación de Gen , Variación Genética , Genoma de Planta/genética , Expresión Génica , Especificidad de Órganos , Poliploidía , Recombinación Genética , Análisis de Secuencia de ARN
11.
Brief Bioinform ; 16(4): 600-15, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25202135

RESUMEN

A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. To make an objective and reproducible performance assessment, we have designed and implemented a framework to generate realistic DNA copy number profiles of cancer samples with known truth. These profiles are generated by resampling publicly available SNP microarray data from genomic regions with known copy-number state. The original data have been extracted from dilutions series of tumor cell lines with matched blood samples at several concentrations. Therefore, the signal-to-noise ratio of the generated profiles can be controlled through the (known) percentage of tumor cells in the sample. This article describes this framework and its application to a comparison study between methods for segmenting DNA copy number profiles from SNP microarrays. This study indicates that no single method is uniformly better than all others. It also helps identifying pros and cons of the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers. This comparison study may be reproduced using the open source and cross-platform R package jointseg, which implements the proposed data generation and evaluation framework: http://r-forge.r-project.org/R/?group_id=1562.


Asunto(s)
Variaciones en el Número de Copia de ADN , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple
12.
Nucleic Acids Res ; 43(Database issue): D1010-7, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25392409

RESUMEN

CATdb (http://urgv.evry.inra.fr/CATdb) is a database providing a public access to a large collection of transcriptomic data, mainly for Arabidopsis but also for other plants. This resource has the rare advantage to contain several thousands of microarray experiments obtained with the same technical protocol and analyzed by the same statistical pipelines. In this paper, we present GEM2Net, a new module of CATdb that takes advantage of this homogeneous dataset to mine co-expression units and decipher Arabidopsis gene functions. GEM2Net explores 387 stress conditions organized into 18 biotic and abiotic stress categories. For each one, a model-based clustering is applied on expression differences to identify clusters of co-expressed genes. To characterize functions associated with these clusters, various resources are analyzed and integrated: Gene Ontology, subcellular localization of proteins, Hormone Families, Transcription Factor Families and a refined stress-related gene list associated to publications. Exploiting protein-protein interactions and transcription factors-targets interactions enables to display gene networks. GEM2Net presents the analysis of the 18 stress categories, in which 17,264 genes are involved and organized within 681 co-expression clusters. The meta-data analyses were stored and organized to compose a dynamic Web resource.


Asunto(s)
Arabidopsis/genética , Bases de Datos Genéticas , Regulación de la Expresión Génica de las Plantas , Redes Reguladoras de Genes , Estrés Fisiológico/genética , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Perfilación de la Expresión Génica , Internet , Modelos Genéticos , Mapeo de Interacción de Proteínas
13.
BMC Genomics ; 17(1): 818, 2016 10 21.
Artículo en Inglés | MEDLINE | ID: mdl-27769163

RESUMEN

BACKGROUND: Higher plants have to cope with increasing concentrations of pollutants of both natural and anthropogenic origin. Given their capacity to concentrate and metabolize various compounds including pollutants, plants can be used to treat environmental problems - a process called phytoremediation. However, the molecular mechanisms underlying the stabilization, the extraction, the accumulation and partial or complete degradation of pollutants by plants remain poorly understood. RESULTS: Here, we determined the molecular events involved in the early plant response to phenanthrene, used as a model of polycyclic aromatic hydrocarbons. A transcriptomic and a metabolic analysis strongly suggest that energy availability is the crucial limiting factor leading to high and rapid transcriptional reprogramming that can ultimately lead to death. We show that the accumulation of phenanthrene in leaves inhibits electron transfer and photosynthesis within a few minutes, probably disrupting energy transformation. CONCLUSION: This kinetic analysis improved the resolution of the transcriptome in the initial plant response to phenanthrene, identifying genes that are involved in primary processes set up to sense and detoxify this pollutant but also in molecular mechanisms used by the plant to cope with such harmful stress. The identification of first events involved in plant response to phenanthrene is a key step in the selection of candidates for further functional characterization, with the prospect of engineering efficient ecological detoxification systems for polycyclic aromatic hydrocarbons.


Asunto(s)
Contaminantes Ambientales/farmacología , Fenantrenos/farmacología , Fenómenos Fisiológicos de las Plantas/efectos de los fármacos , Fenómenos Fisiológicos de las Plantas/genética , Análisis por Conglomerados , Relación Dosis-Respuesta a Droga , Metabolismo Energético/efectos de los fármacos , Metabolismo Energético/genética , Regulación de la Expresión Génica de las Plantas/efectos de los fármacos , Desarrollo de la Planta/efectos de los fármacos , Desarrollo de la Planta/genética , Transcriptoma , Xenobióticos/farmacología
14.
Bioinformatics ; 30(11): 1539-46, 2014 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-24493034

RESUMEN

MOTIVATION: DNA copy number profiles characterize regions of chromosome gains, losses and breakpoints in tumor genomes. Although many models have been proposed to detect these alterations, it is not clear which model is appropriate before visual inspection the signal, noise and models for a particular profile. RESULTS: We propose SegAnnDB, a Web-based computer vision system for genomic segmentation: first, visually inspect the profiles and manually annotate altered regions, then SegAnnDB determines the precise alteration locations using a mathematical model of the data and annotations. SegAnnDB facilitates collaboration between biologists and bioinformaticians, and uses the University of California, Santa Cruz genome browser to visualize copy number alterations alongside known genes. AVAILABILITY AND IMPLEMENTATION: The breakpoints project on INRIA GForge hosts the source code, an Amazon Machine Image can be launched and a demonstration Web site is http://bioviz.rocq.inria.fr.


Asunto(s)
Variaciones en el Número de Copia de ADN , Programas Informáticos , Algoritmos , Puntos de Rotura del Cromosoma , Genómica/métodos , Internet
15.
Nucleic Acids Res ; 41(21): e200, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-24062158

RESUMEN

Traditional methods that aim to identify biomarkers that distinguish between two groups, like Significance Analysis of Microarrays or the t-test, perform optimally when such biomarkers show homogeneous behavior within each group and differential behavior between the groups. However, in many applications, this is not the case. Instead, a subgroup of samples in one group shows differential behavior with respect to all other samples. To successfully detect markers showing such imbalanced patterns of differential signal, a different approach is required. We propose a novel method, specifically designed for the Detection of Imbalanced Differential Signal (DIDS). We use an artificial dataset and a human breast cancer dataset to measure its performance and compare it with three traditional methods and four approaches that take imbalanced signal into account. Supported by extensive experimental results, we show that DIDS outperforms all other approaches in terms of power and positive predictive value. In a mouse breast cancer dataset, DIDS is the only approach that detects a functionally validated marker of chemotherapy resistance. DIDS can be applied to any continuous value data, including gene expression data, and in any context where imbalanced differential signal is manifested.


Asunto(s)
Algoritmos , Biomarcadores de Tumor/metabolismo , Expresión Génica , Animales , Biomarcadores de Tumor/genética , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Femenino , Humanos , Neoplasias Mamarias Experimentales/genética , Neoplasias Mamarias Experimentales/metabolismo , Ratones , Receptor ErbB-2/análisis
16.
Bioinformatics ; 28(18): 2357-65, 2012 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-22796958

RESUMEN

MOTIVATION: Target enrichment, also referred to as DNA capture, provides an effective way to focus sequencing efforts on a genomic region of interest. Capture data are typically used to detect single-nucleotide variants. It can also be used to detect copy number alterations, which is particularly useful in the context of cancer, where such changes occur frequently. In copy number analysis, it is a common practice to determine log-ratios between test and control samples, but this approach results in a loss of information as it disregards the total coverage or intensity at a locus. RESULTS: We modeled the coverage or intensity of the test sample as a linear function of the control sample. This regression approach is able to deal with regions that are completely deleted, which are problematic for methods that use log-ratios. To demonstrate the utility of our approach, we used capture data to determine copy number for a set of 600 genes in a panel of nine breast cancer cell lines. We found high concordance between our results and those generated using a single-nucleotide polymorphsim genotyping platform. When we compared our results with other log-ratio-based methods, including ExomeCNV, we found that our approach produced better overall correlation with SNP data. AVAILABILITY: The algorithm is implemented in C and R and the code can be downloaded from http://bioinformatics.nki.nl/ocs/ CONTACT: l.wessels@nki.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Variaciones en el Número de Copia de ADN , Análisis de Secuencia de ADN , Algoritmos , Neoplasias de la Mama/genética , Línea Celular Tumoral , Femenino , Genómica/métodos , Genotipo , Humanos , Modelos Lineales , Polimorfismo de Nucleótido Simple
17.
NAR Genom Bioinform ; 5(4): lqad098, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37954572

RESUMEN

To fully understand gene regulation, it is necessary to have a thorough understanding of both the transcriptome and the enzymatic and RNA-binding activities that shape it. While many RNA-Seq-based tools have been developed to analyze the transcriptome, most only consider the abundance of sequencing reads along annotated patterns (such as genes). These annotations are typically incomplete, leading to errors in the differential expression analysis. To address this issue, we present DiffSegR - an R package that enables the discovery of transcriptome-wide expression differences between two biological conditions using RNA-Seq data. DiffSegR does not require prior annotation and uses a multiple changepoints detection algorithm to identify the boundaries of differentially expressed regions in the per-base log2 fold change. In a few minutes of computation, DiffSegR could rightfully predict the role of chloroplast ribonuclease Mini-III in rRNA maturation and chloroplast ribonuclease PNPase in (3'/5')-degradation of rRNA, mRNA and tRNA precursors as well as intron accumulation. We believe DiffSegR will benefit biologists working on transcriptomics as it allows access to information from a layer of the transcriptome overlooked by the classical differential expression analysis pipelines widely used today. DiffSegR is available at https://aliehrmann.github.io/DiffSegR/index.html.

18.
Biostatistics ; 12(3): 413-28, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21209153

RESUMEN

The statistical analysis of array comparative genomic hybridization (CGH) data has now shifted to the joint assessment of copy number variations at the cohort level. Considering multiple profiles gives the opportunity to correct for systematic biases observed on single profiles, such as probe GC content or the so-called "wave effect." In this article, we extend the segmentation model developed in the univariate case to the joint analysis of multiple CGH profiles. Our contribution is multiple: we propose an integrated model to perform joint segmentation, normalization, and calling for multiple array CGH profiles. This model shows great flexibility, especially in the modeling of the wave effect that gives a likelihood framework to approaches proposed by others. We propose a new dynamic programming algorithm for break point positioning, as well as a model selection criterion based on a modified bayesian information criterion proposed in the univariate case. The performance of our method is assessed using simulated and real data sets. Our method is implemented in the R package cghseg.


Asunto(s)
Teorema de Bayes , Hibridación Genómica Comparativa/métodos , Interpretación Estadística de Datos , Modelos Genéticos , Modelos Estadísticos , Algoritmos , Simulación por Computador , Haplotipos , Humanos
19.
Front Plant Sci ; 13: 980587, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36479518

RESUMEN

Partial resistance in plants generally exerts a low selective pressure on pathogens, and thus ensuring their durability in agrosystems. However, little is known about the effect of partial resistance on the molecular mechanisms of pathogenicity, a knowledge that could advance plant breeding for sustainable plant health. Here we investigate the gene expression of Phytophthora capsici during infection of pepper (Capsicum annuum L.), where only partial genetic resistance is reported, using Illumina RNA-seq. Comparison of transcriptomes of P. capsici infecting susceptible and partially resistant peppers identified a small number of genes that redirected its own resources into lipid biosynthesis to subsist on partially resistant plants. The adapted and non-adapted isolates of P. capsici differed in expression of genes involved in nucleic acid synthesis and transporters. Transient ectopic expression of the RxLR effector genes CUST_2407 and CUST_16519 in pepper lines differing in resistance levels revealed specific host-isolate interactions that either triggered local necrotic lesions (hypersensitive response or HR) or elicited leave abscission (extreme resistance or ER), preventing the spread of the pathogen to healthy tissue. Although these effectors did not unequivocally explain the quantitative host resistance, our findings highlight the importance of plant genes limiting nutrient resources to select pepper cultivars with sustainable resistance to P. capsici.

20.
Genes (Basel) ; 13(1)2021 12 27.
Artículo en Inglés | MEDLINE | ID: mdl-35052407

RESUMEN

RNA silencing serves key roles in a multitude of cellular processes, including development, stress responses, metabolism, and maintenance of genome integrity. Dicer, Argonaute (AGO), double-stranded RNA binding (DRB) proteins, RNA-dependent RNA polymerase (RDR), and DNA-dependent RNA polymerases known as Pol IV and Pol V form core components to trigger RNA silencing. Common bean (Phaseolus vulgaris) is an important staple crop worldwide. In this study, we aimed to unravel the components of the RNA-guided silencing pathway in this non-model plant, taking advantage of the availability of two genome assemblies of Andean and Meso-American origin. We identified six PvDCLs, thirteen PvAGOs, 10 PvDRBs, 5 PvRDRs, in both genotypes, suggesting no recent gene amplification or deletion after the gene pool separation. In addition, we identified one PvNRPD1 and one PvNRPE1 encoding the largest subunits of Pol IV and Pol V, respectively. These genes were categorized into subgroups based on phylogenetic analyses. Comprehensive analyses of gene structure, genomic localization, and similarity among these genes were performed. Their expression patterns were investigated by means of expression models in different organs using online data and quantitative RT-PCR after pathogen infection. Several of the candidate genes were up-regulated after infection with the fungus Colletotrichum lindemuthianum.


Asunto(s)
Colletotrichum/fisiología , Regulación de la Expresión Génica de las Plantas , Estudio de Asociación del Genoma Completo , Phaseolus/genética , Enfermedades de las Plantas/genética , Proteínas de Plantas/metabolismo , Interferencia de ARN , Proteínas Argonautas/genética , Proteínas Argonautas/metabolismo , ARN Polimerasas Dirigidas por ADN/genética , ARN Polimerasas Dirigidas por ADN/metabolismo , Phaseolus/crecimiento & desarrollo , Phaseolus/inmunología , Phaseolus/microbiología , Filogenia , Enfermedades de las Plantas/inmunología , Enfermedades de las Plantas/microbiología , Proteínas de Plantas/genética , ARN Polimerasa Dependiente del ARN/genética , ARN Polimerasa Dependiente del ARN/metabolismo , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA