Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Environ Sci Technol ; 58(12): 5383-5393, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38478982

RESUMO

Cardiometabolic health is complex and characterized by an ensemble of correlated and/or co-occurring conditions including obesity, dyslipidemia, hypertension, and diabetes mellitus. It is affected by social, lifestyle, and environmental factors, which in-turn exhibit complex correlation patterns. To account for the complexity of (i) exposure profiles and (ii) health outcomes, we propose to use a multitrait Bayesian variable selection approach and identify a sparse set of exposures jointly explanatory of the complex cardiometabolic health status. Using data from a subset (N = 941 participants) of the nutrition, environment, and cardiovascular health (NESCAV) study, we evaluated the link between measurements of the cumulative exposure to (N = 33) pollutants derived from hair and cardiometabolic health as proxied by up to nine measured traits. Our multitrait analysis showed increased statistical power, compared to single-trait analyses, to detect subtle contributions of exposures to a set of clinical phenotypes, while providing parsimonious results with improved interpretability. We identified six exposures that were jointly explanatory of cardiometabolic health as modeled by six complementary traits, of which, we identified strong associations between hexachlorobenzene and trifluralin exposure and adverse cardiometabolic health, including traits of obesity, dyslipidemia, and hypertension. This supports the use of this type of approach for the joint modeling, in an exposome context, of correlated exposures in relation to complex and multifaceted outcomes.


Assuntos
Dislipidemias , Expossoma , Hipertensão , Humanos , Teorema de Bayes , Obesidade/epidemiologia , Cabelo , Exposição Ambiental
2.
J R Stat Soc Ser C Appl Stat ; 72(5): 1375-1393, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38143734

RESUMO

Stability selection represents an attractive approach to identify sparse sets of features jointly associated with an outcome in high-dimensional contexts. We introduce an automated calibration procedure via maximisation of an in-house stability score and accommodating a priori-known block structure (e.g. multi-OMIC) data. It applies to [Least Absolute Shrinkage Selection Operator (LASSO)] penalised regression and graphical models. Simulations show our approach outperforms non-stability-based and stability selection approaches using the original calibration. Application to multi-block graphical LASSO on real (epigenetic and transcriptomic) data from the Norwegian Women and Cancer study reveals a central/credible and novel cross-OMIC role of LRRN3 in the biological response to smoking. Proposed approaches were implemented in the R package sharp.

3.
Bioinformatics ; 39(11)2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37847776

RESUMO

MOTIVATION: In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. RESULTS: We extend here consensus clustering to allow for attribute weighting in the calculation of pairwise distances using existing regularized approaches. We propose a procedure for the calibration of the number of clusters (and regularization parameter) by maximizing the sharp score, a novel stability score calculated directly from consensus clustering outputs, making it extremely computationally competitive. Our simulation study shows better clustering performances of (i) approaches calibrated by maximizing the sharp score compared to existing calibration scores and (ii) weighted compared to unweighted approaches in the presence of features that do not contribute to cluster definition. Application on real gene expression data measured in lung tissue reveals clear clusters corresponding to different lung cancer subtypes. AVAILABILITY AND IMPLEMENTATION: The R package sharp (version ≥1.4.3) is available on CRAN at https://CRAN.R-project.org/package=sharp.


Assuntos
Algoritmos , Consenso , Calibragem , Simulação por Computador , Análise por Conglomerados
4.
Plant Commun ; 4(5): 100676, 2023 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-37644724

RESUMO

Plant defense responses involve several biological processes that allow plants to fight against pathogenic attacks. How these different processes are orchestrated within organs and depend on specific cell types is poorly known. Here, using single-cell RNA sequencing (scRNA-seq) technology on three independent biological replicates, we identified several cell populations representing the core transcriptional responses of wild-type Arabidopsis leaves inoculated with the bacterial pathogen Pseudomonas syringae DC3000. Among these populations, we retrieved major cell types of the leaves (mesophyll, guard, epidermal, companion, and vascular S cells) with which we could associate characteristic transcriptional reprogramming and regulators, thereby specifying different cell-type responses to the pathogen. Further analyses of transcriptional dynamics, on the basis of inference of cell trajectories, indicated that the different cell types, in addition to their characteristic defense responses, can also share similar modules of gene reprogramming, uncovering a ubiquitous antagonism between immune and susceptible processes. Moreover, it appears that the defense responses of vascular S cells, epidermal cells, and mesophyll cells can evolve along two separate paths, one converging toward an identical cell fate, characterized mostly by lignification and detoxification functions. As this divergence does not correspond to the differentiation between immune and susceptible cells, we speculate that this might reflect the discrimination between cell-autonomous and non-cell-autonomous responses. Altogether our data provide an upgraded framework to describe, explore, and explain the specialization and the coordination of plant cell responses upon pathogenic challenge.


Assuntos
Arabidopsis , Arabidopsis/genética , Análise da Expressão Gênica de Célula Única , Folhas de Planta/genética , Diferenciação Celular , Células Vegetais
5.
G3 (Bethesda) ; 11(9)2021 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-34544146

RESUMO

Viticulture has to cope with climate change and to decrease pesticide inputs, while maintaining yield and wine quality. Breeding is a key lever to meet this challenge, and genomic prediction a promising tool to accelerate breeding programs. Multivariate methods are potentially more accurate than univariate ones. Moreover, some prediction methods also provide marker selection, thus allowing quantitative trait loci (QTLs) detection and the identification of positional candidate genes. To study both genomic prediction and QTL detection for drought-related traits in grapevine, we applied several methods, interval mapping (IM) as well as univariate and multivariate penalized regression, in a bi-parental progeny. With a dense genetic map, we simulated two traits under four QTL configurations. The penalized regression method Elastic Net (EN) for genomic prediction, and controlling the marginal False Discovery Rate on EN selected markers to prioritize the QTLs. Indeed, penalized methods were more powerful than IM for QTL detection across various genetic architectures. Multivariate prediction did not perform better than its univariate counterpart, despite strong genetic correlation between traits. Using 14 traits measured in semi-controlled conditions under different watering conditions, penalized regression methods proved very efficient for intra-population prediction whatever the genetic architecture of the trait, with predictive abilities reaching 0.68. Compared to a previous study on the same traits, these methods applied on a denser map found new QTLs controlling traits linked to drought tolerance and provided relevant candidate genes. Overall, these findings provide a strong evidence base for implementing genomic prediction in grapevine breeding.


Assuntos
Secas , Locos de Características Quantitativas , Mapeamento Cromossômico , Genômica , Fenótipo
6.
Ecol Lett ; 24(9): 1905-1916, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34231296

RESUMO

The relative importance of ecological factors and species interactions for shaping species distributions is still debated. The realised niches of eight sympatric tephritid fruit flies were inferred from field abundance data using joint species distribution modelling and network inference, on the whole community and separately on three host plant groups. These estimates were then confronted the fundamental niches of seven fly species estimated through laboratory-measured fitnesses on host plants. Species abundances depended on host plants, followed by climatic factors, with a dose of competition between species sharing host plants. The relative importance of these factors mildly changed among the three host plant groups. Despite overlapping fundamental niches, specialists and generalists had almost distinct realised niches, with possible competitive exclusion of generalists by specialists on Cucurbitaceae. They had different assembly rules: Specialists were mainly influenced by their adaptation to host plants, while generalist abundances varied regardless of their fundamental host use.


Assuntos
Drosophila , Plantas , Animais
7.
J Bioinform Comput Biol ; 19(1): 2140003, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33653235

RESUMO

In many cancers, mechanisms of gene regulation can be severely altered. Identification of deregulated genes, which do not follow the regulation processes that exist between transcription factors and their target genes, is of importance to better understand the development of the disease. We propose a methodology to detect deregulation mechanisms with a particular focus on cancer subtypes. This strategy is based on the comparison between tumoral and healthy cells. First, we use gene expression data from healthy cells to infer a reference gene regulatory network. Then, we compare it with gene expression levels in tumor samples to detect deregulated target genes. We finally measure the ability of each transcription factor to explain these deregulations. We apply our method on a public bladder cancer data set derived from The Cancer Genome Atlas project and confirm that it captures hallmarks of cancer subtypes. We also show that it enables the discovery of new potential biomarkers.


Assuntos
Algoritmos , Regulação Neoplásica da Expressão Gênica , Modelos Genéticos , Neoplasias/genética , Neoplasias/patologia , Redes Reguladoras de Genes , Humanos , Fatores de Transcrição/genética , Neoplasias da Bexiga Urinária/genética
8.
Algorithms Mol Biol ; 15: 13, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32625242

RESUMO

MOTIVATION: Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be examined jointly when estimating phenotypic variance. In this work we consider two types of biological markers: genotypic markers, which characterize an observation in terms of inherited genetic information, and metagenomic marker which are related to the environment. Both types of markers are available in their millions and can be used to characterize any observation uniquely. OBJECTIVE: Our focus is on detecting interactions between groups of genetic and metagenomic markers in order to gain a better understanding of the complex relationship between environment and genome in the expression of a given phenotype. CONTRIBUTIONS: We propose a novel approach for efficiently detecting interactions between complementary datasets in a high-dimensional setting with a reduced computational cost. The method, named SICOMORE, reduces the dimension of the search space by selecting a subset of supervariables in the two complementary datasets. These supervariables are given by a weighted group structure defined on sets of variables at different scales. A Lasso selection is then applied on each type of supervariable to obtain a subset of potential interactions that will be explored via linear model testing. RESULTS: We compare SICOMORE with other approaches in simulations, with varying sample sizes, noise, and numbers of true interactions. SICOMORE exhibits convincing results in terms of recall, as well as competitive performances with respect to running time. The method is also used to detect interaction between genomic markers in Medicago truncatula and metagenomic markers in its rhizosphere bacterial community. SOFTWARE AVAILABILITY: An R package is available [4], along with its documentation and associated scripts, allowing the reader to reproduce the results presented in the paper.

9.
BMC Bioinformatics ; 21(1): 120, 2020 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-32197576

RESUMO

BACKGROUND: In unsupervised learning and clustering, data integration from different sources and types is a difficult question discussed in several research areas. For instance in omics analysis, dozen of clustering methods have been developed in the past decade. When a single source of data is at play, hierarchical clustering (HC) is extremely popular, as a tree structure is highly interpretable and arguably more informative than just a partition of the data. However, applying blindly HC to multiple sources of data raises computational and interpretation issues. RESULTS: We propose mergeTrees, a method that aggregates a set of trees with the same leaves to create a consensus tree. In our consensus tree, a cluster at height h contains the individuals that are in the same cluster for all the trees at height h. The method is exact and proven to be [Formula: see text], n being the individuals and q being the number of trees to aggregate. Our implementation is extremely effective on simulations, allowing us to process many large trees at a time. We also rely on mergeTrees to perform the cluster analysis of two real -omics data sets, introducing a spectral variant as an efficient and robust by-product. CONCLUSIONS: Our tree aggregation method can be used in conjunction with hierarchical clustering to perform efficient cluster analysis. This approach was found to be robust to the absence of clustering information in some of the data sets as well as an increased variability within true clusters. The method is implemented in R/C++ and available as an R package named mergeTrees, which makes it easy to integrate in existing or new pipelines in several research areas.


Assuntos
Análise por Conglomerados , Algoritmos , Perfilação da Expressão Gênica , Humanos , Proteômica
10.
Cell ; 179(2): 432-447.e21, 2019 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-31585082

RESUMO

Cell-cell communication involves a large number of molecular signals that function as words of a complex language whose grammar remains mostly unknown. Here, we describe an integrative approach involving (1) protein-level measurement of multiple communication signals coupled to output responses in receiving cells and (2) mathematical modeling to uncover input-output relationships and interactions between signals. Using human dendritic cell (DC)-T helper (Th) cell communication as a model, we measured 36 DC-derived signals and 17 Th cytokines broadly covering Th diversity in 428 observations. We developed a data-driven, computationally validated model capturing 56 already described and 290 potentially novel mechanisms of Th cell specification. By predicting context-dependent behaviors, we demonstrate a new function for IL-12p70 as an inducer of Th17 in an IL-1 signaling context. This work provides a unique resource to decipher the complex combinatorial rules governing DC-Th cell communication and guide their manipulation for vaccine design and immunotherapies.


Assuntos
Comunicação Celular/imunologia , Células Dendríticas/imunologia , Interleucina-12/fisiologia , Células Th17/imunologia , Adolescente , Adulto , Idoso , Células Cultivadas , Técnicas de Cocultura , Voluntários Saudáveis , Humanos , Interleucina-1/metabolismo , Pessoa de Meia-Idade , Modelos Biológicos , Adulto Jovem
11.
Methods Mol Biol ; 1883: 143-160, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30547399

RESUMO

This chapter addresses the problem of reconstructing regulatory networks in molecular biology by integrating multiple sources of data. We consider data sets measured from diverse technologies all related to the same set of variables and individuals. This situation is becoming more and more common in molecular biology, for instance, when both proteomic and transcriptomic data related to the same set of "genes" are available on a given cohort of patients.To infer a consensus network that integrates both proteomic and transcriptomic data, we introduce a multivariate extension of Gaussian graphical models (GGM), which we refer to as multiattribute GGM. Indeed, the GGM framework offers a good proxy for modeling direct links between biological entities. We perform the inference of our multivariate GGM with a neighborhood selection procedure that operates at a multiscale level. This procedure employs a group-Lasso penalty in order to select interactions which operate both at the proteomic and at the transcriptomic level between two genes. We end up with a consensus network embedding information shared at multiple scales of the cell. We illustrate this method on two breast cancer data sets. An R-package is publicly available on github at https://github.com/jchiquet/multivarNetwork to promote reproducibility.


Assuntos
Neoplasias da Mama/genética , Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Modelos Genéticos , Algoritmos , Biologia Computacional/instrumentação , Conjuntos de Dados como Assunto , Feminino , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/métodos , Humanos , Distribuição Normal , Proteômica/instrumentação , Proteômica/métodos , Software
12.
Stat Appl Genet Mol Biol ; 17(5)2018 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-30205662

RESUMO

Omic data are characterized by the presence of strong dependence structures that result either from data acquisition or from some underlying biological processes. Applying statistical procedures that do not adjust the variable selection step to the dependence pattern may result in a loss of power and the selection of spurious variables. The goal of this paper is to propose a variable selection procedure within the multivariate linear model framework that accounts for the dependence between the multiple responses. We shall focus on a specific type of dependence which consists in assuming that the responses of a given individual can be modelled as a time series. We propose a novel Lasso-based approach within the framework of the multivariate linear model taking into account the dependence structure by using different types of stationary processes covariance structures for the random error matrix. Our numerical experiments show that including the estimation of the covariance matrix of the random error matrix in the Lasso criterion dramatically improves the variable selection performance. Our approach is successfully applied to an untargeted LC-MS (Liquid Chromatography-Mass Spectrometry) data set made of African copals samples. Our methodology is implemented in the R package MultiVarSel which is available from the Comprehensive R Archive Network (CRAN).


Assuntos
Biomarcadores/metabolismo , Cromatografia Líquida/métodos , Interpretação Estatística de Dados , Metabolômica/métodos , Espectrometria de Massas em Tandem/métodos , Humanos , Modelos Lineares , Metabolômica/estatística & dados numéricos
13.
BMC Syst Biol ; 9 Suppl 6: S6, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26679516

RESUMO

In tumoral cells, gene regulation mechanisms are severely altered. Genes that do not react normally to their regulators' activity can provide explanations for the tumoral behavior, and be characteristic of cancer subtypes. We thus propose a statistical methodology to identify the misregulated genes given a reference network and gene expression data.


Assuntos
Modelos Genéticos , Transcriptoma , Algoritmos , Reações Falso-Positivas , Redes Reguladoras de Genes , Humanos , Neoplasias da Bexiga Urinária/genética
14.
Science ; 345(6199): 950-3, 2014 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-25146293

RESUMO

Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent An and Cn subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.


Assuntos
Brassica napus/genética , Duplicação Cromossômica , Evolução Molecular , Genoma de Planta , Poliploidia , Sementes/genética , Brassica napus/citologia
15.
New Phytol ; 197(3): 730-736, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23278496

RESUMO

The reprogramming of gene expression appears as the major trend in synthetic and natural allopolyploids where expression of an important proportion of genes was shown to deviate from that of the parents or the average of the parents. In this study, we analyzed gene expression changes in previously reported, highly stable synthetic wheat allohexaploids that combine the D genome of Aegilops tauschii and the AB genome extracted from the natural hexaploid wheat Triticum aestivum. A comprehensive genome-wide analysis of transcriptional changes using the Affymetrix GeneChip Wheat Genome Array was conducted. Prevalence of gene expression additivity was observed where expression does not deviate from the average of the parents for 99.3% of 34,820 expressed transcripts. Moreover, nearly similar expression was observed (for 99.5% of genes) when comparing these synthetic and natural wheat allohexaploids. Such near-complete additivity has never been reported for other allopolyploids and, more interestingly, for other synthetic wheat allohexaploids that differ from the ones studied here by having the natural tetraploid Triticum turgidum as the AB genome progenitor. Our study gave insights into the dynamics of additive gene expression in the highly stable wheat allohexaploids.


Assuntos
Poliploidia , Triticum/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Genoma de Planta , Instabilidade Genômica
16.
Stat Appl Genet Mol Biol ; 9: Article 15, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20196750

RESUMO

We present a weighted-LASSO method to infer the parameters of a first-order vector auto-regressive model that describes time course expression data generated by directed gene-to-gene regulation networks. These networks are assumed to own prior internal structures of connectivity which drive the inference method. This prior structure can be either derived from prior biological knowledge or inferred by the method itself. We illustrate the performance of this structure-based penalization both on synthetic data and on two canonical regulatory networks (the yeast cell cycle regulation network and the E. coli S.O.S. DNA repair network).


Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Redes Reguladoras de Genes , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Análise de Regressão , Algoritmos , Bioestatística , Ciclo Celular/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Funções Verossimilhança , Modelos Genéticos , Modelos Estatísticos , Resposta SOS em Genética/genética , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética
17.
Bioinformatics ; 25(3): 417-8, 2009 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-19073589

RESUMO

SUMMARY: The R package SIMoNe (Statistical Inference for MOdular NEtworks) enables inference of gene-regulatory networks based on partial correlation coefficients from microarray experiments. Modelling gene expression data with a Gaussian graphical model (hereafter GGM), the algorithm estimates non-zero entries of the concentration matrix, in a sparse and possibly high-dimensional setting. Its originality lies in the fact that it searches for a latent modular structure to drive the inference procedure through adaptive penalization of the concentration matrix. AVAILABILITY: Under the GNU General Public Licence at http://cran.r-project.org/web/packages/simone/


Assuntos
Algoritmos , Redes Reguladoras de Genes , Software , Simulação por Computador , Bases de Dados Genéticas , Perfilação da Expressão Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...