RESUMO
The increasing use of high-throughput gene expression quantification technologies over the last two decades and the fact that most of the published studies are stored in public databases has triggered an explosion of studies available through public repositories. All this information offers an invaluable resource for reuse to generate new knowledge and scientific findings. In this context, great interest has been focused on meta-analysis methods to integrate and jointly analyze different gene expression datasets. In this work, we describe the main steps in the gene expression meta-analysis, from data preparation to the state-of-the art statistical methods. We also analyze the main types of applications and problems that can be approached in gene expression meta-analysis studies and provide a comparative overview of the available software and bioinformatics tools. Moreover, a practical guide for choosing the most appropriate method in each case is also provided.
Assuntos
Expressão Gênica , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , InternetRESUMO
BACKGROUND: Autoimmune diseases are heterogeneous pathologies with difficult diagnosis and few therapeutic options. In the last decade, several omics studies have provided significant insights into the molecular mechanisms of these diseases. Nevertheless, data from different cohorts and pathologies are stored independently in public repositories and a unified resource is imperative to assist researchers in this field. RESULTS: Here, we present Autoimmune Diseases Explorer ( https://adex.genyo.es ), a database that integrates 82 curated transcriptomics and methylation studies covering 5609 samples for some of the most common autoimmune diseases. The database provides, in an easy-to-use environment, advanced data analysis and statistical methods for exploring omics datasets, including meta-analysis, differential expression or pathway analysis. CONCLUSIONS: This is the first omics database focused on autoimmune diseases. This resource incorporates homogeneously processed data to facilitate integrative analyses among studies.
Assuntos
Doenças Autoimunes , Biologia Computacional , Doenças Autoimunes/epidemiologia , Doenças Autoimunes/genética , Bases de Dados Factuais , HumanosRESUMO
Lupus nephritis (LN) represents one of the most severe complications of systemic lupus erythematosus, leading to end-stage kidney disease in worst cases. Current first-line therapies for LN, including mycophenolate mofetil (MMF) and azathioprine (AZA), fail to induce long-term remission in 60-70% of the patients, evidencing the urgent need to delve into the molecular knowledge-gap behind the non-response to these therapies. A longitudinal cohort of treated LN patients including clinical, cellular and transcriptomic data, was analyzed. Gene-expression signatures behind non-response to different drugs were revealed by differential expression analysis. Drug-specific non-response mechanisms and cell proportion differences were identified. Blood cell subsets mediating non-response were described using single-cell RNASeq data. We show that AZA and MMF non-response implicates different cells and regulatory functions. Mechanistic models were used to suggest add-on therapies to improve their current performance. Our results provide new insights into the molecular mechanisms associated with treatment failures in LN.
RESUMO
The relationship between SARS-CoV-2 transmission and environmental factors has been analyzed in numerous studies since the outbreak of the pandemic, resulting in heterogeneous results and conclusions. This may be due to differences in methodology, considered variables, confounding factors, studied periods and/or lack of adequate data. Furthermore, previous works have reported that the lack of population immunity is the fundamental driver in transmission dynamics and can mask the potential impact of environmental variables. In this study, we aimed to investigate the association between climate variables and COVID-19 transmission considering the influence of population immunity. We analyzed two different periods characterized by the absence of vaccination (low population immunity) and a high degree of vaccination (high level of population immunity), respectively. Although this study has some limitations, such us the restriction to a specific climatic zone and the omission of other environmental factors, our results indicate that transmission of SARS-CoV-2 may increase independently of temperature and specific humidity in periods with low levels of population immunity while a negative association is found under conditions with higher levels of population immunity in the analyzed regions.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/epidemiologia , Umidade , Temperatura , PandemiasRESUMO
Statistical methods for enrichment analysis are important tools to extract biological information from omics experiments. Although these methods have been widely used for the analysis of gene and protein lists, the development of high-throughput technologies for regulatory elements demands dedicated statistical and bioinformatics tools. Here, we present a set of enrichment analysis methods for regulatory elements, including CpG sites, miRNAs, and transcription factors. Statistical significance is determined via a power weighting function for target genes and tested by the Wallenius noncentral hypergeometric distribution model to avoid selection bias. These new methodologies have been applied to the analysis of a set of miRNAs associated with arrhythmia, showing the potential of this tool to extract biological information from a list of regulatory elements. These new methods are available in GeneCodis 4, a web tool able to perform singular and modular enrichment analysis that allows the integration of heterogeneous information.
RESUMO
The coronavirus disease 2019 (COVID-19) pandemic has caused an unprecedented global health crisis, with several countries imposing lockdowns to control the coronavirus spread. Important research efforts are focused on evaluating the association of environmental factors with the survival and spread of the virus and different works have been published, with contradictory results in some cases. Data with spatial and temporal information is a key factor to get reliable results and, although there are some data repositories for monitoring the disease both globally and locally, an application that integrates and aggregates data from meteorological and air quality variables with COVID-19 information has not been described so far to the best of our knowledge. Here, we present DatAC (Data Against COVID-19), a data fusion project with an interactive web frontend that integrates COVID-19 and environmental data in Spain. DatAC is provided with powerful data analysis and statistical capabilities that allow users to explore and analyze individual trends and associations among the provided data. Using the application, we have evaluated the impact of the Spanish lockdown on the air quality, observing that NO2, CO, PM2.5, PM10 and SO2 levels decreased drastically in the entire territory, while O3 levels increased. We observed similar trends in urban and rural areas, although the impact has been more important in the former. Moreover, the application allowed us to analyze correlations among climate factors, such as ambient temperature, and the incidence of COVID-19 in Spain. Our results indicate that temperature is not the driving factor and without effective control actions, outbreaks will appear and warm weather will not substantially limit the growth of the pandemic. DatAC is available at https://covid19.genyo.es.