Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
PLoS One ; 19(6): e0305874, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38917129

RESUMEN

Combining data from experiments on multispecies studies provides invaluable contributions to the understanding of basic disease mechanisms and pathophysiology of pathogens crossing species boundaries. The task of multispecies gene expression analysis, however, is often challenging given annotation inconsistencies and in cases of small sample sizes due to bias caused by batch effects. In this work we aim to demonstrate that an alternative approach to standard differential expression analysis in single cell RNA-sequencing (scRNA-seq) based on effect size profiles is suitable for the fusion of data from small samples and multiple organisms. The analysis pipeline is based on effect size metric profiles of samples in specific cell clusters. The effect size substitutes standard differentiation analyses based on p-values and profiles identified based on these effect size metrics serve as a tool to link cell type clusters between the studied organisms. The algorithms were tested on published scRNA-seq data sets derived from several species and subsequently validated on own data from human and bovine peripheral blood mononuclear cells stimulated with Mycobacterium tuberculosis. Correlation of the effect size profiles between clusters allowed for the linkage of human and bovine cell types. Moreover, effect size ratios were used to identify differentially regulated genes in control and stimulated samples. The genes identified through effect size profiling were confirmed experimentally using qPCR. We demonstrate that in situations where batch effects dominate cell type variation in single cell small sample size multispecies studies, effect size profiling is a valid alternative to traditional statistical inference techniques.


Asunto(s)
Mycobacterium tuberculosis , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Animales , Humanos , Bovinos , Mycobacterium tuberculosis/genética , Perfilación de la Expresión Génica/métodos , Algoritmos , Leucocitos Mononucleares/metabolismo , Análisis de Secuencia de ARN/métodos
2.
Comput Struct Biotechnol J ; 21: 4663-4674, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37841335

RESUMEN

Recent advances in sample preparation and sequencing technology have made it possible to profile the transcriptomes of individual cells using single-cell RNA sequencing (scRNA-Seq). Compared to bulk RNA-Seq data, single-cell data often contain a higher percentage of zero reads, mainly due to lower sequencing depth per cell, which affects mostly measurements of low-expression genes. However, discrepancies between platforms are observed regardless of expression level. Using four paired datasets with multiple samples each, we investigated technical and biological factors that can contribute to this expression shift. Using two separate machine learning models we found that, in addition to expression level, RNA integrity, gene or UTR3 length, and the number of transcripts potentially also influence the occurrence of zeros. These findings could enable the development of novel analytical methods for cross-platform expression shift correction. We also identified genes and biological pathways in our diverse datasets that consistently showed differences when assessed at the single cell versus bulk level to assist in interpreting analysis across transcriptomic platforms. At the gene level, 25 genes (0.12%) were found in all datasets as discordant, but at the pathway level, 7 pathways (2.02%) showed shared enrichment in discordant genes.

3.
Aging (Albany NY) ; 13(15): 19145-19164, 2021 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-34375949

RESUMEN

DNA methylation analysis is becoming increasingly useful in biomedical research and forensic practice. The discovery of differentially methylated sites (DMSs) that continuously change over an individual's lifetime has led to breakthroughs in molecular age estimation. Although semen samples are often used in forensic DNA analysis, previous epigenetic age prediction studies mainly focused on somatic cell types. Here, Infinium MethylationEPIC BeadChip arrays were applied to semen-derived DNA samples, which identified numerous novel DMSs moderately correlated with age. Validation of the ten most age-correlated novel DMSs and three previously known sites in an independent set of semen-derived DNA samples using targeted bisulfite massively parallel sequencing, confirmed age-correlation for nine new and three previously known markers. Prediction modelling revealed the best model for semen, based on 6 CpGs from newly identified genes SH2B2, EXOC3, IFITM2, and GALR2 as well as the previously known FOLH1B gene, which predict age with a mean absolute error of 5.1 years in an independent test set. Further increases in the accuracy of age prediction from semen DNA will require technological progress to allow sensitive, simultaneous analysis of a much larger number of age correlated DMSs from the compromised DNA typical of forensic semen stains.


Asunto(s)
Islas de CpG/genética , Metilación de ADN , Epigénesis Genética , Modelos Genéticos , Semen , Adulto , Factores de Edad , Genética Forense/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Modelos Lineales , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Adulto Joven
4.
Sci Rep ; 11(1): 13580, 2021 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-34193945

RESUMEN

In the DECODE project, data were collected from 3,114 surveys filled by symptomatic patients RT-qPCR tested for SARS-CoV-2 in a single university centre in March-September 2020. The population demonstrated balanced sex and age with 759 SARS-CoV-2( +) patients. The most discriminative symptoms in SARS-CoV-2( +) patients at early infection stage were loss of taste/smell (OR = 3.33, p < 0.0001), body temperature above 38℃ (OR = 1.67, p < 0.0001), muscle aches (OR = 1.30, p = 0.0242), headache (OR = 1.27, p = 0.0405), cough (OR = 1.26, p = 0.0477). Dyspnea was more often reported among SARS-CoV-2(-) (OR = 0.55, p < 0.0001). Cough and dyspnea were 3.5 times more frequent among SARS-CoV-2(-) (OR = 0.28, p < 0.0001). Co-occurrence of cough, muscle aches, headache, loss of taste/smell (OR = 4.72, p = 0.0015) appeared significant, although co-occurrence of two symptoms only, cough and loss of smell or taste, means OR = 2.49 (p < 0.0001). Temperature > 38℃ with cough was most frequent in men (20%), while loss of taste/smell with cough in women (17%). For younger people, taste/smell impairment is sufficient to characterise infection, whereas in older patients co-occurrence of fever and cough is necessary. The presented study objectifies the single symptoms and interactions significance in COVID-19 diagnoses and demonstrates diverse symptomatology in patient groups.


Asunto(s)
COVID-19/diagnóstico , COVID-19/epidemiología , Infecciones del Sistema Respiratorio/diagnóstico , Infecciones del Sistema Respiratorio/epidemiología , SARS-CoV-2 , Evaluación de Síntomas/estadística & datos numéricos , Centros Médicos Académicos/estadística & datos numéricos , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Ageusia/etiología , COVID-19/complicaciones , Niño , Preescolar , Tos/etiología , Diagnóstico Diferencial , Disnea/etiología , Femenino , Fiebre/etiología , Cefalea/etiología , Humanos , Lactante , Masculino , Persona de Mediana Edad , Oportunidad Relativa , Trastornos del Olfato/etiología , Proyectos Piloto , Polonia/epidemiología , Infecciones del Sistema Respiratorio/complicaciones , Infecciones del Sistema Respiratorio/microbiología , Encuestas y Cuestionarios , Evaluación de Síntomas/clasificación , Adulto Joven
5.
Cancers (Basel) ; 12(3)2020 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-32235817

RESUMEN

Nearly half of all cancers are treated with radiotherapy alone or in combination with other treatments, where damage to normal tissues is a limiting factor for the treatment. Radiotherapy-induced adverse health effects, mostly of importance for cancer patients with long-term survival, may appear during or long time after finishing radiotherapy and depend on the patient's radiosensitivity. Currently, there is no assay available that can reliably predict the individual's response to radiotherapy. We profiled two study sets from breast (n = 29) and head-and-neck cancer patients (n = 74) that included radiosensitive patients and matched radioresistant controls.. We studied 55 single nucleotide polymorphisms (SNPs) in 33 genes by DNA genotyping and 130 circulating proteins by affinity-based plasma proteomics. In both study sets, we discovered several plasma proteins with the predictive power to find radiosensitive patients (adjusted p < 0.05) and validated the two most predictive proteins (THPO and STIM1) by sandwich immunoassays. By integrating genotypic and proteomic data into an analysis model, it was found that the proteins CHIT1, PDGFB, PNKD, RP2, SERPINC1, SLC4A, STIM1, and THPO, as well as the VEGFA gene variant rs69947, predicted radiosensitivity of our breast cancer (AUC = 0.76) and head-and-neck cancer (AUC = 0.89) patients. In conclusion, circulating proteins and a SNP variant of VEGFA suggest that processes such as vascular growth capacity, immune response, DNA repair and oxidative stress/hypoxia may be involved in an individual's risk of experiencing radiation-induced toxicity.

6.
Bioinformatics ; 35(11): 1885-1892, 2019 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-30357412

RESUMEN

MOTIVATION: In contemporary biological experiments, bias, which interferes with the measurements, requires attentive processing. Important sources of bias in high-throughput biological experiments are batch effects and diverse methods towards removal of batch effects have been established. These include various normalization techniques, yet many require knowledge on the number of batches and assignment of samples to batches. Only few can deal with the problem of identification of batch effect of unknown structure. For this reason, an original batch identification algorithm through dynamical programming is introduced for omics data that may be sorted on a timescale. RESULTS: BatchI algorithm is based on partitioning a series of high-throughput experiment samples into sub-series corresponding to estimated batches. The dynamic programming method is used for splitting data with maximal dispersion between batches, while maintaining minimal within batch dispersion. The procedure has been tested on a number of available datasets with and without prior information about batch partitioning. Datasets with a priori identified batches have been split accordingly, measured with weighted average Dice Index. Batch effect correction is justified by higher intra-group correlation. In the blank datasets, identified batch divisions lead to improvement of parameters and quality of biological information, shown by literature study and Information Content. The outcome of the algorithm serves as a starting point for correction methods. It has been demonstrated that omitting the essential step of batch effect control may lead to waste of valuable potential discoveries. AVAILABILITY AND IMPLEMENTATION: The implementation is available within the BatchI R package at http://zaed.aei.polsl.pl/index.php/pl/111-software. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Proyectos de Investigación
7.
PLoS One ; 13(12): e0209626, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30596717

RESUMEN

Previous studies have suggested that exposure to ionizing radiation increases the risk of ischemic heart disease (IHD). The data from the Mayak nuclear worker cohort have indicated enhanced risk for IHD incidence. The goal of this study was to elucidate molecular mechanisms of radiation-induced IHD by integrating proteomics data with a transcriptomics study on post mortem cardiac left ventricle samples from Mayak workers categorized in four radiation dose groups (0 Gy, < 100 mGy, 100-500 mGy, > 500 mGy). The proteomics data that were newly analysed here, originated from a label-free analysis of cardiac samples. The transcriptomics analysis was performed on a subset of these samples. Stepwise linear regression analyses were used to correct the age-dependent changes in protein expression, enabling the separation of proteins, the expression of which was dependent only on the radiation dose, age or both of these factors. Importantly, the majority of the proteins showed only dose-dependent expression changes. Hierarchical clustering of the proteome and transcriptome profiles confirmed the separation of control and high-dose samples. Restrictive (separate p-values) and integrative (combined p-value) approaches were used to investigate the enrichment of biological pathways. The integrative method proved superior in the validation of the key biological pathways found in the proteomics analysis, namely PPAR signalling, TCA cycle and glycolysis/gluconeogenesis. This study presents a novel, improved, and comprehensive statistical approach of analysing biological effects on a limited number of samples.


Asunto(s)
Perfilación de la Expresión Génica , Isquemia Miocárdica/etiología , Isquemia Miocárdica/metabolismo , Proteómica , Traumatismos por Radiación/etiología , Traumatismos por Radiación/metabolismo , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Ontología de Genes , Humanos , Masculino , Isquemia Miocárdica/epidemiología , Proteómica/métodos , Dosis de Radiación , Traumatismos por Radiación/epidemiología , Radiación Ionizante , Transducción de Señal
8.
Interdiscip Sci ; 9(1): 24-35, 2017 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-28303531

RESUMEN

Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.


Asunto(s)
Biomarcadores de Tumor/genética , Leucemia/genética , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA