Pesquisa | Portal Regional da BVS

Evaluation of zero counts to better understand the discrepancies between bulk and single-cell RNA-Seq platforms.

Zyla, Joanna; Papiez, Anna; Zhao, Jun; Qu, Rihao; Li, Xiaotong; Kluger, Yuval; Polanska, Joanna; Hatzis, Christos; Pusztai, Lajos; Marczyk, Michal.

Comput Struct Biotechnol J ; 21: 4663-4674, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37841335

RESUMO

Recent advances in sample preparation and sequencing technology have made it possible to profile the transcriptomes of individual cells using single-cell RNA sequencing (scRNA-Seq). Compared to bulk RNA-Seq data, single-cell data often contain a higher percentage of zero reads, mainly due to lower sequencing depth per cell, which affects mostly measurements of low-expression genes. However, discrepancies between platforms are observed regardless of expression level. Using four paired datasets with multiple samples each, we investigated technical and biological factors that can contribute to this expression shift. Using two separate machine learning models we found that, in addition to expression level, RNA integrity, gene or UTR3 length, and the number of transcripts potentially also influence the occurrence of zeros. These findings could enable the development of novel analytical methods for cross-platform expression shift correction. We also identified genes and biological pathways in our diverse datasets that consistently showed differences when assessed at the single cell versus bulk level to assist in interpreting analysis across transcriptomic platforms. At the gene level, 25 genes (0.12%) were found in all datasets as discordant, but at the pathway level, 7 pathways (2.02%) showed shared enrichment in discordant genes.

Epigenetic age prediction in semen - marker selection and model development.

Pisarek, Aleksandra; Pospiech, Ewelina; Heidegger, Antonia; Xavier, Catarina; Papiez, Anna; Piniewska-Róg, Danuta; Kalamara, Vivian; Potabattula, Ramya; Bochenek, Michal; Sikora-Polaczek, Marta; Macur, Aneta; Wozniak, Anna; Janeczko, Jaroslaw; Phillips, Christopher; Haaf, Thomas; Polanska, Joanna; Parson, Walther; Kayser, Manfred; Branicki, Wojciech.

Aging (Albany NY) ; 13(15): 19145-19164, 2021 08 10.

Artigo em Inglês | MEDLINE | ID: mdl-34375949

RESUMO

DNA methylation analysis is becoming increasingly useful in biomedical research and forensic practice. The discovery of differentially methylated sites (DMSs) that continuously change over an individual's lifetime has led to breakthroughs in molecular age estimation. Although semen samples are often used in forensic DNA analysis, previous epigenetic age prediction studies mainly focused on somatic cell types. Here, Infinium MethylationEPIC BeadChip arrays were applied to semen-derived DNA samples, which identified numerous novel DMSs moderately correlated with age. Validation of the ten most age-correlated novel DMSs and three previously known sites in an independent set of semen-derived DNA samples using targeted bisulfite massively parallel sequencing, confirmed age-correlation for nine new and three previously known markers. Prediction modelling revealed the best model for semen, based on 6 CpGs from newly identified genes SH2B2, EXOC3, IFITM2, and GALR2 as well as the previously known FOLH1B gene, which predict age with a mean absolute error of 5.1 years in an independent test set. Further increases in the accuracy of age prediction from semen DNA will require technological progress to allow sensitive, simultaneous analysis of a much larger number of age correlated DMSs from the compromised DNA typical of forensic semen stains.

Assuntos

Ilhas de CpG/genética , Metilação de DNA , Epigênese Genética , Modelos Genéticos , Sêmen , Adulto , Fatores Etários , Genética Forense/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Lineares , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Adulto Jovem

Symptom-based early-stage differentiation between SARS-CoV-2 versus other respiratory tract infections-Upper Silesia pilot study.

Mika, Justyna; Tobiasz, Joanna; Zyla, Joanna; Papiez, Anna; Bach, Malgorzata; Werner, Aleksandra; Kozielski, Michal; Kania, Mateusz; Gruca, Aleksandra; Piotrowski, Damian; Sobala-Szczygiel, Barbara; Wlostowska, Bozena; Foszner, Pawel; Sikora, Marek; Polanska, Joanna; Jaroszewicz, Jerzy.

Sci Rep ; 11(1): 13580, 2021 06 30.

Artigo em Inglês | MEDLINE | ID: mdl-34193945

RESUMO

In the DECODE project, data were collected from 3,114 surveys filled by symptomatic patients RT-qPCR tested for SARS-CoV-2 in a single university centre in March-September 2020. The population demonstrated balanced sex and age with 759 SARS-CoV-2( +) patients. The most discriminative symptoms in SARS-CoV-2( +) patients at early infection stage were loss of taste/smell (OR = 3.33, p < 0.0001), body temperature above 38â (OR = 1.67, p < 0.0001), muscle aches (OR = 1.30, p = 0.0242), headache (OR = 1.27, p = 0.0405), cough (OR = 1.26, p = 0.0477). Dyspnea was more often reported among SARS-CoV-2(-) (OR = 0.55, p < 0.0001). Cough and dyspnea were 3.5 times more frequent among SARS-CoV-2(-) (OR = 0.28, p < 0.0001). Co-occurrence of cough, muscle aches, headache, loss of taste/smell (OR = 4.72, p = 0.0015) appeared significant, although co-occurrence of two symptoms only, cough and loss of smell or taste, means OR = 2.49 (p < 0.0001). Temperature > 38â with cough was most frequent in men (20%), while loss of taste/smell with cough in women (17%). For younger people, taste/smell impairment is sufficient to characterise infection, whereas in older patients co-occurrence of fever and cough is necessary. The presented study objectifies the single symptoms and interactions significance in COVID-19 diagnoses and demonstrates diverse symptomatology in patient groups.

Assuntos

COVID-19/diagnóstico , COVID-19/epidemiologia , Infecções Respiratórias/diagnóstico , Infecções Respiratórias/epidemiologia , SARS-CoV-2 , Avaliação de Sintomas/estatística & dados numéricos , Centros Médicos Acadêmicos/estatística & dados numéricos , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Ageusia/etiologia , COVID-19/complicações , Criança , Pré-Escolar , Tosse/etiologia , Diagnóstico Diferencial , Dispneia/etiologia , Feminino , Febre/etiologia , Cefaleia/etiologia , Humanos , Lactente , Masculino , Pessoa de Meia-Idade , Razão de Chances , Transtornos do Olfato/etiologia , Projetos Piloto , Polônia/epidemiologia , Infecções Respiratórias/complicações , Infecções Respiratórias/microbiologia , Inquéritos e Questionários , Avaliação de Sintomas/classificação , Adulto Jovem

Molecular Profiling for Predictors of Radiosensitivity in Patients with Breast or Head-and-Neck Cancer.

Drobin, Kimi; Marczyk, Michal; Halle, Martin; Danielsson, Daniel; Papiez, Anna; Sangsuwan, Traimate; Bendes, Annika; Hong, Mun-Gwan; Qundos, Ulrika; Harms-Ringdahl, Mats; Wersäll, Peter; Polanska, Joanna; Schwenk, Jochen M; Haghdoost, Siamak.

Cancers (Basel) ; 12(3)2020 Mar 22.

Artigo em Inglês | MEDLINE | ID: mdl-32235817

RESUMO

Nearly half of all cancers are treated with radiotherapy alone or in combination with other treatments, where damage to normal tissues is a limiting factor for the treatment. Radiotherapy-induced adverse health effects, mostly of importance for cancer patients with long-term survival, may appear during or long time after finishing radiotherapy and depend on the patient's radiosensitivity. Currently, there is no assay available that can reliably predict the individual's response to radiotherapy. We profiled two study sets from breast (n = 29) and head-and-neck cancer patients (n = 74) that included radiosensitive patients and matched radioresistant controls.. We studied 55 single nucleotide polymorphisms (SNPs) in 33 genes by DNA genotyping and 130 circulating proteins by affinity-based plasma proteomics. In both study sets, we discovered several plasma proteins with the predictive power to find radiosensitive patients (adjusted p < 0.05) and validated the two most predictive proteins (THPO and STIM1) by sandwich immunoassays. By integrating genotypic and proteomic data into an analysis model, it was found that the proteins CHIT1, PDGFB, PNKD, RP2, SERPINC1, SLC4A, STIM1, and THPO, as well as the VEGFA gene variant rs69947, predicted radiosensitivity of our breast cancer (AUC = 0.76) and head-and-neck cancer (AUC = 0.89) patients. In conclusion, circulating proteins and a SNP variant of VEGFA suggest that processes such as vascular growth capacity, immune response, DNA repair and oxidative stress/hypoxia may be involved in an individual's risk of experiencing radiation-induced toxicity.

BatchI: Batch effect Identification in high-throughput screening data using a dynamic programming algorithm.

Papiez, Anna; Marczyk, Michal; Polanska, Joanna; Polanski, Andrzej.

Bioinformatics ; 35(11): 1885-1892, 2019 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-30357412

RESUMO

MOTIVATION: In contemporary biological experiments, bias, which interferes with the measurements, requires attentive processing. Important sources of bias in high-throughput biological experiments are batch effects and diverse methods towards removal of batch effects have been established. These include various normalization techniques, yet many require knowledge on the number of batches and assignment of samples to batches. Only few can deal with the problem of identification of batch effect of unknown structure. For this reason, an original batch identification algorithm through dynamical programming is introduced for omics data that may be sorted on a timescale. RESULTS: BatchI algorithm is based on partitioning a series of high-throughput experiment samples into sub-series corresponding to estimated batches. The dynamic programming method is used for splitting data with maximal dispersion between batches, while maintaining minimal within batch dispersion. The procedure has been tested on a number of available datasets with and without prior information about batch partitioning. Datasets with a priori identified batches have been split accordingly, measured with weighted average Dice Index. Batch effect correction is justified by higher intra-group correlation. In the blank datasets, identified batch divisions lead to improvement of parameters and quality of biological information, shown by literature study and Information Content. The outcome of the algorithm serves as a starting point for correction methods. It has been demonstrated that omitting the essential step of batch effect control may lead to waste of valuable potential discoveries. AVAILABILITY AND IMPLEMENTATION: The implementation is available within the BatchI R package at http://zaed.aei.polsl.pl/index.php/pl/111-software. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Projetos de Pesquisa

Integrative multiomics study for validation of mechanisms in radiation-induced ischemic heart disease in Mayak workers.

Papiez, Anna; Azimzadeh, Omid; Azizova, Tamara; Moseeva, Maria; Anastasov, Natasa; Smida, Jan; Tapio, Soile; Polanska, Joanna.

PLoS One ; 13(12): e0209626, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30596717

RESUMO

Previous studies have suggested that exposure to ionizing radiation increases the risk of ischemic heart disease (IHD). The data from the Mayak nuclear worker cohort have indicated enhanced risk for IHD incidence. The goal of this study was to elucidate molecular mechanisms of radiation-induced IHD by integrating proteomics data with a transcriptomics study on post mortem cardiac left ventricle samples from Mayak workers categorized in four radiation dose groups (0 Gy, < 100 mGy, 100-500 mGy, > 500 mGy). The proteomics data that were newly analysed here, originated from a label-free analysis of cardiac samples. The transcriptomics analysis was performed on a subset of these samples. Stepwise linear regression analyses were used to correct the age-dependent changes in protein expression, enabling the separation of proteins, the expression of which was dependent only on the radiation dose, age or both of these factors. Importantly, the majority of the proteins showed only dose-dependent expression changes. Hierarchical clustering of the proteome and transcriptome profiles confirmed the separation of control and high-dose samples. Restrictive (separate p-values) and integrative (combined p-value) approaches were used to investigate the enrichment of biological pathways. The integrative method proved superior in the validation of the key biological pathways found in the proteomics analysis, namely PPAR signalling, TCA cycle and glycolysis/gluconeogenesis. This study presents a novel, improved, and comprehensive statistical approach of analysing biological effects on a limited number of samples.

Assuntos

Perfilação da Expressão Gênica , Isquemia Miocárdica/etiologia , Isquemia Miocárdica/metabolismo , Proteômica , Lesões por Radiação/etiologia , Lesões por Radiação/metabolismo , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Ontologia Genética , Humanos , Masculino , Isquemia Miocárdica/epidemiologia , Proteômica/métodos , Doses de Radiação , Lesões por Radiação/epidemiologia , Radiação Ionizante , Transdução de Sinais

Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers.

Labaj, Wojciech; Papiez, Anna; Polanski, Andrzej; Polanska, Joanna.

Interdiscip Sci ; 9(1): 24-35, 2017 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-28303531

RESUMO

Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.

Assuntos

Biomarcadores Tumorais/genética , Leucemia/genética , Humanos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA