RESUMO
The composition of cell-type is a key indicator of health. Advancements in bulk gene expression data curation, single cell RNA-sequencing technologies, and computational deconvolution approaches offer a new perspective to learn about the composition of different cell types in a quick and affordable way. In this study, we developed a quantile regression and deep learning-based method called Neural Network Immune Contexture Estimator (NNICE) to estimate the cell type abundance and its uncertainty by automatically deconvolving bulk RNA-seq data. The proposed NNICE model was able to successfully recover ground-truth cell type fraction values given unseen bulk mixture gene expression profiles from the same dataset it was trained on. Compared with baseline methods, NNICE achieved better performance on deconvolve both pseudo-bulk gene expressions (Pearson correlation R = 0.9) and real bulk gene expression data (Pearson correlation R = 0.9) across all cell types. In conclusion, NNICE combines statistic inference with deep learning to provide accurate and interpretable cell type deconvolution from bulk gene expression.
Assuntos
Algoritmos , Aprendizado Profundo , Perfilação da Expressão Gênica , Redes Neurais de Computação , Humanos , Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Biologia Computacional/métodos , RNA-Seq/métodos , Análise de Sequência de RNA/métodos , TranscriptomaRESUMO
Deep learning models have potential to improve performance of automated computer-assisted diagnosis tools in digital histopathology and reduce subjectivity. The main objective of this study was to further improve diagnostic potential of convolutional neural networks (CNNs) in detection of lymph node metastasis in breast cancer patients by integrative augmentation of input images with multiple segmentation channels. For this retrospective study, we used the PatchCamelyon dataset, consisting of 327,680 histopathology images of lymph node sections from breast cancer. Images had labels for the presence or absence of metastatic tissue. In addition, we used four separate histopathology datasets with annotations for nucleus, mitosis, tubule, and epithelium to train four instances of U-net. Then our baseline model was trained with and without additional segmentation channels and their performances were compared. Integrated gradient was used to visualize model attribution. The model trained with concatenation/integration of original input plus four additional segmentation channels, which we refer to as ConcatNet, was superior (AUC 0.924) compared to baseline with or without augmentations (AUC 0.854; 0.884). Baseline model trained with one additional segmentation channel showed intermediate performance (AUC 0.870-0.895). ConcatNet had sensitivity of 82.0% and specificity of 87.8%, which was an improvement in performance over the baseline (sensitivity of 74.6%; specificity of 80.4%). Integrated gradients showed that models trained with additional segmentation channels had improved focus on particular areas of the image containing aberrant cells. Augmenting images with additional segmentation channels improved baseline model performance as well as its ability to focus on discrete areas of the image.
RESUMO
Young women with breast cancer have disproportionately poor clinical outcomes compared to their older counterparts. The underlying biological differences behind this age-dependent disparity are still unknown and warrant investigation. Recently, the tumor immune landscape has received much attention for its prognostic value and therapeutic targets. The differential tumor immune landscape between age groups in breast cancer has not yet been characterized, and may contribute to the age-related differences in clinical outcomes. Computational deconvolution was used to quantify abundance of immune cell types from bulk transcriptome profiles of breast cancer patients from two independent datasets. No significant differences in immune cell composition that were consistent in the two cohorts were found between the young and old age groups. Regardless of absence of significant differences, the higher tumor infiltration of several immune cell types, such as CD8+ T and CD4+ T cells, was associated with better clinical outcomes in the young but not in the old age group. Mutational signatures analysis showed signatures previously not found in breast cancer to be associated with tumor-infiltrating lymphocyte (TIL) levels in the young age group, whereas in the old group, all significant signatures were those previously found in breast cancer. Pathway analysis revealed different gene sets associated with TIL levels for each age group from the two cohorts. Overall, our results show trends towards better clinical outcomes for high TIL levels, especially CD8+ T cells, but only in the young age group. Furthermore, our work suggests that the underlying biological differences may involve multiple levels of tumor physiology.
RESUMO
Genome-wide copy-number association studies offer new opportunities to identify the mechanisms underlying complex diseases, including chronic inflammatory, psychiatric disorders and others. We have used genotyping microarrays to analyse the copy-number variants (CNVs) from 243 Caucasian individuals with Inflammatory Bowel Disease (IBD). The CNV data was obtained by using multiple quality control measures and merging the results of three different CNV detection algorithms: PennCNV, iPattern, and QuantiSNP. The final dataset contains 4,402 CNVs detected by two or three algorithms independently with high confidence. This paper provides a detailed description of the data generation and quality control steps. For further interpretation of the data presented in this article, please see the research article entitled 'Copy number variation-based gene set analysis reveals cytokine signalling pathways associated with psychiatric comorbidity in patients with inflammatory bowel disease'.