Search | VHL Regional Portal

ScribbleDom: using scribble-annotated histology images to identify domains in spatial transcriptomics data.

Rahman, Mohammad Nuwaisir; Noman, Abdullah Al; Turza, Abir Mohammad; Abrar, Mohammed Abid; Samee, Md Abul Hassan; Rahman, M Saifur.

Bioinformatics ; 39(10)2023 Oct 03.

Article in English | MEDLINE | ID: mdl-37756699

ABSTRACT

MOTIVATION: Spatial domain identification is a very important problem in the field of spatial transcriptomics. The state-of-the-art solutions to this problem focus on unsupervised methods, as there is lack of data for a supervised learning formulation. The results obtained from these methods highlight significant opportunities for improvement. RESULTS: In this article, we propose a potential avenue for enhancement through the development of a semi-supervised convolutional neural network based approach. Named "ScribbleDom", our method leverages human expert's input as a form of semi-supervision, thereby seamlessly combines the cognitive abilities of human experts with the computational power of machines. ScribbleDom incorporates a loss function that integrates two crucial components: similarity in gene expression profiles and adherence to the valuable input of a human annotator through scribbles on histology images, providing prior knowledge about spot labels. The spatial continuity of the tissue domains is taken into account by extracting information on the spot microenvironment through convolution filters of varying sizes, in the form of "Inception" blocks. By leveraging this semi-supervised approach, ScribbleDom significantly improves the quality of spatial domains, yielding superior results both quantitatively and qualitatively. Our experiments on several benchmark datasets demonstrate the clear edge of ScribbleDom over state-of-the-art methods-between 1.82% to 169.38% improvements in adjusted Rand index for 9 of the 12 human dorsolateral prefrontal cortex samples, and 15.54% improvement in the melanoma cancer dataset. Notably, when the expert input is absent, ScribbleDom can still operate, in a fully unsupervised manner like the state-of-the-art methods, and produces results that remain competitive. AVAILABILITY AND IMPLEMENTATION: Source code is available at Github (https://github.com/1alnoman/ScribbleDom) and Zenodo (https://zenodo.org/badge/latestdoi/681572669).

NoVaTeST: identifying genes with location-dependent noise variance in spatial transcriptomics data.

Abrar, Mohammed Abid; Kaykobad, M; Rahman, M Saifur; Samee, Md Abul Hassan.

Bioinformatics ; 39(6)2023 06 01.

Article in English | MEDLINE | ID: mdl-37285319

ABSTRACT

MOTIVATION: Spatial transcriptomics (ST) can reveal the existence and extent of spatial variation of gene expression in complex tissues. Such analyses could help identify spatially localized processes underlying a tissue's function. Existing tools to detect spatially variable genes assume a constant noise variance across spatial locations. This assumption might miss important biological signals when the variance can change across locations. RESULTS: In this article, we propose NoVaTeST, a framework to identify genes with location-dependent noise variance in ST data. NoVaTeST models gene expression as a function of spatial location and allows the noise to vary spatially. NoVaTeST then statistically compares this model to one with constant noise and detects genes showing significant spatial noise variation. We refer to these genes as "noisy genes." In tumor samples, the noisy genes detected by NoVaTeST are largely independent of the spatially variable genes detected by existing tools that assume constant noise, and provide important biological insights into tumor microenvironments. AVAILABILITY AND IMPLEMENTATION: An implementation of the NoVaTeST framework in Python along with instructions for running the pipeline is available at https://github.com/abidabrar-bracu/NoVaTeST.

Subject(s)

Software , Transcriptome , Gene Expression Profiling

Multinomial Convolutions for Joint Modeling of Regulatory Motifs and Sequence Activity Readouts.

Park, Minjun; Singh, Salvi; Khan, Samin Rahman; Abrar, Mohammed Abid; Grisanti, Francisco; Rahman, M Sohel; Samee, Md Abul Hassan.

Genes (Basel) ; 13(9)2022 09 08.

Article in English | MEDLINE | ID: mdl-36140783

ABSTRACT

A common goal in the convolutional neural network (CNN) modeling of genomic data is to discover specific sequence motifs. Post hoc analysis methods aid in this task but are dependent on parameters whose optimal values are unclear and applying the discovered motifs to new genomic data is not straightforward. As an alternative, we propose to learn convolutions as multinomial distributions, thus streamlining interpretable motif discovery with CNN model fitting. We developed MuSeAM (Multinomial CNNs for Sequence Activity Modeling) by implementing multinomial convolutions in a CNN model. Through benchmarking, we demonstrate the efficacy of MuSeAM in accurately modeling genomic data while fitting multinomial convolutions that recapitulate known transcription factor motifs.

Subject(s)

Genomics , Neural Networks, Computer , Transcription Factors/genetics

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL