Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 96
Filter
1.
Nat Med ; 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38890530

ABSTRACT

The pathogenesis of allograft (dys)function has been increasingly studied using 'omics'-based technologies, but the focus on individual organs has created knowledge gaps that neither unify nor distinguish pathological mechanisms across allografts. Here we present a comprehensive study of human pan-organ allograft dysfunction, analyzing 150 datasets with more than 12,000 samples across four commonly transplanted solid organs (heart, lung, liver and kidney, n = 1,160, 1,241, 1,216 and 8,853 samples, respectively) that we leveraged to explore transcriptomic differences among allograft dysfunction (delayed graft function, acute rejection and fibrosis), tolerance and stable graft function. We identified genes that correlated robustly with allograft dysfunction across heart, lung, liver and kidney transplantation. Furthermore, we developed a transfer learning omics prediction framework that, by borrowing information across organs, demonstrated superior classifications compared to models trained on single organs. These findings were validated using a single-center prospective kidney transplant cohort study (a collective 329 samples across two timepoints), providing insights supporting the potential clinical utility of our approach. Our study establishes the capacity for machine learning models to learn across organs and presents a transcriptomic transplant resource that can be employed to develop pan-organ biomarkers of allograft dysfunction.

2.
Genome Res ; 34(1): 119-133, 2024 02 07.
Article in English | MEDLINE | ID: mdl-38190633

ABSTRACT

Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space by using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal data sets, we show scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome data set we generated from differentiating mouse embryonic stem cells over time, we show scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Animals , Mice , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Gene Expression Regulation
3.
Nat Commun ; 15(1): 509, 2024 Jan 13.
Article in English | MEDLINE | ID: mdl-38218939

ABSTRACT

Recent advances in subcellular imaging transcriptomics platforms have enabled high-resolution spatial mapping of gene expression, while also introducing significant analytical challenges in accurately identifying cells and assigning transcripts. Existing methods grapple with cell segmentation, frequently leading to fragmented cells or oversized cells that capture contaminated expression. To this end, we present BIDCell, a self-supervised deep learning-based framework with biologically-informed loss functions that learn relationships between spatially resolved gene expression and cell morphology. BIDCell incorporates cell-type data, including single-cell transcriptomics data from public repositories, with cell morphology information. Using a comprehensive evaluation framework consisting of metrics in five complementary categories for cell segmentation performance, we demonstrate that BIDCell outperforms other state-of-the-art methods according to many metrics across a variety of tissue types and technology platforms. Our findings underscore the potential of BIDCell to significantly enhance single-cell spatial expression analyses, enabling great potential in biological discovery.


Subject(s)
Benchmarking , Gene Expression Profiling , Erythrocytes, Abnormal , Histocompatibility Testing , Supervised Machine Learning
4.
iScience ; 26(11): 108220, 2023 Nov 17.
Article in English | MEDLINE | ID: mdl-37965156

ABSTRACT

The mouse olfactory system regenerates constantly throughout life. While genes critical for the initial projection of olfactory sensory neurons (OSNs) to the olfactory bulb have been identified, what genes are important for maintaining the olfactory map during regeneration are still unknown. Here we show a mutation in Protocadherin 19 (Pcdh19), a cell adhesion molecule and member of the cadherin superfamily, leads to defects in OSN coalescence during regeneration. Surprisingly, lateral glomeruli were more affected and males in particular showed a more severe phenotype. Single cell analysis unexpectedly showed OSNs expressing the MOR28 odorant receptor could be subdivided into two major clusters. We showed that at least one protocadherin is differentially expressed between OSNs coalescing on the medial and lateral glomeruli. Moreover, females expressed a slightly different complement of genes from males. These features may explain the differential effects of mutating Pcdh19 on medial and lateral glomeruli in males and females.

5.
mBio ; : e0226223, 2023 Oct 16.
Article in English | MEDLINE | ID: mdl-37850732

ABSTRACT

Among the 16 two-component systems in the opportunistic human pathogen Staphylococcus aureus, only WalKR is essential. Like the orthologous systems in other Bacillota, S. aureus WalKR controls autolysins involved in peptidoglycan remodeling and is therefore intimately involved in cell division. However, despite the importance of WalKR in S. aureus, the basis for its essentiality is not understood and the regulon is poorly defined. Here, we defined a consensus WalR DNA-binding motif and the direct WalKR regulon by using functional genomics, including chromatin immunoprecipitation sequencing, with a panel of isogenic walKR mutants that had a spectrum of altered activities. Consistent with prior findings, the direct regulon includes multiple autolysin genes. However, this work also revealed that WalR directly regulates at least five essential genes involved in lipoteichoic acid synthesis (ltaS): translation (rplK), DNA compaction (hup), initiation of DNA replication (dnaA, hup) and purine nucleotide metabolism (prs). Thus, WalKR in S. aureus serves as a polyfunctional regulator that contributes to fundamental control over critical cell processes by coordinately linking cell wall homeostasis with purine biosynthesis, protein biosynthesis, and DNA replication. Our findings further address the essentiality of this locus and highlight the importance of WalKR as a bona fide target for novel anti-staphylococcal therapeutics. IMPORTANCE The opportunistic human pathogen Staphylococcus aureus uses an array of protein sensing systems called two-component systems (TCS) to sense environmental signals and adapt its physiology in response by regulating different genes. This sensory network is key to S. aureus versatility and success as a pathogen. Here, we reveal for the first time the full extent of the regulatory network of WalKR, the only staphylococcal TCS that is indispensable for survival under laboratory conditions. We found that WalKR is a master regulator of cell growth, coordinating the expression of genes from multiple, fundamental S. aureus cellular processes, including those involved in maintaining cell wall metabolism, protein biosynthesis, nucleotide metabolism, and the initiation of DNA replication.

6.
Nat Commun ; 14(1): 6479, 2023 10 14.
Article in English | MEDLINE | ID: mdl-37838722

ABSTRACT

Global spread of multidrug-resistant, hospital-adapted Staphylococcus epidermidis lineages underscores the need for new therapeutic strategies. Here we show that many S. epidermidis isolates belonging to these lineages display cryptic susceptibility to penicillin/ß-lactamase inhibitor combinations under in vitro conditions, despite carrying the methicillin resistance gene mecA. Using a mouse thigh model of S. epidermidis infection, we demonstrate that single-dose treatment with amoxicillin/clavulanic acid significantly reduces methicillin-resistant S. epidermidis loads without leading to detectable resistance development. On the other hand, we also show that methicillin-resistant S. epidermidis is capable of developing increased resistance to amoxicillin/clavulanic acid during long-term in vitro exposure to these drugs. These findings suggest that penicillin/ß-lactamase inhibitor combinations could be a promising therapeutic candidate for treatment of a high proportion of methicillin-resistant S. epidermidis infections, although the in vivo risk of resistance development needs to be further addressed before they can be incorporated into clinical trials.


Subject(s)
Penicillins , Staphylococcal Infections , Humans , Penicillins/pharmacology , Penicillins/therapeutic use , beta-Lactamase Inhibitors/pharmacology , Staphylococcus epidermidis , Staphylococcal Infections/drug therapy , Clavulanic Acid/pharmacology , Clavulanic Acid/therapeutic use , Amoxicillin/pharmacology , Amoxicillin/therapeutic use , Microbial Sensitivity Tests , Anti-Bacterial Agents/pharmacology , Anti-Bacterial Agents/therapeutic use
7.
Nat Commun ; 14(1): 4272, 2023 07 17.
Article in English | MEDLINE | ID: mdl-37460600

ABSTRACT

The recent emergence of multi-sample multi-condition single-cell multi-cohort studies allows researchers to investigate different cell states. The effective integration of multiple large-cohort studies promises biological insights into cells under different conditions that individual studies cannot provide. Here, we present scMerge2, a scalable algorithm that allows data integration of atlas-scale multi-sample multi-condition single-cell studies. We have generalized scMerge2 to enable the merging of millions of cells from single-cell studies generated by various single-cell technologies. Using a large COVID-19 data collection with over five million cells from 1000+ individuals, we demonstrate that scMerge2 enables multi-sample multi-condition scRNA-seq data integration from multiple cohorts and reveals signatures derived from cell-type expression that are more accurate in discriminating disease progression. Further, we demonstrate that scMerge2 can remove dataset variability in CyTOF, imaging mass cytometry and CITE-seq experiments, demonstrating its applicability to a broad spectrum of single-cell profiling technologies.


Subject(s)
COVID-19 , Gene Expression Profiling , Humans , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Algorithms , Exome Sequencing , Sequence Analysis, RNA/methods
8.
Biomolecules ; 13(6)2023 05 31.
Article in English | MEDLINE | ID: mdl-37371497

ABSTRACT

The current coronary artery disease (CAD) risk scores for predicting future cardiovascular events rely on well-recognized traditional cardiovascular risk factors derived from a population level but often fail individuals, with up to 25% of first-time heart attack patients having no risk factors. Non-invasive imaging technology can directly measure coronary artery plaque burden. With an advanced lipidomic measurement methodology, for the first time, we aim to identify lipidomic biomarkers to enable intervention before cardiovascular events. With 994 participants from BioHEART-CT Discovery Cohort, we collected clinical data and performed high-performance liquid chromatography with mass spectrometry to determine concentrations of 683 plasma lipid species. Statin-naive participants were selected based on subclinical CAD (sCAD) categories as the analytical cohort (n = 580), with sCAD+ (n = 243) compared to sCAD- (n = 337). Through a machine learning approach, we built a lipid risk score (LRS) and compared the performance of the existing Framingham Risk Score (FRS) in predicting sCAD+. We obtained individual classifiability scores and determined Body Mass Index (BMI) as the modifying variable. FRS and LRS models achieved similar areas under the receiver operating characteristic curve (AUC) in predicting the validation cohort. LRS enhanced the prediction of sCAD+ in the healthy-weight group (BMI < 25 kg/m2), where FRS performed poorly and identified individuals at risk that FRS missed. Lipid features have strong potential as biomarkers to predict CAD plaque burden and can identify residual risk not captured by traditional risk factors/scores. LRS compliments FRS in prediction and has the most significant benefit in healthy-weight individuals.


Subject(s)
Coronary Artery Disease , Myocardial Infarction , Plaque, Atherosclerotic , Humans , Lipidomics , Coronary Angiography/methods , Risk Assessment , Plaque, Atherosclerotic/diagnostic imaging , Tomography, X-Ray Computed , Biomarkers , Lipids
9.
bioRxiv ; 2023 May 22.
Article in English | MEDLINE | ID: mdl-37292801

ABSTRACT

Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal datasets, we demonstrate scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome dataset we generated from differentiating mouse embryonic stem cells over time, we demonstrate scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.

10.
iScience ; 26(5): 106633, 2023 May 19.
Article in English | MEDLINE | ID: mdl-37192969

ABSTRACT

Cardiovascular disease remains a leading cause of mortality with an estimated half a billion people affected in 2019. However, detecting signals between specific pathophysiology and coronary plaque phenotypes using complex multi-omic discovery datasets remains challenging due to the diversity of individuals and their risk factors. Given the complex cohort heterogeneity present in those with coronary artery disease (CAD), we illustrate several different methods, both knowledge-guided and data-driven approaches, for identifying subcohorts of individuals with subclinical CAD and distinct metabolomic signatures. We then demonstrate that utilizing these subcohorts can improve the prediction of subclinical CAD and can facilitate the discovery of novel biomarkers of subclinical disease. Analyses acknowledging cohort heterogeneity through identifying and utilizing these subcohorts may be able to advance our understanding of CVD and provide more effective preventative treatments to reduce the burden of this disease in individuals and in society as a whole.

11.
Microbiome ; 11(1): 51, 2023 03 15.
Article in English | MEDLINE | ID: mdl-36918961

ABSTRACT

BACKGROUND: Unrevealing the interplay between diet, the microbiome, and the health state could enable the design of personalized intervention strategies and improve the health and well-being of individuals. A common approach to this is to divide the study population into smaller cohorts based on dietary preferences in the hope of identifying specific microbial signatures. However, classification of patients based solely on diet is unlikely to reflect the microbiome-host health relationship or the taxonomic microbiome makeup. RESULTS: We present a novel approach, the Nutrition-Ecotype Mixture of Experts (NEMoE) model, for establishing associations between gut microbiota and health state that accounts for diet-specific cohort variability using a regularized mixture of experts model framework with an integrated parameter sharing strategy to ensure data-driven diet-cohort identification consistency across taxonomic levels. The success of our approach was demonstrated through a series of simulation studies, in which NEMoE showed robustness with regard to parameter selection and varying degrees of data heterogeneity. Further application to real-world microbiome data from a Parkinson's disease cohort revealed that NEMoE is capable of not only improving predictive performance for Parkinson's Disease but also for identifying diet-specific microbial signatures of disease. CONCLUSION: In summary, NEMoE can be used to uncover diet-specific relationships between nutritional-ecotype and patient health and to contextualize precision nutrition for different diseases. Video Abstract.


Subject(s)
Microbiota , Parkinson Disease , Humans , Ecotype , Diet , Nutritional Status
12.
Comput Biol Med ; 154: 106576, 2023 03.
Article in English | MEDLINE | ID: mdl-36736097

ABSTRACT

The spatial architecture of the tumour microenvironment and phenotypic heterogeneity of tumour cells have been shown to be associated with cancer prognosis and clinical outcomes, including survival. Recent advances in highly multiplexed imaging, including imaging mass cytometry (IMC), capture spatially resolved, high-dimensional maps that quantify dozens of disease-relevant biomarkers at single-cell resolution, that contain potential to inform patient-specific prognosis. Existing automated methods for predicting survival, on the other hand, typically do not leverage spatial phenotype information captured at the single-cell level. Furthermore, there is no end-to-end method designed to leverage the rich information in whole IMC images and all marker channels, and aggregate this information with clinical data in a complementary manner to predict survival with enhanced accuracy. To that end, we present a deep multimodal graph-based network (DMGN) with two modules: (1) a multimodal graph-based module that considers relationships between spatial phenotype information in all image regions and all clinical variables adaptively, and (2) a clinical embedding module that automatically generates embeddings specialised for each clinical variable to enhance multimodal aggregation. We demonstrate that our modules are consistently effective at improving survival prediction performance using two public breast cancer datasets, and that our new approach can outperform state-of-the-art methods in survival prediction.


Subject(s)
Neoplasms , Tumor Microenvironment , Humans , Phenotype , Upper Extremity , Neoplasms/diagnostic imaging
13.
Brief Bioinform ; 24(2)2023 03 19.
Article in English | MEDLINE | ID: mdl-36813563

ABSTRACT

Cell-state transition can reveal additional information from single-cell ribonucleic acid (RNA)-sequencing data in time-resolved biological phenomena. However, most of the current methods are based on the time derivative of the gene expression state, which restricts them to the short-term evolution of cell states. Here, we present single-cell State Transition Across-samples of RNA-seq data (scSTAR), which overcomes this limitation by constructing a paired-cell projection between biological conditions with an arbitrary time span by maximizing the covariance between two feature spaces using partial least square and minimum squared error methods. In mouse ageing data, the response to stress in CD4+ memory T cell subtypes was found to be associated with ageing. A novel Treg subtype characterized by mTORC activation was identified to be associated with antitumour immune suppression, which was confirmed by immunofluorescence microscopy and survival analysis in 11 cancers from The Cancer Genome Atlas Program. On melanoma data, scSTAR improved immunotherapy-response prediction accuracy from 0.8 to 0.96.


Subject(s)
Gene Expression Profiling , RNA , Animals , Mice , RNA/genetics , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Genome
14.
Nat Rev Microbiol ; 21(6): 380-395, 2023 06.
Article in English | MEDLINE | ID: mdl-36707725

ABSTRACT

Invasive Staphylococcus aureus infections are common, causing high mortality, compounded by the propensity of the bacterium to develop drug resistance. S. aureus is an excellent case study of the potential for a bacterium to be commensal, colonizing, latent or disease-causing; these states defined by the interplay between S. aureus and host. This interplay is multidimensional and evolving, exemplified by the spread of S. aureus between humans and other animal reservoirs and the lack of success in vaccine development. In this Review, we examine recent advances in understanding the S. aureus-host interactions that lead to infections. We revisit the primary role of neutrophils in controlling infection, summarizing the discovery of new immune evasion molecules and the discovery of new functions ascribed to well-known virulence factors. We explore the intriguing intersection of bacterial and host metabolism, where crosstalk in both directions can influence immune responses and infection outcomes. This Review also assesses the surprising genomic plasticity of S. aureus, its dualism as a multi-mammalian species commensal and opportunistic pathogen and our developing understanding of the roles of other bacteria in shaping S. aureus colonization.


Subject(s)
Staphylococcal Infections , Staphylococcus aureus , Animals , Humans , Staphylococcus aureus/genetics , Immune Evasion , Virulence Factors/genetics , Adaptation, Physiological , Host-Pathogen Interactions , Mammals
15.
Preprint in English | bioRxiv | ID: ppbiorxiv-518997

ABSTRACT

Recent advancements in the use of single-cell technologies in large cohort studies enable the investigation of cellular response and mechanisms associated with disease outcome, including COVID-19. Several efforts have been made using single-cell RNA-sequencing to better understand the immune response to COVID-19 virus infection. Nonetheless, it is often difficult to compare or integrate data from multiple data sets due to challenges in data normalisation, metadata harmonisation, and having a common interface to quickly query and access this vast amount of data. Here we present Covidscope (http://covidsc.d24h.hk/), a well-curated open web resource that currently contains single-cell gene expression data and associated metadata of almost 5 million blood and immune cells extracted from almost 1,000 COVID-19 patients across 20 studies around the world. Our collection contains the integrated data with harmonised metadata and multi-level cell type annotations. By combining NoSQL and optimised index, our Covidscope achieves rapid subsetting of high-dimensional gene expression data based on both data set level, donor-level (e.g., age and sex of patients) and cell-level (e.g., expression of specific gene markers) metadata, enabling multiple efficient downstream single-cell meta-analysis.

16.
Cancers (Basel) ; 14(21)2022 Oct 24.
Article in English | MEDLINE | ID: mdl-36358632

ABSTRACT

Viruses are well known drivers of several human malignancies. A causative factor for oral cavity squamous cell carcinoma (OSCC) in patients with limited exposure to traditional risk factors, including tobacco use, is yet to be identified. Our study aimed to comprehensively evaluate the role of viral drivers in OSCC patients with low cumulative exposure to traditional risk factors. Patients under 50 years of age with OSCC, defined using strict anatomic criteria were selected for WGS. The WGS data was interrogated using viral detection tools (Kraken 2 and BLASTN), together examining >700,000 viruses. The findings were further verified using tissue microarrays of OSCC samples using both immunohistochemistry and RNA in situ hybridisation (ISH). 28 patients underwent WGS and comprehensive viral profiling. One 49-year-old male patient with OSCC of the hard palate demonstrated HPV35 integration. 657 cases of OSCC were then evaluated for the presence of HPV integration through immunohistochemistry for p16 and HPV RNA ISH. HPV integration was seen in 8 (1.2%) patients, all middle-aged men with predominant floor of mouth involvement. In summary, a wide-ranging interrogation of >700,000 viruses using OSCC WGS data showed HPV integration in a minority of male OSCC patients and did not carry any prognostic significance.

17.
PLoS Comput Biol ; 18(10): e1010495, 2022 10.
Article in English | MEDLINE | ID: mdl-36197936

ABSTRACT

COVID-19 patients display a wide range of disease severity, ranging from asymptomatic to critical symptoms with high mortality risk. Our ability to understand the interaction of SARS-CoV-2 infected cells within the lung, and of protective or dysfunctional immune responses to the virus, is critical to effectively treat these patients. Currently, our understanding of cell-cell interactions across different disease states, and how such interactions may drive pathogenic outcomes, is incomplete. Here, we developed a generalizable and scalable workflow for identifying cells that are differentially interacting across COVID-19 patients with distinct disease outcomes and use this to examine eight public single-cell RNA-seq datasets (six from peripheral blood mononuclear cells, one from bronchoalveolar lavage and one from nasopharyngeal), with a total of 211 individual samples. By characterizing the cell-cell interaction patterns across epithelial and immune cells in lung tissues for patients with varying disease severity, we illustrate diverse communication patterns across individuals, and discover heterogeneous communication patterns among moderate and severe patients. We further illustrate patterns derived from cell-cell interactions are potential signatures for discriminating between moderate and severe patients. Overall, this workflow can be generalized and scaled to combine multiple scRNA-seq datasets to uncover cell-cell interactions.


Subject(s)
COVID-19 , Cell Communication , Humans , Leukocytes, Mononuclear , SARS-CoV-2 , Workflow
18.
Access Microbiol ; 4(4): 000346, 2022.
Article in English | MEDLINE | ID: mdl-35812709

ABSTRACT

Background: Australia's response to the coronavirus disease 2019 (COVID-19) pandemic relies on widespread availability of rapid, accurate testing and reporting of results to facilitate contact tracing. The extensive geographical area of Australia presents a logistical challenge, with many of the population located distant from a laboratory capable of robust severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) detection. A strategy to address this is the deployment of a mobile facility utilizing novel diagnostic platforms. This study aimed to evaluate the feasibility of a fully contained transportable SARS-CoV-2 testing laboratory using a range of rapid point-of-care tests. Method: A 20 ft (6.1 m) shipping container was refurbished (GeneWorks, Adelaide, South Australia) with climate controls, laboratory benches, hand-wash station and a class II biosafety cabinet. Portable marquees situated adjacent to the container served as stations for registration, sample acquisition and personal protective equipment for staff. Specimens were collected and tested on-site utilizing either the Abbott ID NOW or Abbott Panbio rapid tests. SARS-CoV-2 positive results from the rapid platforms or any participants reporting symptoms consistent with COVID-19 were tested on-site by GeneXpert Xpress RT-PCR. All samples were tested in parallel with a standard-of-care RT-PCR test (Panther Fusion SARS-CoV-2 assay) performed at the public health reference laboratory. In-laboratory environmental conditions and data management-related factors were also recorded. Results: Over a 3 week period, 415 participants were recruited for point-of-care SARS-CoV-2 testing. From time of enrolment, the median result turnaround time was 26 min for the Abbott ID NOW, 32 min for the Abbott Panbio and 75 min for the Xpert Xpress. The environmental conditions of the refurbished shipping container were found to be suitable for all platforms tested, although humidity may have produced condensation within the container. Available software enabled turnaround times to be recorded, although technical malfunction resulted in incomplete data capture. Conclusion: Transportable container laboratories can enable rapid COVID-19 results at the point of care and may be useful during outbreak settings, particularly in environments that are physically distant from centralized laboratories. They may also be appropriate in resource-limited settings. The results of this pilot study confirm feasibility, although larger trials to validate individual rapid point-of-care testing platforms in this environment are required.

19.
NPJ Digit Med ; 5(1): 85, 2022 Jul 04.
Article in English | MEDLINE | ID: mdl-35788693

ABSTRACT

In this modern era of precision medicine, molecular signatures identified from advanced omics technologies hold great promise to better guide clinical decisions. However, current approaches are often location-specific due to the inherent differences between platforms and across multiple centres, thus limiting the transferability of molecular signatures. We present Cross-Platform Omics Prediction (CPOP), a penalised regression model that can use omics data to predict patient outcomes in a platform-independent manner and across time and experiments. CPOP improves on the traditional prediction framework of using gene-based features by selecting ratio-based features with similar estimated effect sizes. These components gave CPOP the ability to have a stable performance across datasets of similar biology, minimising the effect of technical noise often generated by omics platforms. We present a comprehensive evaluation using melanoma transcriptomics data to demonstrate its potential to be used as a critical part of a clinical screening framework for precision medicine. Additional assessment of generalisation was demonstrated with ovarian cancer and inflammatory bowel disease studies.

20.
Gigascience ; 112022 07 30.
Article in English | MEDLINE | ID: mdl-35906887

ABSTRACT

Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a diverse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics; these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies.


Subject(s)
Benchmarking , Machine Learning , Algorithms , Proportional Hazards Models , Survival Analysis
SELECTION OF CITATIONS
SEARCH DETAIL
...