RESUMEN
Recent advancements in generative approaches in AI have opened up the prospect of synthetic tabular clinical data generation. From filling in missing values in real-world data, these approaches have now advanced to creating complex multi-tables. This review explores the development of techniques capable of synthesizing patient data and modeling multiple tables. We highlight the challenges and opportunities of these methods for analyzing patient data in physiology. Additionally, it discusses the challenges and potential of these approaches in improving clinical research, personalized medicine, and healthcare policy. The integration of these generative models into physiological settings may represent both a theoretical advancement and a practical tool that has the potential to improve mechanistic understanding and patient care. By providing a reliable source of synthetic data, these models can also help mitigate privacy concerns and facilitate large-scale data sharing.
RESUMEN
Altered expression and functional roles of the transcribed ultraconserved regions (T-UCRs), as genomic sequences with 100% conservation between the genomes of human, mouse, and rat, in the pathophysiology of neoplasms has already been investigated. Nevertheless, the relevance of the functions for T-UCRs in gastric cancer (GC) is still the subject of inquiry. In the current study, we first used a genome-wide profiling approach to analyze the expression of T-UCRs in GC patients. Then, we constructed a three-component regulatory network and investigated potential diagnostic and prognostic values of the T-UCRs. The Cancer Genome Atlas Stomach Adenocarcinoma (TCGA-STAD) dataset was used as a resource for the RNA-sequencing data. FeatureCounts was utilized to quantify the number of reads mapped to each T-UCR. Differential expression analysis was then conducted using DESeq2. In the following, interactions between T-UCRs, microRNAs (miRNAs), and messenger RNAs (mRNAs) were combined into a three-component network. Enrichment analyses were performed and a protein-protein interaction (PPI) network was constructed. The R Survival package was utilized to identify survival-related significantly differentially expressed T-UCRs (DET-UCRs). Using an in-house cohort of GC tissues, expression of two DET-UCRs was furthermore experimentally verified. Our results showed that several T-UCRs were dysregulated in TCGA-STAD tumoral samples compared to nontumoral counterparts. The three-component network was constructed which composed of DET-UCRs, miRNAs, and mRNAs nodes. Functional enrichment and PPI network analyses revealed important enriched signaling pathways and gene ontologies such as "pathway in cancer" and regulation of cell proliferation and apoptosis. Five T-UCRs were significantly correlated with the overall survival of GC patients. While no expression of uc.232 was observed in our in-house cohort of GC tissues, uc.343 showed an increased expression, although not statistically significant, in gastric tumoral tissues. The constructed three-component regulatory network of T-UCRs in GC presents a comprehensive understanding of the underlying gene expression regulation processes involved in tumor development and can serve as a basis to investigate potential prognostic biomarkers and therapeutic targets.
Asunto(s)
Adenocarcinoma , MicroARNs , ARN Largo no Codificante , Neoplasias Gástricas , Humanos , Ratas , Ratones , Animales , Neoplasias Gástricas/genética , Pronóstico , Secuencia Conservada/genética , Regulación Neoplásica de la Expresión Génica , MicroARNs/genética , Adenocarcinoma/genética , Biomarcadores , Redes Reguladoras de Genes , Biomarcadores de Tumor/genéticaRESUMEN
Backgound Aims: This meta-analysis aims at summarizing the whole body of research on cell therapies for acute myocardial infarction (MI) in the mouse model to bring forward ongoing research in this field of regenerative medicine. Despite rather modest effects in clinical trials, pre-clinical studies continue to report beneficial effects of cardiac cell therapies for cardiac repair following acute ischemic injury. Results: The authors' meta-analysis of data from 166 mouse studies comprising 257 experimental groups demonstrated a significant improvement in left ventricular ejection fraction of 10.21% after cell therapy compared with control animals. Subgroup analysis indicated that second-generation cell therapies such as cardiac progenitor cells and pluripotent stem cell derivatives had the highest therapeutic potential for minimizing myocardial damage post-MI. Conclusions: Whereas the vision of functional tissue replacement has been replaced by the concept of regional scar modulation in most of the investigated studies, rather basic methods for assessing cardiac function were most frequently used. Hence, future studies will highly benefit from integrating methods for assessment of regional wall properties to evolve a deeper understanding of how to modulate cardiac healing after acute MI.
Asunto(s)
Infarto del Miocardio , Función Ventricular Izquierda , Animales , Ratones , Volumen Sistólico , Corazón , Infarto del Miocardio/terapia , Trasplante de Células Madre/métodosRESUMEN
The in vitro generation of human cardiomyocytes derived from induced pluripotent stem cells (iPSC) is of great importance for cardiac disease modeling, drug-testing applications and for regenerative medicine. Despite the development of various cultivation strategies, a sufficiently high degree of maturation is still a decisive limiting factor for the successful application of these cardiac cells. The maturation process includes, among others, the proper formation of sarcomere structures, mediating the contraction of cardiomyocytes. To precisely monitor the maturation of the contractile machinery, we have established an imaging-based strategy that allows quantitative evaluation of important parameters, defining the quality of the sarcomere network. iPSC-derived cardiomyocytes were subjected to different culture conditions to improve sarcomere formation, including prolonged cultivation time and micro patterned surfaces. Fluorescent images of α-actinin were acquired using super-resolution microscopy. Subsequently, we determined cell morphology, sarcomere density, filament alignment, z-Disc thickness and sarcomere length of iPSC-derived cardiomyocytes. Cells from adult and neonatal heart tissue served as control. Our image analysis revealed a profound effect on sarcomere content and filament orientation when iPSC-derived cardiomyocytes were cultured on structured, line-shaped surfaces. Similarly, prolonged cultivation time had a beneficial effect on the structural maturation, leading to a more adult-like phenotype. Automatic evaluation of the sarcomere filaments by machine learning validated our data. Moreover, we successfully transferred this approach to skeletal muscle cells, showing an improved sarcomere formation cells over different differentiation periods. Overall, our image-based workflow can be used as a straight-forward tool to quantitatively estimate the structural maturation of contractile cells. As such, it can support the establishment of novel differentiation protocols to enhance sarcomere formation and maturity.
Asunto(s)
Señalización del Calcio/fisiología , Diferenciación Celular/fisiología , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Sarcómeros/metabolismo , Actinina/metabolismo , Animales , Calcio/metabolismo , Células Cultivadas , Humanos , Aprendizaje Automático , Ratones , Microscopía Fluorescente/métodos , Músculo Esquelético/citología , Miocardio/citología , Fenotipo , ARN/genética , ARN/aislamiento & purificaciónRESUMEN
The vast and heterogeneous data being constantly generated in clinics can provide great wealth for patients and research alike. The quickly evolving field of medical informatics research has contributed numerous concepts, algorithms, and standards to facilitate this development. However, these difficult relationships, complex terminologies, and multiple implementations can present obstacles for people who want to get active in the field. With a particular focus on medical informatics research conducted in Germany, we present in our Viewpoint a set of 10 important topics to improve the overall interdisciplinary communication between different stakeholders (eg, physicians, computational experts, experimentalists, students, patient representatives). This may lower the barriers to entry and offer a starting point for collaborations at different levels. The suggested topics are briefly introduced, then general best practice guidance is given, and further resources for in-depth reading or hands-on tutorials are recommended. In addition, the topics are set to cover current aspects and open research gaps of the medical informatics domain, including data regulations and concepts; data harmonization and processing; and data evaluation, visualization, and dissemination. In addition, we give an example on how these topics can be integrated in a medical informatics curriculum for higher education. By recognizing these topics, readers will be able to (1) set clinical and research data into the context of medical informatics, understanding what is possible to achieve with data or how data should be handled in terms of data privacy and storage; (2) distinguish current interoperability standards and obtain first insights into the processes leading to effective data transfer and analysis; and (3) value the use of newly developed technical approaches to utilize the full potential of clinical data.
Asunto(s)
Informática Médica , Humanos , Curriculum , Algoritmos , AlemaniaRESUMEN
Single-cell RNA-sequencing (scRNA-seq) provides high-resolution insights into complex tissues. Cardiac tissue, however, poses a major challenge due to the delicate isolation process and the large size of mature cardiomyocytes. Regardless of the experimental technique, captured cells are often impaired and some capture sites may contain multiple or no cells at all. All this refers to "low quality" potentially leading to data misinterpretation. Common standard quality control parameters involve the number of detected genes, transcripts per cell, and the fraction of transcripts from mitochondrial genes. While cutoffs for transcripts and genes per cell are usually user-defined for each experiment or individually calculated, a fixed threshold of 5% mitochondrial transcripts is standard and often set as default in scRNA-seq software. However, this parameter is highly dependent on the tissue type. In the heart, mitochondrial transcripts comprise almost 30% of total mRNA due to high energy demands. Here, we demonstrate that a 5%-threshold not only causes an unacceptable exclusion of cardiomyocytes but also introduces a bias that particularly discriminates pacemaker cells. This effect is apparent for our in vitro generated induced-sinoatrial-bodies (iSABs; highly enriched physiologically functional pacemaker cells), and also evident in a public data set of cells isolated from embryonal murine sinoatrial node tissue (Goodyer William et al. in Circ Res 125:379-397, 2019). Taken together, we recommend omitting this filtering parameter for scRNA-seq in cardiovascular applications whenever possible.
Asunto(s)
ARN Mitocondrial/genética , ARN Citoplasmático Pequeño/genética , Análisis de la Célula Individual/métodos , Animales , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Humanos , Ratones , Miocitos Cardíacos/fisiología , Control de Calidad , ARN Mensajero/genética , Análisis de Secuencia de ARN , Programas Informáticos , Secuenciación del Exoma/métodosRESUMEN
The current generation of sequencing technologies has led to significant advances in identifying novel disease-associated mutations and generated large amounts of data in a high-throughput manner. Such data in conjunction with clinical routine data are proven to be highly useful in deriving population-level and patient-level predictions, especially in the field of cancer precision medicine. However, data harmonization across multiple national and international clinical sites is an essential step for the assessment of events and outcomes associated with patients, which is currently not adequately addressed. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is an internationally established research data repository introduced by the Observational Health Data Science and Informatics (OHDSI) community to overcome this issue. To address the needs of cancer research, the genomic vocabulary extension was introduced in 2020 to support the standardization of subsequent data analysis. In this review, we evaluate the current potential of the OMOP CDM to be applicable in cancer prediction and how comprehensively the genomic vocabulary extension of the OMOP can serve current needs of AI-based predictions. For this, we systematically screened the literature for articles that use the OMOP CDM in predictive analyses in cancer and investigated the underlying predictive models/tools. Interestingly, we found 248 articles, of which most use the OMOP for harmonizing their data, but only 5 make use of predictive algorithms on OMOP-based data and fulfill our criteria. The studies present multicentric investigations, in which the OMOP played an essential role in discovering and optimizing machine learning (ML)-based models. Ultimately, the use of the OMOP CDM leads to standardized data-driven studies for multiple clinical sites and enables a more solid basis utilizing, e.g., ML models that can be reused and combined in early prediction, diagnosis, and improvement of personalized cancer care and biomarker discovery.
Asunto(s)
Informática Médica , Neoplasias , Biomarcadores , Análisis de Datos , Bases de Datos Factuales , Registros Electrónicos de Salud , Humanos , Neoplasias/diagnóstico , Neoplasias/genética , Medicina de PrecisiónRESUMEN
Ventricular arrhythmias associated with myocardial infarction (MI) have a significant impact on mortality in patients following heart attack. Therefore, targeted reduction of arrhythmia represents a therapeutic approach for the prevention and treatment of severe events after infarction. Recent research transplanting mesenchymal stem cells (MSC) showed their potential in MI therapy. Our study aimed to investigate the effects of MSC injection on post-infarction arrhythmia. We used our murine double infarction model, which we previously established, to more closely mimic the clinical situation and intramyocardially injected hypoxic pre-conditioned murine MSC to the infarction border. Thereafter, various types of arrhythmias were recorded and analyzed. We observed a homogenous distribution of all types of arrhythmias after the first infarction, without any significant differences between the groups. Yet, MSC therapy after double infarction led to a highly significant reduction in simple and complex arrhythmias. Moreover, RNA-sequencing of samples from stem cell treated mice after re-infarction demonstrated a significant decline in most arrhythmias with reduced inflammatory pathways. Additionally, following stem-cell therapy we found numerous highly expressed genes to be either linked to lowering the risk of heart failure, cardiomyopathy or sudden cardiac death. Moreover, genes known to be associated with arrhythmogenesis and key mutations underlying arrhythmias were downregulated. In summary, our stem-cell therapy led to a reduction in cardiac arrhythmias after MI and showed a downregulation of already established inflammatory pathways. Furthermore, our study reveals gene regulation pathways that have a potentially direct influence on arrhythmogenesis after myocardial infarction.
Asunto(s)
Trasplante de Células Madre Mesenquimatosas , Células Madre Mesenquimatosas , Infarto del Miocardio , Animales , Arritmias Cardíacas/etiología , Arritmias Cardíacas/metabolismo , Arritmias Cardíacas/terapia , Modelos Animales de Enfermedad , Células Madre Mesenquimatosas/metabolismo , Ratones , Infarto del Miocardio/complicaciones , Infarto del Miocardio/metabolismo , Infarto del Miocardio/terapiaRESUMEN
BACKGROUND: The research landscape of single-cell and single-nuclei RNA-sequencing is evolving rapidly. In particular, the area for the detection of rare cells was highly facilitated by this technology. However, an automated, unbiased, and accurate annotation of rare subpopulations is challenging. Once rare cells are identified in one dataset, it is usually necessary to generate further specific datasets to enrich the analysis (e.g., with samples from other tissues). From a machine learning perspective, the challenge arises from the fact that rare-cell subpopulations constitute an imbalanced classification problem. We here introduce a Machine Learning (ML)-based oversampling method that uses gene expression counts of already identified rare cells as an input to generate synthetic cells to then identify similar (rare) cells in other publicly available experiments. We utilize single-cell synthetic oversampling (sc-SynO), which is based on the Localized Random Affine Shadowsampling (LoRAS) algorithm. The algorithm corrects for the overall imbalance ratio of the minority and majority class. RESULTS: We demonstrate the effectiveness of our method for three independent use cases, each consisting of already published datasets. The first use case identifies cardiac glial cells in snRNA-Seq data (17 nuclei out of 8635). This use case was designed to take a larger imbalance ratio (~1 to 500) into account and only uses single-nuclei data. The second use case was designed to jointly use snRNA-Seq data and scRNA-Seq on a lower imbalance ratio (~1 to 26) for the training step to likewise investigate the potential of the algorithm to consider both single-cell capture procedures and the impact of "less" rare-cell types. The third dataset refers to the murine data of the Allen Brain Atlas, including more than 1 million cells. For validation purposes only, all datasets have also been analyzed traditionally using common data analysis approaches, such as the Seurat workflow. CONCLUSIONS: In comparison to baseline testing without oversampling, our approach identifies rare-cells with a robust precision-recall balance, including a high accuracy and low false positive detection rate. A practical benefit of our algorithm is that it can be readily implemented in other and existing workflows. The code basis in R and Python is publicly available at FairdomHub, as well as GitHub, and can easily be transferred to identify other rare-cell types.
Asunto(s)
ARN , Análisis de la Célula Individual , Animales , Análisis por Conglomerados , Aprendizaje Automático , Ratones , ARN/genética , Análisis de Secuencia de ARNRESUMEN
BACKGROUND: Fifteen percent of atopic dermatitis (AD) liability-scale heritability could be attributed to 31 susceptibility loci identified by using genome-wide association studies, with only 3 of them (IL13, IL-6 receptor [IL6R], and filaggrin [FLG]) resolved to protein-coding variants. OBJECTIVE: We examined whether a significant portion of unexplained AD heritability is further explained by low-frequency and rare variants in the gene-coding sequence. METHODS: We evaluated common, low-frequency, and rare protein-coding variants using exome chip and replication genotype data of 15,574 patients and 377,839 control subjects combined with whole-transcriptome data on lesional, nonlesional, and healthy skin samples of 27 patients and 38 control subjects. RESULTS: An additional 12.56% (SE, 0.74%) of AD heritability is explained by rare protein-coding variation. We identified docking protein 2 (DOK2) and CD200 receptor 1 (CD200R1) as novel genome-wide significant susceptibility genes. Rare coding variants associated with AD are further enriched in 5 genes (IL-4 receptor [IL4R], IL13, Janus kinase 1 [JAK1], JAK2, and tyrosine kinase 2 [TYK2]) of the IL13 pathway, all of which are targets for novel systemic AD therapeutics. Multiomics-based network and RNA sequencing analysis revealed DOK2 as a central hub interacting with, among others, CD200R1, IL6R, and signal transducer and activator of transcription 3 (STAT3). Multitissue gene expression profile analysis for 53 tissue types from the Genotype-Tissue Expression project showed that disease-associated protein-coding variants exert their greatest effect in skin tissues. CONCLUSION: Our discoveries highlight a major role of rare coding variants in AD acting independently of common variants. Further extensive functional studies are required to detect all potential causal variants and to specify the contribution of the novel susceptibility genes DOK2 and CD200R1 to overall disease susceptibility.
Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/genética , Dermatitis Atópica/genética , Genotipo , Receptores de Orexina/genética , Fosfoproteínas/genética , Piel/metabolismo , Adulto , Estudios de Cohortes , Proteínas Filagrina , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Especificidad de Órganos , Polimorfismo Genético , Riesgo , TranscriptomaRESUMEN
RNA-based regulation has become a major research topic in molecular biology. The analysis of epigenetic and expression data is therefore incomplete if RNA-based regulation is not taken into account. Thus, it is increasingly important but not yet standard to combine RNA-centric data and analysis tools with other types of experimental data such as RNA-seq or ChIP-seq. Here, we present the RNA workbench, a comprehensive set of analysis tools and consolidated workflows that enable the researcher to combine these two worlds. Based on the Galaxy framework the workbench guarantees simple access, easy extension, flexible adaption to personal and security needs, and sophisticated analyses that are independent of command-line knowledge. Currently, it includes more than 50 bioinformatics tools that are dedicated to different research areas of RNA biology including RNA structure analysis, RNA alignment, RNA annotation, RNA-protein interaction, ribosome profiling, RNA-seq analysis and RNA target prediction. The workbench is developed and maintained by experts in RNA bioinformatics and the Galaxy framework. Together with the growing community evolving around this workbench, we are committed to keep the workbench up-to-date for future standards and needs, providing researchers with a reliable and robust framework for RNA data analysis. AVAILABILITY: The RNA workbench is available at https://github.com/bgruening/galaxy-rna-workbench.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN/química , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Biología Computacional , Internet , Conformación de Ácido Nucleico , ARN/metabolismo , ARN no Traducido/química , Flujo de TrabajoRESUMEN
AIMS: Stem cell-based regenerative therapies for the treatment of ischemic myocardium are currently a subject of intensive investigation. A variety of cell populations have been demonstrated to be safe and to exert some positive effects in human Phase I and II clinical trials, however conclusive evidence of efficacy is still lacking. While the relevance of animal models for appropriate pre-clinical safety and efficacy testing with regard to application in Phase III studies continues to increase, concerns have been expressed regarding the validity of the mouse model to predict clinical results. Against the background that hundreds of preclinical studies have assessed the efficacy of numerous kinds of cell preparations - including pluripotent stem cells - for cardiac repair, we undertook a systematic re-evaluation of data from the mouse model, which initially paved the way for the first clinical trials in this field. METHODS AND RESULTS: A systematic literature screen was performed to identify publications reporting results of cardiac stem cell therapies for the treatment of myocardial ischemia in the mouse model. Only peer-reviewed and placebo-controlled studies using magnet resonance imaging (MRI) for left ventricular ejection fraction (LVEF) assessment were included. Experimental data from 21 studies involving 583 animals demonstrate a significant improvement in LVEF of 8.59%+/- 2.36; p=.012 (95% CI, 3.7-13.8) compared with control animals. CONCLUSION: The mouse is a valid model to evaluate the efficacy of cell-based advanced therapies for the treatment of ischemic myocardial damage. Further studies are required to understand the mechanisms underlying stem cell based improvement of cardiac function after ischemia.
Asunto(s)
Infarto del Miocardio/terapia , Trasplante de Células Madre , Animales , Tratamiento Basado en Trasplante de Células y Tejidos , Bases de Datos Factuales , Modelos Animales de Enfermedad , Corazón/fisiopatología , Humanos , Ratones , Infarto del Miocardio/metabolismo , Infarto del Miocardio/fisiopatología , Regeneración , Función Ventricular Izquierda/fisiologíaRESUMEN
BACKGROUND: Technical advances in Next Generation Sequencing (NGS) provide a means to acquire deeper insights into cellular functions. The lack of standardized and automated methodologies poses a challenge for the analysis and interpretation of RNA sequencing data. We critically compare and evaluate state-of-the-art bioinformatics approaches and present a workflow that integrates the best performing data analysis, data evaluation and annotation methods in a Transparent, Reproducible and Automated PipeLINE (TRAPLINE) for RNA sequencing data processing (suitable for Illumina, SOLiD and Solexa). RESULTS: Comparative transcriptomics analyses with TRAPLINE result in a set of differentially expressed genes, their corresponding protein-protein interactions, splice variants, promoter activity, predicted miRNA-target interactions and files for single nucleotide polymorphism (SNP) calling. The obtained results are combined into a single file for downstream analysis such as network construction. We demonstrate the value of the proposed pipeline by characterizing the transcriptome of our recently described stem cell derived antibiotic selected cardiac bodies ('aCaBs'). CONCLUSION: TRAPLINE supports NGS-based research by providing a workflow that requires no bioinformatics skills, decreases the processing time of the analysis and works in the cloud. The pipeline is implemented in the biomedical research platform Galaxy and is freely accessible via www.sbi.uni-rostock.de/RNAseqTRAPLINE or the specific Galaxy manual page (https://usegalaxy.org/u/mwolfien/p/trapline---manual).
Asunto(s)
Biología Computacional/normas , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Análisis de Secuencia de ARN/normas , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , MicroARNs/genética , MicroARNs/metabolismo , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Dominios y Motivos de Interacción de Proteínas , Alineación de Secuencia , Análisis de Secuencia de ARN/métodos , TranscriptomaRESUMEN
Over the last decade, the exponential growth in patient data volume and velocity has transformed it into a valuable resource for researchers. Yet, accessing comprehensive, unique patient data sets remains a challenge, particularly when individuals have received treatments across various practices and hospitals. Traditional record linkage methods fall short in adequately protecting patient privacy in these scenarios. Privacy Preserving Record Linkage (PPRL) offers a solution, employing techniques such as data cryptographic methods to identify common patients occurring in multiple datasets, while maintaining the privacy of other patients. This paper proposes an investigation into combined approaches of two common German PPRL tools, namely E-PIX and MainSEL. Each tool, while aiming for 'privacy preservation', employs distinct methods that offer unique advantages and drawbacks. Our research aims to explore these in a combined approach to leverage their respective strengths and mitigate their limitations. We anticipate that this synergistic approach will not only enhance data privacy but also allow for easier synchronisation of research data. This study is particularly pertinent in light of evolving privacy regulations and the increasing complexity of healthcare data management. By advancing PPRL methodologies, we aim to contribute to more robust, privacy-compliant data analysis practices in healthcare research.
Asunto(s)
Seguridad Computacional , Confidencialidad , Registros Electrónicos de Salud , Registro Médico Coordinado , Alemania , Registro Médico Coordinado/métodos , HumanosRESUMEN
Background: Enhancer RNAs (eRNAs) are involved in gene expression regulation. Although functional roles of eRNAs in the pathophysiology of neoplasms have been reported, their involvement in gastric cancer (GC) is less known. Materials & methods: A network-based integrative approach was utilized for analyzing transcriptome and epigenome alterations in GC, and an eRNA was selected for experimental validation. Survival analysis and clinicopathological associations were also performed. Results: A hub eRNA, ENSR00000272060, showed significantly increased expression in tumor versus nontumor tissues, as well as an association with clinicopathological features. A seven-gene prognostic model was also constructed. Conclusion: The constructed network provides a comprehensive understanding of the underlying processes implicated in the progression of GC, along with a starting point from which to derive potential diagnostic/prognostic biomarkers.
What is this summary about? We provide an overview of a study on genetic materials related to stomach cancer. This study could help identify factors that change the progress of this disease. We used genetic information from a specific disease database. One of the genetic materials that was assessed is eRNA. It was examined in some samples of gastric cancer. We analyzed gastric tissues to confirm our findings. The goal of this study was to find out whether we could identify a disease-related eRNA. What were the results? We found an eRNA that showed genetic differences between examined samples. It was also related to the stage of the disease. What do the results mean? The results show that there is a difference in the amount of examined eRNA between samples. It suggests that we may be able to use it to detect the disease earlier.
Asunto(s)
Neoplasias Gástricas , Humanos , Neoplasias Gástricas/genética , Neoplasias Gástricas/patología , Transcriptoma , Epigenoma , Regulación Neoplásica de la Expresión Génica , Biomarcadores de Tumor/genéticaRESUMEN
The emergence of collaborations, which standardize and combine multiple clinical databases across different regions, provide a wealthy source of data, which is fundamental for clinical prediction models, such as patient-level predictions. With the aid of such large data pools, researchers are able to develop clinical prediction models for improved disease classification, risk assessment, and beyond. To fully utilize this potential, Machine Learning (ML) methods are commonly required to process these large amounts of data on disease-specific patient cohorts. As a consequence, the Observational Health Data Sciences and Informatics (OHDSI) collaborative develops a framework to facilitate the application of ML models for these standardized patient datasets by using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). In this study, we compare the feasibility of current web-based OHDSI approaches, namely ATLAS and "Patient-level Prediction" (PLP), against a native solution (R based) to conduct such ML-based patient-level prediction analyses in OMOP. This will enable potential users to select the most suitable approach for their investigation. Each of the applied ML solutions was individually utilized to solve the same patient-level prediction task. Both approaches went through an exemplary benchmarking analysis to assess the weaknesses and strengths of the PLP R-Package. In this work, the performance of this package was subsequently compared versus the commonly used native R-package called Machine Learning in R 3 (mlr3), and its sub-packages. The approaches were evaluated on performance, execution time, and ease of model implementation. The results show that the PLP package has shorter execution times, which indicates great scalability, as well as intuitive code implementation, and numerous possibilities for visualization. However, limitations in comparison to native packages were depicted in the implementation of specific ML classifiers (e.g., Lasso), which may result in a decreased performance for real-world prediction problems. The findings here contribute to the overall effort of developing ML-based prediction models on a clinical scale and provide a snapshot for future studies that explicitly aim to develop patient-level prediction models in OMOP CDM.
Asunto(s)
Aprendizaje Automático , Informática Médica , Humanos , Bases de Datos Factuales , Registros Electrónicos de SaludRESUMEN
The use of artificial intelligence (AI) in healthcare is transforming a number of medical fields, including nephrology. The integration of various AI techniques in nephrology facilitates the prediction of the early detection, diagnosis, prognosis, and treatment of kidney disease. Nevertheless, recent reports have demonstrated that the majority of published clinical AI studies lack uniform AI reporting standards, which poses significant challenges in interpreting, replicating, and translating the studies into routine clinical use. In response to these issues, worldwide initiatives have created guidelines for publishing AI-related studies that outline the minimal necessary information that researchers should include. By following standardized reporting frameworks, researchers and clinicians can ensure the reproducibility, reliability, and ethical use of AI models. This will ultimately lead to improved research outcomes, enhanced clinical decision-making, and better patient management. This review article highlights the importance of adhering to AI reporting guidelines in medical research, with a focus on nephrology and urology, and clinical practice for advancing the field and optimizing patient care.
RESUMEN
This study advances the utility of synthetic study data in hematology, particularly for Acute Myeloid Leukemia (AML), by facilitating its integration into healthcare systems and research platforms through standardization into the Observational Medical Outcomes Partnership (OMOP) and Fast Healthcare Interoperability Resources (FHIR) formats. In our previous work, we addressed the need for high-quality patient data and used CTAB-GAN+ and Normalizing Flow (NFlow) to synthesize data from 1606 patients across four multicenter AML clinical trials. We published the generated synthetic cohorts, that accurately replicate the distributions of key demographic, laboratory, molecular, and cytogenetic variables, alongside patient outcomes, demonstrating high fidelity and usability. The conversion to the OMOP format opens avenues for comparative observational multi-center research by enabling seamless combination with related OMOP datasets, thereby broadening the scope of AML research. Similarly, standardization into FHIR facilitates further developments of applications, e.g. via the SMART-on-FHIR platform, offering realistic test data. This effort aims to foster a more collaborative research environment and facilitate the development of innovative tools and applications in AML care and research.
Asunto(s)
Leucemia Mieloide Aguda , Humanos , Hematología , Interoperabilidad de la Información en Salud , Registros Electrónicos de Salud , Evaluación de Resultado en la Atención de SaludRESUMEN
The integration of artificial intelligence (AI) algorithms into clinical practice holds immense potential to improve patient care, but widespread adoption still faces significant challenges, including interoperability issues. We propose a concept for the agile development of an IT platform to integrate AI-based applications into clinical workflows for a use case in ophthalmology.
Asunto(s)
Inteligencia Artificial , Integración de Sistemas , Oftalmología , Sistemas de Apoyo a Decisiones Clínicas/organización & administración , Humanos , Registros Electrónicos de Salud , Algoritmos , Flujo de TrabajoRESUMEN
INTRODUCTION: Seamless interoperability of ophthalmic clinical data is beneficial for improving patient care and advancing research through the integration of data from various sources. Such consolidation increases the amount of data available, leading to more robust statistical analyses, and improving the accuracy and reliability of artificial intelligence models. However, the lack of consistent, harmonized data formats and meanings (syntactic and semantic interoperability) poses a significant challenge in sharing ophthalmic data. METHODS: The Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR), a standard for the exchange of healthcare data, emerges as a promising solution. To facilitate cross-site data exchange in research, the German Medical Informatics Initiative (MII) has developed a core data set (CDS) based on FHIR. RESULTS: This work investigates the suitability of the MII CDS specifications for exchanging ophthalmic clinical data necessary to train and validate a specific machine learning model designed for predicting visual acuity. In interdisciplinary collaborations, we identified and categorized the required ophthalmic clinical data and explored the possibility of its mapping to FHIR using the MII CDS specifications. DISCUSSION: We found that the current FHIR MII CDS specifications do not completely accommodate the ophthalmic clinical data we investigated, indicating that the creation of an extension module is essential.