Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 43
1.
Healthcare (Basel) ; 12(9)2024 May 02.
Article En | MEDLINE | ID: mdl-38727496

Understanding the intricate relationships between diseases is critical for both prevention and recovery. However, there is a lack of suitable methodologies for exploring the precedence relationships within multiple censored time-to-event data, resulting in decreased analytical accuracy. This study introduces the Censored Event Precedence Analysis (CEPA), which is a nonparametric Bayesian approach suitable for understanding the precedence relationships in censored multivariate events. CEPA aims to analyze the precedence relationships between events to predict subsequent occurrences effectively. We applied CEPA to neonatal data from the National Health Insurance Service, identifying the precedence relationships among the seven most commonly diagnosed diseases categorized by the International Classification of Diseases. This analysis revealed a typical diagnostic sequence, starting with respiratory diseases, followed by skin, infectious, digestive, ear, eye, and injury-related diseases. Furthermore, simulation studies were conducted to demonstrate CEPA suitability for censored multivariate datasets compared to traditional models. The performance accuracy reached 76% for uniform distribution and 65% for exponential distribution, showing superior performance in all four tested environments. Therefore, the statistical approach based on CEPA enhances our understanding of disease interrelationships beyond competitive methodologies. By identifying disease precedence with CEPA, we can preempt subsequent disease occurrences and propose a healthcare system based on these relationships.

2.
Sci Data ; 11(1): 371, 2024 Apr 11.
Article En | MEDLINE | ID: mdl-38605036

The simplified molecular-input line-entry system (SMILES) has been utilized in a variety of artificial intelligence analyses owing to its capability of representing chemical structures using line notation. However, its ease of representation is limited, which has led to the proposal of BigSMILES as an alternative method suitable for the representation of macromolecules. Nevertheless, research on BigSMILES remains limited due to its preprocessing requirements. Thus, this study proposes a conversion workflow of BigSMILES, focusing on its automated generation from SMILES representations of homopolymers. BigSMILES representations for 4,927,181 records are provided, thereby enabling its immediate use for various research and development applications. Our study presents detailed descriptions on a validation process to ensure the accuracy, interchangeability, and robustness of the conversion. Additionally, a systematic overview of utilized codes and functions that emphasizes their relevance in the context of BigSMILES generation are produced. This advancement is anticipated to significantly aid researchers and facilitate further studies in BigSMILES representation, including potential applications in deep learning and further extension to complex structures such as copolymers.

3.
PLoS One ; 18(11): e0294513, 2023.
Article En | MEDLINE | ID: mdl-37972018

Traditionally, datasets with multiple censored time-to-events have not been utilized in multivariate analysis because of their high level of complexity. In this paper, we propose the Censored Time Interval Analysis (CTIVA) method to address this issue. It estimates the joint probability distribution of actual event times in the censored dataset by implementing a statistical probability density estimation technique on the dataset. Based on the acquired event time, CTIVA investigates variables correlated with the interval time of events via statistical tests. The proposed method handles both categorical and continuous variables simultaneously-thus, it is suitable for application on real-world censored time-to-event datasets, which include both categorical and continuous variables. CTIVA outperforms traditional censored time-to-event data handling methods by 5% on simulation data. The average area under the curve (AUC) of the proposed method on the simulation dataset exceeds 0.9 under various conditions. Further, CTIVA yields novel results on National Sample Cohort Demo (NSCD) and proteasome inhibitor bortezomib dataset, a real-world censored time-to-event dataset of medical history of beneficiaries provided by the National Health Insurance Sharing Service (NHISS) and National Center for Biotechnology Information (NCBI). We believe that the development of CTIVA is a milestone in the investigation of variables correlated with interval time of events in presence of censoring.


Survival Analysis , Humans , Computer Simulation , Probability , Multivariate Analysis , Time Factors
4.
Sci Rep ; 13(1): 17201, 2023 10 11.
Article En | MEDLINE | ID: mdl-37821628

Immunoglobulin A nephropathy (IgAN) is the most common primary glomerulonephritis worldwide. The clinical relevance of 11 urinary exosomal microRNAs (miRNAs) was evaluated in patients with IgAN. From January 2009 to November 2018, IgAN (n = 93), disease control (n = 11), and normal control (n = 19) groups were enrolled. We evaluated the expression levels of urinary exosomal miRNAs at the baseline and their relationship with clinical and pathologic features. This study aimed to discriminate statistically powerful urinary exosomal miRNAs for the prognosis of IgAN. Urinary miRNA levels of miR-16-5p, miR-29a-3p, miR-124-3p, miR-126-3p, miR-199a-3p, miR-199b-5p, and miR-335-3p showed significant correlation with both estimated glomerular filtration rate (eGFR) and urine protein-to-creatinine ratio (uPCR). In univariate regression analysis, age, body mass index, hypertension, eGFR, uPCR, Oxford classification E, and three miRNAs (miR-16-5p, miR-199a-3p, and miR-335-3p) were associated with disease progression in patients with IgAN. The area under the curve (AUC) of miR-199a-3p was high enough (0.749) without any other clinical or pathologic factors, considering that the AUC of the International IgAN Risk Prediction Tool was 0.853. Urinary exosomal miRNAs may serve as alternative prognostic biomarkers of IgAN with further research.


Glomerulonephritis, IGA , MicroRNAs , Humans , Glomerulonephritis, IGA/pathology , Clinical Relevance , MicroRNAs/metabolism , Prognosis , Disease Progression , Biomarkers/urine
6.
Front Immunol ; 14: 1190576, 2023.
Article En | MEDLINE | ID: mdl-37228607

Introduction: Acute rejection (AR) continues to be a significant obstacle for short- and long-term graft survival in kidney transplant recipients. Herein, we aimed to examine urinary exosomal microRNAs with the objective of identifying novel biomarkers of AR. Materials and methods: Candidate microRNAs were selected using NanoString-based urinary exosomal microRNA profiling, meta-analysis of web-based, public microRNA database, and literature review. The expression levels of these selected microRNAs were measured in the urinary exosomes of 108 recipients of the discovery cohort using quantitative real-time polymerase chain reaction (qPCR). Based on the differential microRNA expressions, AR signatures were generated, and their diagnostic powers were determined by assessing the urinary exosomes of 260 recipients in an independent validation cohort. Results: We identified 29 urinary exosomal microRNAs as candidate biomarkers of AR, of which 7 microRNAs were differentially expressed in recipients with AR, as confirmed by qPCR analysis. A three-microRNA AR signature, composed of hsa-miR-21-5p, hsa-miR-31-5p, and hsa-miR-4532, could discriminate recipients with AR from those maintaining stable graft function (area under the curve [AUC] = 0.85). This signature exhibited a fair discriminative power in the identification of AR in the validation cohort (AUC = 0.77). Conclusion: We have successfully demonstrated that urinary exosomal microRNA signatures may form potential biomarkers for the diagnosis of AR in kidney transplantation recipients.


Kidney Transplantation , MicroRNAs , Humans , Kidney Transplantation/adverse effects , MicroRNAs/genetics , Biomarkers , Real-Time Polymerase Chain Reaction
7.
Sci Rep ; 13(1): 435, 2023 Mar 06.
Article En | MEDLINE | ID: mdl-36878960

The significance of simulation has been increasing in device design due to the cost of real test. The accuracy of the simulation increases as the resolution of the simulation increases. However, the high-resolution simulation is not suited for actual device design because the amount of computing exponentially increases as the resolution increases. In this study, we introduce a model that predicts high-resolution outcomes using low-resolution calculated values which successfully achieves high simulation accuracy with low computational cost. The fast residual learning super-resolution (FRSR) convolutional network model is a model that we introduced that can simulate electromagnetic fields of optical. Our model achieved high accuracy when using the super-resolution technique on a 2D slit array under specific circumstances and achieved an approximately 18 times faster execution time than the simulator. To reduce the model training time and enhance performance, the proposed model shows the best accuracy (R2: 0.9941) by restoring high-resolution images using residual learning and a post-upsampling method to reduce computation. It has the shortest training time among the models that use super-resolution (7000 s). This model addresses the issue of temporal limitations of high-resolution simulations of device module characteristics.

8.
Appl Clin Inform ; 13(4): 880-890, 2022 08.
Article En | MEDLINE | ID: mdl-36130711

BACKGROUND: A computerized 12-lead electrocardiogram (ECG) can automatically generate diagnostic statements, which are helpful for clinical purposes. Standardization is required for big data analysis when using ECG data generated by different interpretation algorithms. The common data model (CDM) is a standard schema designed to overcome heterogeneity between medical data. Diagnostic statements usually contain multiple CDM concepts and also include non-essential noise information, which should be removed during CDM conversion. Existing CDM conversion tools have several limitations, such as the requirement for manual validation, inability to extract multiple CDM concepts, and inadequate noise removal. OBJECTIVES: We aim to develop a fully automated text data conversion algorithm that overcomes limitations of existing tools and manual conversion. METHODS: We used interpretations printed by 12-lead resting ECG tests from three different vendors: GE Medical Systems, Philips Medical Systems, and Nihon Kohden. For automatic mapping, we first constructed an ontology-lexicon of ECG interpretations. After clinical coding, an optimized tool for converting ECG interpretation to CDM terminology is developed using term-based text processing. RESULTS: Using the ontology-lexicon, the cosine similarity-based algorithm and rule-based hierarchical algorithm showed comparable conversion accuracy (97.8 and 99.6%, respectively), while an integrated algorithm based on a heuristic approach, ECG2CDM, demonstrated superior performance (99.9%) for datasets from three major vendors. CONCLUSION: We developed a user-friendly software that runs the ECG2CDM algorithm that is easy to use even if the user is not familiar with CDM or medical terminology. We propose that automated algorithms can be helpful for further big data analysis with an integrated and standardized ECG dataset.


Electrocardiography , Vocabulary , Algorithms , Databases, Factual , Software
9.
Sci Rep ; 12(1): 1140, 2022 Jan 21.
Article En | MEDLINE | ID: mdl-35064166

The simulation and design of electronic devices such as transistors is vital for the semiconductor industry. Conventionally, a device is intuitively designed and simulated using model equations, which is a time-consuming and expensive process. However, recent machine learning approaches provide an unprecedented opportunity to improve these tasks by training the underlying relationships between the device design and the specifications derived from the extensively accumulated simulation data. This study implements various machine learning approaches for the simulation acceleration and inverse-design problems of fin field-effect transistors. In comparison to traditional simulators, the proposed neural network model demonstrated almost equivalent results (R2 = 0.99) and was more than 122,000 times faster in simulation. Moreover, the proposed inverse-design model successfully generated design parameters that satisfied the desired target specifications with high accuracies (R2 = 0.96). Overall, the results demonstrated that the proposed machine learning models aided in achieving efficient solutions for the simulation and design problems pertaining to electronic devices. Thus, the proposed approach can be further extended to more complex devices and other vital processes in the semiconductor industry.

10.
Front Endocrinol (Lausanne) ; 12: 774436, 2021.
Article En | MEDLINE | ID: mdl-34858345

The clinical manifestations of diabetic kidney disease (DKD) are more heterogeneous than those previously reported, and these observations mandate the need for the recruitment of patients with biopsy-proven DKD in biomarker research. In this study, using the public gene expression omnibus (GEO) repository, we aimed to identify urinary mRNA biomarkers that can predict histological severity and disease progression in patients with DKD in whom the diagnosis and histologic grade has been confirmed by kidney biopsy. We identified 30 DKD-specific mRNA candidates based on the analysis of the GEO datasets. Among these, there were significant alterations in the urinary levels of 17 mRNAs in patients with DKD, compared with healthy controls. Four urinary mRNAs-LYZ, C3, FKBP5, and G6PC-reflected tubulointerstitial inflammation and fibrosis in kidney biopsy and could predict rapid progression to end-stage kidney disease independently of the baseline eGFR (tertile 1 vs. tertile 3; adjusted hazard ratio of 9.68 and 95% confidence interval of 2.85-32.87, p < 0.001). In conclusion, we demonstrated that urinary mRNA signatures have a potential to indicate the pathologic status and predict adverse renal outcomes in patients with DKD.


Diabetic Nephropathies/diagnosis , Kidney Function Tests/methods , RNA, Messenger/urine , Adult , Aged , Biomarkers/urine , Biopsy , Case-Control Studies , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/pathology , Diabetes Mellitus, Type 2/urine , Diabetic Nephropathies/genetics , Diabetic Nephropathies/pathology , Diabetic Nephropathies/urine , Disease Progression , Female , Glomerular Filtration Rate , Humans , Kidney/metabolism , Kidney/pathology , Kidney Failure, Chronic/diagnosis , Kidney Failure, Chronic/genetics , Kidney Failure, Chronic/pathology , Kidney Failure, Chronic/urine , Male , Middle Aged , Prognosis , Republic of Korea , Transcriptome
11.
Sensors (Basel) ; 21(18)2021 Sep 15.
Article En | MEDLINE | ID: mdl-34577397

Conventional predictive Artificial Neural Networks (ANNs) commonly employ deterministic weight matrices; therefore, their prediction is a point estimate. Such a deterministic nature in ANNs causes the limitations of using ANNs for medical diagnosis, law problems, and portfolio management in which not only discovering the prediction but also the uncertainty of the prediction is essentially required. In order to address such a problem, we propose a predictive probabilistic neural network model, which corresponds to a different manner of using the generator in the conditional Generative Adversarial Network (cGAN) that has been routinely used for conditional sample generation. By reversing the input and output of ordinary cGAN, the model can be successfully used as a predictive model; moreover, the model is robust against noises since adversarial training is employed. In addition, to measure the uncertainty of predictions, we introduce the entropy and relative entropy for regression problems and classification problems, respectively. The proposed framework is applied to stock market data and an image classification task. As a result, the proposed framework shows superior estimation performance, especially on noisy data; moreover, it is demonstrated that the proposed framework can properly estimate the uncertainty of predictions.


Neural Networks, Computer , Uncertainty
12.
Front Immunol ; 12: 656632, 2021.
Article En | MEDLINE | ID: mdl-34177898

Urine has been regarded as a good resource based on the assumption that urine can directly reflect the state of the allograft or ongoing injury in kidney transplantation. Previous studies, suggesting the usefulness of urinary mRNA as a biomarker of acute rejection, imply that urinary mRNA mirrors the transcriptional activity of the kidneys. We selected 14 data-driven candidate genes through a meta-analysis and measured the candidate genes using quantitative PCR without pre-amplification in the cross-sectional specimens from Korean kidney transplant patients. Expression of 9/14 genes (CXCL9, CD3ϵ, IP-10, LCK, C1QB, PSMB9, Tim-3, Foxp3, and FAM26F) was significantly different between acute rejection and stable graft function with normal pathology and long-term graft survival in 103 training samples. CXCL9 was also distinctly expressed in allografts with acute rejection in in situ hybridization analysis. This result, consistent with the qPCR result, implies that urinary mRNA could reflect the magnitude of allograft injury. We developed an AR prediction model with the urinary mRNAs by a binary logistic regression and the AUC of the model was 0.89 in the training set. The model was validated in 391 independent samples, and the AUC value yielded 0.84 with a fixed manner. In addition, the decision curve analysis indicated a range of reasonable threshold probabilities for biopsy. Therefore, we suggest the urine mRNA signature could be used as a non-invasive monitoring tool of acute rejection for clinical application and could help determine whether to perform a biopsy in a recipient with increased creatinine.


Allografts/immunology , Biomarkers , Graft Rejection/diagnosis , Graft Rejection/etiology , Kidney Transplantation/adverse effects , Liquid Biopsy/methods , RNA, Messenger/genetics , Acute Disease , Adult , Biomarkers/urine , Cell-Free Nucleic Acids , Female , Humans , Immunohistochemistry , Kidney Transplantation/methods , Male , Middle Aged , ROC Curve , Reproducibility of Results
14.
Sci Rep ; 10(1): 20265, 2020 11 20.
Article En | MEDLINE | ID: mdl-33219276

Pathology reports contain the essential data for both clinical and research purposes. However, the extraction of meaningful, qualitative data from the original document is difficult due to the narrative and complex nature of such reports. Keyword extraction for pathology reports is necessary to summarize the informative text and reduce intensive time consumption. In this study, we employed a deep learning model for the natural language process to extract keywords from pathology reports and presented the supervised keyword extraction algorithm. We considered three types of pathological keywords, namely specimen, procedure, and pathology types. We compared the performance of the present algorithm with the conventional keyword extraction methods on the 3115 pathology reports that were manually labeled by professional pathologists. Additionally, we applied the present algorithm to 36,014 unlabeled pathology reports and analysed the extracted keywords with biomedical vocabulary sets. The results demonstrated the suitability of our model for practical application in extracting important data from pathology reports.


Algorithms , Deep Learning , Electronic Health Records , Natural Language Processing , Humans
15.
PLoS One ; 15(10): e0239760, 2020.
Article En | MEDLINE | ID: mdl-33002010

In general survival analysis, multiple studies have considered a single failure time corresponding to the time to the event of interest or to the occurrence of multiple events under the assumption that each event is independent. However, in real-world events, one event may impact others. Essentially, the potential structure of the occurrence of multiple events can be observed in several survival datasets. The interrelations between the times to the occurrences of events are immensely challenging to analyze because of the presence of censoring. Censoring commonly arises in longitudinal studies in which some events are often not observed for some of the subjects within the duration of research. Although this problem presents the obstacle of distortion caused by censoring, the advanced multivariate survival analysis methods that handle multiple events with censoring make it possible to measure a bivariate probability density function for a pair of events. Considering this improvement, this paper proposes a method called censored network estimation to discover partially correlated relationships and construct the corresponding network composed of edges representing non-zero partial correlations on multiple censored events. To demonstrate its superior performance compared to conventional methods, the selecting power for the partially correlated events was evaluated in two types of networks with iterative simulation experiments. Additionally, the correlation structure was investigated on the electronic health records dataset of the times to the first diagnosis for newborn babies in South Korea. The results show significantly improved performance as compared to edge measurement with competitive methods and reliability in terms of the interrelations of real-life diseases.


Multivariate Analysis , Survival Analysis , Data Interpretation, Statistical , Humans , Models, Statistical , Statistics as Topic , Time Factors
16.
Sci Rep ; 10(1): 10535, 2020 Jun 29.
Article En | MEDLINE | ID: mdl-32601349

When designing new optical devices, many simulations must be conducted to determine the optimal design parameters. Therefore, fast and accurate simulations are essential for designing optical devices. In this work, we introduce a deep learning approach that accelerates a simulator solving frequency-domain Maxwell equations. Our model achieves high accuracy while predicting transmittance per wavelength in 2D slit arrays under certain conditions to achieve 160,000 times faster results than the simulator. We generated a dataset using an open-source simulator and compared its performance with those of other machine learning models. Additionally, we propose a new loss function and performance evaluation method for creating better performance models with multiple regression outputs from one input source. We observed that using a loss function that adds binary cross-entropy loss, which predicts whether the differential of the transmittance is positive or negative at wavelengths adjacent to the root mean-squared error of the transmittance value, is more effective for predicting variations in multiple regression outputs. The simulation results show that a four-layer convolutional neural network model demonstrates the best accuracy (R2 score: 0.86). The overall approach presented here is expected to be useful for simulating and designing optical devices.

17.
PLoS One ; 15(6): e0234323, 2020.
Article En | MEDLINE | ID: mdl-32530943

We investigated the phenotype and molecular signatures of CD8+ T cell subsets in kidney-transplant recipients (KTRs) with biopsy-proven T cell-mediated rejection (TCMR). We included 121 KTRs and divided them into three groups according to the pathologic or clinical diagnosis: Normal biopsy control (NC)(n = 32), TCMR (n = 50), and long-term graft survival (LTGS)(n = 39). We used flowcytometry and microarray to analyze the phenotype and molecular signatures of CD8+ T cell subsets using peripheral blood from those patients and analyzed significant gene expressions according to CD8+ T cell subsets. We investigated whether the analysis of CD8+ T cell subsets is useful for predicting the development of TCMR. CCR7+CD8+ T cells significantly decreased, but CD28nullCD57+CD8+ T cells and CCR7-CD45RA+CD8+ T cells showed an increase in the TCMR group compared to other groups (p<0.05 for each); hence CCR7+CD8+ T cells showed significant negative correlations to both effector CD8+ T cells. We identified genes significantly associated with the change of CCR7+CD8+ T, CCR7-CD45RA+CD8+ T, and CD28nullCD57+CD8+ T cells in an ex vivo study and found that most of them were included in the significant genes on in vitro CCR7+CD8+ T cells. Finally, the decrease of CCR7+CD8+ T cells relative to CD28nullCD57+ T or CCR7-CD45RA+CD8+ T cells can predict TCMR significantly in the whole clinical cohort. In conclusion, phenotype and molecular signature of CD8+ T subsets showed a significant relationship to the development of TCMR; hence monitoring of CD8+ T cell subsets may be a useful for predicting TCMR in KTRs.


CD8-Positive T-Lymphocytes/immunology , Graft Rejection/immunology , Kidney Transplantation/adverse effects , T-Lymphocyte Subsets/immunology , Adult , CD28 Antigens/genetics , CD28 Antigens/immunology , CD57 Antigens/genetics , CD57 Antigens/immunology , CD8-Positive T-Lymphocytes/classification , Cross-Sectional Studies , Female , Gene Expression Profiling , Graft Rejection/etiology , Graft Rejection/genetics , Healthy Volunteers , Humans , Immunophenotyping , In Vitro Techniques , Male , Middle Aged , Oligonucleotide Array Sequence Analysis , Phenotype , Receptors, CCR7/genetics , Receptors, CCR7/immunology , T-Lymphocyte Subsets/classification
18.
Sci Rep ; 10(1): 6339, 2020 04 14.
Article En | MEDLINE | ID: mdl-32286339

Exposure to environment-polluting chemicals (EPC) is associated with the development of diabetes. Many EPCs exert toxic effects via aryl hydrocarbon receptor (AhR) and/or mitochondrial inhibition. Here we investigated if the levels of human exposure to a mixture of EPC and/or mitochondrial inhibitors could predict the development of diabetes in a prospective study, the Korean Genome and Epidemiological Study (KoGES). We analysed AhR ligands (AhRL) and mitochondria-inhibiting substances (MIS) in serum samples (n = 1,537), collected during the 2008 Ansung KoGES survey with a 4-year-follow-up. Serum AhRL, determined by the AhR-dependent luciferase reporter assay, represents the contamination level of AhR ligand mixture in serum. Serum levels of MIS, analysed indirectly by MIS-ATP or MIS-ROS, are the serum MIS-induced mitochondria inhibiting effects on ATP content or reactive oxygen species (ROS) production in the cultured cells. Among 919 normal subjects at baseline, 7.1% developed impaired glucose tolerance (IGT) and 1.6% diabetes after 4 years. At the baseline, diabetic and IGT sera displayed higher AhRL and MIS than normal sera, which correlated with indices of insulin resistance. When the subjects were classified according to ROC cut-off values, fully adjusted relative risks of diabetes development within 4 years were 7.60 (95% CI, 4.23-13.64), 4.27 (95% CI, 2.38-7.64), and 21.11 (95% CI, 8.46-52.67) for AhRL ≥ 2.70 pM, MIS-ATP ≤ 88.1%, and both, respectively. Gender analysis revealed that male subjects with AhRL ≥ 2.70 pM or MIS-ATP ≤ 88.1% showed higher risk than female subjects. High serum levels of AhRL and/or MIS strongly predict the future development of diabetes, suggesting that the accumulation of AhR ligands and/or mitochondrial inhibitors in body may play an important role in the pathogenesis of diabetes.


Air Pollutants/toxicity , Basic Helix-Loop-Helix Transcription Factors/genetics , Biomarkers/blood , Diabetes Mellitus/blood , Mitochondria/drug effects , Receptors, Aryl Hydrocarbon/genetics , Aged , Basic Helix-Loop-Helix Transcription Factors/blood , Diabetes Mellitus/chemically induced , Diabetes Mellitus/pathology , Environmental Biomarkers/genetics , Female , Glucose Intolerance/blood , Glucose Intolerance/genetics , Glucose Tolerance Test , Humans , Insulin Resistance/genetics , Ligands , Male , Middle Aged , Reactive Oxygen Species/metabolism , Receptors, Aryl Hydrocarbon/blood , Republic of Korea
19.
Cancer Res Treat ; 52(3): 764-778, 2020 Jul.
Article En | MEDLINE | ID: mdl-32065847

PURPOSE: The purpose of this study was to identify the concordant or discordant genomic profiling between primary and matched metastatic tumors in patients with colorectal cancer (CRC) and to explore the clinical implication. MATERIALS AND METHODS: Surgical samples of primary and matched metastatic tissues from 158 patients (335 samples) with CRC at Korea University Anam Hospital were evaluated using the Ion AmpliSeq Cancer Hotspot Panel. We compared genetic variants and classified them as concordant, primary-specific, and metastasis-specific variants. We used a combination of principal components analysis and clustering to find genomic groups. Kaplan-Meier curves were used to appraise survival between genomic groups. We used machine learning to confirm the correlation between genetic variants and metastatic sites. RESULTS: A total of 282 types of deleterious non-synonymous variants were selected for analysis. Of a total of 897 variants, an average of 40% was discordant. Three genomic groups were yielded based on the genomic discrepancy patterns. Overall survival differed significantly between the genomic groups. The poorest group had the highest proportion of concordant KRAS G12V and additional metastasis-specific SMAD4. Correlation analysis between genetic variants and metastatic sites suggested that concordant KRAS mutations would have more disseminated metastases. CONCLUSION: Driver gene mutations were mostly concordant; however, discordant or metastasis-specific mutations were present. Clinically, the concordant driver genetic changes with additional metastasis-specific variants can predict poor prognosis for patients with CRC.


Adenocarcinoma, Mucinous/mortality , Biomarkers, Tumor/genetics , Colorectal Neoplasms/mortality , Colorectal Surgery/mortality , High-Throughput Nucleotide Sequencing/methods , Metastasectomy/mortality , Mutation , Adenocarcinoma, Mucinous/genetics , Adenocarcinoma, Mucinous/secondary , Adenocarcinoma, Mucinous/surgery , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Colorectal Neoplasms/surgery , Female , Follow-Up Studies , Humans , Male , Middle Aged , Neoplasm Metastasis , Prognosis , Survival Rate
20.
Bioinformatics ; 35(23): 4898-4906, 2019 12 01.
Article En | MEDLINE | ID: mdl-31095279

MOTIVATION: Network-based analysis of biomedical data has been extensively studied over the last decades. As a successful application, gene networks have been used to illustrate interactions among genes and explain the associated phenotypes. However, the gene network approaches have not been actively applied for survival analysis, which is one of the main interests of biomedical research. In addition, a few previous studies using gene networks for survival analysis construct networks mainly from prior knowledge, such as pathways, regulations and gene sets, while the performance considerably depends on the selection of prior knowledge. RESULTS: In this paper, we propose a data-driven construction method for survival risk-gene networks as well as a survival risk prediction method using the network structure. The proposed method constructs risk-gene networks with survival-associated genes using penalized regression. Then, gene expression indices are hierarchically adjusted through the networks to reduce the variance intrinsic in datasets. By illustrating risk-gene structure, the proposed method is expected to provide an intuition for the relationship between genes and survival risks. The risk-gene network is applied to a low grade glioma dataset, and produces a hypothesis of the relationship between genetic biomarkers of low and high grade glioma. Moreover, with multiple datasets, we demonstrate that the proposed method shows superior prediction performance compared to other conventional methods. AVAILABILITY AND IMPLEMENTATION: The R package of risk-gene networks is freely available in the web at http://cdal.korea.ac.kr/NetDA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Algorithms , Gene Regulatory Networks , Computational Biology , Gene Expression , Survival Analysis
...