RESUMEN
In China, patients usually determine their medical specialty before they register the corresponding specialists in the hospitals. This process usually requires a lot of medical knowledge for the patients. As a result, many patients do not register the correct specialty for the first time if they do not receive help from the hospitals. In this study, we try to automatically direct the patients to the appropriate specialty based on the symptoms they described. As far as we know, this is the first study to solve the problem. We propose a neural network-based model based on a hybrid model integrated with an attention mechanism. To prove the actual effect of this hybrid model, we utilized a data set of more than 40,000 items, including eight departments, such as Otorhinolaryngology, Pediatrics, and other common departments. The experiment results show that the hybrid model achieves more than 93.5% accuracy and has a high generalization capacity, which is superior to traditional classification models.
Asunto(s)
Medicina , Redes Neurales de la Computación , Humanos , Niño , ChinaRESUMEN
BACKGROUND: Classifying cancers by gene selection is among the most important and challenging procedures in biomedicine. A major challenge is to design an effective method that eliminates irrelevant, redundant, or noisy genes from the classification, while retaining all of the highly discriminative genes. RESULTS: We propose a gene selection method, called local hyperplane-based discriminant analysis (LHDA). LHDA adopts two central ideas. First, it uses a local approximation rather than global measurement; second, it embeds a recently reported classification model, K-Local Hyperplane Distance Nearest Neighbor(HKNN) classifier, into its discriminator. Through classification accuracy-based iterations, LHDA obtains the feature weight vector and finally extracts the optimal feature subset. The performance of the proposed method is evaluated in extensive experiments on synthetic and real microarray benchmark datasets. Eight classical feature selection methods, four classification models and two popular embedded learning schemes, including k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), Support Vector Machine (SVM) and Random Forest are employed for comparisons. CONCLUSION: The proposed method yielded comparable to or superior performances to seven state-of-the-art models. The nice performance demonstrate the superiority of combining feature weighting with model learning into an unified framework to achieve the two tasks simultaneously.
Asunto(s)
Análisis por Conglomerados , Análisis Discriminante , Aprendizaje Automático/normas , Neoplasias/clasificación , Neoplasias/genética , Máquina de Vectores de Soporte , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , HumanosRESUMEN
To investigate the efficacy of anserine on antiobesity, C57BL/6 mice are orally administered with a high-fat diet (HFD) and different doses of anserine (60, 120, and 240 mg/kg/day) for 16 weeks. Body weight, lipid, and epididymal fat content in mice are measured, and their liver damage is observed. The results display that the body weight, epididymal fat content, and low-density lipoprotein cholesterol (LDL-C) content in anserine groups are decreased by 4.36-18.71%, 7.57-35.12%, and 24.32-44.40%, respectively. To further investigate the antiobesity mechanism of anserine, the expression of SREBP-1, NLRP3, NF-κB p65 (p65), and p-NF-κB p65 (p-p65) proteins in the liver and peroxisome proliferator-activated receptor gamma coactivator 1α (PGC1-α) and UCP-1 proteins in brown adipose tissue (BAT) is analyzed by Western blot. Results show that anserine can significantly decrease the expression of the NLRP3, p65, p-p65, and the SREBP-1 proteins and increase the expression of the PGC1-α and UCP-1 proteins. This study demonstrates that anserine lowered blood lipids and prevented obesity; its antiobesity mechanism may be related to the activation of brown fat by inflammation.
Asunto(s)
Fármacos Antiobesidad , Dieta Alta en Grasa , Ratones , Animales , Dieta Alta en Grasa/efectos adversos , Anserina , Proteína 1 de Unión a los Elementos Reguladores de Esteroles/genética , Proteína con Dominio Pirina 3 de la Familia NLR , FN-kappa B , Ratones Endogámicos C57BL , Obesidad/tratamiento farmacológico , Obesidad/etiología , Obesidad/metabolismo , Peso Corporal , Fármacos Antiobesidad/farmacologíaRESUMEN
Background: The prognosis of breast cancer is often unfavorable, emphasizing the need for early metastasis risk detection and accurate treatment predictions. This study aimed to develop a novel multi-modal deep learning model using preoperative data to predict disease-free survival (DFS). Methods: We retrospectively collected pathology imaging, molecular and clinical data from The Cancer Genome Atlas and one independent institution in China. We developed a novel Deep Learning Clinical Medicine Based Pathological Gene Multi-modal (DeepClinMed-PGM) model for DFS prediction, integrating clinicopathological data with molecular insights. The patients included the training cohort (n = 741), internal validation cohort (n = 184), and external testing cohort (n = 95). Result: Integrating multi-modal data into the DeepClinMed-PGM model significantly improved area under the receiver operating characteristic curve (AUC) values. In the training cohort, AUC values for 1-, 3-, and 5-year DFS predictions increased to 0.979, 0.957, and 0.871, while in the external testing cohort, the values reached 0.851, 0.878, and 0.938 for 1-, 2-, and 3-year DFS predictions, respectively. The DeepClinMed-PGM's robust discriminative capabilities were consistently evident across various cohorts, including the training cohort [hazard ratio (HR) 0.027, 95% confidence interval (CI) 0.0016-0.046, P < 0.0001], the internal validation cohort (HR 0.117, 95% CI 0.041-0.334, P < 0.0001), and the external cohort (HR 0.061, 95% CI 0.017-0.218, P < 0.0001). Additionally, the DeepClinMed-PGM model demonstrated C-index values of 0.925, 0.823, and 0.864 within the three cohorts, respectively. Conclusion: This study introduces an approach to breast cancer prognosis, integrating imaging and molecular and clinical data for enhanced predictive accuracy, offering promise for personalized treatment strategies.
RESUMEN
Although immunotherapy has revolutionized cancer management, most patients do not derive benefits from it. Aiming to explore an appropriate strategy for immunotherapy efficacy prediction, we collected 6251 patients' transcriptome data from multicohort population and analyzed the data using a machine learning algorithm. In this study, we found that patients from three immune gene clusters had different overall survival when treated with immunotherapy (P < 0.001), and that these clusters had differential states of hypoxia scores and metabolism functions. The immune gene score showed good immunotherapy efficacy prediction (AUC was 0.737 at 20 months), which was well validated. The immune gene score, tumor mutation burden, and long non-coding RNA score were further combined to build a tumor immune microenvironment signature, which correlated more strongly with overall survival (AUC, 0.814 at 20 months) than when using a single variable. Thus, we recommend using the characterization of the tumor immune microenvironment associated with immunotherapy efficacy via a multi-omics analysis of cancer.
RESUMEN
The hourly concentrations of 102 volatile organic compounds (VOCs) in Wuhan from June to July in 2019 were obtained using an online monitoring instrument. The ρ(VOCs) varied from 24.9 to 254 µg·m-3, with a mean value of (67.7±32.2) µg·m-3. According to the air quality standard of ozone, the observation period was divided into clean and polluted episodes of O3. The differences in meteorological parameters, VOC concentrations, compositions, sources, and ozone formation potential (OFP) between clean and polluted episodes were analyzed and compared. The average mass concentrations of NOx, CO, and VOCs in polluted periods exceeded those of clean periods by 34.9%, 25.0%, and 27.8%, respectively. The mass concentrations of alkanes, alkenes, aromatic hydrocarbons, and oxygenated volatile organic compounds in polluted periods were higher than those in clean periods by 40.7%, 39.5%, 26.9%, and 21.5%, respectively. The average OFP in polluted periods[(102±69.6) µg·m-3] exceeded that of clean periods by 33.5%. The average contribution rates of LPG combustion, industrial sources, vehicle emissions, natural sources, and solvent usage to VOCs were 3.4%, 2.5%, 0.2%, 1.3%, and 1.4% lower than those of the clean periods, respectively, whereas the gasoline evaporation increased by 8.8% in polluted periods. The contributions of vehicle emissions and gasoline evaporation exhibited higher values in the morning and evening, with lower values in the afternoon, which may have been related to peak vehicles emissions. The contribution of LPG combustion peaked along with the cooking time. The concentration weighted trajectory showed that the main sources of VOCs in polluted periods were from local emissions and surrounding regions in the northeastern direction of Wuhan. In polluted periods, gasoline evaporation and LPG combustion should be emphasized for preventing O3 pollution in the summer in Wuhan.
Asunto(s)
Contaminantes Atmosféricos , Ozono , Compuestos Orgánicos Volátiles , Contaminantes Atmosféricos/análisis , China , Monitoreo del Ambiente , Gasolina , Ozono/análisis , Emisiones de Vehículos/análisis , Compuestos Orgánicos Volátiles/análisisRESUMEN
Variations in DNA copy number carry important information on genome evolution and regulation of DNA replication in cancer cells. The rapid development of single-cell sequencing technology enables exploration of gene-expression heterogeneity among single cells, providing important information on cell evolution. Evolutionary relationships in accumulated sequence data can be visualized by adjacent positioning of similar cells so that similar copy-number profiles are shown by block patterns. However, single-cell DNA sequencing data usually have low amount of starting genome, which requires an extra step of amplification to accumulate sufficient samples, introducing noise and making regular pattern-finding challenging. In this paper, we will propose to tackle this issue of recovering the hidden blocks within single-cell DNA-sequencing data through continuous sample permutations such that similar samples are positioned adjacently. The permutation is guided by the total variational norm of the recovered copy number profiles, and is continued until the total variational norm is minimized when similar samples are stacked together to reveal block patterns. An efficient numerical scheme for finding this permutation is designed, tailored from the alternating direction method of multipliers. Application of this method to both simulated and real data demonstrates its ability to recover the hidden structures of single-cell DNA sequences.
Asunto(s)
Variaciones en el Número de Copia de ADN , Neoplasias/patología , Biología Computacional , Humanos , Análisis de Secuencia de ADN , Análisis de la Célula IndividualRESUMEN
It is known that copy number variations (CNVs) are associated with complex diseases and particular tumor types, thus reliable identification of CNVs is of great potential value. Recent advances in next generation sequencing (NGS) data analysis have helped manifest the richness of CNV information. However, the performances of these methods are not consistent. Reliably finding CNVs in NGS data in an efficient way remains a challenging topic, worthy of further investigation. Accordingly, we tackle the problem by formulating CNVs identification into a quadratic optimization problem involving two constraints. By imposing the constraints of sparsity and smoothness, the reconstructed read depth signal from NGS is anticipated to fit the CNVs patterns more accurately. An efficient numerical solution tailored from alternating direction minimization (ADM) framework is elaborated. We demonstrate the advantages of the proposed method, namely ADM-CNV, by comparing it with six popular CNV detection methods using synthetic, simulated, and empirical sequencing data. It is shown that the proposed approach can successfully reconstruct CNV patterns from raw data, and achieve superior or comparable performance in detection of the CNVs compared to the existing counterparts.