Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37982712

RESUMO

Interpretation of cryo-electron microscopy (cryo-EM) maps requires building and fitting 3D atomic models of biological molecules. AlphaFold-predicted models generate initial 3D coordinates; however, model inaccuracy and conformational heterogeneity often necessitate labor-intensive manual model building and fitting into cryo-EM maps. In this work, we designed a protein model-building workflow, which combines a deep-learning cryo-EM map feature enhancement tool, CryoFEM (Cryo-EM Feature Enhancement Model) and AlphaFold. A benchmark test using 36 cryo-EM maps shows that CryoFEM achieves state-of-the-art performance in optimizing the Fourier Shell Correlations between the maps and the ground truth models. Furthermore, in a subset of 17 datasets where the initial AlphaFold predictions are less accurate, the workflow significantly improves their model accuracy. Our work demonstrates that the integration of modern deep learning image enhancement and AlphaFold may lead to automated model building and fitting for the atomistic interpretation of cryo-EM maps.


Assuntos
Aprendizado Profundo , Microscopia Crioeletrônica/métodos , Modelos Moleculares , Conformação Molecular , Conformação Proteica
2.
J Phys Chem A ; 128(10): 1948-1957, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38416723

RESUMO

Accurate classification of molecular chemical motifs from experimental measurement is an important problem in molecular physics, chemistry, and biology. In this work, we present neural network ensemble classifiers for predicting the presence (or lack thereof) of 41 different chemical motifs on small molecules from simulated C, N, and O K-edge X-ray absorption near-edge structure (XANES) spectra. Our classifiers not only achieve class-balanced accuracies of more than 0.95 but also accurately quantify uncertainty. We also show that including multiple XANES modalities improves predictions notably on average, demonstrating a "multimodal advantage" over any single modality. In addition to structure refinement, our approach can be generalized to broad applications with molecular design pipelines.

3.
Entropy (Basel) ; 23(4)2021 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-33924721

RESUMO

Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. One of the potential schemes to achieve this property is the federated learning (FL), which consists of several clients or local nodes learning on their own data and a central node to aggregate the models collected from those local nodes. However, to the best of our knowledge, no work has been done in quantum machine learning (QML) in federation setting yet. In this work, we present the federated training on hybrid quantum-classical machine learning models although our framework could be generalized to pure quantum machine learning model. Specifically, we consider the quantum neural network (QNN) coupled with classical pre-trained convolutional model. Our distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.

4.
Phys Rev Lett ; 124(15): 156401, 2020 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-32357067

RESUMO

Simulations of excited state properties, such as spectral functions, are often computationally expensive and therefore not suitable for high-throughput modeling. As a proof of principle, we demonstrate that graph-based neural networks can be used to predict the x-ray absorption near-edge structure spectra of molecules to quantitative accuracy. Specifically, the predicted spectra reproduce nearly all prominent peaks, with 90% of the predicted peak locations within 1 eV of the ground truth. Besides its own utility in spectral analysis and structure inference, our method can be combined with structure search algorithms to enable high-throughput spectrum sampling of the vast material configuration space, which opens up new pathways to material design and discovery.

5.
Nano Lett ; 19(6): 3457-3463, 2019 06 12.
Artigo em Inglês | MEDLINE | ID: mdl-31046292

RESUMO

Due to its chemical stability, titania (TiO2) thin films increasingly have significant impact when applied as passivation layers. However, optimization of growth conditions, key to achieving essential film quality and effectiveness, is challenging in the few-nanometers thickness regime. Furthermore, the atomic-scale structure of the nominally amorphous titania coating layers, particularly when applied to nanostructured supports, is difficult to probe. In this Letter, the quality of titania layers grown on ZnO nanowires is optimized using specific strategies for processing of the nanowire cores prior to titania coating. The best approach, low-pressure O2 plasma treatment, results in significantly more-uniform titania films and a conformal coating. Characterization using X-ray absorption near edge structure (XANES) reveals the titania layer to be highly amorphous, with features in the Ti spectra significantly different from those observed for bulk TiO2 polymorphs. Analysis based on first-principles calculations suggests that the titania shell contains a substantial fraction of under-coordinated Ti4+ ions. The best match to the experimental XANES spectrum is achieved with a "glassy" TiO2 model that contains ∼50% of under-coordinated Ti4+ ions, in contrast to bulk crystalline TiO2 that only contains 6-coordinated Ti4+ ions in octahedral sites.

6.
Plant J ; 86(6): 472-80, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27015116

RESUMO

Transcriptome data sets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by a lack of metadata or differences in annotation styles of different labs. In this study, we carefully selected and integrated 6057 Arabidopsis microarray expression samples from 304 experiments deposited to the Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI). Metadata such as tissue type, growth conditions and developmental stage were manually curated for each sample. We then studied the global expression landscape of the integrated data set and found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome, compared with aerial tissues, but the transcriptome of cultured root is more similar to the transcriptome of aerial tissues, as the cultured root samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating the re-use of plant transcriptome data. As a proof of principle, we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified the accuracy of our predictions with sample metadata provided by the authors.


Assuntos
Arabidopsis/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Proteínas de Arabidopsis/genética , Regulação da Expressão Gênica de Plantas/genética
7.
J Synchrotron Radiat ; 23(Pt 4): 937-46, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27359142

RESUMO

This study investigates the distributions of Br, Ca, Cl, Cr, Cu, K, Fe, Mn, Pb, Ti, V and Zn in Phragmites australis root system and the function of Fe nanoparticles in scavenging metals in the root epidermis using synchrotron X-ray microfluorescence, synchrotron transmission X-ray microscope measurement and synchrotron X-ray absorption near-edge structure techniques. The purpose of this study is to understand the mobility of metals in wetland plant root systems after their uptake from rhizosphere soils. Phragmites australis samples were collected in the Yangtze River intertidal zone in July 2013. The results indicate that Fe nanoparticles are present in the root epidermis and that other metals correlate significantly with Fe, suggesting that Fe nanoparticles play an important role in metal scavenging in the epidermis.

9.
BMC Bioinformatics ; 16: 273, 2015 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-26316173

RESUMO

BACKGROUND: Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. RESULTS: In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. CONCLUSIONS: In our research, we noticed a " 1+1>2" effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast's responses to different levels of glucose. Our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.


Assuntos
Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/genética , Análise de Regressão
10.
IEEE Trans Cybern ; PP2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38593009

RESUMO

While deep neural networks (DNNs) have revolutionized many fields, their fragility to carefully designed adversarial attacks impedes the usage of DNNs in safety-critical applications. In this article, we strive to explore the robust features that are not affected by the adversarial perturbations, that is, invariant to the clean image and its adversarial examples (AEs), to improve the model's adversarial robustness. Specifically, we propose a feature disentanglement model to segregate the robust features from nonrobust features and domain-specific features. The extensive experiments on five widely used datasets with different attacks demonstrate that robust features obtained from our model improve the model's adversarial robustness compared to the state-of-the-art approaches. Moreover, the trained domain discriminator is able to identify the domain-specific features from the clean images and AEs almost perfectly. This enables AE detection without incurring additional computational costs. With that, we can also specify different classifiers for clean images and AEs, thereby avoiding any drop in clean image accuracy.

11.
IEEE Trans Neural Netw Learn Syst ; 35(3): 3340-3350, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38271160

RESUMO

Grid emergency voltage control (GEVC) is paramount in electric power systems to improve voltage stability and prevent cascading outages and blackouts in case of contingencies. While most deep reinforcement learning (DRL)-based paradigms perform single agents in a static environment, real-world agents for GEVC are expected to cooperate in a dynamically shifting grid. Moreover, due to high uncertainties from combinatory natures of various contingencies and load consumption, along with the complexity of dynamic grid operation, the data efficiency and control performance of the existing DRL-based methods are challenged. To address these limitations, we propose a multi-agent graph-attention (GATT)-based DRL algorithm for GEVC in multi-area power systems. We develop graph convolutional network (GCN)-based agents for feature representation of the graph-structured voltages to improve the decision accuracy in a data-efficient manner. Furthermore, a cutting-edge attention mechanism concentrates on effective information sharing among multiple agents, synergizing different-sized subnetworks in the grid for cooperative learning. We address several key challenges in the existing DRL-based GEVC approaches, including low scalability and poor stability against high uncertainties. Test results in the IEEE benchmark system verify the advantages of the proposed method over several recent multi-agent DRL-based algorithms.

12.
IEEE Trans Image Process ; 33: 3508-3519, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38809733

RESUMO

Domain Generalization (DG) aims to learn a generalizable model on the unseen target domain by only training on the multiple observed source domains. Although a variety of DG methods have focused on extracting domain-invariant features, the domain-specific class-relevant features have attracted attention and been argued to benefit generalization to the unseen target domain. To take into account the class-relevant domain-specific information, in this paper we propose an Information theory iNspired diSentanglement and pURification modEl (INSURE) to explicitly disentangle the latent features to obtain sufficient and compact (necessary) class-relevant feature for generalization to the unseen domain. Specifically, we first propose an information theory inspired loss function to ensure the disentangled class-relevant features contain sufficient class label information and the other disentangled auxiliary feature has sufficient domain information. We further propose a paired purification loss function to let the auxiliary feature discard all the class-relevant information and thus the class-relevant feature will contain sufficient and compact (necessary) class-relevant information. Moreover, instead of using multiple encoders, we propose to use a learnable binary mask as our disentangler to make the disentanglement more efficient and make the disentangled features complementary to each other. We conduct extensive experiments on five widely used DG benchmark datasets including PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet. The proposed INSURE achieves state-of-the-art performance. We also empirically show that domain-specific class-relevant features are beneficial for domain generalization. The code is available at https://github.com/yuxi120407/INSURE.

13.
Front Bioinform ; 4: 1280971, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38812660

RESUMO

Radiation exposure poses a significant threat to human health. Emerging research indicates that even low-dose radiation once believed to be safe, may have harmful effects. This perception has spurred a growing interest in investigating the potential risks associated with low-dose radiation exposure across various scenarios. To comprehensively explore the health consequences of low-dose radiation, our study employs a robust statistical framework that examines whether specific groups of genes, belonging to known pathways, exhibit coordinated expression patterns that align with the radiation levels. Notably, our findings reveal the existence of intricate yet consistent signatures that reflect the molecular response to radiation exposure, distinguishing between low-dose and high-dose radiation. Moreover, we leverage a pathway-constrained variational autoencoder to capture the nonlinear interactions within gene expression data. By comparing these two analytical approaches, our study aims to gain valuable insights into the impact of low-dose radiation on gene expression patterns, identify pathways that are differentially affected, and harness the potential of machine learning to uncover hidden activity within biological networks. This comparative analysis contributes to a deeper understanding of the molecular consequences of low-dose radiation exposure.

14.
Cancer Med ; 13(12): e7253, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38899720

RESUMO

PURPOSE: Real world evidence is crucial to understanding the diffusion of new oncologic therapies, monitoring cancer outcomes, and detecting unexpected toxicities. In practice, real world evidence is challenging to collect rapidly and comprehensively, often requiring expensive and time-consuming manual case-finding and annotation of clinical text. In this Review, we summarise recent developments in the use of artificial intelligence to collect and analyze real world evidence in oncology. METHODS: We performed a narrative review of the major current trends and recent literature in artificial intelligence applications in oncology. RESULTS: Artificial intelligence (AI) approaches are increasingly used to efficiently phenotype patients and tumors at large scale. These tools also may provide novel biological insights and improve risk prediction through multimodal integration of radiographic, pathological, and genomic datasets. Custom language processing pipelines and large language models hold great promise for clinical prediction and phenotyping. CONCLUSIONS: Despite rapid advances, continued progress in computation, generalizability, interpretability, and reliability as well as prospective validation are needed to integrate AI approaches into routine clinical care and real-time monitoring of novel therapies.


Assuntos
Inteligência Artificial , Oncologia , Neoplasias , Humanos , Oncologia/métodos , Oncologia/tendências , Neoplasias/terapia
15.
Sci Rep ; 13(1): 2453, 2023 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-36774365

RESUMO

Quantum machine learning (QML) can complement the growing trend of using learned models for a myriad of classification tasks, from image recognition to natural speech processing. There exists the potential for a quantum advantage due to the intractability of quantum operations on a classical computer. Many datasets used in machine learning are crowd sourced or contain some private information, but to the best of our knowledge, no current QML models are equipped with privacy-preserving features. This raises concerns as it is paramount that models do not expose sensitive information. Thus, privacy-preserving algorithms need to be implemented with QML. One solution is to make the machine learning algorithm differentially private, meaning the effect of a single data point on the training dataset is minimized. Differentially private machine learning models have been investigated, but differential privacy has not been thoroughly studied in the context of QML. In this study, we develop a hybrid quantum-classical model that is trained to preserve privacy using differentially private optimization algorithm. This marks the first proof-of-principle demonstration of privacy-preserving QML. The experiments demonstrate that differentially private QML can protect user-sensitive information without signficiantly diminishing model accuracy. Although the quantum model is simulated and tested on a classical computer, it demonstrates potential to be efficiently implemented on near-term quantum devices [noisy intermediate-scale quantum (NISQ)]. The approach's success is illustrated via the classification of spatially classed two-dimensional datasets and a binary MNIST classification. This implementation of privacy-preserving QML will ensure confidentiality and accurate learning on NISQ technology.

16.
Mar Pollut Bull ; 190: 114832, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36934488

RESUMO

This study was conducted in northern New Jersey, USA, to estimate the nutrient fluxes from the Passaic River, the Hackensack River and other sources into Newark Bay and the nutrient residence time in Newark Bay. Bi-weekly total inorganic nitrogen (TIN) and orthophosphate concentration data in the Passaic River, the Hackensack River, and Newark Bay for over 15 years (2004-2019) were collected along with daily river discharge data from the public database. The annual TIN and orthophosphate (ortho-P) loading from the Passaic River ranged from 915 × 103 kg y-1 to 251 × 104 kg y-1 and 94 × 103 kg y-1to 372 × 103 kg y-1, respectively. The annual TIN and ortho-P loading from the Hackensack River ranged from 3.13 × 103 kg y-1 to 234 × 103 kg y-1 and 0.28 × 103 kg y-1 to 6.97 × 103 kg y-1, respectively. Seasonal variation results indicated that hurricane events highly increased TIN and ortho-P loading from riverine input and reduced residence time in Newark Bay.


Assuntos
Baías , Poluentes Químicos da Água , Monitoramento Ambiental , New Jersey , Rios , Poluentes Químicos da Água/análise
17.
BMC Genom Data ; 24(1): 52, 2023 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-37710206

RESUMO

BACKGROUND: When polygenic risk score (PRS) is derived from summary statistics, independence between discovery and test sets cannot be monitored. We compared two types of PRS studies derived from raw genetic data (denoted as rPRS) and the summary statistics for IGAP (sPRS). RESULTS: Two variables with the high heritability in UK Biobank, hypertension, and height, are used to derive an exemplary scale effect of PRS. sPRS without APOE is derived from International Genomics of Alzheimer's Project (IGAP), which records ΔAUC and ΔR2 of 0.051 ± 0.013 and 0.063 ± 0.015 for Alzheimer's Disease Sequencing Project (ADSP) and 0.060 and 0.086 for Accelerating Medicine Partnership - Alzheimer's Disease (AMP-AD). On UK Biobank, rPRS performances for hypertension assuming a similar size of discovery and test sets are 0.0036 ± 0.0027 (ΔAUC) and 0.0032 ± 0.0028 (ΔR2). For height, ΔR2 is 0.029 ± 0.0037. CONCLUSION: Considering the high heritability of hypertension and height of UK Biobank and sample size of UK Biobank, sPRS results from AD databases are inflated. Independence between discovery and test sets is a well-known basic requirement for PRS studies. However, a lot of PRS studies cannot follow such requirements because of impossible direct comparisons when using summary statistics. Thus, for sPRS, potential duplications should be carefully considered within the same ethnic group.


Assuntos
Doença de Alzheimer , Hipertensão , Humanos , Bases de Dados Factuais , Etnicidade , Genômica , Hipertensão/genética
18.
Sci Data ; 10(1): 349, 2023 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-37268638

RESUMO

X-ray absorption spectroscopy (XAS) is a premier technique for materials characterization, providing key information about the local chemical environment of the absorber atom. In this work, we develop a database of sulfur K-edge XAS spectra of crystalline and amorphous lithium thiophosphate materials based on the atomic structures reported in Chem. Mater., 34, 6702 (2022). The XAS database is based on simulations using the excited electron and core-hole pseudopotential approach implemented in the Vienna Ab initio Simulation Package. Our database contains 2681 S K-edge XAS spectra for 66 crystalline and glassy structure models, making it the largest collection of first-principles computational XAS spectra for glass/ceramic lithium thiophosphates to date. This database can be used to correlate S spectral features with distinct S species based on their local coordination and short-range ordering in sulfide-based solid electrolytes. The data is openly distributed via the Materials Cloud, allowing researchers to access it for free and use it for further analysis, such as spectral fingerprinting, matching with experiments, and developing machine learning models.

19.
Sci Rep ; 12(1): 17821, 2022 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-36280773

RESUMO

In recent years, data-driven, deep-learning-based models have shown great promise in medical risk prediction. By utilizing the large-scale Electronic Health Record data found in the U.S. Department of Veterans Affairs, the largest integrated healthcare system in the United States, we have developed an automated, personalized risk prediction model to support the clinical decision-making process for localized prostate cancer patients. This method combines the representative power of deep learning and the analytical interpretability of parametric regression models and can implement both time-dependent and static input data. To collect a comprehensive evaluation of model performances, we calculate time-dependent C-statistics [Formula: see text] over 2-, 5-, and 10-year time horizons using either a composite outcome or prostate cancer mortality as the target event. The composite outcome combines the Prostate-Specific Antigen (PSA) test, metastasis, and prostate cancer mortality. Our longitudinal model Recurrent Deep Survival Machine (RDSM) achieved [Formula: see text] 0.85 (0.83), 0.80 (0.83), and 0.76 (0.81), while the cross-sectional model Deep Survival Machine (DSM) attained [Formula: see text] 0.85 (0.82), 0.80 (0.82), and 0.76 (0.79) for the 2-, 5-, and 10-year composite (mortality) outcomes, respectively. In addition to estimating the survival probability, our method can quantify the uncertainty associated with the prediction. The uncertainty scores show a consistent correlation with the prediction accuracy. We find PSA and prostate cancer stage information are the most important indicators in risk prediction. Our work demonstrates the utility of the data-driven machine learning model in prostate cancer risk prediction, which can play a critical role in the clinical decision system.


Assuntos
Aprendizado Profundo , Neoplasias da Próstata , Masculino , Humanos , Estados Unidos , Antígeno Prostático Específico , Estudos Transversais , Neoplasias da Próstata/patologia , Análise de Sobrevida
20.
IUCrJ ; 8(Pt 1): 12-21, 2021 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-33520239

RESUMO

The reconstruction of a single-particle image from the modulus of its Fourier transform, by phase-retrieval methods, has been extensively applied in X-ray structural science. Particularly for strong-phase objects, such as the phase domains found inside crystals by Bragg coherent diffraction imaging (BCDI), conventional iteration methods are time consuming and sensitive to their initial guess because of their iterative nature. Here, a deep-neural-network model is presented which gives a fast and accurate estimate of the complex single-particle image in the form of a universal approximator learned from synthetic data. A way to combine the deep-neural-network model with conventional iterative methods is then presented to refine the accuracy of the reconstructed results from the proposed deep-neural-network model. Improved convergence is also demonstrated with experimental BCDI data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA