Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.187
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38653490

RESUMO

Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.


Assuntos
Teorema de Bayes , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Predisposição Genética para Doença , Algoritmos , Software , Biologia Computacional/métodos , Estudos de Associação Genética/métodos
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38622357

RESUMO

Pseudouridine is an RNA modification that is widely distributed in both prokaryotes and eukaryotes, and plays a critical role in numerous biological activities. Despite its importance, the precise identification of pseudouridine sites through experimental approaches poses significant challenges, requiring substantial time and resources.Therefore, there is a growing need for computational techniques that can reliably and quickly identify pseudouridine sites from vast amounts of RNA sequencing data. In this study, we propose fuzzy kernel evidence Random Forest (FKeERF) to identify pseudouridine sites. This method is called PseU-FKeERF, which demonstrates high accuracy in identifying pseudouridine sites from RNA sequencing data. The PseU-FKeERF model selected four RNA feature coding schemes with relatively good performance for feature combination, and then input them into the newly proposed FKeERF method for category prediction. FKeERF not only uses fuzzy logic to expand the original feature space, but also combines kernel methods that are easy to interpret in general for category prediction. Both cross-validation tests and independent tests on benchmark datasets have shown that PseU-FKeERF has better predictive performance than several state-of-the-art methods. This new method not only improves the accuracy of pseudouridine site identification, but also provides a certain reference for disease control and related drug development in the future.


Assuntos
Pseudouridina , Algoritmo Florestas Aleatórias , Pseudouridina/genética , RNA/genética , Sequência de Bases
3.
Proc Natl Acad Sci U S A ; 120(14): e2208779120, 2023 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-36996114

RESUMO

While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that are consistent. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and neural tangent kernels, we provide explicit activation functions that can be used to construct networks that achieve consistency. Interestingly, these activation functions are simple and easy to implement, yet differ from commonly used activations such as ReLU or sigmoid. More generally, we create a taxonomy of infinitely wide and deep networks and show that these models implement one of three well-known classifiers depending on the activation function used: 1) 1-nearest neighbor (model predictions are given by the label of the nearest training example); 2) majority vote (model predictions are given by the label of the class with the greatest representation in the training set); or 3) singular kernel classifiers (a set of classifiers containing those that achieve consistency). Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação
4.
Proc Natl Acad Sci U S A ; 120(8): e2211115120, 2023 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-36800390

RESUMO

We develop an algebraic framework for sequential data assimilation of partially observed dynamical systems. In this framework, Bayesian data assimilation is embedded in a nonabelian operator algebra, which provides a representation of observables by multiplication operators and probability densities by density operators (quantum states). In the algebraic approach, the forecast step of data assimilation is represented by a quantum operation induced by the Koopman operator of the dynamical system. Moreover, the analysis step is described by a quantum effect, which generalizes the Bayesian observational update rule. Projecting this formulation to finite-dimensional matrix algebras leads to computational schemes that are i) automatically positivity-preserving and ii) amenable to consistent data-driven approximation using kernel methods for machine learning. Moreover, these methods are natural candidates for implementation on quantum computers. Applications to the Lorenz 96 multiscale system and the El Niño Southern Oscillation in a climate model show promising results in terms of forecast skill and uncertainty quantification.

5.
Proc Natl Acad Sci U S A ; 120(11): e2201553120, 2023 03 14.
Artigo em Inglês | MEDLINE | ID: mdl-36893275

RESUMO

Predicting the spread of populations across fragmented habitats is vital if we are to manage their persistence in the long term. We applied network theory with a model and an experiment to show that spread rate is jointly defined by the configuration of habitat networks (i.e., the arrangement and length of connections between habitat fragments) and the movement behavior of individuals. We found that population spread rate in the model was well predicted by algebraic connectivity of the habitat network. A multigeneration experiment with the microarthropod Folsomia candida validated this model prediction. The realized habitat connectivity and spread rate were determined by the interaction between dispersal behavior and habitat configuration, such that the network configurations that facilitated the fastest spread changed depending on the shape of the species' dispersal kernel. Predicting the spread rate of populations in fragmented landscapes requires combining knowledge of species-specific dispersal kernels and the spatial configuration of habitat networks. This information can be used to design landscapes to manage the spread and persistence of species in fragmented habitats.


Assuntos
Ecossistema , Modelos Biológicos , Dispersão de Sementes , Distribuição Animal , Animais
6.
Trends Genet ; 38(10): 989-990, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35715277

RESUMO

Maize and rice were domesticated from their wild progenitors independently. Whether their convergent phenotypic selection was driven by conserved molecular changes remains unclear. We discuss the implications of a recent genome-wide study of convergently selected maize and rice genes showing that maize KERNEL ROW NUMBER2 (KRN2) and its rice ortholog experienced convergent selection.


Assuntos
Estudo de Associação Genômica Ampla , Oryza , Alelos , Oryza/genética , Zea mays/genética
7.
Biostatistics ; 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38637995

RESUMO

Computed tomography (CT) has been a powerful diagnostic tool since its emergence in the 1970s. Using CT data, 3D structures of human internal organs and tissues, such as blood vessels, can be reconstructed using professional software. This 3D reconstruction is crucial for surgical operations and can serve as a vivid medical teaching example. However, traditional 3D reconstruction heavily relies on manual operations, which are time-consuming, subjective, and require substantial experience. To address this problem, we develop a novel semiparametric Gaussian mixture model tailored for the 3D reconstruction of blood vessels. This model extends the classical Gaussian mixture model by enabling nonparametric variations in the component-wise parameters of interest according to voxel positions. We develop a kernel-based expectation-maximization algorithm for estimating the model parameters, accompanied by a supporting asymptotic theory. Furthermore, we propose a novel regression method for optimal bandwidth selection. Compared to the conventional cross-validation-based (CV) method, the regression method outperforms the CV method in terms of computational and statistical efficiency. In application, this methodology facilitates the fully automated reconstruction of 3D blood vessel structures with remarkable accuracy.

8.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36433785

RESUMO

Differentiating cancer subtypes is crucial to guide personalized treatment and improve the prognosis for patients. Integrating multi-omics data can offer a comprehensive landscape of cancer biological process and provide promising ways for cancer diagnosis and treatment. Taking the heterogeneity of different omics data types into account, we propose a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy. In stage 1, we obtain a composite kernel borrowing the cancer integration via multi-kernel learning (CIMLR) idea by optimizing the kernel parameters for individual omics data type. In stage 2, we obtain a final fused kernel through a weighted linear combination of individual kernels learned from stage 1 using an unsupervised multiple kernel learning method. Based on the final fusion kernel, k-means clustering is applied to identify cancer subtypes. Simulation studies show that hMKL outperforms the one-stage CIMLR method when there is data heterogeneity. hMKL can estimate the number of clusters correctly, which is the key challenge in subtyping. Application to two real data sets shows that hMKL identified meaningful subtypes and key cancer-associated biomarkers. The proposed method provides a novel toolkit for heterogeneous multi-omics data integration and cancer subtypes identification.


Assuntos
Aprendizado Profundo , Neoplasias , Humanos , Multiômica , Neoplasias/genética , Análise por Conglomerados , Simulação por Computador , Biomarcadores Tumorais/genética
9.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37099694

RESUMO

Studies have found that human microbiome is associated with and predictive of human health and diseases. Many statistical methods developed for microbiome data focus on different distance metrics that can capture various information in microbiomes. Prediction models were also developed for microbiome data, including deep learning methods with convolutional neural networks that consider both taxa abundance profiles and taxonomic relationships among microbial taxa from a phylogenetic tree. Studies have also suggested that a health outcome could associate with multiple forms of microbiome profiles. In addition to the abundance of some taxa that are associated with a health outcome, the presence/absence of some taxa is also associated with and predictive of the same health outcome. Moreover, associated taxa may be close to each other on a phylogenetic tree or spread apart on a phylogenetic tree. No prediction models currently exist that use multiple forms of microbiome-outcome associations. To address this, we propose a multi-kernel machine regression (MKMR) method that is able to capture various types of microbiome signals when doing predictions. MKMR utilizes multiple forms of microbiome signals through multiple kernels being transformed from multiple distance metrics for microbiomes and learn an optimal conic combination of these kernels, with kernel weights helping us understand contributions of individual microbiome signal types. Simulation studies suggest a much-improved prediction performance over competing methods with mixture of microbiome signals. Real data applicants to predict multiple health outcomes using throat and gut microbiome data also suggest a better prediction of MKMR than that of competing methods.


Assuntos
Microbiota , Humanos , Filogenia , Simulação por Computador , Redes Neurais de Computação , Avaliação de Resultados em Cuidados de Saúde
10.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36702753

RESUMO

Microbes can affect the metabolism and immunity of human body incessantly, and the dysbiosis of human microbiome drives not only the occurrence but also the progression of disease (i.e. multiple statuses of disease). Recently, microbiome-based association tests have been widely developed to detect the association between the microbiome and host phenotype. However, the existing methods have not achieved satisfactory performance in testing the association between the microbiome and ordinal/nominal multicategory phenotypes (e.g. disease severity and tumor subtype). In this paper, we propose an optimal microbiome-based association test for multicategory phenotypes, namely, multiMiAT. Specifically, under the multinomial logit model framework, we first introduce a microbiome regression-based kernel association test for multicategory phenotypes (multiMiRKAT). As a data-driven optimal test, multiMiAT then integrates multiMiRKAT, score test and MiRKAT-MC to maintain excellent performance in diverse association patterns. Massive simulation experiments prove the success of our method. Furthermore, multiMiAT is also applied to real microbiome data experiments to detect the association between the gut microbiome and clinical statuses of colorectal cancer as well as for diverse statuses of Clostridium difficile infections.


Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Microbiota/genética , Simulação por Computador , Fenótipo , Modelos Logísticos
11.
EMBO Rep ; 24(1): e55542, 2023 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-36394374

RESUMO

The Zn content in cereal seeds is an important trait for crop production as well as for human health. However, little is known about how Zn is loaded to plant seeds. Here, through a genome-wide association study (GWAS), we identify the Zn-NA (nicotianamine) transporter gene ZmYSL2 that is responsible for loading Zn to maize kernels. High promoter sequence variation in ZmYSL2 most likely drives the natural variation in Zn concentrations in maize kernels. ZmYSL2 is specifically localized on the plasma membrane facing the maternal tissue of the basal endosperm transfer cell layer (BETL) and functions in loading Zn-NA into the BETL. Overexpression of ZmYSL2 increases the Zn concentration in the kernels by 31.6%, which achieves the goal of Zn biofortification of maize. These findings resolve the mystery underlying the loading of Zn into plant seeds, providing an efficient strategy for breeding or engineering maize varieties with enriched Zn nutrition.


Assuntos
Estudo de Associação Genômica Ampla , Zea mays , Humanos , Zea mays/genética , Zea mays/metabolismo , Zinco/metabolismo , Melhoramento Vegetal , Sementes/genética , Proteínas de Membrana Transportadoras/genética
12.
Cereb Cortex ; 34(1)2024 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-38100334

RESUMO

Functional connectome has revealed remarkable potential in the diagnosis of neurological disorders, e.g. autism spectrum disorder. However, existing studies have primarily focused on a single connectivity pattern, such as full correlation, partial correlation, or causality. Such an approach fails in discovering the potential complementary topology information of FCNs at different connection patterns, resulting in lower diagnostic performance. Consequently, toward an accurate autism spectrum disorder diagnosis, a straightforward ambition is to combine the multiple connectivity patterns for the diagnosis of neurological disorders. To this end, we conduct functional magnetic resonance imaging data to construct multiple brain networks with different connectivity patterns and employ kernel combination techniques to fuse information from different brain connectivity patterns for autism diagnosis. To verify the effectiveness of our approach, we assess the performance of the proposed method on the Autism Brain Imaging Data Exchange dataset for diagnosing autism spectrum disorder. The experimental findings demonstrate that our method achieves precise autism spectrum disorder diagnosis with exceptional accuracy (91.30%), sensitivity (91.48%), and specificity (91.11%).


Assuntos
Transtorno do Espectro Autista , Conectoma , Doenças do Sistema Nervoso , Humanos , Conectoma/métodos , Transtorno do Espectro Autista/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Mapeamento Encefálico/métodos
13.
Cereb Cortex ; 34(1)2024 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-38012122

RESUMO

Mild cognitive impairment is considered the prodromal stage of Alzheimer's disease. Accurate diagnosis and the exploration of the pathological mechanism of mild cognitive impairment are extremely valuable for targeted Alzheimer's disease prevention and early intervention. In all, 100 mild cognitive impairment patients and 86 normal controls were recruited in this study. We innovatively constructed the individual morphological brain networks and derived multiple brain connectome features based on 3D-T1 structural magnetic resonance imaging with the Jensen-Shannon divergence similarity estimation method. Our results showed that the most distinguishing morphological brain connectome features in mild cognitive impairment patients were consensus connections and nodal graph metrics, mainly located in the frontal, occipital, limbic lobes, and subcortical gray matter nuclei, corresponding to the default mode network. Topological properties analysis revealed that mild cognitive impairment patients exhibited compensatory changes in the frontal lobe, while abnormal cortical-subcortical circuits associated with cognition were present. Moreover, the combination of multidimensional brain connectome features using multiple kernel-support vector machine achieved the best classification performance in distinguishing mild cognitive impairment patients and normal controls, with an accuracy of 84.21%. Therefore, our findings are of significant importance for developing potential brain imaging biomarkers for early detection of Alzheimer's disease and understanding the neuroimaging mechanisms of the disease.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Conectoma , Humanos , Conectoma/métodos , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/patologia , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Disfunção Cognitiva/diagnóstico por imagem , Disfunção Cognitiva/patologia , Imageamento por Ressonância Magnética/métodos
14.
Cereb Cortex ; 34(7)2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38997209

RESUMO

Visual encoding models often use deep neural networks to describe the brain's visual cortex response to external stimuli. Inspired by biological findings, researchers found that large receptive fields built with large convolutional kernels improve convolutional encoding model performance. Inspired by scaling laws in recent years, this article investigates the performance of large convolutional kernel encoding models on larger parameter scales. This paper proposes a large-scale parameters framework with a sizeable convolutional kernel for encoding visual functional magnetic resonance imaging activity information. The proposed framework consists of three parts: First, the stimulus image feature extraction module is constructed using a large-kernel convolutional network while increasing channel numbers to expand the parameter size of the framework. Second, enlarging the input data during the training stage through the multi-subject fusion module to accommodate the increase in parameters. Third, the voxel mapping module maps from stimulus image features to functional magnetic resonance imaging signals. Compared to sizeable convolutional kernel visual encoding networks with base parameter scale, our visual encoding framework improves by approximately 7% on the Natural Scenes Dataset, the dedicated dataset for the Algonauts 2023 Challenge. We further analyze that our encoding framework made a trade-off between encoding performance and trainability. This paper confirms that expanding parameters in visual coding can bring performance improvements.


Assuntos
Mapeamento Encefálico , Imageamento por Ressonância Magnética , Redes Neurais de Computação , Córtex Visual , Imageamento por Ressonância Magnética/métodos , Humanos , Córtex Visual/fisiologia , Córtex Visual/diagnóstico por imagem , Mapeamento Encefálico/métodos , Processamento de Imagem Assistida por Computador/métodos , Percepção Visual/fisiologia , Estimulação Luminosa/métodos
15.
Cereb Cortex ; 34(4)2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38615243

RESUMO

OBJECTIVE: To investigate the alterations in cortical-cerebellar circuits and assess their diagnostic potential in preschool children with autism spectrum disorder using multimodal magnetic resonance imaging. METHODS: We utilized diffusion basis spectrum imaging approaches, namely DBSI_20 and DBSI_combine, alongside 3D structural imaging to examine 31 autism spectrum disorder diagnosed patients and 30 healthy controls. The participants' brains were segmented into 120 anatomical regions for this analysis, and a multimodal strategy was adopted to assess the brain networks using a multi-kernel support vector machine for classification. RESULTS: The results revealed consensus connections in the cortical-cerebellar and subcortical-cerebellar circuits, notably in the thalamus and basal ganglia. These connections were predominantly positive in the frontoparietal and subcortical pathways, whereas negative consensus connections were mainly observed in frontotemporal and subcortical pathways. Among the models tested, DBSI_20 showed the highest accuracy rate of 86.88%. In addition, further analysis indicated that combining the 3 models resulted in the most effective performance. CONCLUSION: The connectivity network analysis of the multimodal brain data identified significant abnormalities in the cortical-cerebellar circuits in autism spectrum disorder patients. The DBSI_20 model not only provided the highest accuracy but also demonstrated efficiency, suggesting its potential for clinical application in autism spectrum disorder diagnosis.


Assuntos
Transtorno do Espectro Autista , Humanos , Pré-Escolar , Transtorno do Espectro Autista/diagnóstico por imagem , Imageamento por Ressonância Magnética , Imagem de Difusão por Ressonância Magnética , Cerebelo/diagnóstico por imagem , Encéfalo
16.
Cereb Cortex ; 34(4)2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38679476

RESUMO

Spinocerebellar ataxia type 12 is a hereditary and neurodegenerative illness commonly found in India. However, there is no established noninvasive automatic diagnostic system for its diagnosis and identification of imaging biomarkers. This work proposes a novel four-phase machine learning-based diagnostic framework to find spinocerebellar ataxia type 12 disease-specific atrophic-brain regions and distinguish spinocerebellar ataxia type 12 from healthy using a real structural magnetic resonance imaging dataset. Firstly, each brain region is represented in terms of statistics of coefficients obtained using 3D-discrete wavelet transform. Secondly, a set of relevant regions are selected using a graph network-based method. Thirdly, a kernel support vector machine is used to capture nonlinear relationships among the voxels of a brain region. Finally, the linear relationship among the brain regions is captured to build a decision model to distinguish spinocerebellar ataxia type 12 from healthy by using the regularized logistic regression method. A classification accuracy of 95% and a harmonic mean of precision and recall, i.e. F1-score of 94.92%, is achieved. The proposed framework provides relevant regions responsible for the atrophy. The importance of each region is captured using Shapley Additive exPlanations values. We also performed a statistical analysis to find volumetric changes in spinocerebellar ataxia type 12 group compared to healthy. The promising result of the proposed framework shows that clinicians can use it for early and timely diagnosis of spinocerebellar ataxia type 12.


Assuntos
Biomarcadores , Encéfalo , Imageamento por Ressonância Magnética , Ataxias Espinocerebelares , Máquina de Vetores de Suporte , Humanos , Imageamento por Ressonância Magnética/métodos , Ataxias Espinocerebelares/diagnóstico por imagem , Ataxias Espinocerebelares/genética , Ataxias Espinocerebelares/diagnóstico , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Encéfalo/metabolismo , Biomarcadores/análise , Masculino , Feminino , Adulto , Modelos Logísticos , Pessoa de Meia-Idade , Atrofia
17.
Proc Natl Acad Sci U S A ; 119(24): e2202679119, 2022 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-35687672

RESUMO

Following a brief review of the management of environmental externalities under strategic interactions in the traditional temporal domain, results are extended to the spatiotemporal domain. Conditions for spatial open-loop and feedback Nash equilibria, along with conditions for the benchmark cooperative solution, are presented and compared. A simplified numerical example illustrates the spatial patterns emerging at a steady state under Fickian diffusion and dispersal kernels, and the inefficiency of spatially flat emission taxes. This conceptual framework could provide new research areas.

18.
Proc Natl Acad Sci U S A ; 119(16): e2115064119, 2022 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-35412891

RESUMO

Matrix completion problems arise in many applications including recommendation systems, computer vision, and genomics. Increasingly larger neural networks have been successful in many of these applications but at considerable computational costs. Remarkably, taking the width of a neural network to infinity allows for improved computational performance. In this work, we develop an infinite width neural network framework for matrix completion that is simple, fast, and flexible. Simplicity and speed come from the connection between the infinite width limit of neural networks and kernels known as neural tangent kernels (NTK). In particular, we derive the NTK for fully connected and convolutional neural networks for matrix completion. The flexibility stems from a feature prior, which allows encoding relationships between coordinates of the target matrix, akin to semisupervised learning. The effectiveness of our framework is demonstrated through competitive results for virtual drug screening and image inpainting/reconstruction. We also provide an implementation in Python to make our framework accessible on standard hardware to a broad audience.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Computadores , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Aprendizado de Máquina Supervisionado
19.
BMC Genomics ; 25(1): 466, 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38741045

RESUMO

BACKGROUND: Protein-protein interactions (PPIs) hold significant importance in biology, with precise PPI prediction as a pivotal factor in comprehending cellular processes and facilitating drug design. However, experimental determination of PPIs is laborious, time-consuming, and often constrained by technical limitations. METHODS: We introduce a new node representation method based on initial information fusion, called FFANE, which amalgamates PPI networks and protein sequence data to enhance the precision of PPIs' prediction. A Gaussian kernel similarity matrix is initially established by leveraging protein structural resemblances. Concurrently, protein sequence similarities are gauged using the Levenshtein distance, enabling the capture of diverse protein attributes. Subsequently, to construct an initial information matrix, these two feature matrices are merged by employing weighted fusion to achieve an organic amalgamation of structural and sequence details. To gain a more profound understanding of the amalgamated features, a Stacked Autoencoder (SAE) is employed for encoding learning, thereby yielding more representative feature representations. Ultimately, classification models are trained to predict PPIs by using the well-learned fusion feature. RESULTS: When employing 5-fold cross-validation experiments on SVM, our proposed method achieved average accuracies of 94.28%, 97.69%, and 84.05% in terms of Saccharomyces cerevisiae, Homo sapiens, and Helicobacter pylori datasets, respectively. CONCLUSION: Experimental findings across various authentic datasets validate the efficacy and superiority of this fusion feature representation approach, underscoring its potential value in bioinformatics.


Assuntos
Biologia Computacional , Mapeamento de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Biologia Computacional/métodos , Algoritmos , Helicobacter pylori/metabolismo , Helicobacter pylori/genética , Máquina de Vetores de Suporte , Proteínas/metabolismo , Proteínas/química , Humanos , Mapas de Interação de Proteínas , Bases de Dados de Proteínas
20.
Neuroimage ; 293: 120611, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38643890

RESUMO

Dynamic PET allows quantification of physiological parameters through tracer kinetic modeling. For dynamic imaging of brain or head and neck cancer on conventional PET scanners with a short axial field of view, the image-derived input function (ID-IF) from intracranial blood vessels such as the carotid artery (CA) suffers from severe partial volume effects. Alternatively, optimization-derived input function (OD-IF) by the simultaneous estimation (SIME) method does not rely on an ID-IF but derives the input function directly from the data. However, the optimization problem is often highly ill-posed. We proposed a new method that combines the ideas of OD-IF and ID-IF together through a kernel framework. While evaluation of such a method is challenging in human subjects, we used the uEXPLORER total-body PET system that covers major blood pools to provide a reference for validation. METHODS: The conventional SIME approach estimates an input function using a joint estimation together with kinetic parameters by fitting time activity curves from multiple regions of interests (ROIs). The input function is commonly parameterized with a highly nonlinear model which is difficult to estimate. The proposed kernel SIME method exploits the CA ID-IF as a priori information via a kernel representation to stabilize the SIME approach. The unknown parameters are linear and thus easier to estimate. The proposed method was evaluated using 18F-fluorodeoxyglucose studies with both computer simulations and 20 human-subject scans acquired on the uEXPLORER scanner. The effect of the number of ROIs on kernel SIME was also explored. RESULTS: The estimated OD-IF by kernel SIME showed a good match with the reference input function and provided more accurate estimation of kinetic parameters for both simulation and human-subject data. The kernel SIME led to the highest correlation coefficient (R = 0.97) and the lowest mean absolute error (MAE = 10.5 %) compared to using the CA ID-IF (R = 0.86, MAE = 108.2 %) and conventional SIME (R = 0.57, MAE = 78.7 %) in the human-subject evaluation. Adding more ROIs improved the overall performance of the kernel SIME method. CONCLUSION: The proposed kernel SIME method shows promise to provide an accurate estimation of the blood input function and kinetic parameters for brain PET parametric imaging.


Assuntos
Encéfalo , Tomografia por Emissão de Pósitrons , Humanos , Tomografia por Emissão de Pósitrons/métodos , Tomografia por Emissão de Pósitrons/normas , Encéfalo/diagnóstico por imagem , Imagem Corporal Total/métodos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA