Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 151
Filtrar
1.
Front Microbiol ; 15: 1438942, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39355422

RESUMO

Background: Clinical studies have demonstrated that microbes play a crucial role in human health and disease. The identification of microbe-disease interactions can provide insights into the pathogenesis and promote the diagnosis, treatment, and prevention of disease. Although a large number of computational methods are designed to screen novel microbe-disease associations, the accurate and efficient methods are still lacking due to data inconsistence, underutilization of prior information, and model performance. Methods: In this study, we proposed an improved deep learning-based framework, named GIMMDA, to identify latent microbe-disease associations, which is based on graph autoencoder and inductive matrix completion. By co-training the information from microbe and disease space, the new representations of microbes and diseases are used to reconstruct microbe-disease association in the end-to-end framework. In particular, a similarity fusion strategy is conducted to improve prediction performance. Results: The experimental results show that the performance of GIMMDA is competitive with that of existing state-of-the-art methods on 3 datasets (i.e., HMDAD, Disbiome, and multiMDA). In particular, it performs best with the area under the receiver operating characteristic curve (AUC) of 0.9735, 0.9156, 0.9396 on abovementioned 3 datasets, respectively. And the result also confirms that different similarity fusions can improve the prediction performance. Furthermore, case studies on two diseases, i.e., asthma and obesity, validate the effectiveness and reliability of our proposed model. Conclusion: The proposed GIMMDA model show a strong capability in predicting microbe-disease associations. We expect that GPUDMDA will help identify potential microbe-related diseases in the future.

2.
J Cell Mol Med ; 28(18): e70071, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39300612

RESUMO

The use of matrix completion methods to predict the association between microbes and diseases can effectively improve treatment efficiency. However, the similarity measures used in the existing methods are often influenced by various factors such as neighbourhood size, choice of similarity metric, or multiple parameters for similarity fusion, making it challenging. Additionally, matrix completion is currently limited by the sparsity of the initial association matrix, which restricts its predictive performance. To address these problems, we propose a matrix completion method based on adaptive neighbourhood similarity and sparse constraints (ANS-SCMC) for predict microbe-disease potential associations. Adaptive neighbourhood similarity learning dynamically uses the decomposition results as effective information for the next learning iteration by simultaneously performing local manifold structure learning and decomposition. This approach effectively preserves fine local structure information and avoids the influence of weight parameters directly involved in similarity measurement. Additionally, the sparse constraint-based matrix completion approach can better handle the sparsity challenge in the association matrix. Finally, the algorithm we proposed has achieved significantly higher predictive performance in the validation compared to several commonly used prediction methods proposed to date. Furthermore, in the case study, the prediction algorithm achieved an accuracy of up to 80% for the top 10 microbes associated with type 1 diabetes and 100% for Crohn's disease respectively.


Assuntos
Algoritmos , Humanos , Biologia Computacional/métodos , Microbiota , Doença de Crohn/microbiologia
3.
Sci Rep ; 14(1): 19676, 2024 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-39181926

RESUMO

Despite the negative externalities on the environment and human health, today's economies still produce excessive carbon dioxide emissions. As a result, governments are trying to shift production and consumption to more sustainable models that reduce the environmental impact of carbon dioxide emissions. The European Union, in particular, has implemented an innovative policy to reduce carbon dioxide emissions by creating a market for emission rights, the emissions trading system. The objective of this paper is to perform a counterfactual analysis to measure the impact of the emissions trading system on the reduction of carbon dioxide emissions. For this purpose, a recently-developed statistical machine learning method called matrix completion with fixed effects estimation is used and compared to traditional econometric techniques. We apply matrix completion with fixed effects estimation to the prediction of missing counterfactual entries of a carbon dioxide emissions matrix whose elements (indexed row-wise by country and column-wise by year) represent emissions without the emissions trading system for country-year pairs. The results obtained, confirmed by robust diagnostic tests, show a significant effect of the emissions trading system on the reduction of carbon dioxide emissions: the majority of European Union countries included in our analysis reduced their total carbon dioxide emissions (associated with selected industries) by about 15.4% during the emissions trading system treatment period 2005-2020, compared to the total carbon dioxide emissions (associated with the same industries) that would have been achieved in the absence of the emissions trading system policy. Finally, several managerial/practical implications of the study are discussed, together with its possible extensions.

4.
J Comput Graph Stat ; 33(2): 551-566, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38993268

RESUMO

In clinical practice and biomedical research, measurements are often collected sparsely and irregularly in time, while the data acquisition is expensive and inconvenient. Examples include measurements of spine bone mineral density, cancer growth through mammography or biopsy, a progression of defective vision, or assessment of gait in patients with neurological disorders. Practitioners often need to infer the progression of diseases from such sparse observations. A classical tool for analyzing such data is a mixed-effect model where time is treated as both a fixed effect (population progression curve) and a random effect (individual variability). Alternatively, researchers use Gaussian processes or functional data analysis, assuming that observations are drawn from a certain distribution of processes. While these models are flexible, they rely on probabilistic assumptions, require very careful implementation, and tend to be slow in practice. In this study, we propose an alternative elementary framework for analyzing longitudinal data motivated by matrix completion. Our method yields estimates of progression curves by iterative application of the Singular Value Decomposition. Our framework covers multivariate longitudinal data, and regression and can be easily extended to other settings. As it relies on existing tools for matrix algebra, it is efficient and easy to implement. We apply our methods to understand trends of progression of motor impairment in children with Cerebral Palsy. Our model approximates individual progression curves and explains 30% of the variability. Low-rank representation of progression trends enables identification of different progression trends in subtypes of Cerebral Palsy.

5.
Biostatistics ; 25(4): 1062-1078, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-38850151

RESUMO

DNA methylation is an important epigenetic mark that modulates gene expression through the inhibition of transcriptional proteins binding to DNA. As in many other omics experiments, the issue of missing values is an important one, and appropriate imputation techniques are important in avoiding an unnecessary sample size reduction as well as to optimally leverage the information collected. We consider the case where relatively few samples are processed via an expensive high-density whole genome bisulfite sequencing (WGBS) strategy and a larger number of samples is processed using more affordable low-density, array-based technologies. In such cases, one can impute the low-coverage (array-based) methylation data using the high-density information provided by the WGBS samples. In this paper, we propose an efficient Linear Model of Coregionalisation with informative Covariates (LMCC) to predict missing values based on observed values and covariates. Our model assumes that at each site, the methylation vector of all samples is linked to the set of fixed factors (covariates) and a set of latent factors. Furthermore, we exploit the functional nature of the data and the spatial correlation across sites by assuming some Gaussian processes on the fixed and latent coefficient vectors, respectively. Our simulations show that the use of covariates can significantly improve the accuracy of imputed values, especially in cases where missing data contain some relevant information about the explanatory variable. We also showed that our proposed model is particularly efficient when the number of columns is much greater than the number of rows-which is usually the case in methylation data analysis. Finally, we apply and compare our proposed method with alternative approaches on two real methylation datasets, showing how covariates such as cell type, tissue type or age can enhance the accuracy of imputed values.


Assuntos
Metilação de DNA , Epigênese Genética , Metilação de DNA/genética , Humanos , Modelos Estatísticos , Epigenômica/métodos , Bioestatística/métodos
6.
Comput Biol Med ; 177: 108612, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38838556

RESUMO

Alzheimer's disease (AD) is one of the most prevalent chronic neurodegenerative disorders globally, with a rapidly growing population of AD patients and currently no effective therapeutic interventions available. Consequently, the development of therapeutic anti-AD drugs and the identification of AD targets represent one of the most urgent tasks. In this study, in addition to considering known drugs and targets, we explore compound-protein interactions (CPIs) between compounds and proteins relevant to AD. We propose a deep learning model called CKG-IMC to predict Alzheimer's disease compound-protein interaction relationships. CKG-IMC comprises three modules: a collaborative knowledge graph (CKG), a principal neighborhood aggregation graph neural network (PNA), and an inductive matrix completion (IMC). The collaborative knowledge graph is used to learn semantic associations between entities, PNA is employed to extract structural features of the relationship network, and IMC is utilized for CPIs prediction. Compared with a total of 16 baseline models based on similarities, knowledge graphs, and graph neural networks, our model achieves state-of-the-art performance in experiments of 10-fold cross-validation and independent test. Furthermore, we use CKG-IMC to predict compounds interacting with two confirmed AD targets, 42-amino-acid ß-amyloid (Aß42) protein and microtubule-associated protein tau (tau protein), as well as proteins interacting with five FDA-approved anti-AD drugs. The results indicate that the majority of predictions are supported by literature, and molecular docking experiments demonstrate a strong affinity between the predicted compounds and targets.


Assuntos
Doença de Alzheimer , Aprendizado Profundo , Doença de Alzheimer/metabolismo , Doença de Alzheimer/tratamento farmacológico , Humanos , Redes Neurais de Computação , Mapas de Interação de Proteínas , Biologia Computacional/métodos
7.
Sci Rep ; 14(1): 12761, 2024 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-38834687

RESUMO

Abundant researches have consistently illustrated the crucial role of microRNAs (miRNAs) in a wide array of essential biological processes. Furthermore, miRNAs have been validated as promising therapeutic targets for addressing complex diseases. Given the costly and time-consuming nature of traditional biological experimental validation methods, it is imperative to develop computational methods. In the work, we developed a novel approach named efficient matrix completion (EMCMDA) for predicting miRNA-disease associations. First, we calculated the similarities across multiple sources for miRNA/disease pairs and combined this information to create a holistic miRNA/disease similarity measure. Second, we utilized this biological information to create a heterogeneous network and established a target matrix derived from this network. Lastly, we framed the miRNA-disease association prediction issue as a low-rank matrix-complete issue that was addressed via minimizing matrix truncated schatten p-norm. Notably, we improved the conventional singular value contraction algorithm through using a weighted singular value contraction technique. This technique dynamically adjusts the degree of contraction based on the significance of each singular value, ensuring that the physical meaning of these singular values is fully considered. We evaluated the performance of EMCMDA by applying two distinct cross-validation experiments on two diverse databases, and the outcomes were statistically significant. In addition, we executed comprehensive case studies on two prevalent human diseases, namely lung cancer and breast cancer. Following prediction and multiple validations, it was evident that EMCMDA proficiently forecasts previously undisclosed disease-related miRNAs. These results underscore the robustness and efficacy of EMCMDA in miRNA-disease association prediction.


Assuntos
Algoritmos , Biologia Computacional , Predisposição Genética para Doença , MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Neoplasias da Mama/genética
8.
Magn Reson Med ; 92(4): 1440-1455, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38725430

RESUMO

PURPOSE: To develop a new sequence to simultaneously acquire Cartesian sodium (23Na) MRI and accelerated Cartesian single (SQ) and triple quantum (TQ) sodium MRI of in vivo human brain at 7 T by leveraging two dedicated low-rank reconstruction frameworks. THEORY AND METHODS: The Double Half-Echo technique enables short echo time Cartesian 23Na MRI and acquires two k-space halves, reconstructed by a low-rank coupling constraint. Additionally, three-dimensional (3D) 23Na Multi-Quantum Coherences (MQC) MRI requires multi-echo sampling paired with phase-cycling, exhibiting a redundant multidimensional space. Simultaneous Autocalibrating and k-Space Estimation (SAKE) were used to reconstruct highly undersampled 23Na MQC MRI. Reconstruction performance was assessed against five-dimensional (5D) CS, evaluating structural similarity index (SSIM), root mean squared error (RMSE), signal-to-noise ratio (SNR), and quantification of tissue sodium concentration and TQ/SQ ratio in silico, in vitro, and in vivo. RESULTS: The proposed sequence enabled the simultaneous acquisition of fully sampled 23Na MRI while leveraging prospective undersampling for 23Na MQC MRI. SAKE improved TQ image reconstruction regarding SSIM by 6% and reduced RMSE by 35% compared to 5D CS in vivo. Thanks to prospective undersampling, the spatial resolution of 23Na MQC MRI was enhanced from 8 × 8 × 15 $$ 8\times 8\times 15 $$ mm3 to 8 × 8 × 8 $$ 8\times 8\times 8 $$ mm3 while reducing acquisition time from 2 × 31 $$ 2\times 31 $$ min to 2 × 23 $$ 2\times 23 $$ min. CONCLUSION: The proposed sequence, coupled with low-rank reconstructions, provides an efficient framework for comprehensive whole-brain sodium MRI, combining TSC, T2*, and TQ/SQ ratio estimations. Additionally, low-rank matrix completion enables the reconstruction of highly undersampled 23Na MQC MRI, allowing for accelerated acquisition or enhanced spatial resolution.


Assuntos
Algoritmos , Encéfalo , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Imagens de Fantasmas , Razão Sinal-Ruído , Sódio , Humanos , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Sódio/química , Processamento de Imagem Assistida por Computador/métodos , Isótopos de Sódio , Imageamento Tridimensional/métodos , Simulação por Computador
9.
Comput Biol Chem ; 110: 108071, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38718497

RESUMO

Incomplete data presents significant challenges in drug sensitivity analysis, especially in critical areas like oncology, where precision is paramount. Our study introduces an innovative imputation method designed specifically for low-rank matrices, addressing the crucial challenge of data completion in anticancer drug sensitivity testing. Our method unfolds in two main stages: Initially, the singular value thresholding algorithm is employed for preliminary matrix completion, establishing a solid foundation for subsequent steps. Then, the matrix rows are segmented into distinct blocks based on hierarchical clustering of correlation coefficients, applying singular value thresholding to the largest block, which has been proved to possess the largest entropy. This is followed by a refined data restoration process, where the reconstructed largest block is integrated into the initial matrix completion to achieve the final matrix completion. Compared to other methods, our approach not only improves the accuracy of data restoration but also ensures the integrity and reliability of the imputed values, establishing it as a robust tool for future drug sensitivity analysis.


Assuntos
Algoritmos , Antineoplásicos , Antineoplásicos/farmacologia , Antineoplásicos/química , Humanos , Descoberta de Drogas , Ensaios de Seleção de Medicamentos Antitumorais
10.
J Cheminform ; 16(1): 44, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38627866

RESUMO

Protein kinases become an important source of potential drug targets. Developing new, efficient, and safe small-molecule kinase inhibitors has become an important topic in the field of drug research and development. In contrast with traditional wet experiments which are time-consuming and expensive, machine learning-based approaches for predicting small molecule inhibitors for protein kinases are time-saving and cost-effective, which are highly desired for us. However, the issue of sample scarcity (known active and inactive compounds are usually limited for most kinases) poses a challenge to the research and development of machine learning-based kinase inhibitors' active prediction methods. To alleviate the data scarcity problem in the prediction of kinase inhibitors, in this study, we present a novel Meta-learning-based inductive logistic matrix completion method for the Prediction of Kinase Inhibitors (MetaILMC). MetaILMC adopts a meta-learning framework to learn a well-generalized model from tasks with sufficient samples, which can fast adapt to new tasks with limited samples. As MetaILMC allows the effective transfer of the prior knowledge learned from kinases with sufficient samples to kinases with a small number of samples, the proposed model can produce accurate predictions for kinases with limited data. Experimental results show that MetaILMC has excellent performance for prediction tasks of kinases with few-shot samples and is significantly superior to the state-of-the-art multi-task learning in terms of AUC, AUPR, etc., various performance metrics. Case studies also provided for two drugs to predict Kinase Inhibitory scores, further validating the proposed method's effectiveness and feasibility. SCIENTIFIC CONTRIBUTION: Considering the potential correlation between activity prediction tasks for different kinases, we propose a novel meta learning algorithm MetaILMC, which learns a prior of strong generalization capacity during meta-training from the tasks with sufficient training samples, such that it can be easily and quickly adapted to the new tasks of the kinase with scarce data during meta-testing. Thus, MetaILMC can effectively alleviate the data scarcity problem in the prediction of kinase inhibitors.

11.
Comput Biol Med ; 174: 108403, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38582002

RESUMO

In recent years, emerging evidence has revealed a strong association between dysregulations of long non-coding RNAs (lncRNAs) and sophisticated human diseases. Biological experiments are adequate to identify such associations, but they are costly and time-consuming. Therefore, developing high-quality computational methods is a challenging and urgent task in the field of bioinformatics. This paper proposes a new lncRNA-disease association inference approach NFMCLDA (Network Fusion and Matrix Completion lncRNA-Disease Association), which can effectively integrate multi-source association data. In this approach, miRNA information is used as the transition path, and an unbalanced random walk method on three-layer heterogeneous network is adopted in the preprocessing. Therefore, more effective information between networks can be mined and the sparsity problem of the association matrix can be solved. Finally, the matrix completion method accurately predicts associations. The results show that NFMCLDA can provide more accurate lncRNA-disease associations than state-of-the-art methods. The areas under the receiver operating characteristic curves are 0.9648 and 0.9713, respectively, through the cross-validation of 5-fold and 10-fold. Data from published case studies on four diseases - lung cancer, osteosarcoma, cervical cancer, and colon cancer - have confirmed the reliable predictive potential of NFMCLDA model.


Assuntos
MicroRNAs , RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Biologia Computacional/métodos , Neoplasias/genética , Predisposição Genética para Doença/genética , Feminino
12.
Environ Sci Technol ; 58(13): 5889-5898, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38501580

RESUMO

Human exposure to toxic chemicals presents a huge health burden. Key to understanding chemical toxicity is knowledge of the molecular target(s) of the chemicals. Because a comprehensive safety assessment for all chemicals is infeasible due to limited resources, a robust computational method for discovering targets of environmental exposures is a promising direction for public health research. In this study, we implemented a novel matrix completion algorithm named coupled matrix-matrix completion (CMMC) for predicting direct and indirect exposome-target interactions, which exploits the vast amount of accumulated data regarding chemical exposures and their molecular targets. Our approach achieved an AUC of 0.89 on a benchmark data set generated using data from the Comparative Toxicogenomics Database. Our case studies with bisphenol A and its analogues, PFAS, dioxins, PCBs, and VOCs show that CMMC can be used to accurately predict molecular targets of novel chemicals without any prior bioactivity knowledge. Our results demonstrate the feasibility and promise of computationally predicting environmental chemical-target interactions to efficiently prioritize chemicals in hazard identification and risk assessment.


Assuntos
Dioxinas , Bifenilos Policlorados , Humanos , Exposição Ambiental/análise , Bifenilos Policlorados/análise , Medição de Risco , Saúde Pública
13.
Magn Reson Med ; 91(5): 1978-1993, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38102776

RESUMO

PURPOSE: To propose a new reconstruction method for multidimensional MR fingerprinting (mdMRF) to address shading artifacts caused by physiological motion-induced measurement errors without navigating or gating. METHODS: The proposed method comprises two procedures: self-calibration and subspace reconstruction. The first procedure (self-calibration) applies temporally local matrix completion to reconstruct low-resolution images from a subset of under-sampled data extracted from the k-space center. The second procedure (subspace reconstruction) utilizes temporally global subspace reconstruction with pre-estimated temporal subspace from low-resolution images to reconstruct aliasing-free, high-resolution, and time-resolved images. After reconstruction, a customized outlier detection algorithm was employed to automatically detect and remove images corrupted by measurement errors. Feasibility, robustness, and scan efficiency were evaluated through in vivo human brain imaging experiments. RESULTS: The proposed method successfully reconstructed aliasing-free, high-resolution, and time-resolved images, where the measurement errors were accurately represented. The corrupted images were automatically and robustly detected and removed. Artifact-free T1, T2, and ADC maps were generated simultaneously. The proposed reconstruction method demonstrated robustness across different scanners, parameter settings, and subjects. A high scan efficiency of less than 20 s per slice has been achieved. CONCLUSION: The proposed reconstruction method can effectively alleviate shading artifacts caused by physiological motion-induced measurement errors. It enables simultaneous and artifact-free quantification of T1, T2, and ADC using mdMRF scans without prospective gating, with robustness and high scan efficiency.


Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Algoritmos , Imagens de Fantasmas , Artefatos
14.
Res Sq ; 2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-38076799

RESUMO

Sparsity finds applications in diverse areas such as statistics, machine learning, and signal processing. Computations over sparse structures are less complex compared to their dense counterparts and need less storage. This paper proposes a heuristic method for retrieving sparse approximate solutions of optimization problems via minimizing the ℓp quasi-norm, where 0

15.
Methods ; 219: 102-110, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37804962

RESUMO

MOTIVATION: The outbreak of the human coronavirus (SARS-CoV-2) has placed a huge burden on public health and the world economy. Compared with de novo drug discovery, drug repurposing is a promising therapeutic strategy that facilitates rapid clinical treatment decisions, shortens the development process, and reduces costs. RESULTS: In this study, we propose a weighted hypergraph learning and adaptive inductive matrix completion method, WHAIMC, for predicting potential virus-drug associations. Firstly, we integrate multi-source data to describe viruses and drugs from multiple perspectives, including drug chemical structures, drug targets, virus complete genome sequences, and virus-drug associations. Then, WHAIMC establishes an adaptive inductive matrix completion model to improve performance through adaptive learning of similarity relations. Finally, WHAIMC introduces weighted hypergraph learning into adaptive inductive matrix completion to capture higher-order relationships of viruses (or drugs). The results showed that WHAIMC had a strong predictive performance for new virus-drug associations, new viruses, and new drugs. The case study further demonstrates that WHAIMC is highly effective for repositioning antiviral drugs against SARS-CoV-2 and provides a new perspective for virus-drug association prediction. The code and data in this study is freely available at https://github.com/Mayingjun20179/WHAIMC.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Reposicionamento de Medicamentos/métodos , Antivirais/farmacologia , Antivirais/uso terapêutico , Descoberta de Drogas
16.
Cell Rep Methods ; 3(8): 100540, 2023 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-37671020

RESUMO

A central challenge in biology is to use existing measurements to predict the outcomes of future experiments. For the rapidly evolving influenza virus, variants examined in one study will often have little to no overlap with other studies, making it difficult to discern patterns or unify datasets. We develop a computational framework that predicts how an antibody or serum would inhibit any variant from any other study. We validate this method using hemagglutination inhibition data from seven studies and predict 2,000,000 new values ± uncertainties. Our analysis quantifies the transferability between vaccination and infection studies in humans and ferrets, shows that serum potency is negatively correlated with breadth, and provides a tool for pandemic preparedness. In essence, this approach enables a shift in perspective when analyzing data from "what you see is what you get" into "what anyone sees is what everyone gets."


Assuntos
Furões , Óleos Voláteis , Animais , Humanos , Anticorpos , Testes de Inibição da Hemaglutinação , Aprendizado de Máquina
17.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(4): 778-783, 2023 Aug 25.
Artigo em Chinês | MEDLINE | ID: mdl-37666769

RESUMO

Single-cell transcriptome sequencing (scRNA-seq) can resolve the expression characteristics of cells in tissues with single-cell precision, enabling researchers to quantify cellular heterogeneity within populations with higher resolution, revealing potentially heterogeneous cell populations and the dynamics of complex tissues. However, the presence of a large number of technical zeros in scRNA-seq data will have an impact on downstream analysis of cell clustering, differential genes, cell annotation, and pseudotime, hindering the discovery of meaningful biological signals. The main idea to solve this problem is to make use of the potential correlation between cells and genes, and to impute the technical zeros through the observed data. Based on this, this paper reviewed the basic methods of imputing technical zeros in the scRNA-seq data and discussed the advantages and disadvantages of the existing methods. Finally, recommendations and perspectives on the use and development of the method were provided.


Assuntos
Transcriptoma , Análise por Conglomerados
18.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37529921

RESUMO

Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for uncovering cellular heterogeneity. However, the high costs associated with this technique have rendered it impractical for studying large patient cohorts. We introduce ENIGMA (Deconvolution based on Regularized Matrix Completion), a method that addresses this limitation through accurately deconvoluting bulk tissue RNA-seq data into a readout with cell-type resolution by leveraging information from scRNA-seq data. By employing a matrix completion strategy, ENIGMA minimizes the distance between the mixture transcriptome obtained with bulk sequencing and a weighted combination of cell-type-specific expression. This allows the quantification of cell-type proportions and reconstruction of cell-type-specific transcriptomes. To validate its performance, ENIGMA was tested on both simulated and real datasets, including disease-related tissues, demonstrating its ability in uncovering novel biological insights.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Humanos , Perfilação da Expressão Gênica/métodos , Software , RNA-Seq/métodos , Análise de Sequência de RNA/métodos
19.
Open Mind (Camb) ; 7: 197-220, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37416068

RESUMO

Relational reasoning is a key component of fluid intelligence and an important predictor of academic achievement. Relational reasoning is commonly assessed using matrix completion tasks, in which participants see an incomplete matrix of items that vary on different dimensions and select a response that best completes the matrix based on the relations among items. Performance on such assessments increases dramatically across childhood into adulthood. However, despite widespread use, little is known about the strategies associated with good or poor matrix completion performance in childhood. This study examined the strategies children and adults use to solve matrix completion problems, how those strategies change with age, and whether children and adults adapt strategies to difficulty. We used eyetracking to infer matrix completion strategy use in 6- and 9-year-old children and adults. Across ages, scanning across matrix rows and columns predicted good overall performance, and quicker and higher rates of consulting potential answers predicted poor performance, indicating that optimal matrix completion strategies are similar across development. Indices of good strategy use increased across childhood. As problems increased in difficulty, children and adults increased their scanning of matrix rows and columns, and adults and 9-year-olds also shifted strategies to rely more on consulting potential answers. Adapting strategies to matrix difficulty, particularly increased scanning of rows and columns, was associated with good overall performance in both children and adults. These findings underscore the importance of both spontaneous and adaptive strategy use in individual differences in relational reasoning and its development.

20.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37482409

RESUMO

Numerous biological studies have shown that considering disease-associated micro RNAs (miRNAs) as potential biomarkers or therapeutic targets offers new avenues for the diagnosis of complex diseases. Computational methods have gradually been introduced to reveal disease-related miRNAs. Considering that previous models have not fused sufficiently diverse similarities, that their inappropriate fusion methods may lead to poor quality of the comprehensive similarity network and that their results are often limited by insufficiently known associations, we propose a computational model called Generative Adversarial Matrix Completion Network based on Multi-source Data Fusion (GAMCNMDF) for miRNA-disease association prediction. We create a diverse network connecting miRNAs and diseases, which is then represented using a matrix. The main task of GAMCNMDF is to complete the matrix and obtain the predicted results. The main innovations of GAMCNMDF are reflected in two aspects: GAMCNMDF integrates diverse data sources and employs a nonlinear fusion approach to update the similarity networks of miRNAs and diseases. Also, some additional information is provided to GAMCNMDF in the form of a 'hint' so that GAMCNMDF can work successfully even when complete data are not available. Compared with other methods, the outcomes of 10-fold cross-validation on two distinct databases validate the superior performance of GAMCNMDF with statistically significant results. It is worth mentioning that we apply GAMCNMDF in the identification of underlying small molecule-related miRNAs, yielding outstanding performance results in this specific domain. In addition, two case studies about two important neoplasms show that GAMCNMDF is a promising prediction method.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , Algoritmos , Biologia Computacional/métodos , Neoplasias/genética , Bases de Dados Genéticas , Predisposição Genética para Doença
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA