Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Bioinformatics ; 40(7)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38913860

RESUMO

MOTIVATION: Drug repurposing is a viable solution for reducing the time and cost associated with drug development. However, thus far, the proposed drug repurposing approaches still need to meet expectations. Therefore, it is crucial to offer a systematic approach for drug repurposing to achieve cost savings and enhance human lives. In recent years, using biological network-based methods for drug repurposing has generated promising results. Nevertheless, these methods have limitations. Primarily, the scope of these methods is generally limited concerning the size and variety of data they can effectively handle. Another issue arises from the treatment of heterogeneous data, which needs to be addressed or converted into homogeneous data, leading to a loss of information. A significant drawback is that most of these approaches lack end-to-end functionality, necessitating manual implementation and expert knowledge in certain stages. RESULTS: We propose a new solution, Heterogeneous Graph Transformer for Drug Repurposing (HGTDR), to address the challenges associated with drug repurposing. HGTDR is a three-step approach for knowledge graph-based drug repurposing: (1) constructing a heterogeneous knowledge graph, (2) utilizing a heterogeneous graph transformer network, and (3) computing relationship scores using a fully connected network. By leveraging HGTDR, users gain the ability to manipulate input graphs, extract information from diverse entities, and obtain their desired output. In the evaluation step, we demonstrate that HGTDR performs comparably to previous methods. Furthermore, we review medical studies to validate our method's top 10 drug repurposing suggestions, which have exhibited promising results. We also demonstrated HGTDR's capability to predict other types of relations through numerical and experimental validation, such as drug-protein and disease-protein inter-relations. AVAILABILITY AND IMPLEMENTATION: The source code and data are available at https://github.com/bcb-sut/HGTDR and http://git.dml.ir/BCB/HGTDR.


Assuntos
Reposicionamento de Medicamentos , Reposicionamento de Medicamentos/métodos , Humanos , Algoritmos , Biologia Computacional/métodos , Software
2.
Heliyon ; 9(11): e21965, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38058649

RESUMO

Purpose: The rapid spread of the COVID-19 omicron variant virus has resulted in an overload of hospitals around the globe. As a result, many patients are deprived of hospital facilities, increasing mortality rates. Therefore, mortality rates can be reduced by efficiently assigning facilities to higher-risk patients. Therefore, it is crucial to estimate patients' survival probability based on their conditions at the time of admission so that the minimum required facilities can be provided, allowing more opportunities to be available for those who need them. Although radiologic findings in chest computerized tomography scans show various patterns, considering the individual risk factors and other underlying diseases, it is difficult to predict patient prognosis through routine clinical or statistical analysis. Method: In this study, a deep neural network model is proposed for predicting survival based on simple clinical features, blood tests, axial computerized tomography scan images of lungs, and the patients' planned treatment. The model's architecture combines a Convolutional Neural Network and a Long Short Term Memory network. The model was trained using 390 survivors and 108 deceased patients from the Rasoul Akram Hospital and evaluated 109 surviving and 36 deceased patients infected by the omicron variant. Results: The proposed model reached an accuracy of 87.5% on the test data, indicating survival prediction possibility. The accuracy was significantly higher than the accuracy achieved by classical machine learning methods without considering computerized tomography scan images (p-value <= 4E-5). The images were also replaced with hand-crafted features related to the ratio of infected lung lobes used in classical machine-learning models. The highest-performing model reached an accuracy of 84.5%, which was considerably higher than the models trained on mere clinical information (p-value <= 0.006). However, the performance was still significantly less than the deep model (p-value <= 0.016). Conclusion: The proposed deep model achieved a higher accuracy than classical machine learning methods trained on features other than computerized tomography scan images. This proves the images contain extra information. Meanwhile, Artificial Intelligence methods with multimodal inputs can be more reliable and accurate than computerized tomography severity scores.

3.
BMC Health Serv Res ; 23(1): 1416, 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38102620

RESUMO

BACKGROUND: Policymakers require precise and in-time information to make informed decisions in complex environments such as health systems. Artificial intelligence (AI) is a novel approach that makes collecting and analyzing data in complex systems more accessible. This study highlights recent research on AI's application and capabilities in health policymaking. METHODS: We searched PubMed, Scopus, and the Web of Science databases to find relevant studies from 2000 to 2023, using the keywords "artificial intelligence" and "policymaking." We used Walt and Gilson's policy triangle framework for charting the data. RESULTS: The results revealed that using AI in health policy paved the way for novel analyses and innovative solutions for intelligent decision-making and data collection, potentially enhancing policymaking capacities, particularly in the evaluation phase. It can also be employed to create innovative agendas with fewer political constraints and greater rationality, resulting in evidence-based policies. By creating new platforms and toolkits, AI also offers the chance to make judgments based on solid facts. The majority of the proposed AI solutions for health policy aim to improve decision-making rather than replace experts. CONCLUSION: Numerous approaches exist for AI to influence the health policymaking process. Health systems can benefit from AI's potential to foster the meaningful use of evidence-based policymaking.


Assuntos
Inteligência Artificial , Política de Saúde , Humanos , Formulação de Políticas , Assistência Médica
4.
Cost Eff Resour Alloc ; 21(1): 83, 2023 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-37932778

RESUMO

INTRODUCTION: Artificial Intelligence (AI) represents a significant advancement in technology, and it is crucial for policymakers to incorporate AI thinking into policies and to fully explore, analyze and utilize massive data and conduct AI-related policies. AI has the potential to optimize healthcare financing systems. This study provides an overview of the AI application domains in healthcare financing. METHOD: We conducted a scoping review in six steps: formulating research questions, identifying relevant studies by conducting a comprehensive literature search using appropriate keywords, screening titles and abstracts for relevance, reviewing full texts of relevant articles, charting extracted data, and compiling and summarizing findings. Specifically, the research question sought to identify the applications of artificial intelligence in health financing supported by the published literature and explore potential future applications. PubMed, Scopus, and Web of Science databases were searched between 2000 and 2023. RESULTS: We discovered that AI has a significant impact on various aspects of health financing, such as governance, revenue raising, pooling, and strategic purchasing. We provide evidence-based recommendations for establishing and improving the health financing system based on AI. CONCLUSIONS: To ensure that vulnerable groups face minimum challenges and benefit from improved health financing, we urge national and international institutions worldwide to use and adopt AI tools and applications.

5.
BioData Min ; 16(1): 31, 2023 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-37904172

RESUMO

BACKGROUND: The governance of health systems is complex in nature due to several intertwined and multi-dimensional factors contributing to it. Recent challenges of health systems reflect the need for innovative approaches that can minimize adverse consequences of policies. Hence, there is compelling evidence of a distinct outlook on the health ecosystem using artificial intelligence (AI). Therefore, this study aimed to investigate the roles of AI and its applications in health system governance through an interpretive scoping review of current evidence. METHOD: This study intended to offer a research agenda and framework for the applications of AI in health systems governance. To include shreds of evidence with a greater focus on the application of AI in health governance from different perspectives, we searched the published literature from 2000 to 2023 through PubMed, Scopus, and Web of Science Databases. RESULTS: Our findings showed that integrating AI capabilities into health systems governance has the potential to influence three cardinal dimensions of health. These include social determinants of health, elements of governance, and health system tasks and goals. AI paves the way for strengthening the health system's governance through various aspects, i.e., intelligence innovations, flexible boundaries, multidimensional analysis, new insights, and cognition modifications to the health ecosystem area. CONCLUSION: AI is expected to be seen as a tool with new applications and capabilities, with the potential to change each component of governance in the health ecosystem, which can eventually help achieve health-related goals.

6.
PLoS Comput Biol ; 19(7): e1011249, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37486921

RESUMO

The genetic etiology of brain disorders is highly heterogeneous, characterized by abnormalities in the development of the central nervous system that lead to diminished physical or intellectual capabilities. The process of determining which gene drives disease, known as "gene prioritization," is not entirely understood. Genome-wide searches for gene-disease associations are still underdeveloped due to reliance on previous discoveries and evidence sources with false positive or negative relations. This paper introduces DeepGenePrior, a model based on deep neural networks that prioritizes candidate genes in genetic diseases. Using the well-studied Variational AutoEncoder (VAE), we developed a score to measure the impact of genes on target diseases. Unlike other methods that use prior data to select candidate genes, based on the "guilt by association" principle and auxiliary data sources like protein networks, our study exclusively employs copy number variants (CNVs) for gene prioritization. By analyzing CNVs from 74,811 individuals with autism, schizophrenia, and developmental delay, we identified genes that best distinguish cases from controls. Our findings indicate a 12% increase in fold enrichment in brain-expressed genes compared to previous studies and a 15% increase in genes associated with mouse nervous system phenotypes. Furthermore, we identified common deletions in ZDHHC8, DGCR5, and CATG00000022283 among the top genes related to all three disorders, suggesting a common etiology among these clinically distinct conditions. DeepGenePrior is publicly available online at http://git.dml.ir/z_rahaie/DGP to address obstacles in existing gene prioritization studies identifying candidate genes.


Assuntos
Transtorno Autístico , Aprendizado Profundo , Animais , Camundongos , Variações do Número de Cópias de DNA/genética , Transtorno Autístico/genética , Encéfalo , Predisposição Genética para Doença/genética
7.
Soc Netw Anal Min ; 13(1): 60, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37033472

RESUMO

Recent studies in network science and control have shown a meaningful relationship between the epidemic processes (e.g., COVID-19 spread) and some network properties. This paper studies how such network properties, namely clustering coefficient and centrality measures (or node influence metrics), affect the spread of viruses and the growth of epidemics over scale-free networks. The results can be used to target individuals (the nodes in the network) to flatten the infection curve. This so-called flattening of the infection curve is to reduce the health service costs and burden to the authorities/governments. Our Monte-Carlo simulation results show that clustered networks are, in general, easier to flatten the infection curve, i.e., with the same connectivity and the same number of isolated individuals they result in more flattened curves. Moreover, distance-based centrality measures, which target the nodes based on their average network distance to other nodes (and not the node degrees), are better choices for targeting individuals for isolation/vaccination.

8.
PLoS One ; 17(11): e0277887, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36409705

RESUMO

Deep learning-based graph generation approaches have remarkable capacities for graph data modeling, allowing them to solve a wide range of real-world problems. Making these methods able to consider different conditions during the generation procedure even increases their effectiveness by empowering them to generate new graph samples that meet the desired criteria. This paper presents a conditional deep graph generation method called SCGG that considers a particular type of structural conditions. Specifically, our proposed SCGG model takes an initial subgraph and autoregressively generates new nodes and their corresponding edges on top of the given conditioning substructure. The architecture of SCGG consists of a graph representation learning network and an autoregressive generative model, which is trained end-to-end. More precisely, the graph representation learning network is designed to compute continuous representations for each node in a graph, which are not only affected by the features of adjacent nodes, but also by the ones of farther nodes. This network is primarily responsible for providing the generation procedure with the structural condition, while the autoregressive generative model mainly maintains the generation history. Using this model, we can address graph completion, a rampant and inherently difficult problem of recovering missing nodes and their associated edges of partially observed graphs. The computational complexity of the SCGG method is shown to be linear in the number of graph nodes. Experimental results on both synthetic and real-world datasets demonstrate the superiority of our method compared with state-of-the-art baselines.


Assuntos
Manutenção , Modelos Estruturais
10.
BMC Bioinformatics ; 23(1): 331, 2022 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-35953785

RESUMO

BACKGROUND: Several types of RNA in the cell are usually involved in biological processes with multiple functions. Coding RNAs code for proteins while non-coding RNAs regulate gene expression. Some single-strand RNAs can create a circular shape via the back splicing process and convert into a new type called circular RNA (circRNA). circRNAs are among the essential non-coding RNAs in the cell that involve multiple disorders. One of the critical functions of circRNAs is to regulate the expression of other genes through sponging micro RNAs (miRNAs) in diseases. This mechanism, known as the competing endogenous RNA (ceRNA) hypothesis, and additional information obtained from biological datasets can be used by computational approaches to predict novel associations between disease and circRNAs. RESULTS: We applied multiple classifiers to validate the extracted features from the heterogeneous network and selected the most appropriate one based on some evaluation criteria. Then, the XGBoost is utilized in our pipeline to generate a novel approach, called CircWalk, to predict CircRNA-Disease associations. Our results demonstrate that CircWalk has reasonable accuracy and AUC compared with other state-of-the-art algorithms. We also use CircWalk to predict novel circRNAs associated with lung, gastric, and colorectal cancers as a case study. The results show that our approach can accurately detect novel circRNAs related to these diseases. CONCLUSIONS: Considering the ceRNA hypothesis, we integrate multiple resources to construct a heterogeneous network from circRNAs, mRNAs, miRNAs, and diseases. Next, the DeepWalk algorithm is applied to the network to extract feature vectors for circRNAs and diseases. The extracted features are used to learn a classifier and generate a model to predict novel CircRNA-Disease associations. Our approach uses the concept of the ceRNA hypothesis and the miRNA sponge effect of circRNAs to predict their associations with diseases. Our results show that this outlook could help identify CircRNA-Disease associations more accurately.


Assuntos
MicroRNAs , RNA Circular , Perfilação da Expressão Gênica/métodos , Ontologia Genética , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Mensageiro/genética
11.
BMC Bioinformatics ; 23(1): 298, 2022 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-35879674

RESUMO

BACKGROUND: The advent of high throughput sequencing has enabled researchers to systematically evaluate the genetic variations in cancer, identifying many cancer-associated genes. Although cancers in the same tissue are widely categorized in the same group, they demonstrate many differences concerning their mutational profiles. Hence, there is no definitive treatment for most cancer types. This reveals the importance of developing new pipelines to identify cancer-associated genes accurately and re-classify patients with similar mutational profiles. Classification of cancer patients with similar mutational profiles may help discover subtypes of cancer patients who might benefit from specific treatment types. RESULTS: In this study, we propose a new machine learning pipeline to identify protein-coding genes mutated in many samples to identify cancer subtypes. We apply our pipeline to 12,270 samples collected from the international cancer genome consortium, covering 19 cancer types. As a result, we identify 17 different cancer subtypes. Comprehensive phenotypic and genotypic analysis indicates distinguishable properties, including unique cancer-related signaling pathways. CONCLUSIONS: This new subtyping approach offers a novel opportunity for cancer drug development based on the mutational profile of patients. Additionally, we analyze the mutational signatures for samples in each subtype, which provides important insight into their active molecular mechanisms. Some of the pathways we identified in most subtypes, including the cell cycle and the Axon guidance pathways, are frequently observed in cancer disease. Interestingly, we also identified several mutated genes and different rates of mutation in multiple cancer subtypes. In addition, our study on "gene-motif" suggests the importance of considering both the context of the mutations and mutational processes in identifying cancer-associated genes. The source codes for our proposed clustering pipeline and analysis are publicly available at: https://github.com/bcb-sut/Pan-Cancer .


Assuntos
Neoplasias , Mutação Puntual , Análise por Conglomerados , Genoma Humano , Humanos , Mutação , Neoplasias/genética
12.
Commun Biol ; 5(1): 556, 2022 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-35672401

RESUMO

Non-coding RNAs (ncRNAs) form a large portion of the mammalian genome. However, their biological functions are poorly characterized in cancers. In this study, using a newly developed tool, SomaGene, we analyze de novo somatic point mutations from the International Cancer Genome Consortium (ICGC) whole-genome sequencing data of 1,855 breast cancer samples. We identify 1030 candidates of ncRNAs that are significantly and explicitly mutated in breast cancer samples. By integrating data from the ENCODE regulatory features and FANTOM5 expression atlas, we show that the candidate ncRNAs significantly enrich active chromatin histone marks (1.9 times), CTCF binding sites (2.45 times), DNase accessibility (1.76 times), HMM predicted enhancers (2.26 times) and eQTL polymorphisms (1.77 times). Importantly, we show that the 1030 ncRNAs contain a much higher level (3.64 times) of breast cancer-associated genome-wide association (GWAS) single nucleotide polymorphisms (SNPs) than genome-wide expectation. Such enrichment has not been seen with GWAS SNPs from other cancers. Using breast cell line related Hi-C data, we then show that 82% of our candidate ncRNAs (1.9 times) significantly interact with the promoter of protein-coding genes, including previously known cancer-associated genes, suggesting the critical role of candidate ncRNA genes in the activation of essential regulators of development and differentiation in breast cancer. We provide an extensive web-based resource ( https://www.ihealthe.unsw.edu.au/research ) to communicate our results with the research community. Our list of breast cancer-specific ncRNA genes has the potential to provide a better understanding of the underlying genetic causes of breast cancer. Lastly, the tool developed in this study can be used to analyze somatic mutations in all cancers.


Assuntos
Neoplasias da Mama , Estudo de Associação Genômica Ampla , Neoplasias da Mama/genética , Feminino , Estudo de Associação Genômica Ampla/métodos , Humanos , Mutação Puntual , Polimorfismo de Nucleotídeo Único , RNA não Traduzido/genética
13.
PLoS Comput Biol ; 18(6): e1010241, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35749574

RESUMO

Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than transient background and artefactual interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly identifies statistically significant interactions in both Hi-C and capture Hi-C experiments. MaxHiC uses a negative binomial distribution model and a maximum likelihood technique to correct biases in both Hi-C and capture Hi-C libraries. We systematically benchmark MaxHiC against major Hi-C background correction tools including Hi-C significant interaction callers (SIC) and Hi-C loop callers using published Hi-C, capture Hi-C, and Micro-C datasets. Our results demonstrate that 1) Interacting regions identified by MaxHiC have significantly greater levels of overlap with known regulatory features (e.g. active chromatin histone marks, CTCF binding sites, DNase sensitivity) and also disease-associated genome-wide association SNPs than those identified by currently existing models, 2) the pairs of interacting regions are more likely to be linked by eQTL pairs and 3) more likely to link known regulatory features including known functional enhancer-promoter pairs validated by CRISPRi than any of the existing methods. We also demonstrate that interactions between different genomic region types have distinct distance distributions only revealed by MaxHiC. MaxHiC is publicly available as a python package for the analysis of Hi-C, capture Hi-C and Micro-C data.


Assuntos
Cromatina , Estudo de Associação Genômica Ampla , Sítios de Ligação , Cromatina/genética , Genoma , Genômica/métodos
14.
BMC Bioinformatics ; 23(1): 138, 2022 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-35439935

RESUMO

BACKGROUND: Colorectal cancer (CRC) is one of the leading causes of cancer-related deaths worldwide. Recent studies have observed causative mutations in susceptible genes related to colorectal cancer in 10 to 15% of the patients. This highlights the importance of identifying mutations for early detection of this cancer for more effective treatments among high risk individuals. Mutation is considered as the key point in cancer research. Many studies have performed cancer subtyping based on the type of frequently mutated genes, or the proportion of mutational processes. However, to the best of our knowledge, combination of these features has never been used together for this task. This highlights the potential to introduce better and more inclusive subtype classification approaches using wider range of related features to enable biomarker discovery and thus inform drug development for CRC. RESULTS: In this study, we develop a new pipeline based on a novel concept called 'gene-motif', which merges mutated gene information with tri-nucleotide motif of mutated sites, for colorectal cancer subtype identification. We apply our pipeline to the International Cancer Genome Consortium (ICGC) CRC samples and identify, for the first time, 3131 gene-motif combinations that are significantly mutated in 536 ICGC colorectal cancer samples. Using these features, we identify seven CRC subtypes with distinguishable phenotypes and biomarkers, including unique cancer related signaling pathways, in which for most of them targeted treatment options are currently available. Interestingly, we also identify several genes that are mutated in multiple subtypes but with unique sequence contexts. CONCLUSION: Our results highlight the importance of considering both the mutation type and mutated genes in identification of cancer subtypes and cancer biomarkers. The new CRC subtypes presented in this study demonstrates distinguished phenotypic properties which can be effectively used to develop new treatments. By knowing the genes and phenotypes associated with the subtypes, a personalized treatment plan can be developed that considers the specific phenotypes associated with their genomic lesion.


Assuntos
Neoplasias Colorretais , Biomarcadores Tumorais/genética , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Genômica , Humanos , Mutação , Fenótipo
15.
Med Image Anal ; 75: 102272, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34731774

RESUMO

Disease prediction is a well-known classification problem in medical applications. Graph Convolutional Networks (GCNs) provide a powerful tool for analyzing the patients' features relative to each other. This can be achieved by modeling the problem as a graph node classification task, where each node is a patient. Due to the nature of such medical datasets, class imbalance is a prevalent issue in the field of disease prediction, where the distribution of classes is skewed. When the class imbalance is present in the data, the existing graph-based classifiers tend to be biased towards the major class(es) and neglect the samples in the minor class(es). On the other hand, the correct diagnosis of the rare positive cases (true-positives) among all the patients is vital in a healthcare system. In conventional methods, such imbalance is tackled by assigning appropriate weights to classes in the loss function which is still dependent on the relative values of weights, sensitive to outliers, and in some cases biased towards the minor class(es). In this paper, we propose a Re-weighted Adversarial Graph Convolutional Network (RA-GCN) to prevent the graph-based classifier from emphasizing the samples of any particular class. This is accomplished by associating a graph-based neural network to each class, which is responsible for weighting the class samples and changing the importance of each sample for the classifier. Therefore, the classifier adjusts itself and determines the boundary between classes with more attention to the important samples. The parameters of the classifier and weighting networks are trained by an adversarial approach. We show experiments on synthetic and three publicly available medical datasets. Our results demonstrate the superiority of RA-GCN compared to recent methods in identifying the patient's status on all three datasets. The detailed analysis of our method is provided as quantitative and qualitative experiments on synthetic datasets.


Assuntos
Redes Neurais de Computação , Humanos
16.
Neural Netw ; 144: 726-736, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34678569

RESUMO

Autoencoders have recently been widely employed to approach the novelty detection problem. Trained only on the normal data, the AE is expected to reconstruct the normal data effectively while failing to regenerate the anomalous data. Based on this assumption, one could utilize the AE for novelty detection. However, it is known that this assumption does not always hold. Such an AE can often perfectly reconstruct the anomalous data due to modeling low-level and generic features in the input. We propose a novel training algorithm for the AE that facilitates learning more semantically meaningful features to address this problem. For this purpose, we exploit the fact that adversarial robustness promotes the learning of significant features. Therefore, we force the AE to learn such features by making its bottleneck layer more stable against adversarial perturbations. This idea is general and can be applied to other autoencoder-based approaches as well. We show that despite using a much simpler architecture than the prior methods, the proposed AE outperforms or is competitive to the state-of-the-art on four benchmark datasets and two medical datasets.


Assuntos
Algoritmos , Benchmarking
17.
Cancers (Basel) ; 13(17)2021 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-34503185

RESUMO

It is now known that at least 10% of samples with pancreatic cancers (PC) contain a causative mutation in the known susceptibility genes, suggesting the importance of identifying cancer-associated genes that carry the causative mutations in high-risk individuals for early detection of PC. In this study, we develop a statistical pipeline using a new concept, called gene-motif, that utilizes both mutated genes and mutational processes to identify 4211 3-nucleotide PC-associated gene-motifs within 203 significantly mutated genes in PC. Using these gene-motifs as distinguishable features for pancreatic cancer subtyping results in identifying five PC subtypes with distinguishable phenotypes and genotypes. Our comprehensive biological characterization reveals that these PC subtypes are associated with different molecular mechanisms including unique cancer related signaling pathways, in which for most of the subtypes targeted treatment options are currently available. Some of the pathways we identified in all five PC subtypes, including cell cycle and the Axon guidance pathway are frequently seen and mutated in cancer. We also identified Protein kinase C, EGFR (epidermal growth factor receptor) signaling pathway and P53 signaling pathways as potential targets for treatment of the PC subtypes. Altogether, our results uncover the importance of considering both the mutation type and mutated genes in the identification of cancer subtypes and biomarkers.

18.
PLoS One ; 16(2): e0244430, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33630862

RESUMO

Understanding the functionality of proteins has emerged as a critical problem in recent years due to significant roles of these macro-molecules in biological mechanisms. However, in-laboratory techniques for protein function prediction are not as efficient as methods developed and processed for protein sequencing. While more than 70 million protein sequences are available today, only the functionality of around one percent of them are known. These facts have encouraged researchers to develop computational methods to infer protein functionalities from their sequences. Gene Ontology is the most well-known database for protein functions which has a hierarchical structure, where deeper terms are more determinative and specific. However, the lack of experimentally approved annotations for these specific terms limits the performance of computational methods applied on them. In this work, we propose a method to improve protein function prediction using their sequences by deeply extracting relationships between Gene Ontology terms. To this end, we construct a conditional generative adversarial network which helps to effectively discover and incorporate term correlations in the annotation process. In addition to the baseline algorithms, we compare our method with two recently proposed deep techniques that attempt to utilize Gene Ontology term correlations. Our results confirm the superiority of the proposed method compared to the previous works. Moreover, we demonstrate how our model can effectively help to assign more specific terms to sequences.


Assuntos
Ontologia Genética , Redes Reguladoras de Genes , Proteínas/metabolismo , Biologia Computacional/métodos , Humanos , Análise de Sequência de Proteína/métodos
19.
Bioinformatics ; 37(10): 1345-1351, 2021 06 16.
Artigo em Inglês | MEDLINE | ID: mdl-33226074

RESUMO

MOTIVATION: Single-cell RNA-sequencing (scRNA-seq) offers the opportunity to dissect heterogeneous cellular compositions and interrogate the cell-type-specific gene expression patterns across diverse conditions. However, batch effects such as laboratory conditions and individual-variability hinder their usage in cross-condition designs. RESULTS: Here, we present a single-cell Generative Adversarial Network (scGAN) to simultaneously acquire patterns from raw data while minimizing the confounding effect driven by technical artifacts or other factors inherent to the data. Specifically, scGAN models the data likelihood of the raw scRNA-seq counts by projecting each cell onto a latent embedding. Meanwhile, scGAN attempts to minimize the correlation between the latent embeddings and the batch labels across all cells. We demonstrate scGAN on three public scRNA-seq datasets and show that our method confers superior performance over the state-of-the-art methods in forming clusters of known cell types and identifying known psychiatric genes that are associated with major depressive disorder. AVAILABILITYAND IMPLEMENTATION: The scGAN code and the information for the public scRNA-seq datasets are available at https://github.com/li-lab-mcgill/singlecell-deepfeature. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Transtorno Depressivo Maior , Análise de Célula Única , Perfilação da Expressão Gênica , Humanos , Análise de Sequência de RNA , Transcriptoma
20.
Sci Rep ; 10(1): 1286, 2020 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-31992766

RESUMO

Analysis of cancer mutational signatures have been instrumental in identification of responsible endogenous and exogenous molecular processes in cancer. The quantitative approach used to deconvolute mutational signatures is becoming an integral part of cancer research. Therefore, development of a stand-alone tool with a user-friendly interface for analysis of cancer mutational signatures is necessary. In this manuscript we introduce CANCERSIGN, which enables users to identify 3-mer and 5-mer mutational signatures within whole genome, whole exome or pooled samples. Additionally, this tool enables users to perform clustering on tumor samples based on the proportion of mutational signatures in each sample. Using CANCERSIGN, we analysed all the whole genome somatic mutation datasets profiled by the International Cancer Genome Consortium (ICGC) and identified a number of novel signatures. By examining signatures found in exonic and non-exonic regions of the genome using WGS and comparing this to signatures found in WES data we observe that WGS can identify additional non-exonic signatures that are enriched in the non-coding regions of the genome while the deeper sequencing of WES may help identify weak signatures that are otherwise missed in shallower WGS data.


Assuntos
Bases de Dados de Ácidos Nucleicos , Exoma , Genoma Humano , Mutação , Neoplasias/genética , Software , Animais , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA