Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
1.
Bioinformatics ; 40(Suppl 1): i169-i179, 2024 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940180

RESUMO

MOTIVATION: Electronic health records (EHRs) represent a comprehensive resource of a patient's medical history. EHRs are essential for utilizing advanced technologies such as deep learning (DL), enabling healthcare providers to analyze extensive data, extract valuable insights, and make precise and data-driven clinical decisions. DL methods such as recurrent neural networks (RNN) have been utilized to analyze EHR to model disease progression and predict diagnosis. However, these methods do not address some inherent irregularities in EHR data such as irregular time intervals between clinical visits. Furthermore, most DL models are not interpretable. In this study, we propose two interpretable DL architectures based on RNN, namely time-aware RNN (TA-RNN) and TA-RNN-autoencoder (TA-RNN-AE) to predict patient's clinical outcome in EHR at the next visit and multiple visits ahead, respectively. To mitigate the impact of irregular time intervals, we propose incorporating time embedding of the elapsed times between visits. For interpretability, we propose employing a dual-level attention mechanism that operates between visits and features within each visit. RESULTS: The results of the experiments conducted on Alzheimer's Disease Neuroimaging Initiative (ADNI) and National Alzheimer's Coordinating Center (NACC) datasets indicated the superior performance of proposed models for predicting Alzheimer's Disease (AD) compared to state-of-the-art and baseline approaches based on F2 and sensitivity. Additionally, TA-RNN showed superior performance on the Medical Information Mart for Intensive Care (MIMIC-III) dataset for mortality prediction. In our ablation study, we observed enhanced predictive performance by incorporating time embedding and attention mechanisms. Finally, investigating attention weights helped identify influential visits and features in predictions. AVAILABILITY AND IMPLEMENTATION: https://github.com/bozdaglab/TA-RNN.


Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde , Redes Neurais de Computação , Humanos , Doença de Alzheimer
2.
Bioinformatics ; 39(39 Suppl 1): i149-i157, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387135

RESUMO

MOTIVATION: Alzheimer's disease (AD) is a neurodegenerative disease that affects millions of people worldwide. Mild cognitive impairment (MCI) is an intermediary stage between cognitively normal state and AD. Not all people who have MCI convert to AD. The diagnosis of AD is made after significant symptoms of dementia such as short-term memory loss are already present. Since AD is currently an irreversible disease, diagnosis at the onset of the disease brings a huge burden on patients, their caregivers, and the healthcare sector. Thus, there is a crucial need to develop methods for the early prediction AD for patients who have MCI. Recurrent neural networks (RNN) have been successfully used to handle electronic health records (EHR) for predicting conversion from MCI to AD. However, RNN ignores irregular time intervals between successive events which occurs common in electronic health record data. In this study, we propose two deep learning architectures based on RNN, namely Predicting Progression of Alzheimer's Disease (PPAD) and PPAD-Autoencoder. PPAD and PPAD-Autoencoder are designed for early predicting conversion from MCI to AD at the next visit and multiple visits ahead for patients, respectively. To minimize the effect of the irregular time intervals between visits, we propose using age in each visit as an indicator of time change between successive visits. RESULTS: Our experimental results conducted on Alzheimer's Disease Neuroimaging Initiative and National Alzheimer's Coordinating Center datasets showed that our proposed models outperformed all baseline models for most prediction scenarios in terms of F2 and sensitivity. We also observed that the age feature was one of top features and was able to address irregular time interval problem. AVAILABILITY AND IMPLEMENTATION: https://github.com/bozdaglab/PPAD.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Aprendizado Profundo , Doenças Neurodegenerativas , Humanos , Doença de Alzheimer/diagnóstico por imagem , Disfunção Cognitiva/diagnóstico por imagem , Registros Eletrônicos de Saúde
3.
J Arthroplasty ; 37(3): 414-418, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34793857

RESUMO

BACKGROUND: Identifying risk factors for adverse outcomes and increased costs following total joint arthroplasty (TJA) is needed to ensure quality. The interaction between pre-operative healthcare utilization (pre-HU) and outcomes following TJA has not been fully characterized. METHODS: This is a retrospective cohort study of patients undergoing elective, primary total hip arthroplasty (THA, N = 1785) or total knee arthroplasty (TKA, N = 2159) between 2015 and 2019 at a single institution. Pre-HU and post-operative healthcare utilization (post-HU) included non-elective healthcare utilization in the 90 days prior to and following TJA, respectively (emergency department, urgent care, observation admission, inpatient admission). Multivariate regression models including age, gender, American Society of Anesthesiologists, Medicaid status, and body mass index were fit for 30-day readmission, Centers for Medicare and Medicaid services (CMS)-defined complications, length of stay, and post-HU. RESULTS: The 30-day readmission rate was 3.2% and 3.4% and the CMS-defined complication rate was 3.8% and 2.9% for THA and TKA, respectively. Multivariate regression showed that for THA, presence of any pre-HU was associated with increased risk of 30-day readmission (odds ratio [OR] 2.85, 95% confidence interval [CI] 1.48-5.50, P = .002), CMS complications (OR 2.42, 95% CI 1.27-4.59, P = .007), and post-HU (OR 3.65, 95% CI 2.54-5.26, P < .001). For TKA, ≥2 pre-HU events were associated with increased risk of 30-day readmission (OR 3.52, 95% CI 1.17-10.61, P = .026) and post-HU (OR 2.64, 95% CI 1.29-5.40, P = .008). There were positive correlations for THA (any pre-HU) and TKA (≥2 pre-HU) with length of stay and number of post-HU events. CONCLUSION: Patients who utilize non-elective healthcare in the 90 days prior to TJA are at increased risk of readmission, complications, and unplanned post-HU. LEVEL OF EVIDENCE: Level III.


Assuntos
Artroplastia de Quadril , Readmissão do Paciente , Idoso , Artroplastia de Quadril/efeitos adversos , Humanos , Tempo de Internação , Medicare , Aceitação pelo Paciente de Cuidados de Saúde , Complicações Pós-Operatórias/etiologia , Estudos Retrospectivos , Fatores de Risco , Estados Unidos/epidemiologia
4.
J Arthroplasty ; 37(4): 668-673, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34954019

RESUMO

BACKGROUND: There have been efforts to reduce adverse events and unplanned readmissions after total joint arthroplasty. The Rothman Index (RI) is a real-time, composite measure of medical acuity for hospitalized patients. We aimed to examine the association among in-hospital RI scores and complications, readmissions, and discharge location after total knee arthroplasty (TKA). We hypothesized that RI scores could be used to predict the outcomes of interest. METHODS: This is a retrospective study of an institutional database of elective, primary TKA from July 2018 until December 2019. Complications and readmissions were defined per Centers for Medicare and Medicaid Services. Analysis included multivariate regression, computation of the area under the curve (AUC), and the Youden Index to set RI thresholds. RESULTS: The study cohort's (n = 957) complications (2.4%), readmissions (3.6%), and nonhome discharge (13.7%) were reported. All RI metrics (minimum, maximum, last, mean, range, 25th%, and 75th%) were significantly associated with increased odds of readmission and home discharge (all P < .05). RI scores were not significantly associated with complications. The optimal RI thresholds for increased risk of readmission were last ≤ 71 (AUC = 0.65), mean ≤ 67 (AUC = 0.66), or maximum ≤ 80 (AUC = 0.63). The optimal RI thresholds for increased risk of home discharge were minimum ≥ 53 (AUC = 0.65), mean ≥ 69 (AUC = 0.65), or maximum ≥ 81 (AUC = 0.60). CONCLUSION: RI values may be used to predict readmission or home discharge after TKA.


Assuntos
Artroplastia de Quadril , Artroplastia do Joelho , Assistência ao Convalescente , Idoso , Artroplastia de Quadril/efeitos adversos , Artroplastia do Joelho/efeitos adversos , Hospitais , Humanos , Medicare , Alta do Paciente , Readmissão do Paciente , Complicações Pós-Operatórias/epidemiologia , Complicações Pós-Operatórias/etiologia , Estudos Retrospectivos , Fatores de Risco , Estados Unidos/epidemiologia
5.
BMC Bioinformatics ; 20(1): 115, 2019 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-30841846

RESUMO

BACKGROUND: RNA-seq, wherein RNA transcripts expressed in a sample are sequenced and quantified, has become a widely used technique to study disease and development. With RNA-seq, transcription abundance can be measured, differential expression genes between groups and functional enrichment of those genes can be computed. However, biological insights from RNA-seq are often limited by computational analysis and the enormous volume of resulting data, preventing facile and meaningful review and interpretation of gene expression profiles. Particularly, in cases where the samples under study exhibit uncontrolled variation, deeper analysis of functional enrichment would be necessary to visualize samples' gene expression activity under each biological function. RESULTS: We developed a Bioconductor package rgsepd that streamlines RNA-seq data analysis by wrapping commonly used tools DESeq2 and GOSeq in a user-friendly interface and performs a gene-subset linear projection to cluster heterogeneous samples by Gene Ontology (GO) terms. Rgsepd computes significantly enriched GO terms for each experimental condition and generates multidimensional projection plots highlighting how each predefined gene set's multidimensional expression may delineate samples. CONCLUSIONS: The rgsepd serves to automate differential expression, functional annotation, and exploratory data analyses to highlight subtle expression differences among samples based on each significant biological function.


Assuntos
Análise de Sequência de RNA/métodos , Software , Ontologia Genética , Átrios do Coração/metabolismo , Humanos , RNA/genética , RNA/metabolismo
6.
PLoS Comput Biol ; 14(7): e1006318, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-30011266

RESUMO

MicroRNAs (miRNAs) inhibit expression of target genes by binding to their RNA transcripts. It has been recently shown that RNA transcripts targeted by the same miRNA could "compete" for the miRNA molecules and thereby indirectly regulate each other. Experimental evidence has suggested that the aberration of such miRNA-mediated interaction between RNAs-called competing endogenous RNA (ceRNA) interaction-can play important roles in tumorigenesis. Given the difficulty of deciphering context-specific miRNA binding, and the existence of various gene regulatory factors such as DNA methylation and copy number alteration, inferring context-specific ceRNA interactions accurately is a computationally challenging task. Here we propose a computational method called Cancerin to identify cancer-associated ceRNA interactions. Cancerin incorporates DNA methylation, copy number alteration, gene and miRNA expression datasets to construct cancer-specific ceRNA networks. We applied Cancerin to three cancer datasets from the Cancer Genome Atlas (TCGA) project. Our results indicated that ceRNAs were enriched with cancer-related genes, and ceRNA modules in the inferred ceRNA networks were involved in cancer-associated biological processes. Using LINCS-L1000 shRNA-mediated gene knockdown experiment in breast cancer cell line to assess accuracy, Cancerin was able to predict expression outcome of ceRNA genes with high accuracy.


Assuntos
Neoplasias da Mama/genética , Simulação por Computador , Redes Reguladoras de Genes , Genes Neoplásicos , RNA Neoplásico/genética , Atlas como Assunto , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , Metilação de DNA , Conjuntos de Dados como Assunto , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , MicroRNAs/genética , Proteínas de Neoplasias/metabolismo , Prognóstico , Ligação Proteica , Processamento Pós-Transcricional do RNA
7.
Plant J ; 89(5): 1042-1054, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27775877

RESUMO

Cowpea (Vigna unguiculata L. Walp.) is a legume crop that is resilient to hot and drought-prone climates, and a primary source of protein in sub-Saharan Africa and other parts of the developing world. However, genome resources for cowpea have lagged behind most other major crops. Here we describe foundational genome resources and their application to the analysis of germplasm currently in use in West African breeding programs. Resources developed from the African cultivar IT97K-499-35 include a whole-genome shotgun (WGS) assembly, a bacterial artificial chromosome (BAC) physical map, and assembled sequences from 4355 BACs. These resources and WGS sequences of an additional 36 diverse cowpea accessions supported the development of a genotyping assay for 51 128 SNPs, which was then applied to five bi-parental RIL populations to produce a consensus genetic map containing 37 372 SNPs. This genetic map enabled the anchoring of 100 Mb of WGS and 420 Mb of BAC sequences, an exploration of genetic diversity along each linkage group, and clarification of macrosynteny between cowpea and common bean. The SNP assay enabled a diversity analysis of materials from West African breeding programs. Two major subpopulations exist within those materials, one of which has significant parentage from South and East Africa and more diversity. There are genomic regions of high differentiation between subpopulations, one of which coincides with a cluster of nodulin genes. The new resources and knowledge help to define goals and accelerate the breeding of improved varieties to address food security issues related to limited-input small-holder farming and climate stress.


Assuntos
Produtos Agrícolas/genética , Produtos Agrícolas/fisiologia , Vigna/genética , Vigna/fisiologia , Cromossomos Artificiais Bacterianos , Cromossomos de Plantas/genética , Clima , Abastecimento de Alimentos , Genoma de Planta/genética , Genótipo
8.
Genomics ; 109(3-4): 233-240, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28438487

RESUMO

Copy number amplifications and deletions that are recurrent in cancer samples harbor genes that confer a fitness advantage to cancer tumor proliferation and survival. One important challenge in computational biology is to separate the causal (i.e., driver) genes from passenger genes in large, aberrated regions. Many previous studies focus on the genes within the aberration (i.e., cis genes), but do not utilize the genes that are outside of the aberrated region and dysregulated as a result of the aberration (i.e., trans genes). We propose a computational pipeline, called ProcessDriver, that prioritizes candidate drivers by relating cis genes to dysregulated trans genes and biological processes. ProcessDriver is based on the assumption that a driver cis gene should be closely associated with the dysregulated trans genes and biological processes, as opposed to previous studies that assume a driver cis gene should be the most correlated gene to the copy number of an aberrated region. We applied our method on breast, bladder and ovarian cancer data from the Cancer Genome Atlas database. Our results included previously known driver genes and cancer genes, as well as potentially novel driver genes. Additionally, many genes in the final set of drivers were linked to new tumor events after initial treatment using survival analysis. Our results highlight the importance of selecting driver genes based on their widespread downstream effects in trans.


Assuntos
Neoplasias da Mama/genética , Dosagem de Genes , Genômica/métodos , Oncogenes , Neoplasias Ovarianas/genética , Neoplasias da Bexiga Urinária/genética , Algoritmos , Neoplasias da Mama/patologia , Variações do Número de Cópias de DNA , Progressão da Doença , Feminino , Humanos , Neoplasias Ovarianas/patologia , Neoplasias da Bexiga Urinária/patologia
9.
Plant J ; 84(1): 216-27, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26252423

RESUMO

Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant.


Assuntos
Cromossomos Artificiais Bacterianos/genética , Genoma de Planta/genética , Hordeum/genética , Dados de Sequência Molecular
10.
PLoS Comput Biol ; 9(4): e1003010, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23592960

RESUMO

For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Hordeum/genética , Análise de Sequência de DNA , Cromossomos Artificiais Bacterianos , Clonagem Molecular , Biologia Computacional/métodos , Simulação por Computador , Genes de Plantas , Marcadores Genéticos/genética , Biblioteca Genômica , Genômica , Modelos Genéticos , Oryza/genética , Mapeamento Físico do Cromossomo , Especificidade da Espécie
11.
Nucleic Acids Res ; 40(17): 8219-26, 2012 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-22743268

RESUMO

Collecting representative sets of cancer microRNAs (miRs) from the literature we show that their corresponding families are enriched in sets of highly interacting miR families. Targeting cancer genes on a statistically significant level, such cancer miR families strongly intervene with signaling pathways that harbor numerous cancer genes. Clustering miR family-specific profiles of pathway intervention, we found that different miR families share similar interaction patterns. Resembling corresponding patterns of cancer miRs families, such interaction patterns may indicate a miR family's potential role in cancer. As we find that the number of targeted cancer genes is a naïve proxy for a cancer miR family, we design a simple method to predict candidate miR families based on gene-specific interaction profiles. Assessing the impact of miR families to distinguish between (non-)cancer genes, we predict a set of 84 potential candidate families, including 75% of initially collected cancer miR families. Further confirming their relevance, predicted cancer miR families are significantly indicated in increasing, non-random numbers of tumor types.


Assuntos
MicroRNAs/metabolismo , Neoplasias/genética , Regulação Neoplásica da Expressão Gênica , Genes Neoplásicos , Humanos , MicroRNAs/classificação , MicroRNAs/fisiologia , Neoplasias/metabolismo , Mapeamento de Interação de Proteínas , RNA Mensageiro/metabolismo , Transdução de Sinais/genética
12.
Commun Biol ; 7(1): 1101, 2024 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-39244634

RESUMO

In pre-clinical trials of anti-cancer drugs, cell lines are utilized as a model for patient tumor samples to understand the response of drugs. However, in vitro culture of cell lines, in general, alters the biology of the cell lines and likely gives rise to systematic differences from the tumor samples' genomic profiles; hence the drug response of cell lines may deviate from actual patients' drug response. In this study, we computed a similarity score for the selection of cell lines depicting the close and far resemblance to patient tumor samples in twenty-two different cancer types at genetic, genomic, and epigenetic levels integrating multi-omics datasets. We also considered the presence of immune cells in tumor samples and cancer-related biological pathways in this score which aids personalized medicine research in cancer. We showed that based on these similarity scores, cell lines were able to recapitulate the drug response of patient tumor samples for several FDA-approved cancer drugs in multiple cancer types. Based on these scores, several of the high-rank cell lines were shown to have a close likeness to the corresponding tumor type in previously reported in vitro experiments.


Assuntos
Antineoplásicos , Neoplasias , Humanos , Neoplasias/genética , Neoplasias/tratamento farmacológico , Neoplasias/patologia , Linhagem Celular Tumoral , Antineoplásicos/farmacologia , Medicina de Precisão/métodos , Genômica/métodos
13.
bioRxiv ; 2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39149262

RESUMO

Plants respond to biotic stressors by modulating various processes in an attempt to limit the attack by a pathogen or herbivore. Triggering these different defense processes requires orchestration of a network of proteins and RNA molecules that includes microRNAs (miRNAs). These short RNA molecules (20-22 nucleotides) have been shown to be important players in the early responses of plants to stresses because they can rapidly regulate the expression levels of a network of downstream genes. The ascomycete Fusarium graminearum is an important fungal pathogen that causes significant losses in cereal crops worldwide. Using the well-characterized Fusarium-Arabidopsis pathosystem, we investigated how plants change expression of their miRNAs globally during the early stages of infection by F. graminearum. In addition to miRNAs that have been previously implicated in stress responses, we have also identified evolutionarily young miRNAs whose levels change significantly in response to fungal infection. Some of these young miRNAs have homologs present in cereals. Thus, manipulating expression of these miRNAs may provide a unique path toward development of plants with increased resistance to fungal pathogens.

14.
Bioinform Adv ; 4(1): vbae099, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39143982

RESUMO

Summary: Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation: Not applicable.

15.
NAR Genom Bioinform ; 5(2): lqad063, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37680392

RESUMO

To pave the road towards precision medicine in cancer, patients with similar biology ought to be grouped into same cancer subtypes. Utilizing high-dimensional multiomics datasets, integrative approaches have been developed to uncover cancer subtypes. Recently, Graph Neural Networks have been discovered to learn node embeddings utilizing node features and associations on graph-structured data. Some integrative prediction tools have been developed leveraging these advances on multiple networks with some limitations. Addressing these limitations, we developed SUPREME, a node classification framework, which integrates multiple data modalities on graph-structured data. On breast cancer subtyping, unlike existing tools, SUPREME generates patient embeddings from multiple similarity networks utilizing multiomics features and integrates them with raw features to capture complementary signals. On breast cancer subtype prediction tasks from three datasets, SUPREME outperformed other tools. SUPREME-inferred subtypes had significant survival differences, mostly having more significance than ground truth, and outperformed nine other approaches. These results suggest that with proper multiomics data utilization, SUPREME could demystify undiscovered characteristics in cancer subtypes that cause significant survival differences and could improve ground truth label, which depends mainly on one datatype. In addition, to show model-agnostic property of SUPREME, we applied it to two additional datasets and had a clear outperformance.

16.
bioRxiv ; 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38045324

RESUMO

Alzheimer's disease (AD) is a neurodegenerative disorder, and timely diagnosis is crucial for early interventions. AD is known to have disruptive local and global brain neural connections that may be instrumental in understanding and extracting specific biomarkers. Previous machine-learning approaches are mostly based on convolutional neural network (CNN) and standard vision transformer (ViT) models which may not sufficiently capture the multidimensional local and global patterns that may be indicative of AD. Therefore, in this paper, we propose a novel approach called PVTAD to classify AD and cognitively normal (CN) cases using pretrained pyramid vision transformer (PVT) and white matter (WM) of T1-weighted structural MRI (sMRI) data. Our approach combines the advantages of CNN and standard ViT to extract both local and global features indicative of AD from the WM coronal middle slices. We performed experiments on subjects with T1-weighed MPRAGE sMRI scans from the ADNI dataset. Our results demonstrate that the PVTAD achieves an average accuracy of 97.7% and F1-score of 97.6%, outperforming the single and parallel CNN and standard ViT architectures based on sMRI data for AD vs. CN classification.

17.
ACS Omega ; 8(23): 20379-20388, 2023 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-37323377

RESUMO

The nuclear receptor (NR) superfamily includes phylogenetically related ligand-activated proteins, which play a key role in various cellular activities. NR proteins are subdivided into seven subfamilies based on their function, mechanism, and nature of the interacting ligand. Developing robust tools to identify NR could give insights into their functional relationships and involvement in disease pathways. Existing NR prediction tools only use a few types of sequence-based features and are tested on relatively similar independent datasets; thus, they may suffer from overfitting when extended to new genera of sequences. To address this problem, we developed Nuclear Receptor Prediction Tool (NRPreTo), a two-level NR prediction tool with a unique training approach where in addition to the sequence-based features used by existing NR prediction tools, six additional feature groups depicting various physiochemical, structural, and evolutionary features of proteins were utilized. The first level of NRPreTo allows for the successful prediction of a query protein as NR or non-NR and further subclassifies the protein into one of the seven NR subfamilies in the second level. We developed Random Forest classifiers to test on benchmark datasets, as well as the entire human protein datasets from RefSeq and Human Protein Reference Database (HPRD). We observed that using additional feature groups improved the performance. We also observed that NRPreTo achieved high performance on the external datasets and predicted 59 novel NRs in the human proteome. The source code of NRPreTo is publicly available at https://github.com/bozdaglab/NRPreTo.

18.
bioRxiv ; 2023 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-36778453

RESUMO

Alzheimer's disease (AD) is a neurodegenerative disease that affects millions of people worldwide. Mild cognitive impairment (MCI) is an intermediary stage between cognitively normal (CN) state and AD. Not all people who have MCI convert to AD. The diagnosis of AD is made after significant symptoms of dementia such as short-term memory loss are already present. Since AD is currently an irreversible disease, diagnosis at the onset of disease brings a huge burden on patients, their caregivers, and the healthcare sector. Thus, there is a crucial need to develop methods for the early prediction AD for patients who have MCI. Recurrent Neural Networks (RNN) have been successfully used to handle Electronic Health Records (EHR) for predicting conversion from MCI to AD. However, RNN ignores irregular time intervals between successive events which occurs common in EHR data. In this study, we propose two deep learning architectures based on RNN, namely Predicting Progression of Alzheimer's Disease (PPAD) and PPAD-Autoencoder (PPAD-AE). PPAD and PPAD-AE are designed for early predicting conversion from MCI to AD at the next visit and multiple visits ahead for patients, respectively. To minimize the effect of the irregular time intervals between visits, we propose using age in each visit as an indicator of time change between successive visits. Our experimental results conducted on Alzheimer's Disease Neuroimaging Initiative (ADNI) and National Alzheimer's Coordinating Center (NACC) datasets showed that our proposed models outperformed all baseline models for most prediction scenarios in terms of F2 and sensitivity. We also observed that the age feature was one of top features and was able to address irregular time interval problem.

19.
Sci Rep ; 12(1): 3717, 2022 03 08.
Artigo em Inglês | MEDLINE | ID: mdl-35260634

RESUMO

DNA copy number aberrated regions in cancer are known to harbor cancer driver genes and the short non-coding RNA molecules, i.e., microRNAs. In this study, we integrated the multi-omics datasets such as copy number aberration, DNA methylation, gene and microRNA expression to identify the signature microRNA-gene associations from frequently aberrated DNA regions across pan-cancer utilizing a LASSO-based regression approach. We studied 7294 patient samples associated with eighteen different cancer types from The Cancer Genome Atlas (TCGA) database and identified several cancer-specific and common microRNA-gene interactions enriched in experimentally validated microRNA-target interactions. We highlighted several oncogenic and tumor suppressor microRNAs that were cancer-specific and common in several cancer types. Our method substantially outperformed the five state-of-art methods in selecting significantly known microRNA-gene interactions in multiple cancer types. Several microRNAs and genes were found to be associated with tumor survival and progression. Selected target genes were found to be significantly enriched in cancer-related pathways, cancer hallmark and Gene Ontology (GO) terms. Furthermore, subtype-specific potential gene signatures were discovered in multiple cancer types.


Assuntos
MicroRNAs , Neoplasias , Metilação de DNA , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Neoplasias/genética , Oncogenes
20.
IEEE/ACM Trans Comput Biol Bioinform ; 19(5): 2950-2962, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34283720

RESUMO

Uncovering genotype-phenotype relationships is a fundamental challenge in genomics. Gene prioritization is an important step for this endeavor to make a short manageable list from a list of thousands of genes coming from high-throughput studies. Network propagation methods are promising and state of the art methods for gene prioritization based on the premise that functionally related genes tend to be close to each other in the biological networks. Recently, we introduced PhenoGeneRanker, a network-propagation algorithm for multiplex heterogeneous networks. PhenoGeneRanker allows multi-layer gene and phenotype networks. It also calculates empirical p values of gene and phenotype ranks using random stratified sampling of seeds of genes and phenotypes based on their connectivity degree in the network. In this study, we introduce the PhenoGeneRanker Bioconductor package and its application to multi-omics rat genome datasets to rank hypertension disease-related genes and strains. We showed that PhenoGeneRanker performed better to rank hypertension disease-related genes using multiplex gene networks than aggregated gene networks. We also showed that PhenoGeneRanker performed better to rank hypertension disease-related strains using multiplex phenotype network than single or aggregated phenotype networks. We performed a rigorous hyperparameter analysis and, finally showed that Gene Ontology (GO) enrichment of statistically significant top-ranked genes resulted in hypertension disease-related GO terms.


Assuntos
Algoritmos , Hipertensão , Animais , Redes Reguladoras de Genes/genética , Genômica/métodos , Fenótipo , Ratos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA