Pesquisa | Portal de Pesquisa da BVS Enfermagem

1.

Machine learning methods for prediction of cancer driver genes: a survey paper.

Andrades, Renan; Recamonde-Mendoza, Mariana.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35323900

RESUMO

Identifying the genes and mutations that drive the emergence of tumors is a critical step to improving our understanding of cancer and identifying new directions for disease diagnosis and treatment. Despite the large volume of genomics data, the precise detection of driver mutations and their carrying genes, known as cancer driver genes, from the millions of possible somatic mutations remains a challenge. Computational methods play an increasingly important role in discovering genomic patterns associated with cancer drivers and developing predictive models to identify these elements. Machine learning (ML), including deep learning, has been the engine behind many of these efforts and provides excellent opportunities for tackling remaining gaps in the field. Thus, this survey aims to perform a comprehensive analysis of ML-based computational approaches to identify cancer driver mutations and genes, providing an integrated, panoramic view of the broad data and algorithmic landscape within this scientific problem. We discuss how the interactions among data types and ML algorithms have been explored in previous solutions and outline current analytical limitations that deserve further attention from the scientific community. We hope that by helping readers become more familiar with significant developments in the field brought by ML, we may inspire new researchers to address open problems and advance our knowledge towards cancer driver discovery.

Assuntos

Biologia Computacional , Neoplasias , Algoritmos , Biologia Computacional/métodos , Humanos , Aprendizado de Máquina , Mutação , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/patologia , Oncogenes

2.

Graph neural networks for clinical risk prediction based on electronic health records: A survey.

Oss Boll, Heloísa; Amirahmadi, Ali; Ghazani, Mirfarid Musavian; Morais, Wagner Ourique de; Freitas, Edison Pignaton de; Soliman, Amira; Etminani, Farzaneh; Byttner, Stefan; Recamonde-Mendoza, Mariana.

J Biomed Inform ; 151: 104616, 2024 03.

Artigo em Inglês | MEDLINE | ID: mdl-38423267

RESUMO

OBJECTIVE: This study aims to comprehensively review the use of graph neural networks (GNNs) for clinical risk prediction based on electronic health records (EHRs). The primary goal is to provide an overview of the state-of-the-art of this subject, highlighting ongoing research efforts and identifying existing challenges in developing effective GNNs for improved prediction of clinical risks. METHODS: A search was conducted in the Scopus, PubMed, ACM Digital Library, and Embase databases to identify relevant English-language papers that used GNNs for clinical risk prediction based on EHR data. The study includes original research papers published between January 2009 and May 2023. RESULTS: Following the initial screening process, 50 articles were included in the data collection. A significant increase in publications from 2020 was observed, with most selected papers focusing on diagnosis prediction (n = 36). The study revealed that the graph attention network (GAT) (n = 19) was the most prevalent architecture, and MIMIC-III (n = 23) was the most common data resource. CONCLUSION: GNNs are relevant tools for predicting clinical risk by accounting for the relational aspects among medical events and entities and managing large volumes of EHR data. Future studies in this area may address challenges such as EHR data heterogeneity, multimodality, and model interpretability, aiming to develop more holistic GNN models that can produce more accurate predictions, be effectively implemented in clinical settings, and ultimately improve patient care.

Assuntos

Registros Eletrônicos de Saúde , Idioma , Humanos , Coleta de Dados , Bases de Dados Factuais , Redes Neurais de Computação

3.

Meta-analysis of Transcriptomic Data from Lung Autopsy and Cellular Models of SARS-CoV-2 Infection.

Cadore, Nathan Araujo; Lord, Vinicius Oliveira; Recamonde-Mendoza, Mariana; Kowalski, Thayne Woycinck; Vianna, Fernanda Sales Luiz.

Biochem Genet ; 62(2): 892-914, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37486510

RESUMO

Severe COVID-19 is a systemic disorder involving excessive inflammatory response, metabolic dysfunction, multi-organ damage, and several clinical features. Here, we performed a transcriptome meta-analysis investigating genes and molecular mechanisms related to COVID-19 severity and outcomes. First, transcriptomic data of cellular models of SARS-CoV-2 infection were compiled to understand the first response to the infection. Then, transcriptomic data from lung autopsies of patients deceased due to COVID-19 were compiled to analyze altered genes of damaged lung tissue. These analyses were followed by functional enrichment analyses and gene-phenotype association. A biological network was constructed using the disturbed genes in the lung autopsy meta-analysis. Central genes were defined considering closeness and betweenness centrality degrees. A sub-network phenotype-gene interaction analysis was performed. The meta-analysis of cellular models found genes mainly associated with cytokine signaling and other pathogen response pathways. The meta-analysis of lung autopsy tissue found genes associated with coagulopathy, lung fibrosis, multi-organ damage, and long COVID-19. Only genes DNAH9 and FAM216B were found perturbed in both meta-analyses. BLNK, FABP4, GRIA1, ATF3, TREM2, TPPP, TPPP3, FOS, ALB, JUNB, LMNA, ADRB2, PPARG, TNNC1, and EGR1 were identified as central elements among perturbed genes in lung autopsy and were found associated with several clinical features of severe COVID-19. Central elements were suggested as interesting targets to investigate the relation with features of COVID-19 severity, such as coagulopathy, lung fibrosis, and organ damage.

4.

Broken silence: 22,841 predicted deleterious synonymous variants identified in the human exome through computational analysis.

Mello, Ana Carolina; Leao, Delva; Dias, Luis; Colombelli, Felipe; Recamonde-Mendoza, Mariana; Turchetto-Zolet, Andreia Carina; Matte, Ursula.

Genet Mol Biol ; 46(3 Suppl 1): e20230125, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38259032

RESUMO

Synonymous single nucleotide variants (sSNVs) do not alter the primary structure of a protein, thus it was previously accepted that they were neutral. Recently, several studies demonstrated their significance to a range of diseases. Still, variant prioritization strategies lack focus on sSNVs. Here, we identified 22,841 deleterious synonymous variants in 125,748 human exomes using two in silico predictors (SilVA and CADD). While 98.2% of synonymous variants are classified as neutral, 1.8% are predicted to be deleterious, yielding an average of 9.82 neutral and 0.18 deleterious sSNVs per exome. Further investigation of prediction features via Heterogeneous Ensemble Feature Selection revealed that impact on amino acid sequence and conservation carry the most weight for a deleterious prediction. Thirty nine detrimental sSNVs are not rare and are located on disease associated genes. Ten distinct putatively non-deleterious sSNVs are likely to be under positive selection in the North-Western European and East Asian populations. Taken together our analysis gives voice to the so-called silent mutations as we propose a robust framework for evaluating the deleteriousness of sSNVs in variant prioritization studies.

5.

Brazilian women in Bioinformatics: Challenges and opportunities.

Kowalski, Thayne Woycinck; Giudicelli, Giovanna Câmara; Pinho, Maria Clara de Freitas; Rockenbach, Marília Körbes; Maciel-Fiuza, Miriãn Ferrão; Recamonde-Mendoza, Mariana; Vianna, Fernanda Sales Luiz.

Genet Mol Biol ; 46(3 Suppl 1): e20230134, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38259034

RESUMO

Bioinformatics is a growing research field that received great notoriety in the years of the COVID-19 pandemic. It is a very integrative area, comprising professionals from science, technology, engineering, and mathematics (STEM). In agreement with the other STEM areas, several women have greatly contributed to bioinformatics ascension; however, they had to surpass prejudice and stereotypes to achieve recognition and leadership positions, a path that studies have demonstrated to be more comfortable to their male colleagues. In this review, we discuss the several difficulties that women in STEM, including bioinformatics, surpass during their careers. First, we present a historical context on bioinformatics and the main applications for this area. Then, we discuss gender disparity in STEM and present the challenges that still contribute to women's inequality in STEM compared to their male colleagues. We also present the opportunities and the transformation that we can start, acting in academia, inside the family and school environments, and as a society, hence contributing to gender equality in STEM. Finally, we discuss specific challenges in the bioinformatics field and how we can act to overcome them, especially in low and middle-income countries, such as Brazil.

6.

A New Strategy for the Old Challenge of Thalidomide: Systems Biology Prioritization of Potential Immunomodulatory Drug (IMiD)-Targeted Transcription Factors.

Kowalski, Thayne Woycinck; Feira, Mariléa Furtado; Lord, Vinícius Oliveira; Gomes, Julia do Amaral; Giudicelli, Giovanna Câmara; Fraga, Lucas Rosa; Sanseverino, Maria Teresa Vieira; Recamonde-Mendoza, Mariana; Schuler-Faccini, Lavinia; Vianna, Fernanda Sales Luiz.

Int J Mol Sci ; 24(14)2023 Jul 15.

Artigo em Inglês | MEDLINE | ID: mdl-37511270

RESUMO

Several molecular mechanisms of thalidomide embryopathy (TE) have been investigated, from anti-angiogenesis to oxidative stress to cereblon binding. Recently, it was discovered that thalidomide and its analogs, named immunomodulatory drugs (IMiDs), induced the degradation of C2H2 transcription factors (TFs). This mechanism might impact the strict transcriptional regulation of the developing embryo. Hence, this study aims to evaluate the TFs altered by IMiDs, prioritizing the ones associated with embryogenesis through transcriptome and systems biology-allied analyses. This study comprises only the experimental data accessed through bioinformatics databases. First, proteins and genes reported in the literature as altered/affected by the IMiDs were annotated. A protein systems biology network was evaluated. TFs beta-catenin (CTNNB1) and SP1 play more central roles: beta-catenin is an essential protein in the network, while SP1 is a putative C2H2 candidate for IMiD-induced degradation. Separately, the differential expressions of the annotated genes were analyzed through 23 publicly available transcriptomes, presenting 8624 differentially expressed genes (2947 in two or more datasets). Seventeen C2H2 TFs were identified as related to embryonic development but not studied for IMiD exposure; these TFs are potential IMiDs degradation neosubstrates. This is the first study to suggest an integration of IMiD molecular mechanisms through C2H2 TF degradation.

Assuntos

Mieloma Múltiplo , Talidomida , Humanos , Talidomida/farmacologia , Agentes de Imunomodulação , beta Catenina/genética , beta Catenina/metabolismo , Fatores de Transcrição/metabolismo , Biologia de Sistemas , Proteínas Adaptadoras de Transdução de Sinal/metabolismo , Fatores Imunológicos/farmacologia , Fatores Imunológicos/química , Ubiquitina-Proteína Ligases/metabolismo , Mieloma Múltiplo/metabolismo

7.

Gene Expression Analysis Platform (GEAP): A highly customizable, fast, versatile and ready-to-use microarray analysis platform.

Nunes, Itamar José Guimarães; Recamonde-Mendoza, Mariana; Feltes, Bruno César.

Genet Mol Biol ; 45(1): e20210077, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34927664

RESUMO

There are still numerous challenges to be overcome in microarray data analysis because advanced, state-of-the-art analyses are restricted to programming users. Here we present the Gene Expression Analysis Platform, a versatile, customizable, optimized, and portable software developed for microarray analysis. GEAP was developed in C# for the graphical user interface, data querying, storage, results filtering and dynamic plotting, and R for data processing, quality analysis, and differential expression. Through a new automated system that identifies microarray file formats, retrieves contents, detects file corruption, and solves dependencies, GEAP deals with datasets independently of platform. GEAP covers 32 statistical options, supports quality assessment, differential expression from single and dual-channel experiments, and gene ontology. Users can explore results by different plots and filtering options. Finally, the entire data can be saved and organized through storage features, optimized for memory and data retrieval, with faster performance than R. These features, along with other new options, are not yet present in any microarray analysis software. GEAP accomplishes data analysis in a faster, straightforward, and friendlier way than other similar software, while keeping the flexibility for sophisticated procedures. By developing optimizations, unique customizations and new features, GEAP is destined for both advanced and non-programming users.

8.

How to make more from exposure data? An integrated machine learning pipeline to predict pathogen exposure.

Fountain-Jones, Nicholas M; Machado, Gustavo; Carver, Scott; Packer, Craig; Recamonde-Mendoza, Mariana; Craft, Meggan E.

J Anim Ecol ; 88(10): 1447-1461, 2019 10.

Artigo em Inglês | MEDLINE | ID: mdl-31330063

RESUMO

Predicting infectious disease dynamics is a central challenge in disease ecology. Models that can assess which individuals are most at risk of being exposed to a pathogen not only provide valuable insights into disease transmission and dynamics but can also guide management interventions. Constructing such models for wild animal populations, however, is particularly challenging; often only serological data are available on a subset of individuals and nonlinear relationships between variables are common. Here we provide a guide to the latest advances in statistical machine learning to construct pathogen-risk models that automatically incorporate complex nonlinear relationships with minimal statistical assumptions from ecological data with missing data. Our approach compares multiple machine learning algorithms in a unified environment to find the model with the best predictive performance and uses game theory to better interpret results. We apply this framework on two major pathogens that infect African lions: canine distemper virus (CDV) and feline parvovirus. Our modelling approach provided enhanced predictive performance compared to more traditional approaches, as well as new insights into disease risks in a wild population. We were able to efficiently capture and visualize strong nonlinear patterns, as well as model complex interactions between variables in shaping exposure risk from CDV and feline parvovirus. For example, we found that lions were more likely to be exposed to CDV at a young age but only in low rainfall years. When combined with our data calibration approach, our framework helped us to answer questions about risk of pathogen exposure that are difficult to address with previous methods. Our framework not only has the potential to aid in predicting disease risk in animal populations, but also can be used to build robust predictive models suitable for other ecological applications such as modelling species distribution or diversity patterns.

Assuntos

Vírus da Cinomose Canina , Leões , Animais , Animais Selvagens , Ecologia , Aprendizado de Máquina

9.

The role of protein intrinsic disorder in major psychiatric disorders.

Tovo-Rodrigues, Luciana; Recamonde-Mendoza, Mariana; Paixão-Côrtes, Vanessa Rodrigues; Bruxel, Estela M; Schuch, Jaqueline B; Friedrich, Deise C; Rohde, Luis A; Hutz, Mara H.

Am J Med Genet B Neuropsychiatr Genet ; 171(6): 848-60, 2016 09.

Artigo em Inglês | MEDLINE | ID: mdl-27184105

RESUMO

Although new candidate genes for Autism Spectrum Disorder (ASD), Schizophrenia (SCZ), Attention-Deficit/Hyperactivity Disorder (ADHD), and Bipolar Disorder (BD) emerged from genome-wide association studies (GWAS), their underlying molecular mechanisms remain poorly understood. Evidences of the involvement of intrinsically disordered proteins in diseases have grown in the last decade. These proteins lack tridimensional structure under physiological conditions and are involved in important cellular functions such as signaling, recognition and regulation. The aim of the present study was to identify the role and abundance of intrinsically disordered proteins in a set of psychiatric diseases and to test whether diseases are different regarding protein intrinsic disorder. Our hypothesis is that differences across psychiatric illnesses phenotypes and symptoms may arise from differences in intrinsic protein disorder content and properties of each group. A bioinformatics prediction of intrinsic disorder was performed in proteins retrieved based on top findings from GWAS, Copy Number Variation and candidate gene investigations for each disease. This approach revealed that about 80% of studied proteins presented long stretches of disorder. This amount was significantly higher than that observed in general eukaryotic proteins, and those involved in cardiovascular diseases. These results suggest that proteins with intrinsic disorder are a common feature of neurodevelopment and synaptic transmission processes which are potentially involved in the etiology of psychiatric diseases. Moreover, we identified differences between ADHD and ASD when the binary prediction of structure and putative binding sites were compared. These differences may be related to variation in symptom complexity between both diseases. © 2016 Wiley Periodicals, Inc.

Assuntos

Transtornos Mentais/genética , Transtornos Mentais/fisiopatologia , Deficiências na Proteostase/genética , Transtorno do Deficit de Atenção com Hiperatividade/genética , Transtorno do Espectro Autista/genética , Transtorno Bipolar/genética , Variações do Número de Cópias de DNA , Bases de Dados de Ácidos Nucleicos , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Humanos , Transtornos Mentais/metabolismo , Polimorfismo de Nucleotídeo Único/genética , Esquizofrenia/genética

10.

Bioinformatics Methods for Transcriptome Analysis on Teratogenesis Testing.

Kowalski, Thayne Woycinck; Giudicelli, Giovanna Câmara; Gomes, Julia do Amaral; Recamonde-Mendoza, Mariana; Vianna, Fernanda Sales Luiz.

Methods Mol Biol ; 2753: 365-376, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38285351

RESUMO

Teratogenesis testing can be challenging due to the limitations of both in vitro and in vivo models. Test-systems, based especially on human embryonic cells, have been helping to overcome the difficulties when allied to omics strategies, such as transcriptomics. In these test-systems, cells exposed to different compounds are then analyzed in microarray or RNA-seq platforms regarding the impacts of the potential teratogens in the gene expression. Nevertheless, microarray and RNA-seq dataset processing requires computational resources and bioinformatics knowledge. Here, a pipeline for microarray and RNA-seq processing is presented, aiming to help researchers from any field to interpret the main transcriptome results, such as differential gene expression, enrichment analysis, and statistical interpretation. This chapter also discusses the main difficulties that can be encountered in a transcriptome analysis and the better alternatives to overcome these issues, describing both programming codes and user-friendly tools. Finally, specific issues in the teratogenesis field, such as time-course analysis, are also described, demonstrating how the pipeline can be applied in these studies.

Assuntos

Teratogênese , Humanos , Teratogênese/genética , Perfilação da Expressão Gênica , RNA-Seq , Transcriptoma , Biologia Computacional

11.

Classification of Thyroid Tumors Based on DNA Methylation Patterns.

Marczyk, Vicente Rodrigues; Recamonde-Mendoza, Mariana; Maia, Ana Luiza; Goemann, Iuri Martin.

Thyroid ; 33(9): 1090-1099, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37392021

RESUMO

Background: Alterations in DNA methylation are stable epigenetic events that can serve as clinical biomarkers. The aim of this study was to analyze methylation patterns among various follicular cell-derived thyroid neoplasms to identify disease subtypes and help understand and classify thyroid tumors. Methods: We employed an unsupervised machine learning method for class discovery to search for distinct methylation patterns among various thyroid neoplasms. Our algorithm was not provided with any clinical or pathological information, relying exclusively on DNA methylation data to classify samples. We analyzed 810 thyroid samples (n = 256 for discovery and n = 554 for validation), including benign and malignant tumors, as well as normal thyroid tissue. Results: Our unsupervised algorithm identified that samples could be classified into three subtypes based solely on their methylation profile. These methylation subtypes were strongly associated with histological diagnosis (p < 0.001) and were therefore named normal-like, follicular-like, and papillary thyroid carcinoma (PTC)-like. Follicular adenomas, follicular carcinomas, oncocytic adenomas, and oncocytic carcinomas clustered together forming the follicular-like methylation subtype. Conversely, classic papillary thyroid carcinomas (cPTC) and tall cell PTC clustered together forming the PTC-like subtype. These methylation subtypes were also strongly associated with genomic drivers: 98.7% BRAFV600E-driven cancers were PTC like, whereas 96.0% RAS-driven cancers had a follicular-like methylation pattern. Interestingly, unlike other diagnoses, follicular variant PTC (FVPTC) samples were split into two methylation clusters (follicular like and PTC like), indicating a heterogeneous group likely to be formed by two distinct diseases. FVPTC samples with a follicular-like methylation pattern were enriched for RAS mutations (36.4% vs. 8.0%; p < 0.001), whereas FVPTC- with PTC-like methylation patterns were enriched for BRAFV600E mutations (52.0% vs. 0%, Fisher exact p = 0.004) and RET fusions (16.0% vs. 0%, Fisher exact p = 0.003). Conclusions: Our data provide novel insights into the epigenetic alterations of thyroid tumors. Since our classification method relies on a fully unsupervised machine learning approach for subtype discovery, our results offer a robust background to support the classification of thyroid neoplasms based on methylation patterns.

Assuntos

Adenocarcinoma Folicular , Neoplasias da Glândula Tireoide , Humanos , Metilação de DNA , Proteínas Proto-Oncogênicas B-raf/genética , Proteínas Proto-Oncogênicas B-raf/metabolismo , Neoplasias da Glândula Tireoide/patologia , Câncer Papilífero da Tireoide/genética , Câncer Papilífero da Tireoide/patologia , Adenocarcinoma Folicular/genética , Adenocarcinoma Folicular/patologia , Mutação

12.

Downregulation of Microcephaly-Causing Genes as a Mechanism for ZIKV Teratogenesis: A Meta-analysis of RNA-Seq Studies.

Gomes, Julia A; Sgarioni, Eduarda; Kowalski, Thayne W; Giudicelli, Giovanna C; Recamonde-Mendoza, Mariana; Fraga, Lucas R; Schüler-Faccini, Lavínia; Vianna, Fernanda S L.

J Mol Neurosci ; 73(7-8): 566-577, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37428363

RESUMO

Zika virus (ZIKV) is a neurotropic teratogen that causes congenital Zika syndrome (CZS), characterized by brain and eye anomalies. Impaired gene expression in neural cells after ZIKV infection has been demonstrated; however, there is a gap in the literature of studies comparing whether the differentially expressed genes in such cells are similar and how it can cause CZS. Therefore, the aim of this study was to compare the differential gene expression (DGE) after ZIKV infection in neural cells through a meta-analysis approach. Through the GEO database, studies that evaluated DGE in cells exposed to the Asian lineage of ZIKV versus cells, of the same type, not exposed were searched. From the 119 studies found, five meet our inclusion criteria. Raw data of them were retrieved, pre-processed, and evaluated. The meta-analysis was carried out by comparing seven datasets, from these five studies. We found 125 upregulated genes in neural cells, mainly interferon-stimulated genes, such as IFI6, ISG15, and OAS2, involved in the antiviral response. Furthermore, 167 downregulated, involved with cellular division. Among these downregulated genes, classic microcephaly-causing genes stood out, such as CENPJ, ASPM, CENPE, and CEP152, demonstrating a possible mechanism by which ZIKV impairs brain development and causes CZS.

Assuntos

Microcefalia , Teratogênese , Infecção por Zika virus , Zika virus , Humanos , Zika virus/genética , Infecção por Zika virus/genética , Infecção por Zika virus/congênito , Microcefalia/genética , RNA-Seq , Regulação para Baixo , Proteínas de Ciclo Celular/genética

13.

Suicide risk classification with machine learning techniques in a large Brazilian community sample.

Roza, Thiago Henrique; Seibel, Gabriel de Souza; Recamonde-Mendoza, Mariana; Lotufo, Paulo A; Benseñor, Isabela M; Passos, Ives Cavalcante; Brunoni, Andre Russowsky.

Psychiatry Res ; 325: 115258, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37263086

RESUMO

Even though suicide is a relatively preventable poor outcome, its prediction remains an elusive task. The main goal of this study was to develop machine learning classifiers to identify increased suicide risk in Brazilians with common mental disorders. With the use of clinical and sociodemographic baseline data (n = 4039 adult participants) from a large Brazilian community sample, we developed several models (Elastic Net, Random Forests, Naïve Bayes, and ensemble) for the classification of increased suicide risk among individuals with common mental disorders. 1120 participants (27.7%) presented increased suicide risk. The Random Forests model achieved the best AUC ROC (0.814), followed by Naive Bayes (0.798) and Elastic Net (0.773). Sensitivity varied from 0.922 (Naive Bayes) to 0.630 (Random Forests), while specificity varied from 0.792 (Random Forests) to 0.473 (Naive Bayes). The ensemble model presented an AUC ROC of 0.811, sensitivity of 0.899, and specificity of 0.510. Features representing depression symptoms were the most relevant for the classification of increased suicide risk. Some of our models presented good performance metrics in the classification of increased suicide risk in the investigated sample, which can provide the means to early preventive interventions.

Assuntos

Transtornos Mentais , Suicídio , Adulto , Humanos , Teorema de Bayes , Brasil/epidemiologia , Aprendizado de Máquina

14.

EPGAT: Gene Essentiality Prediction With Graph Attention Networks.

Schapke, Joao; Tavares, Anderson; Recamonde-Mendoza, Mariana.

IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1615-1626, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-33497339

RESUMO

Identifying essential genes and proteins is a critical step towards a better understanding of human biology and pathology. Computational approaches helped to mitigate experimental constraints by exploring machine learning (ML) methods and the correlation of essentiality with biological information, especially protein-protein interaction (PPI) networks, to predict essential genes. Nonetheless, their performance is still limited, as network-based centralities are not exclusive proxies of essentiality, and traditional ML methods are unable to learn from non-euclidean domains such as graphs. Given these limitations, we proposed EPGAT, an approach for Essentiality Prediction based on Graph Attention Networks (GATs), which are attention-based Graph Neural Networks (GNNs), operating on graph-structured data. Our model directly learns gene essentiality patterns from PPI networks, integrating additional evidence from multiomics data encoded as node attributes. We benchmarked EPGAT for four organisms, including humans, accurately predicting gene essentiality with ROC AUC score ranging from 0.78 to 0.97. Our model significantly outperformed network-based and shallow ML-based methods and achieved a very competitive performance against the state-of-the-art node2vec embedding method. Notably, EPGAT was the most robust approach in scenarios with limited and imbalanced training data. Thus, the proposed approach offers a powerful and effective way to identify essential genes and proteins.

Assuntos

Algoritmos , Redes Neurais de Computação , Genes Essenciais/genética , Humanos , Aprendizado de Máquina , Mapas de Interação de Proteínas/genética , Proteínas/genética

15.

Transcriptome meta-analysis of valproic acid exposure in human embryonic stem cells.

Kowalski, Thayne Woycinck; Lord, Vinícius Oliveira; Sgarioni, Eduarda; Gomes, Julia do Amaral; Mariath, Luiza Monteavaro; Recamonde-Mendoza, Mariana; Vianna, Fernanda Sales Luiz.

Eur Neuropsychopharmacol ; 60: 76-88, 2022 07.

Artigo em Inglês | MEDLINE | ID: mdl-35635998

RESUMO

Valproic acid (VPA) is a widely used antiepileptic drug not recommended in pregnancy because it is teratogenic. Many assays have assessed the impact of the VPA exposure on the transcriptome of human embryonic stem-cells (hESC), but the molecular perturbations that VPA exerts in neurodevelopment are not completely understood. This study aimed to perform a transcriptome meta-analysis of VPA-exposed hESC to elucidate the main biological mechanisms altered by VPA effects on the gene expression. Publicly available microarray and RNA-seq transcriptomes were selected in the Gene Expression Omnibus (GEO) repository. Samples were processed according to the standard pipelines for each technology in the Galaxy server and R. Meta-analysis was performed using the Fisher-P method. Overrepresented genes were obtained by evaluating ontologies, pathways, and phenotypes' databases. The meta-analysis performed in seven datasets resulted in 61 perturbed genes, 54 upregulated. Ontology and pathway enrichments suggested neurodevelopment and neuroinflammatory effects; phenotype overrepresentation included epilepsy-related genes, such as SCN1A and GABRB2. The NDNF gene upregulation was also identified; this gene is involved in neuron migration and survival during development. Sub-network analysis proposed TGFß and BMP pathways activation. These results suggest VPA exerts effects in epilepsy-related genes even in embryonic cells. Neurodevelopmental genes, such as NDNF were upregulated and VPA might also disturb several development pathways. These mechanisms might help to explain the spectrum of VPA-induced congenital anomalies and the molecular effects on neurodevelopment.

Assuntos

Epilepsia , Células-Tronco Embrionárias Humanas , Anticonvulsivantes/farmacologia , Anticonvulsivantes/uso terapêutico , Feminino , Humanos , Gravidez , Transcriptoma , Ácido Valproico/farmacologia , Ácido Valproico/uso terapêutico

16.

Identifying posttraumatic stress disorder staging from clinical and sociodemographic features: a proof-of-concept study using a machine learning approach.

Ramos-Lima, Luis Francisco; Waikamp, Vitoria; Oliveira-Watanabe, Thauana; Recamonde-Mendoza, Mariana; Teche, Stefania Pigatto; Mello, Marcelo Feijo; Mello, Andrea Feijo; Freitas, Lucia Helena Machado.

Psychiatry Res ; 311: 114489, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35276574

RESUMO

This proof-of-concept study aimed to investigate the viability of a predictive model to support posttraumatic stress disorder (PTSD) staging. We performed a naturalistic, cross-sectional study at two Brazilian centers: the Psychological Trauma Research and Treatment (NET-Trauma) Program at Universidade Federal of Rio Grande do Sul, and the Program for Research and Care on Violence and PTSD (PROVE), at Universidade Federal of São Paulo. Five supervised machine-learning algorithms were tested: Elastic Net, Gradient Boosting Machine, Random Forest, Support Vector Machine, and C5.0, using clinical (Clinician-Administered PTSD Scale version 5) and sociodemographic features. A hundred and twelve patients were enrolled (61 from NET-Trauma and 51 from PROVE). We found a model with four classes suitable for the PTSD staging, with best performance metrics using the C5.0 algorithm to CAPS-5 15-items plus sociodemographic features, with an accuracy of 65.6% for the train dataset and 52.9% for the test dataset (both significant). The number of symptoms, CAPS-5 total score, global severity score, and presence of current/previous trauma events appear as main features to predict PTSD staging. This is the first study to evaluate staging in PTSD with machine learning algorithms using accessible clinical and sociodemographic features, which may be used in future research.

Assuntos

Transtornos de Estresse Pós-Traumáticos , Brasil/epidemiologia , Estudos Transversais , Humanos , Aprendizado de Máquina , Estudo de Prova de Conceito , Transtornos de Estresse Pós-Traumáticos/diagnóstico , Transtornos de Estresse Pós-Traumáticos/epidemiologia

17.

A call for citizen science in pandemic preparedness and response: beyond data collection.

Tan, Yi-Roe; Agrawal, Anurag; Matsoso, Malebona Precious; Katz, Rebecca; Davis, Sara L M; Winkler, Andrea Sylvia; Huber, Annalena; Joshi, Ashish; El-Mohandes, Ayman; Mellado, Bruce; Mubaira, Caroline Antonia; Canlas, Felipe C; Asiki, Gershim; Khosa, Harjyot; Lazarus, Jeffrey Victor; Choisy, Marc; Recamonde-Mendoza, Mariana; Keiser, Olivia; Okwen, Patrick; English, Rene; Stinckwich, Serge; Kiwuwa-Muyingo, Sylvia; Kutadza, Tariro; Sethi, Tavpritesh; Mathaha, Thuso; Nguyen, Vinh Kim; Gill, Amandeep; Yap, Peiling.

BMJ Glob Health ; 7(6)2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35760438

RESUMO

The COVID-19 pandemic has underlined the need to partner with the community in pandemic preparedness and response in order to enable trust-building among stakeholders, which is key in pandemic management. Citizen science, defined here as a practice of public participation and collaboration in all aspects of scientific research to increase knowledge and build trust with governments and researchers, is a crucial approach to promoting community engagement. By harnessing the potential of digitally enabled citizen science, one could translate data into accessible, comprehensible and actionable outputs at the population level. The application of citizen science in health has grown over the years, but most of these approaches remain at the level of participatory data collection. This narrative review examines citizen science approaches in participatory data generation, modelling and visualisation, and calls for truly participatory and co-creation approaches across all domains of pandemic preparedness and response. Further research is needed to identify approaches that optimally generate short-term and long-term value for communities participating in population health. Feasible, sustainable and contextualised citizen science approaches that meaningfully engage affected communities for the long-term will need to be inclusive of all populations and their cultures, comprehensive of all domains, digitally enabled and viewed as a key component to allow trust-building among the stakeholders. The impact of COVID-19 on people's lives has created an opportune time to advance people's agency in science, particularly in pandemic preparedness and response.

Assuntos

COVID-19 , Ciência do Cidadão , Participação da Comunidade , Coleta de Dados , Humanos , Pandemias

18.

Detecting Aedes aegypti mosquitoes through audio classification with convolutional neural networks.

Fernandes, Marcelo Schreiber; Cordeiro, Weverton; Recamonde-Mendoza, Mariana.

Comput Biol Med ; 129: 104152, 2021 02.

Artigo em Inglês | MEDLINE | ID: mdl-33333363

RESUMO

The incidence of mosquito-borne diseases is significant in under-developed regions, mostly due to the lack of resources to implement aggressive control measurements against mosquito proliferation. A potential strategy to raise community awareness regarding mosquito proliferation is building a live map of mosquito incidences using smartphone apps and crowdsourcing. In this paper, we explore the possibility of identifying Aedes aegypti mosquitoes using machine learning techniques and audio analysis captured from commercially available smartphones. In summary, we downsampled Aedes aegypti wingbeat recordings and used them to train a convolutional neural network (CNN) through supervised learning. As a feature, we used the recording spectrogram to represent the mosquito wingbeat frequency over time visually. We trained and compared three classifiers: a binary, a multiclass, and an ensemble of binary classifiers. In our evaluation, the binary and ensemble models achieved accuracy of 97.65% (±0.55) and 94.56% (±0.77), respectively, whereas the multiclass had an accuracy of 78.12% (±2.09). The best sensitivity was observed in the ensemble approach (96.82% ± 1.62), followed by the multiclass for the particular case of Aedes aegypti (90.23% ± 3.83) and the binary (88.49% ± 6.68). The binary classifier and the multiclass classifier presented the best balance between precision and recall, with F1-measure close to 90%. Although the ensemble classifier achieved the lowest precision, thus impairing its F1-measure (79.95% ± 2.13), it was the most powerful classifier to detect Aedes aegypti in our dataset.

Assuntos

Aedes , Animais , Aprendizado de Máquina , Mosquitos Vetores , Redes Neurais de Computação

19.

Roux-en-Y Gastric Bypass Downregulates Angiotensin-Converting Enzyme 2 (ACE2) Gene Expression in Subcutaneous White Adipose Tissue: A Putative Protective Mechanism Against Severe COVID-19.

Kristem, Leonardo; Recamonde-Mendoza, Mariana; Cigerza, Giuliano C; Khoraki, Jad; Campos, Guilherme M; Mazzini, Guilherme S.

Obes Surg ; 31(6): 2831-2834, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-33611766

RESUMO

The angiotensin-converting enzyme 2 (ACE2) is the receptor for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It is highly expressed in adipose tissue, possibly associated with progression to severe coronavirus disease 2019 (COVID-19) in obese subjects. We searched the Gene Expression Omnibus (GEO) and reanalyzed the GSE59034 containing microarray data from subcutaneous white adipose tissue (sWAT) biopsies from 16 women before and 2 years after RYGB, and 16 controls matched by sex, age, and BMI. After RYGB, there was a significant decrease in sWAT ACE2 gene expression (logFC=-0.4175, P=0.0015). Interestingly, after RYGB the sWAT ACE2 gene expression was significantly lower than in non-obese matched controls (LogFC=-0.32875, P=0.0014). Our data adds to the well-known benefits of RYGB, a potential protective mechanism against COVID-19.

Assuntos

COVID-19 , Derivação Gástrica , Obesidade Mórbida , Tecido Adiposo , Enzima de Conversão de Angiotensina 2 , Feminino , Expressão Gênica , Humanos , Obesidade Mórbida/cirurgia , Peptidil Dipeptidase A/genética , SARS-CoV-2

20.

Patterns of high-risk drinking among medical students: A web-based survey with machine learning.

Marcon, Grasiela; de Ávila Pereira, Flávia; Zimerman, Aline; da Silva, Bruno Castro; von Diemen, Lisia; Passos, Ives Cavalcante; Recamonde-Mendoza, Mariana.

Comput Biol Med ; 136: 104747, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34449306

RESUMO

BACKGROUND: Prior studies have found increased rates of alcohol consumption among physicians and medical students. The present study aims to build machine learning (ML) models to identify patterns of high-risk drinking (HRD), including alcohol use disorder, within this population. METHODS: We analyzed data collected through a web-based survey among Brazilian medical students. Variables included sociodemographic data, personal information, university status, and mental health. Stratification for HRD was carried out based on the AUDIT-C scores. Three ML algorithms were used to build classifiers to predict HRD among medical students: elastic net regularization, random forest, and artificial neural networks. Model interpretation techniques were adopted to assess the most influential predictors for models' decisions, which represent potential factors associated with HRD. RESULTS: A total of 4840 medical students were included in the study. The prevalence of HRD was 53.03%. The three ML models built were able to distinguish individuals with HRD from low-risk drinking (LRD) with very similar performance. The average AUC scores in the cross-validation procedure were around 0.72, and this performance was replicated in the test set. The most important features for the ML models were the use of tobacco and cannabis, monthly family income, marital status, sexual orientation, and physical activities. CONCLUSIONS: This study proposes that ML models may serve as tools for initial screening of students regarding their susceptibility for at-risk drinking or alcohol use disorder. In addition, we identified several key factors associated with HRD that could be further investigated and explored for preventive and assistance measures.

Assuntos

Estudantes de Medicina , Algoritmos , Feminino , Humanos , Internet , Aprendizado de Máquina , Masculino , Redes Neurais de Computação

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA