Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
1.
Transl Psychiatry ; 14(1): 246, 2024 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-38851761

RESUMO

Acute COVID-19 infection can be followed by diverse clinical manifestations referred to as Post Acute Sequelae of SARS-CoV2 Infection (PASC). Studies have shown an increased risk of being diagnosed with new-onset psychiatric disease following a diagnosis of acute COVID-19. However, it was unclear whether non-psychiatric PASC-associated manifestations (PASC-AMs) are associated with an increased risk of new-onset psychiatric disease following COVID-19. A retrospective electronic health record (EHR) cohort study of 2,391,006 individuals with acute COVID-19 was performed to evaluate whether non-psychiatric PASC-AMs are associated with new-onset psychiatric disease. Data were obtained from the National COVID Cohort Collaborative (N3C), which has EHR data from 76 clinical organizations. EHR codes were mapped to 151 non-psychiatric PASC-AMs recorded 28-120 days following SARS-CoV-2 diagnosis and before diagnosis of new-onset psychiatric disease. Association of newly diagnosed psychiatric disease with age, sex, race, pre-existing comorbidities, and PASC-AMs in seven categories was assessed by logistic regression. There were significant associations between a diagnosis of any psychiatric disease and five categories of PASC-AMs with odds ratios highest for neurological, cardiovascular, and constitutional PASC-AMs with odds ratios of 1.31, 1.29, and 1.23 respectively. Secondary analysis revealed that the proportions of 50 individual clinical features significantly differed between patients diagnosed with different psychiatric diseases. Our study provides evidence for association between non-psychiatric PASC-AMs and the incidence of newly diagnosed psychiatric disease. Significant associations were found for features related to multiple organ systems. This information could prove useful in understanding risk stratification for new-onset psychiatric disease following COVID-19. Prospective studies are needed to corroborate these findings.


Assuntos
COVID-19 , Transtornos Mentais , SARS-CoV-2 , Humanos , COVID-19/psicologia , COVID-19/complicações , COVID-19/epidemiologia , Masculino , Feminino , Transtornos Mentais/epidemiologia , Pessoa de Meia-Idade , Adulto , Estudos Retrospectivos , Idoso , Fenótipo , Síndrome de COVID-19 Pós-Aguda , Comorbidade , Registros Eletrônicos de Saúde , Adulto Jovem , Fatores de Risco , Adolescente
2.
Sci Data ; 11(1): 363, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605048

RESUMO

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.


Assuntos
Disciplinas das Ciências Biológicas , Bases de Conhecimento , Reconhecimento Automatizado de Padrão , Algoritmos , Pesquisa Translacional Biomédica
3.
Int J Med Inform ; 187: 105461, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38643701

RESUMO

OBJECTIVE: Female reproductive disorders (FRDs) are common health conditions that may present with significant symptoms. Diet and environment are potential areas for FRD interventions. We utilized a knowledge graph (KG) method to predict factors associated with common FRDs (for example, endometriosis, ovarian cyst, and uterine fibroids). MATERIALS AND METHODS: We harmonized survey data from the Personalized Environment and Genes Study (PEGS) on internal and external environmental exposures and health conditions with biomedical ontology content. We merged the harmonized data and ontologies with supplemental nutrient and agricultural chemical data to create a KG. We analyzed the KG by embedding edges and applying a random forest for edge prediction to identify variables potentially associated with FRDs. We also conducted logistic regression analysis for comparison. RESULTS: Across 9765 PEGS respondents, the KG analysis resulted in 8535 significant or suggestive predicted links between FRDs and chemicals, phenotypes, and diseases. Amongst these links, 32 were exact matches when compared with the logistic regression results, including comorbidities, medications, foods, and occupational exposures. DISCUSSION: Mechanistic underpinnings of predicted links documented in the literature may support some of our findings. Our KG methods are useful for predicting possible associations in large, survey-based datasets with added information on directionality and magnitude of effect from logistic regression. These results should not be construed as causal but can support hypothesis generation. CONCLUSION: This investigation enabled the generation of hypotheses on a variety of potential links between FRDs and exposures. Future investigations should prospectively evaluate the variables hypothesized to impact FRDs.


Assuntos
Exposição Ambiental , Humanos , Feminino , Exposição Ambiental/efeitos adversos , Doenças dos Genitais Femininos , Modelos Logísticos , Estado Nutricional , Dieta , Adulto , Algoritmo Florestas Aleatórias
4.
J Pers Med ; 14(4)2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38672968

RESUMO

Artificial intelligence (AI) approaches have been introduced in various disciplines but remain rather unused in head and neck (H&N) cancers. This survey aimed to infer the current applications of and attitudes toward AI in the multidisciplinary care of H&N cancers. From November 2020 to June 2022, a web-based questionnaire examining the relationship between AI usage and professionals' demographics and attitudes was delivered to different professionals involved in H&N cancers through social media and mailing lists. A total of 139 professionals completed the questionnaire. Only 49.7% of the respondents reported having experience with AI. The most frequent AI users were radiologists (66.2%). Significant predictors of AI use were primary specialty (V = 0.455; p < 0.001), academic qualification and age. AI's potential was seen in the improvement of diagnostic accuracy (72%), surgical planning (64.7%), treatment selection (57.6%), risk assessment (50.4%) and the prediction of complications (45.3%). Among participants, 42.7% had significant concerns over AI use, with the most frequent being the 'loss of control' (27.6%) and 'diagnostic errors' (57.0%). This survey reveals limited engagement with AI in multidisciplinary H&N cancer care, highlighting the need for broader implementation and further studies to explore its acceptance and benefits.

5.
Bioinform Adv ; 4(1): vbae036, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38577542

RESUMO

Motivation: Graph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes. For this reason, classification procedures are forced to assume that the vast majority of unlabeled edges are negative. Existing approaches to sampling negative edges for training and evaluating classifiers do so by uniformly sampling pairs of nodes. Results: We show here that this sampling strategy typically leads to sets of positive and negative examples with imbalanced node degree distributions. Using representative heterogeneous biomedical knowledge graph and random walk-based graph machine learning, we show that this strategy substantially impacts classification performance. If users of graph machine-learning models apply the models to prioritize examples that are drawn from approximately the same distribution as the positive examples are, then performance of models as estimated in the validation phase may be artificially inflated. We present a degree-aware node sampling approach that mitigates this effect and is simple to implement. Availability and implementation: Our code and data are publicly available at https://github.com/monarch-initiative/negativeExampleSelection.

6.
medRxiv ; 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-37503093

RESUMO

Objective: Large Language Models such as GPT-4 previously have been applied to differential diagnostic challenges based on published case reports. Published case reports have a sophisticated narrative style that is not readily available from typical electronic health records (EHR). Furthermore, even if such a narrative were available in EHRs, privacy requirements would preclude sending it outside the hospital firewall. We therefore tested a method for parsing clinical texts to extract ontology terms and programmatically generating prompts that by design are free of protected health information. Materials and Methods: We investigated different methods to prepare prompts from 75 recently published case reports. We transformed the original narratives by extracting structured terms representing phenotypic abnormalities, comorbidities, treatments, and laboratory tests and creating prompts programmatically. Results: Performance of all of these approaches was modest, with the correct diagnosis ranked first in only 5.3-17.6% of cases. The performance of the prompts created from structured data was substantially worse than that of the original narrative texts, even if additional information was added following manual review of term extraction. Moreover, different versions of GPT-4 demonstrated substantially different performance on this task. Discussion: The sensitivity of the performance to the form of the prompt and the instability of results over two GPT-4 versions represent important current limitations to the use of GPT-4 to support diagnosis in real-life clinical settings. Conclusion: Research is needed to identify the best methods for creating prompts from typically available clinical data to support differential diagnostics.

7.
Front Bioinform ; 3: 1304099, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38076030

RESUMO

The recent breakthroughs of Large Language Models (LLMs) in the context of natural language processing have opened the way to significant advances in protein research. Indeed, the relationships between human natural language and the "language of proteins" invite the application and adaptation of LLMs to protein modelling and design. Considering the impressive results of GPT-4 and other recently developed LLMs in processing, generating and translating human languages, we anticipate analogous results with the language of proteins. Indeed, protein language models have been already trained to accurately predict protein properties, generate novel functionally characterized proteins, achieving state-of-the-art results. In this paper we discuss the promises and the open challenges raised by this novel and exciting research area, and we propose our perspective on how LLMs will affect protein modeling and design.

8.
J Clin Med ; 12(24)2023 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-38137667

RESUMO

PURPOSE: to evaluate the clinical impact of a protocol for the image-guided percutaneous microwave ablation (MWA) of hepatocellular carcinoma (HCC) that includes cone-beam computed tomography (CBCT), fusion imaging and ablation volume prediction in patients with hepatocellular carcinoma unsuitable for standard ultrasound (US) guidance. MATERIALS AND METHODS: this study included all patients with HCC treated with MWA between January 2021 and June 2022 in a tertiary institution. Patients were divided into two groups: Group A, treated following the protocol, and Group B, treated with standard ultrasound (US) guidance. Follow-up images were reviewed to assess residual disease (RD), local tumor progression (LTP) and intrahepatic distant recurrence (IDR). Ablation response at 1 month was also evaluated according to mRECIST. Baseline variables and outcomes were compared between the groups. For 1-month RD, propensity score weighting (PSW) was performed. RESULTS: 80 consecutive patients with 101 HCCs treated with MWA were divided into two groups. Group A had 41 HCCs in 37 patients, and Group B had 60 HCCs in 43 patients. Among all baseline variables, the groups differed regarding their age (mean of 72 years in Group A and 64 years in Group B, respectively), new vs. residual tumor rates (48% Group A vs. 25% Group B, p < 0.05) and number of subcapsular tumors (56.7% Group B vs. 31.7% Group A, p < 0.05) and perivascular tumors (51.7% Group B vs. 17.1% Group A, p < 0.05). The protocol led to repositioning the antenna in 49% of cases. There was a significant difference in 1-month local response between the groups measured as the RD rate and mRECIST outcomes. LTP rates at 3 and 6 months, and IDR rates at 1, 3 and 6 months, showed no significant differences. Among all variables, logistic regression after PSW demonstrated a protective effect of the protocol against 1-month RD. CONCLUSIONS: The use of CBCT, fusion imaging and ablation volume prediction during percutaneous MWA of HCCs provided a better 1-month tumor local control. Further studies with a larger population and longer follow-up are needed.

9.
EBioMedicine ; 96: 104777, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37672869

RESUMO

BACKGROUND: The cause and symptoms of long COVID are poorly understood. It is challenging to predict whether a given COVID-19 patient will develop long COVID in the future. METHODS: We used electronic health record (EHR) data from the National COVID Cohort Collaborative to predict the incidence of long COVID. We trained two machine learning (ML) models - logistic regression (LR) and random forest (RF). Features used to train predictors included symptoms and drugs ordered during acute infection, measures of COVID-19 treatment, pre-COVID comorbidities, and demographic information. We assigned the 'long COVID' label to patients diagnosed with the U09.9 ICD10-CM code. The cohorts included patients with (a) EHRs reported from data partners using U09.9 ICD10-CM code and (b) at least one EHR in each feature category. We analysed three cohorts: all patients (n = 2,190,579; diagnosed with long COVID = 17,036), inpatients (149,319; 3,295), and outpatients (2,041,260; 13,741). FINDINGS: LR and RF models yielded median AUROC of 0.76 and 0.75, respectively. Ablation study revealed that drugs had the highest influence on the prediction task. The SHAP method identified age, gender, cough, fatigue, albuterol, obesity, diabetes, and chronic lung disease as explanatory features. Models trained on data from one N3C partner and tested on data from the other partners had average AUROC of 0.75. INTERPRETATION: ML-based classification using EHR information from the acute infection period is effective in predicting long COVID. SHAP methods identified important features for prediction. Cross-site analysis demonstrated the generalizability of the proposed methodology. FUNDING: NCATS U24 TR002306, NCATS UL1 TR003015, Axle Informatics Subcontract: NCATS-P00438-B, NIH/NIDDK/OD, PSR2015-1720GVALE_01, G43C22001320007, and Director, Office of Science, Office of Basic Energy Sciences of the U.S. Department of Energy Contract No. DE-AC02-05CH11231.


Assuntos
COVID-19 , Síndrome de COVID-19 Pós-Aguda , Humanos , Tratamento Farmacológico da COVID-19 , Aprendizado de Máquina , Obesidade
10.
medRxiv ; 2023 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-37502882

RESUMO

Objective: Female reproductive disorders (FRDs) are common health conditions that may present with significant symptoms. Diet and environment are potential areas for FRD interventions. We utilized a knowledge graph (KG) method to predict factors associated with common FRDs (e.g., endometriosis, ovarian cyst, and uterine fibroids). Materials and Methods: We harmonized survey data from the Personalized Environment and Genes Study on internal and external environmental exposures and health conditions with biomedical ontology content. We merged the harmonized data and ontologies with supplemental nutrient and agricultural chemical data to create a KG. We analyzed the KG by embedding edges and applying a random forest for edge prediction to identify variables potentially associated with FRDs. We also conducted logistic regression analysis for comparison. Results: Across 9765 PEGS respondents, the KG analysis resulted in 8535 significant predicted links between FRDs and chemicals, phenotypes, and diseases. Amongst these links, 32 were exact matches when compared with the logistic regression results, including comorbidities, medications, foods, and occupational exposures. Discussion: Mechanistic underpinnings of predicted links documented in the literature may support some of our findings. Our KG methods are useful for predicting possible associations in large, survey-based datasets with added information on directionality and magnitude of effect from logistic regression. These results should not be construed as causal, but can support hypothesis generation. Conclusion: This investigation enabled the generation of hypotheses on a variety of potential links between FRDs and exposures. Future investigations should prospectively evaluate the variables hypothesized to impact FRDs.

11.
Diagnostics (Basel) ; 13(11)2023 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-37296701

RESUMO

(1) Background: The assessment of resection margins during surgery of oral cavity squamous cell cancer (OCSCC) dramatically impacts the prognosis of the patient as well as the need for adjuvant treatment in the future. Currently there is an unmet need to improve OCSCC surgical margins which appear to be involved in around 45% cases. Intraoperative imaging techniques, magnetic resonance imaging (MRI) and intraoral ultrasound (ioUS), have emerged as promising tools in guiding surgical resection, although the number of studies available on this subject is still low. The aim of this diagnostic test accuracy (DTA) review is to investigate the accuracy of intraoperative imaging in the assessment of OCSCC margins. (2) Methods: By using the Cochrane-supported platform Review Manager version 5.4, a systematic search was performed on the online databases MEDLINE-EMBASE-CENTRAL using the keywords "oral cavity cancer, squamous cell carcinoma, tongue cancer, surgical margins, magnetic resonance imaging, intraoperative, intra-oral ultrasound". (3) Results: Ten papers were identified for full-text analysis. The negative predictive value (cutoff < 5 mm) for ioUS ranged from 0.55 to 0.91, that of MRI ranged from 0.5 to 0.91; accuracy analysis performed on four selected studies showed a sensitivity ranging from 0.07 to 0.75 and specificity ranging from 0.81 to 1. Image guidance allowed for a mean improvement in free margin resection of 35%. (4) Conclusions: IoUS shows comparable accuracy to that of ex vivo MRI for the assessment of close and involved surgical margins, and should be preferred as the more affordable and reproducible technique. Both techniques showed higher diagnostic yield if applied to early OCSCC (T1-T2 stages), and when histology is favorable.

12.
NPJ Digit Med ; 6(1): 89, 2023 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-37208468

RESUMO

Common data models solve many challenges of standardizing electronic health record (EHR) data but are unable to semantically integrate all of the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping.

13.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36929917

RESUMO

MOTIVATION: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific gene ontology annotations. RESULTS: We present isoform interpretation, a method that uses expectation-maximization to infer isoform-specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85 617 isoforms of 17 900 protein-coding human genes spanning a range of 17 430 distinct gene ontology terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isoform interpretation significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isoform interpretation show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene-level function. AVAILABILITY AND IMPLEMENTATION: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321.


Assuntos
Motivação , Software , Humanos , Isoformas de Proteínas/genética , Processamento Alternativo , Análise de Sequência de RNA
14.
J Biomed Inform ; 139: 104295, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36716983

RESUMO

Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful for assessing associations between patients' predictors and outcomes of interest. However, these datasets often suffer from missing values in a high proportion of cases, whose removal may introduce severe bias. Several multiple imputation algorithms have been proposed to attempt to recover the missing information under an assumed missingness mechanism. Each algorithm presents strengths and weaknesses, and there is currently no consensus on which multiple imputation algorithm works best in a given scenario. Furthermore, the selection of each algorithm's parameters and data-related modeling choices are also both crucial and challenging. In this paper we propose a novel framework to numerically evaluate strategies for handling missing data in the context of statistical analysis, with a particular focus on multiple imputation techniques. We demonstrate the feasibility of our approach on a large cohort of type-2 diabetes patients provided by the National COVID Cohort Collaborative (N3C) Enclave, where we explored the influence of various patient characteristics on outcomes related to COVID-19. Our analysis included classic multiple imputation techniques as well as simple complete-case Inverse Probability Weighted models. Extensive experiments show that our approach can effectively highlight the most promising and performant missing-data handling strategy for our case study. Moreover, our methodology allowed a better understanding of the behavior of the different models and of how it changed as we modified their parameters. Our method is general and can be applied to different research fields and on datasets containing heterogeneous types.


Assuntos
COVID-19 , Humanos , Algoritmos , Projetos de Pesquisa , Viés , Probabilidade
15.
EBioMedicine ; 87: 104413, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36563487

RESUMO

BACKGROUND: Stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, long COVID is incompletely understood and characterised by a wide range of manifestations that are difficult to analyse computationally. Additionally, the generalisability of machine learning classification of COVID-19 clinical outcomes has rarely been tested. METHODS: We present a method for computationally modelling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning. FINDINGS: We found six clusters of PASC patients, each with distinct profiles of phenotypic abnormalities, including clusters with distinct pulmonary, neuropsychiatric, and cardiovascular abnormalities, and a cluster associated with broad, severe manifestations and increased mortality. There was significant association of cluster membership with a range of pre-existing conditions and measures of severity during acute COVID-19. We assigned new patients from other healthcare centres to clusters by maximum semantic similarity to the original patients, and showed that the clusters were generalisable across different hospital systems. The increased mortality rate originally identified in one cluster was consistently observed in patients assigned to that cluster in other hospital systems. INTERPRETATION: Semantic phenotypic clustering provides a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC. FUNDING: NIH (TR002306/OT2HL161847-01/OD011883/HG010860), U.S.D.O.E. (DE-AC02-05CH11231), Donald A. Roux Family Fund at Jackson Laboratory, Marsico Family at CU Anschutz.


Assuntos
COVID-19 , Síndrome de COVID-19 Pós-Aguda , Humanos , Progressão da Doença , SARS-CoV-2
16.
Head Neck ; 45(2): 482-491, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36349545

RESUMO

Machine learning (ML) is increasingly used to detect lymph node (LN) metastases in head and neck (H&N) carcinoma. We systematically reviewed the literature on radiomic-based ML for the detection of pathological LNs in H&N cancer. A systematic review was conducted in PubMed, EMBASE, and the Cochrane Library. Baseline study characteristics and methodological quality items (modeling, performance evaluation, clinical utility, and transparency items) were extracted and evaluated. The qualitative synthesis is presented using descriptive statistics. Seven studies were included in this study. Overall, the methodological quality items were generally favorable for modeling (57% of studies). The studies were mostly unsuccessful in terms of transparency (85.7%), evaluation of clinical utility (71.3%), and assessment of generalizability employing independent or external validation (72.5%). ML may be able to predict LN metastases in H&N cancer. Further studies are warranted to improve the generalizability assessment, clinical utility evaluation, and transparency items.


Assuntos
Neoplasias de Cabeça e Pescoço , Linfonodos , Humanos , Metástase Linfática/patologia , Linfonodos/diagnóstico por imagem , Linfonodos/patologia , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Neoplasias de Cabeça e Pescoço/patologia , Aprendizado de Máquina
17.
Nat Comput Sci ; 3(6): 552-568, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38177435

RESUMO

Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately 1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing utilities, and over 80,000 graphs from the literature and other sources. Standardized interfaces allow a seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of graph-representation-learning methods, therefore also positioning GRAPE as a software resource that performs a fair comparison between methods and libraries for graph processing and embedding.


Assuntos
Bibliotecas , Vitis , Algoritmos , Software , Aprendizagem
18.
BMC Bioinformatics ; 23(Suppl 2): 154, 2022 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-36510125

RESUMO

BACKGROUND: Cis-regulatory regions (CRRs) are non-coding regions of the DNA that fine control the spatio-temporal pattern of transcription; they are involved in a wide range of pivotal processes such as the development of specific cell-lines/tissues and the dynamic cell response to physiological stimuli. Recent studies showed that genetic variants occurring in CRRs are strongly correlated with pathogenicity or deleteriousness. Considering the central role of CRRs in the regulation of physiological and pathological conditions, the correct identification of CRRs and of their tissue-specific activity status through Machine Learning methods plays a major role in dissecting the impact of genetic variants on human diseases. Unfortunately, the problem is still open, though some promising results have been already reported by (deep) machine-learning based methods that predict active promoters and enhancers in specific tissues or cell lines by encoding epigenetic or spectral features directly extracted from DNA sequences. RESULTS: We present the experiments we performed to compare two Deep Neural Networks, a Feed-Forward Neural Network model working on epigenomic features, and a Convolutional Neural Network model working only on genomic sequence, targeted to the identification of enhancer- and promoter-activity in specific cell lines. While performing experiments to understand how the experimental setup influences the prediction performance of the methods, we particularly focused on (1) automatic model selection performed by Bayesian optimization and (2) exploring different data rebalancing setups for reducing negative unbalancing effects. CONCLUSIONS: Results show that (1) automatic model selection by Bayesian optimization improves the quality of the learner; (2) data rebalancing considerably impacts the prediction performance of the models; test set rebalancing may provide over-optimistic results, and should therefore be cautiously applied; (3) despite working on sequence data, convolutional models obtain performance close to those of feed forward models working on epigenomic information, which suggests that also sequence data carries informative content for CRR-activity prediction. We therefore suggest combining both models/data types in future works.


Assuntos
Aprendizado Profundo , Humanos , Teorema de Bayes , Sequências Reguladoras de Ácido Nucleico , Redes Neurais de Computação , Aprendizado de Máquina
19.
medRxiv ; 2022 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-36380762

RESUMO

Acute COVID-19 infection can be followed by diverse clinical manifestations referred to as Post Acute Sequelae of SARS-CoV2 Infection (PASC). Studies have shown an increased risk of being diagnosed with new-onset psychiatric disease following a diagnosis of acute COVID-19. However, it was unclear whether non-psychiatric PASC-associated manifestations (PASC-AMs) are associated with an increased risk of new-onset psychiatric disease following COVID-19. A retrospective EHR cohort study of 1,603,767 individuals with acute COVID-19 was performed to evaluate whether non-psychiatric PASC-AMs are associated with new-onset psychiatric disease. Data were obtained from the National COVID Cohort Collaborative (N3C), which has EHR data from 65 clinical organizations. EHR codes were mapped to 151 non-psychiatric PASC-AMs recorded 28-120 days following SARS-CoV-2 diagnosis and before diagnosis of new-onset psychiatric disease. Association of newly diagnosed psychiatric disease with age, sex, race, pre-existing comorbidities, and PASC-AMs in seven categories was assessed by logistic regression. There was a significant association between six categories and newly diagnosed anxiety, mood, and psychotic disorders, with odds ratios highest for cardiovascular (1.35, 1.27-1.42) PASC-AMs. Secondary analysis revealed that the proportions of 95 individual clinical features significantly differed between patients diagnosed with different psychiatric disorders. Our study provides evidence for association between non-psychiatric PASC-AMs and the incidence of newly diagnosed psychiatric disease. Significant associations were found for features related to multiple organ systems. This information could prove useful in understanding risk stratification for new-onset psychiatric disease following COVID-19. Prospective studies are needed to corroborate these findings. Funding: NCATS U24 TR002306.

20.
Diabetes Res Clin Pract ; 194: 110157, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36400170

RESUMO

AIMS: Studies suggest that metformin is associated with reduced COVID-19 severity in individuals with diabetes compared to other antihyperglycemics. We assessed if metformin is associated with reduced incidence of severe COVID-19 for patients with prediabetes or polycystic ovary syndrome (PCOS), common diseases that increase the risk of severe COVID-19. METHODS: This observational, retrospective study utilized EHR data from 52 hospitals for COVID-19 patients with PCOS or prediabetes treated with metformin or levothyroxine/ondansetron (controls). After balancing via inverse probability score weighting, associations with COVID-19 severity were assessed by logistic regression. RESULTS: In the prediabetes cohort, when compared to levothyroxine, metformin was associated with a significantly lower incidence of COVID-19 with "mild-ED" or worse (OR [95% CI]: 0.636, [0.455-0.888]) and "moderate" or worse severity (0.493 [0.339-0.718]). Compared to ondansetron, metformin was associated with lower incidence of "mild-ED" or worse severity (0.039 [0.026-0.057]), "moderate" or worse (0.045 [0.03-0.069]), "severe" or worse (0.183 [0.077-0.431]), and "mortality/hospice" (0.223 [0.071-0.694]). For PCOS, metformin showed no significant differences in severity compared to levothyroxine, but was associated with a significantly lower incidence of "mild-ED" or worse (0.101 [0.061-0.166]), and "moderate" or worse (0.094 [0.049-0.18]) COVID-19 outcome compared to ondansetron. CONCLUSIONS: Metformin use is associated with less severe COVID-19 in patients with prediabetes or PCOS.


Assuntos
COVID-19 , Metformina , Síndrome do Ovário Policístico , Estado Pré-Diabético , Feminino , Humanos , Metformina/uso terapêutico , Estudos Retrospectivos , COVID-19/epidemiologia , COVID-19/complicações , Estado Pré-Diabético/tratamento farmacológico , Estado Pré-Diabético/epidemiologia , Estado Pré-Diabético/complicações , Síndrome do Ovário Policístico/complicações , Hipoglicemiantes/uso terapêutico , Tiroxina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...