Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
bioRxiv ; 2024 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-39005436

RESUMEN

Objectives: Concept embeddings are low-dimensional vector representations of concepts such as MeSH:D009203 (Myocardial Infarction), whose similarity in the embedded vector space reflects their semantic similarity. Here, we test the hypothesis that non-biomedical concept synonym replacement can improve the quality of biomedical concepts embeddings. Materials and methods: We developed an approach that leverages WordNet to replace sets of synonyms with the most common representative of the synonym set. Results: We tested our approach on 1055 concept sets and found that, on average, the mean intra-cluster distance was reduced by 8% in the vector-space. Assuming that homophily of related concepts in the vector space is desirable, our approach tends to improve the quality of embeddings. Discussion and Conclusion: This pilot study shows that non-biomedical synonym replacement tends to improve the quality of embeddings of biomedical concepts using the Word2Vec algorithm. We have implemented our approach in a freely available Python package available at https://github.com/TheJacksonLaboratory/wn2vec.

2.
medRxiv ; 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-39108510

RESUMEN

Large language models (LLM) have shown great promise in supporting differential diagnosis, but 23 available published studies on the diagnostic accuracy evaluated small cohorts (number of cases, 30-422, mean 104) and have evaluated LLM responses subjectively by manual curation (23/23 studies). The performance of LLMs for rare disease diagnosis has not been evaluated systematically. Here, we perform a rigorous and large-scale analysis of the performance of a GPT-4 in prioritizing candidate diagnoses, using the largest-ever cohort of rare disease patients. Our computational study used 5267 computational case reports from previously published data. Each case was formatted as a Global Alliance for Genomics and Health (GA4GH) phenopacket, in which clinical anomalies were represented as Human Phenotype Ontology (HPO) terms. We developed software to generate prompts from each phenopacket. Prompts were sent to Generative Pre-trained Transformer 4 (GPT-4), and the rank of the correct diagnosis, if present in the response, was recorded. The mean reciprocal rank of the correct diagnosis was 0.24 (with the reciprocal of the MRR corresponding to a rank of 4.2), and the correct diagnosis was placed in rank 1 in 19.2% of the cases, in the first 3 ranks in 28.6%, and in the first 10 ranks in 32.5%. Our study is the largest to be reported to date and provides a realistic estimate of the performance of GPT-4 in rare disease medicine.

3.
Bioinform Adv ; 4(1): vbae036, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38577542

RESUMEN

Motivation: Graph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes. For this reason, classification procedures are forced to assume that the vast majority of unlabeled edges are negative. Existing approaches to sampling negative edges for training and evaluating classifiers do so by uniformly sampling pairs of nodes. Results: We show here that this sampling strategy typically leads to sets of positive and negative examples with imbalanced node degree distributions. Using representative heterogeneous biomedical knowledge graph and random walk-based graph machine learning, we show that this strategy substantially impacts classification performance. If users of graph machine-learning models apply the models to prioritize examples that are drawn from approximately the same distribution as the positive examples are, then performance of models as estimated in the validation phase may be artificially inflated. We present a degree-aware node sampling approach that mitigates this effect and is simple to implement. Availability and implementation: Our code and data are publicly available at https://github.com/monarch-initiative/negativeExampleSelection.

4.
Sci Data ; 11(1): 906, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39174566

RESUMEN

The "RNA world" represents a novel frontier for the study of fundamental biological processes and human diseases and is paving the way for the development of new drugs tailored to each patient's biomolecular characteristics. Although scientific data about coding and non-coding RNA molecules are constantly produced and available from public repositories, they are scattered across different databases and a centralized, uniform, and semantically consistent representation of the "RNA world" is still lacking. We propose RNA-KG, a knowledge graph (KG) encompassing biological knowledge about RNAs gathered from more than 60 public databases, integrating functional relationships with genes, proteins, and chemicals and ontologically grounded biomedical concepts. To develop RNA-KG, we first identified, pre-processed, and characterized each data source; next, we built a meta-graph that provides an ontological description of the KG by representing all the bio-molecular entities and medical concepts of interest in this domain, as well as the types of interactions connecting them. Finally, we leveraged an instance-based semantically abstracted knowledge model to specify the ontological alignment according to which RNA-KG was generated. RNA-KG can be downloaded in different formats and also queried by a SPARQL endpoint. A thorough topological analysis of the resulting heterogeneous graph provides further insights into the characteristics of the "RNA world". RNA-KG can be both directly explored and visualized, and/or analyzed by applying computational methods to infer bio-medical knowledge from its heterogeneous nodes and edges. The resource can be easily updated with new experimental data, and specific views of the overall KG can be extracted according to the bio-medical problem to be studied.


Asunto(s)
ARN , ARN/genética , Humanos , Ontologías Biológicas
5.
Int J Med Inform ; 187: 105461, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38643701

RESUMEN

OBJECTIVE: Female reproductive disorders (FRDs) are common health conditions that may present with significant symptoms. Diet and environment are potential areas for FRD interventions. We utilized a knowledge graph (KG) method to predict factors associated with common FRDs (for example, endometriosis, ovarian cyst, and uterine fibroids). MATERIALS AND METHODS: We harmonized survey data from the Personalized Environment and Genes Study (PEGS) on internal and external environmental exposures and health conditions with biomedical ontology content. We merged the harmonized data and ontologies with supplemental nutrient and agricultural chemical data to create a KG. We analyzed the KG by embedding edges and applying a random forest for edge prediction to identify variables potentially associated with FRDs. We also conducted logistic regression analysis for comparison. RESULTS: Across 9765 PEGS respondents, the KG analysis resulted in 8535 significant or suggestive predicted links between FRDs and chemicals, phenotypes, and diseases. Amongst these links, 32 were exact matches when compared with the logistic regression results, including comorbidities, medications, foods, and occupational exposures. DISCUSSION: Mechanistic underpinnings of predicted links documented in the literature may support some of our findings. Our KG methods are useful for predicting possible associations in large, survey-based datasets with added information on directionality and magnitude of effect from logistic regression. These results should not be construed as causal but can support hypothesis generation. CONCLUSION: This investigation enabled the generation of hypotheses on a variety of potential links between FRDs and exposures. Future investigations should prospectively evaluate the variables hypothesized to impact FRDs.


Asunto(s)
Exposición a Riesgos Ambientales , Humanos , Femenino , Exposición a Riesgos Ambientales/efectos adversos , Enfermedades de los Genitales Femeninos , Modelos Logísticos , Estado Nutricional , Dieta , Adulto , Bosques Aleatorios
6.
J Pers Med ; 14(4)2024 Mar 25.
Artículo en Inglés | MEDLINE | ID: mdl-38672968

RESUMEN

Artificial intelligence (AI) approaches have been introduced in various disciplines but remain rather unused in head and neck (H&N) cancers. This survey aimed to infer the current applications of and attitudes toward AI in the multidisciplinary care of H&N cancers. From November 2020 to June 2022, a web-based questionnaire examining the relationship between AI usage and professionals' demographics and attitudes was delivered to different professionals involved in H&N cancers through social media and mailing lists. A total of 139 professionals completed the questionnaire. Only 49.7% of the respondents reported having experience with AI. The most frequent AI users were radiologists (66.2%). Significant predictors of AI use were primary specialty (V = 0.455; p < 0.001), academic qualification and age. AI's potential was seen in the improvement of diagnostic accuracy (72%), surgical planning (64.7%), treatment selection (57.6%), risk assessment (50.4%) and the prediction of complications (45.3%). Among participants, 42.7% had significant concerns over AI use, with the most frequent being the 'loss of control' (27.6%) and 'diagnostic errors' (57.0%). This survey reveals limited engagement with AI in multidisciplinary H&N cancer care, highlighting the need for broader implementation and further studies to explore its acceptance and benefits.

7.
Transl Psychiatry ; 14(1): 246, 2024 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-38851761

RESUMEN

Acute COVID-19 infection can be followed by diverse clinical manifestations referred to as Post Acute Sequelae of SARS-CoV2 Infection (PASC). Studies have shown an increased risk of being diagnosed with new-onset psychiatric disease following a diagnosis of acute COVID-19. However, it was unclear whether non-psychiatric PASC-associated manifestations (PASC-AMs) are associated with an increased risk of new-onset psychiatric disease following COVID-19. A retrospective electronic health record (EHR) cohort study of 2,391,006 individuals with acute COVID-19 was performed to evaluate whether non-psychiatric PASC-AMs are associated with new-onset psychiatric disease. Data were obtained from the National COVID Cohort Collaborative (N3C), which has EHR data from 76 clinical organizations. EHR codes were mapped to 151 non-psychiatric PASC-AMs recorded 28-120 days following SARS-CoV-2 diagnosis and before diagnosis of new-onset psychiatric disease. Association of newly diagnosed psychiatric disease with age, sex, race, pre-existing comorbidities, and PASC-AMs in seven categories was assessed by logistic regression. There were significant associations between a diagnosis of any psychiatric disease and five categories of PASC-AMs with odds ratios highest for neurological, cardiovascular, and constitutional PASC-AMs with odds ratios of 1.31, 1.29, and 1.23 respectively. Secondary analysis revealed that the proportions of 50 individual clinical features significantly differed between patients diagnosed with different psychiatric diseases. Our study provides evidence for association between non-psychiatric PASC-AMs and the incidence of newly diagnosed psychiatric disease. Significant associations were found for features related to multiple organ systems. This information could prove useful in understanding risk stratification for new-onset psychiatric disease following COVID-19. Prospective studies are needed to corroborate these findings.


Asunto(s)
COVID-19 , Trastornos Mentales , SARS-CoV-2 , Humanos , COVID-19/psicología , COVID-19/complicaciones , COVID-19/epidemiología , Masculino , Femenino , Trastornos Mentales/epidemiología , Persona de Mediana Edad , Adulto , Estudios Retrospectivos , Anciano , Fenotipo , Síndrome Post Agudo de COVID-19 , Comorbilidad , Registros Electrónicos de Salud , Adulto Joven , Factores de Riesgo , Adolescente
8.
EBioMedicine ; 106: 105220, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39018755

RESUMEN

BACKGROUND: Anthracycline-based neoadjuvant chemotherapy (NAC) may modify tumour immune infiltrate. This study characterized immune infiltrate spatial distribution after NAC in primary high-risk soft tissue sarcomas (STS) and investigate association with prognosis. METHODS: The ISG-STS 1001 trial randomized STS patients to anthracycline plus ifosfamide (AI) or a histology-tailored (HT) NAC. Four areas of tumour specimens were sampled: the area showing the highest lymphocyte infiltrate (HI) at H&E; the area with lack of post-treatment changes (highest grade, HG); the area with post-treatment changes (lowest grade, LG); and the tumour edge (TE). CD3, CD8, PD-1, CD20, FOXP3, and CD163 were analyzed at immunohistochemistry and digital pathology. A machine learning method was used to generate sarcoma immune index scores (SIS) that predict patient disease-free and overall survival (DFS and OS). FINDINGS: Tumour infiltrating lymphocytes and PD-1+ cells together with CD163+ cells were more represented in STS histologies with complex compared to simple karyotype, while CD20+ B-cells were detected in both these histology groups. PD-1+ cells exerted a negative prognostic value irrespectively of their spatial distribution. Enrichment in CD20+ B-cells at HI and TE areas was associated with better patient outcomes. We generated a prognostic SIS for each tumour area, having the HI-SIS the best performance. Such prognostic value was driven by treatment with AI. INTERPRETATION: The different spatial distribution of immune populations and their different association with prognosis support NAC as a modifier of tumour immune infiltrate in STS. FUNDING: Pharmamar; Italian Ministry of Health [RF-2019-12370923; GR-2016-02362609]; 5 × 1000 Funds-2016, Italian Ministry of Health; AIRC Grant [ID#28546].


Asunto(s)
Linfocitos Infiltrantes de Tumor , Terapia Neoadyuvante , Sarcoma , Humanos , Sarcoma/tratamiento farmacológico , Sarcoma/mortalidad , Sarcoma/inmunología , Sarcoma/patología , Femenino , Masculino , Linfocitos Infiltrantes de Tumor/inmunología , Linfocitos Infiltrantes de Tumor/metabolismo , Pronóstico , Persona de Mediana Edad , Adulto , Anciano , Resultado del Tratamiento , Microambiente Tumoral/inmunología , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Biomarcadores de Tumor , Inmunohistoquímica
9.
Sci Data ; 11(1): 363, 2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38605048

RESUMEN

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Bases del Conocimiento , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Investigación Biomédica Traslacional
10.
J Clin Med ; 12(24)2023 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-38137667

RESUMEN

PURPOSE: to evaluate the clinical impact of a protocol for the image-guided percutaneous microwave ablation (MWA) of hepatocellular carcinoma (HCC) that includes cone-beam computed tomography (CBCT), fusion imaging and ablation volume prediction in patients with hepatocellular carcinoma unsuitable for standard ultrasound (US) guidance. MATERIALS AND METHODS: this study included all patients with HCC treated with MWA between January 2021 and June 2022 in a tertiary institution. Patients were divided into two groups: Group A, treated following the protocol, and Group B, treated with standard ultrasound (US) guidance. Follow-up images were reviewed to assess residual disease (RD), local tumor progression (LTP) and intrahepatic distant recurrence (IDR). Ablation response at 1 month was also evaluated according to mRECIST. Baseline variables and outcomes were compared between the groups. For 1-month RD, propensity score weighting (PSW) was performed. RESULTS: 80 consecutive patients with 101 HCCs treated with MWA were divided into two groups. Group A had 41 HCCs in 37 patients, and Group B had 60 HCCs in 43 patients. Among all baseline variables, the groups differed regarding their age (mean of 72 years in Group A and 64 years in Group B, respectively), new vs. residual tumor rates (48% Group A vs. 25% Group B, p < 0.05) and number of subcapsular tumors (56.7% Group B vs. 31.7% Group A, p < 0.05) and perivascular tumors (51.7% Group B vs. 17.1% Group A, p < 0.05). The protocol led to repositioning the antenna in 49% of cases. There was a significant difference in 1-month local response between the groups measured as the RD rate and mRECIST outcomes. LTP rates at 3 and 6 months, and IDR rates at 1, 3 and 6 months, showed no significant differences. Among all variables, logistic regression after PSW demonstrated a protective effect of the protocol against 1-month RD. CONCLUSIONS: The use of CBCT, fusion imaging and ablation volume prediction during percutaneous MWA of HCCs provided a better 1-month tumor local control. Further studies with a larger population and longer follow-up are needed.

11.
Nat Comput Sci ; 3(6): 552-568, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38177435

RESUMEN

Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately 1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing utilities, and over 80,000 graphs from the literature and other sources. Standardized interfaces allow a seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of graph-representation-learning methods, therefore also positioning GRAPE as a software resource that performs a fair comparison between methods and libraries for graph processing and embedding.


Asunto(s)
Bibliotecas , Vitis , Algoritmos , Programas Informáticos , Aprendizaje
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA