Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 40(10)2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39271156

RESUMO

MOTIVATION: Molecular representation learning (MRL) models molecules with low-dimensional vectors to support biological and chemical applications. Current methods primarily rely on intrinsic molecular information to learn molecular representations, but they often overlook effectively integrating domain knowledge into MRL. RESULTS: In this article, we develop a reaction-enhanced graph learning (RXGL) framework for MRL, utilizing chemical reactions as domain knowledge. RXGL introduces dual graph learning modules to model molecule representation. One module employs graph convolutions on molecular graphs to capture molecule structures. The other module constructs a reaction-aware graph from chemical reactions and designs a novel graph attention network on this graph to integrate reaction-level relations into molecular modeling. To refine molecule representations, we design a reaction-based relation learning task, which considers the relations between the reactant and product sides in reactions. In addition, we introduce a cross-view contrastive task to strengthen the cooperative associations between molecular and reaction-aware graph learning. Experiment results show that our RXGL achieves strong performance in various downstream tasks, including product prediction, reaction classification, and molecular property prediction. AVAILABILITY AND IMPLEMENTATION: The code is publicly available at https://github.com/coder-ACAC/RLM.


Assuntos
Aprendizado de Máquina , Modelos Moleculares , Algoritmos , Biologia Computacional/métodos , Modelos Químicos
2.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35679533

RESUMO

Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.


Assuntos
Algoritmos , Aprendizado de Máquina
3.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36929917

RESUMO

MOTIVATION: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific gene ontology annotations. RESULTS: We present isoform interpretation, a method that uses expectation-maximization to infer isoform-specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85 617 isoforms of 17 900 protein-coding human genes spanning a range of 17 430 distinct gene ontology terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isoform interpretation significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isoform interpretation show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene-level function. AVAILABILITY AND IMPLEMENTATION: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321.


Assuntos
Motivação , Software , Humanos , Isoformas de Proteínas/genética , Processamento Alternativo , Análise de Sequência de RNA
4.
J Biomed Inform ; 139: 104295, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36716983

RESUMO

Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful for assessing associations between patients' predictors and outcomes of interest. However, these datasets often suffer from missing values in a high proportion of cases, whose removal may introduce severe bias. Several multiple imputation algorithms have been proposed to attempt to recover the missing information under an assumed missingness mechanism. Each algorithm presents strengths and weaknesses, and there is currently no consensus on which multiple imputation algorithm works best in a given scenario. Furthermore, the selection of each algorithm's parameters and data-related modeling choices are also both crucial and challenging. In this paper we propose a novel framework to numerically evaluate strategies for handling missing data in the context of statistical analysis, with a particular focus on multiple imputation techniques. We demonstrate the feasibility of our approach on a large cohort of type-2 diabetes patients provided by the National COVID Cohort Collaborative (N3C) Enclave, where we explored the influence of various patient characteristics on outcomes related to COVID-19. Our analysis included classic multiple imputation techniques as well as simple complete-case Inverse Probability Weighted models. Extensive experiments show that our approach can effectively highlight the most promising and performant missing-data handling strategy for our case study. Moreover, our methodology allowed a better understanding of the behavior of the different models and of how it changed as we modified their parameters. Our method is general and can be applied to different research fields and on datasets containing heterogeneous types.


Assuntos
COVID-19 , Humanos , Algoritmos , Projetos de Pesquisa , Viés , Probabilidade
5.
BMC Bioinformatics ; 23(Suppl 2): 154, 2022 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-36510125

RESUMO

BACKGROUND: Cis-regulatory regions (CRRs) are non-coding regions of the DNA that fine control the spatio-temporal pattern of transcription; they are involved in a wide range of pivotal processes such as the development of specific cell-lines/tissues and the dynamic cell response to physiological stimuli. Recent studies showed that genetic variants occurring in CRRs are strongly correlated with pathogenicity or deleteriousness. Considering the central role of CRRs in the regulation of physiological and pathological conditions, the correct identification of CRRs and of their tissue-specific activity status through Machine Learning methods plays a major role in dissecting the impact of genetic variants on human diseases. Unfortunately, the problem is still open, though some promising results have been already reported by (deep) machine-learning based methods that predict active promoters and enhancers in specific tissues or cell lines by encoding epigenetic or spectral features directly extracted from DNA sequences. RESULTS: We present the experiments we performed to compare two Deep Neural Networks, a Feed-Forward Neural Network model working on epigenomic features, and a Convolutional Neural Network model working only on genomic sequence, targeted to the identification of enhancer- and promoter-activity in specific cell lines. While performing experiments to understand how the experimental setup influences the prediction performance of the methods, we particularly focused on (1) automatic model selection performed by Bayesian optimization and (2) exploring different data rebalancing setups for reducing negative unbalancing effects. CONCLUSIONS: Results show that (1) automatic model selection by Bayesian optimization improves the quality of the learner; (2) data rebalancing considerably impacts the prediction performance of the models; test set rebalancing may provide over-optimistic results, and should therefore be cautiously applied; (3) despite working on sequence data, convolutional models obtain performance close to those of feed forward models working on epigenomic information, which suggests that also sequence data carries informative content for CRR-activity prediction. We therefore suggest combining both models/data types in future works.


Assuntos
Aprendizado Profundo , Humanos , Teorema de Bayes , Sequências Reguladoras de Ácido Nucleico , Redes Neurais de Computação , Aprendizado de Máquina
6.
Bioinformatics ; 37(23): 4526-4533, 2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34240108

RESUMO

MOTIVATION: Automated protein function prediction is a complex multi-class, multi-label, structured classification problem in which protein functions are organized in a controlled vocabulary, according to the Gene Ontology (GO). 'Hierarchy-unaware' classifiers, also known as 'flat' methods, predict GO terms without exploiting the inherent structure of the ontology, potentially violating the True-Path-Rule (TPR) that governs the GO, while 'hierarchy-aware' approaches, even if they obey the TPR, do not always show clear improvements with respect to flat methods, or do not scale well when applied to the full GO. RESULTS: To overcome these limitations, we propose Hierarchical Ensemble Methods for Directed Acyclic Graphs (HEMDAG), a family of highly modular hierarchical ensembles of classifiers, able to build upon any flat method and to provide 'TPR-safe' predictions, by leveraging a combination of isotonic regression and TPR learning strategies. Extensive experiments on synthetic and real data across several organisms firstly show that HEMDAG can be used as a general tool to improve the predictions of flat classifiers, and secondly that HEMDAG is competitive versus state-of-the-art hierarchy-aware learning methods proposed in the last CAFA international challenges. AVAILABILITY AND IMPLEMENTATION: Fully tested R code freely available at https://anaconda.org/bioconda/r-hemdag. Tutorial and documentation at https://hemdag.readthedocs.io. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Biologia Computacional , Ontologia Genética , Biologia Computacional/métodos , Proteínas/metabolismo
7.
Virol J ; 19(1): 84, 2022 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-35570298

RESUMO

BACKGROUND: Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of the COVID-19 pandemic in 2020 suggested that ibuprofen was associated with an increased risk of adverse events in COVID-19 patients, but subsequent observational studies failed to demonstrate increased risk and in one case showed reduced risk associated with NSAID use. METHODS: A 38-center retrospective cohort study was performed that leveraged the harmonized, high-granularity electronic health record data of the National COVID Cohort Collaborative. A propensity-matched cohort of 19,746 COVID-19 inpatients was constructed by matching cases (treated with NSAIDs at the time of admission) and 19,746 controls (not treated) from 857,061 patients with COVID-19 available for analysis. The primary outcome of interest was COVID-19 severity in hospitalized patients, which was classified as: moderate, severe, or mortality/hospice. Secondary outcomes were acute kidney injury (AKI), extracorporeal membrane oxygenation (ECMO), invasive ventilation, and all-cause mortality at any time following COVID-19 diagnosis. RESULTS: Logistic regression showed that NSAID use was not associated with increased COVID-19 severity (OR: 0.57 95% CI: 0.53-0.61). Analysis of secondary outcomes using logistic regression showed that NSAID use was not associated with increased risk of all-cause mortality (OR 0.51 95% CI: 0.47-0.56), invasive ventilation (OR: 0.59 95% CI: 0.55-0.64), AKI (OR: 0.67 95% CI: 0.63-0.72), or ECMO (OR: 0.51 95% CI: 0.36-0.7). In contrast, the odds ratios indicate reduced risk of these outcomes, but our quantitative bias analysis showed E-values of between 1.9 and 3.3 for these associations, indicating that comparatively weak or moderate confounder associations could explain away the observed associations. CONCLUSIONS: Study interpretation is limited by the observational design. Recording of NSAID use may have been incomplete. Our study demonstrates that NSAID use is not associated with increased COVID-19 severity, all-cause mortality, invasive ventilation, AKI, or ECMO in COVID-19 inpatients. A conservative interpretation in light of the quantitative bias analysis is that there is no evidence that NSAID use is associated with risk of increased severity or the other measured outcomes. Our results confirm and extend analogous findings in previous observational studies using a large cohort of patients drawn from 38 centers in a nationally representative multicenter database.


Assuntos
Injúria Renal Aguda , COVID-19 , Anti-Inflamatórios não Esteroides/efeitos adversos , Teste para COVID-19 , Estudos de Coortes , Humanos , Pandemias , Estudos Retrospectivos
8.
Radiol Med ; 127(4): 407-413, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35258775

RESUMO

OBJECTIVES: To evaluate the quality of the reports of loco-regional staging computed tomography (CT) or magnetic resonance imaging (MRI) in head and neck (H&N) cancer. METHODS: Consecutive reports of staging CT and MRI of all H&N cancer cases from 2018 to 2020 were collected. We created lists of quality indicators for tumor (T) for each district and for node (N). We marked these as 0 or 1 in the report calculating a report score (RS) and a maximum sum (MS) of each list. Two radiologists and two otolaryngologists in consensus classified reports as low quality (LQ) if the RS fell in the percentage range 0-59% of MS and as high quality (HQ) if it fell in the range 60-100%, annotating technique and district. We evaluated the distribution of reports in these categories. RESULTS: Two hundred thirty-seven reports (97 CT and 140 MRI) of 95 oral cavity, 52 laryngeal, 47 oropharyngeal, 19 hypo-pharyngeal, 14 parotid, and 10 nasopharyngeal cancers were included. Sixty-six percent of all the reports were LQ for T, 66% out of all the MRI reports, and 65% out of all CT reports were LQ. Eight-five percent of reports were HQ for N, 85% out of all the MRI reports, and 82% out of all CT reports were HQ. Reports of oral cavity, oro-nasopharynx, and parotid were LQ, respectively, in 76%, 73%, 100% and 92 out of cases. CONCLUSION: Reports of staging CT/MRI in H&N cancer were LQ for T description and HQ for N description.


Assuntos
Neoplasias de Cabeça e Pescoço , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Hospitais , Humanos , Imageamento por Ressonância Magnética/métodos , Estadiamento de Neoplasias , Glândula Parótida , Tomografia Computadorizada por Raios X/métodos
9.
Emerg Radiol ; 28(5): 911-919, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34021845

RESUMO

PURPOSE: To assess the incidence of erroneous diagnosis of pneumatosis (pseudo-pneumatosis) in patients who underwent an emergency abdominal CT and to verify the performance of imaging features, supported by artificial intelligence (AI) techniques, to reduce this misinterpretation. METHODS: We selected 71 radiological reports where the presence of pneumatosis was considered definitive or suspected. Surgical findings, clinical outcomes, and reevaluation of the CT scans were used to assess the correct diagnosis of pneumatosis. We identified four imaging signs from literature, to differentiate pneumatosis from pseudo-pneumatosis: gas location, dissecting gas in the bowel wall, a circumferential gas pattern, and intramural gas beyond a gas-fluid/faecal level. Two radiologists reevaluated in consensus all the CT scans, assessing the four above-mentioned variables. Variable discriminative importance was assessed using the Fisher exact test. Accurate and statistically significant variables (p-value < 0.05, accuracy > 75%) were pooled using boosted Random Forests (RFs) executed using a Leave-One-Out cross-validation (LOO cv) strategy to obtain unbiased estimates of individual variable importance by permutation analysis. After the LOO cv, the comparison of the variable importance distribution was validated by one-sided Wilcoxon test. RESULTS: Twenty-seven patients proved to have pseudo-pneumatosis (error: 38%). The most significant features to diagnose pneumatosis were presence of dissecting gas in the bowel wall (accuracy: 94%), presence of intramural gas beyond a gas-fluid/faecal level (accuracy: 86%), and a circumferential gas pattern (accuracy: 78%). CONCLUSION: The incidence of pseudo-pneumatosis can be high. The use of a checklist which includes three imaging signs can be useful to reduce this overestimation.


Assuntos
Inteligência Artificial , Pneumatose Cistoide Intestinal , Lista de Checagem , Humanos , Incidência , Intestinos , Pneumatose Cistoide Intestinal/diagnóstico por imagem
10.
BMC Emerg Med ; 21(1): 59, 2021 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-33971826

RESUMO

BACKGROUND: During the recent outbreak of COVID-19 (coronavirus disease 2019), Lombardy was the most affected region in Italy, with 87,000 patients and 15,876 deaths up to May 26, 2020. Since February 22, 2020, well before the Government declared a state of emergency, there was a huge reduction in the number of emergency surgeries performed at hospitals in Lombardy. A general decrease in attendance at emergency departments (EDs) was also observed. The aim of our study is to report the experience of the ED of a third-level hospital in downtown Milan, Lombardy, and provide possible explanations for the observed phenomena. METHODS: This retrospective, observational study assessed the volume of emergency surgeries and attendance at an ED during the course of the pandemic, i.e. immediately before, during and after a progressive community lockdown in response to the COVID-19 pandemic. These data were compared with data from the same time periods in 2019. The results are presented as means, standard error (SE), and 95% studentized confidence intervals (CI). The Wilcoxon rank signed test at a 0.05 significance level was used to assess differences in per-day ED access distributions. RESULTS: Compared to 2019, a significant overall drop in emergency surgeries (60%, p < 0.002) and in ED admittance (66%, p ≅ 0) was observed in 2020. In particular, there were significant decreases in medical (40%), surgical (74%), specialist (ophthalmology, otolaryngology, traumatology, and urology) (92%), and psychiatric (60%) cases. ED admittance due to domestic violence (59%) and individuals who left the ED without being seen (76%) also decreased. Conversely, the number of deaths increased by 196%. CONCLUSIONS: During the COVID-19 outbreak the volume of urgent surgeries and patients accessing our ED dropped. Currently, it is not known if mortality of people who did not seek care increased during the pandemic. Further studies are needed to understand if such reductions during the COVID-19 pandemic will result in a rebound of patients left untreated or in unwanted consequences for population health.


Assuntos
COVID-19/epidemiologia , Emergências , Serviço Hospitalar de Emergência/estatística & dados numéricos , Acessibilidade aos Serviços de Saúde , Pneumonia Viral/epidemiologia , Procedimentos Cirúrgicos Operatórios , Feminino , Humanos , Itália/epidemiologia , Masculino , Pandemias , Pneumonia Viral/virologia , SARS-CoV-2 , Centros de Atenção Terciária
11.
Radiol Med ; 125(1): 15-23, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31587182

RESUMO

OBJECTIVES: To evaluate the performance of the LI-RADS v.2018 scale by comparing it with the Likert scale, in the characterization of liver lesions. METHODS: A total of 39 patients with chronic liver disease underwent MR examination for characterization of 44 liver lesions. Images were independently analyzed by two radiologists using the LI-RADS scale and by another two radiologists using the Likert scale. The reference standard used was either histopathological evaluation or a 4-year MRI follow-up. Receiver operating characteristic analysis was performed. RESULTS: The LI-RADS scale obtained an accuracy of 80%, a sensitivity of 72%, a specificity of 93%, a positive predictive value (PPV) of 93% and a negative predictive value (NPV) of 70%, while the Likert scale achieved an accuracy of 79%, a sensitivity of 73%, a specificity of 87%, a PPV of 89% and a NPV of 70%. The area under the curve (AUC) was 85% for the LI-RADS scale and 83% for the Likert scale. The inter-observer agreement was strong (k = 0.89) between the LI-RADS evaluators and moderate (k = 0.69) between the Likert evaluators. CONCLUSIONS: There was no statistically significant difference between the performances of the two scales; nevertheless, we suggest that the LI-RADS scale be used, as it appeared more objective and consistent.


Assuntos
Carcinoma Hepatocelular/diagnóstico por imagem , Cirrose Hepática/diagnóstico por imagem , Neoplasias Hepáticas/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Lesões Pré-Cancerosas/diagnóstico por imagem , Idoso , Idoso de 80 Anos ou mais , Diagnóstico Diferencial , Feminino , Humanos , Fígado/diagnóstico por imagem , Imageamento por Ressonância Magnética/normas , Masculino , Pessoa de Meia-Idade , Variações Dependentes do Observador , Valor Preditivo dos Testes , Curva ROC , Padrões de Referência , Estudos Retrospectivos , Sensibilidade e Especificidade , Ultrassonografia
12.
Radiol Med ; 125(12): 1260-1270, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32862406

RESUMO

OBJECTIVES: We aimed to assess the diagnostic performance of CT in patients with a negative first RT-PCR testing and to identify typical features of COVID-19 pneumonia that can guide diagnosis in this case. METHODS: Patients suspected of COVID-19 with a negative first RT-PCR testing were retrospectively revalued after undergoing CT. CT was reviewed by two radiologists and classified as suspected COVID-19 pneumonia, non-COVID-19 pneumonia or negative. The performance of both first RT-PCR result and CT was evaluated by using sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV) and area under the curve (AUC) and by using the second RT-PCR test as the reference standard. CT findings for confirmed COVID-19 positive or negative were compared by using the Pearson chi-squared test (P values < 0.05) RESULTS: Totally, 337 patients suspected of COVID-19 underwent CT and nasopharyngeal swabs in March 2020. Eighty-seven out of 337 patients had a negative first RT-PCR result; of these, 68 repeated RT-PCR testing and were included in the study. The first RT-PCR test showed SE 0, SP = 100%, PPV = NaN, NPV = 70%, AUC = 50%, and CT showed SE = 70% SP = 79%, PPV = 86%, NPV = 76%, AUC = 75%. The most relevant CT variables were ground glass opacity more than 50% and peripheral and/or perihilar distribution. DISCUSSION: Negative RT-PCR test but positive CT features should be highly suggestive of COVID-19 in a cluster or community transmission scenarios, and the second RT-PCR test should be promptly requested to confirm the final diagnosis.


Assuntos
Betacoronavirus , Infecções por Coronavirus/diagnóstico , Pneumonia Viral/diagnóstico , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Tomografia Computadorizada por Raios X , Adulto , Idoso , Idoso de 80 Anos ou mais , Área Sob a Curva , COVID-19 , Distribuição de Qui-Quadrado , Infecções por Coronavirus/diagnóstico por imagem , Infecções por Coronavirus/epidemiologia , Reações Falso-Negativas , Reações Falso-Positivas , Feminino , Humanos , Itália/epidemiologia , Pulmão/diagnóstico por imagem , Masculino , Pessoa de Meia-Idade , Nasofaringe/virologia , Pandemias , Pneumonia Viral/diagnóstico por imagem , Pneumonia Viral/epidemiologia , Valor Preditivo dos Testes , Probabilidade , Radiografia Torácica/métodos , Radiografia Torácica/estatística & dados numéricos , Padrões de Referência , Reprodutibilidade dos Testes , Estudos Retrospectivos , Reação em Cadeia da Polimerase Via Transcriptase Reversa/estatística & dados numéricos , SARS-CoV-2 , Sensibilidade e Especificidade , Tomografia Computadorizada por Raios X/estatística & dados numéricos
13.
BMC Bioinformatics ; 20(1): 733, 2019 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-31881821

RESUMO

BACKGROUND: The protein ki67 (pki67) is a marker of tumor aggressiveness, and its expression has been proven to be useful in the prognostic and predictive evaluation of several types of tumors. To numerically quantify the pki67 presence in cancerous tissue areas, pathologists generally analyze histochemical images to count the number of tumor nuclei marked for pki67. This allows estimating the ki67-index, that is the percentage of tumor nuclei positive for pki67 over all the tumor nuclei. Given the high image resolution and dimensions, its estimation by expert clinicians is particularly laborious and time consuming. Though automatic cell counting techniques have been presented so far, the problem is still open. RESULTS: In this paper we present a novel automatic approach for the estimations of the ki67-index. The method starts by exploiting the STRESS algorithm to produce a color enhanced image where all pixels belonging to nuclei are easily identified by thresholding, and then separated into positive (i.e. pixels belonging to nuclei marked for pki67) and negative by a binary classification tree. Next, positive and negative nuclei pixels are processed separately by two multiscale procedures identifying isolated nuclei and separating adjoining nuclei. The multiscale procedures exploit two Bayesian classification trees to recognize positive and negative nuclei-shaped regions. CONCLUSIONS: The evaluation of the computed results, both through experts' visual assessments and through the comparison of the computed indexes with those of experts, proved that the prototype is promising, so that experts believe in its potential as a tool to be exploited in the clinical practice as a valid aid for clinicians estimating the ki67-index. The MATLAB source code is open source for research purposes.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Antígeno Ki-67/análise , Neoplasias/química , Algoritmos , Animais , Teorema de Bayes , Núcleo Celular/química , Humanos , Camundongos , Software
14.
BMC Bioinformatics ; 20(1): 422, 2019 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-31412768

RESUMO

BACKGROUND: One of the main issues in the automated protein function prediction (AFP) problem is the integration of multiple networked data sources. The UNIPred algorithm was thereby proposed to efficiently integrate -in a function-specific fashion- the protein networks by taking into account the imbalance that characterizes protein annotations, and to subsequently predict novel hypotheses about unannotated proteins. UNIPred is publicly available as R code, which might result of limited usage for non-expert users. Moreover, its application requires efforts in the acquisition and preparation of the networks to be integrated. Finally, the UNIPred source code does not handle the visualization of the resulting consensus network, whereas suitable views of the network topology are necessary to explore and interpret existing protein relationships. RESULTS: We address the aforementioned issues by proposing UNIPred-Web, a user-friendly Web tool for the application of the UNIPred algorithm to a variety of biomolecular networks, already supplied by the system, and for the visualization and exploration of protein networks. We support different organisms and different types of networks -e.g., co-expression, shared domains and physical interaction networks. Users are supported in the different phases of the process, ranging from the selection of the networks and the protein function to be predicted, to the navigation of the integrated network. The system also supports the upload of user-defined protein networks. The vertex-centric and the highly interactive approach of UNIPred-Web allow a narrow exploration of specific proteins, and an interactive analysis of large sub-networks with only a few mouse clicks. CONCLUSIONS: UNIPred-Web offers a practical and intuitive (visual) guidance to biologists interested in gaining insights into protein biomolecular functions. UNIPred-Web provides facilities for the integration of networks, and supplies a framework for the imbalance-aware protein network integration of nine organisms, the prediction of thousands of GO protein functions, and a easy-to-use graphical interface for the visual analysis, navigation and interpretation of the integrated networks and of the functional predictions.


Assuntos
Biologia Computacional/métodos , Internet , Mapas de Interação de Proteínas , Proteínas/metabolismo , Software , Algoritmos , Interface Usuário-Computador
15.
BMC Bioinformatics ; 19(Suppl 10): 357, 2018 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-30367588

RESUMO

BACKGROUND: In the clinical practice, the objective quantification of histological results is essential not only to define objective and well-established protocols for diagnosis, treatment, and assessment, but also to ameliorate disease comprehension. SOFTWARE: The software MIAQuant_Learn presented in this work segments, quantifies and analyzes markers in histochemical and immunohistochemical images obtained by different biological procedures and imaging tools. MIAQuant_Learn employs supervised learning techniques to customize the marker segmentation process with respect to any marker color appearance. Our software expresses the location of the segmented markers with respect to regions of interest by mean-distance histograms, which are numerically compared by measuring their intersection. When contiguous tissue sections stained by different markers are available, MIAQuant_Learn aligns them and overlaps the segmented markers in a unique image enabling a visual comparative analysis of the spatial distribution of each marker (markers' relative location). Additionally, it computes novel measures of markers' co-existence in tissue volumes depending on their density. CONCLUSIONS: Applications of MIAQuant_Learn in clinical research studies have proven its effectiveness as a fast and efficient tool for the automatic extraction, quantification and analysis of histological sections. It is robust with respect to several deficits caused by image acquisition systems and produces objective and reproducible results. Thanks to its flexibility, MIAQuant_Learn represents an important tool to be exploited in basic research where needs are constantly changing.


Assuntos
Algoritmos , Biologia Computacional/métodos , Processamento de Imagem Assistida por Computador/métodos , Coloração e Rotulagem , Biomarcadores Tumorais/metabolismo , Árvores de Decisões , Humanos , Imuno-Histoquímica , Software , Máquina de Vetores de Suporte
16.
Radiol Med ; 119(10): 784-9, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-24553784

RESUMO

PURPOSE: This study was undertaken to collect information on the incidence and distribution of acute, non-traumatic conditions of the neck at our emergency radiology department and to review the literature about this topic. MATERIALS AND METHODS: We retrospectively reviewed 143 consecutive patients who underwent neck computed tomography (CT) for non-traumatic emergencies between 1 December 2008 and 31 December 2012. For each of the conditions identified, we defined the overall incidence, the incidence based on the site, gender, average age and age range. RESULTS: Computed tomography examination was positive in 125 out of 143 patients (87.4%), 74 men and 51 women, with an average age of 51.1 years, aged between 10 and 90 years. We found 79 inflammatory/infectious conditions (63.2% of positive cases, 55.2% of total cases), 46 men and 33 women, with an average age of 47 years. Computed tomography revealed 26 newly found tumours (20.8/18.2%), 19 men and 7 women, with an average age of 68.5 years, aged between 49 and 97 years. In 20 cases, 9 men and 11 women, with an average age of 57.3 years, aged between 21 and 90 years, we diagnosed other acute conditions: six cases of foreign body ingestion (4.8/4.2%), five benign swellings (4/3.5%), five cases of vascular disorders (4/3.5%), and four cases of oedema of the larynx (3.2/2.8 %). CONCLUSIONS: Our study of emergency CT of non-traumatic conditions of the neck fundamentally revealed infectious/inflammatory diseases and newly found neoplasms.


Assuntos
Emergências , Corpos Estranhos , Laringe , Neoplasias Bucais/diagnóstico por imagem , Pescoço/diagnóstico por imagem , Abscesso Peritonsilar/diagnóstico por imagem , Abscesso Retrofaríngeo/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Adolescente , Adulto , Distribuição por Idade , Idoso , Idoso de 80 Anos ou mais , Criança , Emergências/epidemiologia , Feminino , Corpos Estranhos/epidemiologia , Humanos , Incidência , Itália/epidemiologia , Neoplasias Laríngeas/diagnóstico por imagem , Masculino , Pessoa de Meia-Idade , Neoplasias Bucais/epidemiologia , Abscesso Peritonsilar/epidemiologia , Valor Preditivo dos Testes , Abscesso Retrofaríngeo/epidemiologia , Estudos Retrospectivos , Fatores de Risco , Sensibilidade e Especificidade , Distribuição por Sexo
17.
medRxiv ; 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-37503093

RESUMO

Objective: Large Language Models such as GPT-4 previously have been applied to differential diagnostic challenges based on published case reports. Published case reports have a sophisticated narrative style that is not readily available from typical electronic health records (EHR). Furthermore, even if such a narrative were available in EHRs, privacy requirements would preclude sending it outside the hospital firewall. We therefore tested a method for parsing clinical texts to extract ontology terms and programmatically generating prompts that by design are free of protected health information. Materials and Methods: We investigated different methods to prepare prompts from 75 recently published case reports. We transformed the original narratives by extracting structured terms representing phenotypic abnormalities, comorbidities, treatments, and laboratory tests and creating prompts programmatically. Results: Performance of all of these approaches was modest, with the correct diagnosis ranked first in only 5.3-17.6% of cases. The performance of the prompts created from structured data was substantially worse than that of the original narrative texts, even if additional information was added following manual review of term extraction. Moreover, different versions of GPT-4 demonstrated substantially different performance on this task. Discussion: The sensitivity of the performance to the form of the prompt and the instability of results over two GPT-4 versions represent important current limitations to the use of GPT-4 to support diagnosis in real-life clinical settings. Conclusion: Research is needed to identify the best methods for creating prompts from typically available clinical data to support differential diagnostics.

18.
bioRxiv ; 2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-39005436

RESUMO

Objectives: Concept embeddings are low-dimensional vector representations of concepts such as MeSH:D009203 (Myocardial Infarction), whose similarity in the embedded vector space reflects their semantic similarity. Here, we test the hypothesis that non-biomedical concept synonym replacement can improve the quality of biomedical concepts embeddings. Materials and methods: We developed an approach that leverages WordNet to replace sets of synonyms with the most common representative of the synonym set. Results: We tested our approach on 1055 concept sets and found that, on average, the mean intra-cluster distance was reduced by 8% in the vector-space. Assuming that homophily of related concepts in the vector space is desirable, our approach tends to improve the quality of embeddings. Discussion and Conclusion: This pilot study shows that non-biomedical synonym replacement tends to improve the quality of embeddings of biomedical concepts using the Word2Vec algorithm. We have implemented our approach in a freely available Python package available at https://github.com/TheJacksonLaboratory/wn2vec.

19.
medRxiv ; 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-39108510

RESUMO

Large language models (LLM) have shown great promise in supporting differential diagnosis, but 23 available published studies on the diagnostic accuracy evaluated small cohorts (number of cases, 30-422, mean 104) and have evaluated LLM responses subjectively by manual curation (23/23 studies). The performance of LLMs for rare disease diagnosis has not been evaluated systematically. Here, we perform a rigorous and large-scale analysis of the performance of a GPT-4 in prioritizing candidate diagnoses, using the largest-ever cohort of rare disease patients. Our computational study used 5267 computational case reports from previously published data. Each case was formatted as a Global Alliance for Genomics and Health (GA4GH) phenopacket, in which clinical anomalies were represented as Human Phenotype Ontology (HPO) terms. We developed software to generate prompts from each phenopacket. Prompts were sent to Generative Pre-trained Transformer 4 (GPT-4), and the rank of the correct diagnosis, if present in the response, was recorded. The mean reciprocal rank of the correct diagnosis was 0.24 (with the reciprocal of the MRR corresponding to a rank of 4.2), and the correct diagnosis was placed in rank 1 in 19.2% of the cases, in the first 3 ranks in 28.6%, and in the first 10 ranks in 32.5%. Our study is the largest to be reported to date and provides a realistic estimate of the performance of GPT-4 in rare disease medicine.

20.
Int J Med Inform ; 187: 105461, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38643701

RESUMO

OBJECTIVE: Female reproductive disorders (FRDs) are common health conditions that may present with significant symptoms. Diet and environment are potential areas for FRD interventions. We utilized a knowledge graph (KG) method to predict factors associated with common FRDs (for example, endometriosis, ovarian cyst, and uterine fibroids). MATERIALS AND METHODS: We harmonized survey data from the Personalized Environment and Genes Study (PEGS) on internal and external environmental exposures and health conditions with biomedical ontology content. We merged the harmonized data and ontologies with supplemental nutrient and agricultural chemical data to create a KG. We analyzed the KG by embedding edges and applying a random forest for edge prediction to identify variables potentially associated with FRDs. We also conducted logistic regression analysis for comparison. RESULTS: Across 9765 PEGS respondents, the KG analysis resulted in 8535 significant or suggestive predicted links between FRDs and chemicals, phenotypes, and diseases. Amongst these links, 32 were exact matches when compared with the logistic regression results, including comorbidities, medications, foods, and occupational exposures. DISCUSSION: Mechanistic underpinnings of predicted links documented in the literature may support some of our findings. Our KG methods are useful for predicting possible associations in large, survey-based datasets with added information on directionality and magnitude of effect from logistic regression. These results should not be construed as causal but can support hypothesis generation. CONCLUSION: This investigation enabled the generation of hypotheses on a variety of potential links between FRDs and exposures. Future investigations should prospectively evaluate the variables hypothesized to impact FRDs.


Assuntos
Exposição Ambiental , Humanos , Feminino , Exposição Ambiental/efeitos adversos , Doenças dos Genitais Femininos , Modelos Logísticos , Estado Nutricional , Dieta , Adulto , Algoritmo Florestas Aleatórias
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA