Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Am J Med Genet A ; 191(2): 518-525, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36426646

RESUMO

Detecting obstructive sleep apnea (OSA) is important to both prevent significant comorbidities in people with Down syndrome (DS) and untangle contributions to other behavioral and mental health diagnoses. However, laboratory-based polysomnograms are often poorly tolerated, unavailable, or not covered by health insurance for this population. In previous work, our team developed a prediction model that seemed to hold promise in identifying which people with DS might not have significant apnea and, consequently, might be able to forgo a diagnostic polysomnogram. In this study, we sought to validate these findings in a novel set of participants with DS. We recruited an additional 64 participants with DS, ages 3-35 years. Caregivers completed the same validated questionnaires, and our study team collected vital signs, physical exam findings, and medical histories that were previously shown to be predictive. Patients then had a laboratory-based polysomnogram. The best modeling had a validated negative predictive value of 50% for an apnea-hypopnea index (AHI) > 1/hTST and 73.7% for AHI >5/hTST. The positive predictive values were 60% and 39.1%, respectively. As such, a clinically reliable screening tool for OSA in people with DS was not achieved. Patients with DS should continue to be monitored for OSA according to current healthcare guidelines.


Assuntos
Síndrome de Down , Apneia Obstrutiva do Sono , Humanos , Pré-Escolar , Criança , Adolescente , Adulto Jovem , Adulto , Síndrome de Down/complicações , Síndrome de Down/diagnóstico , Síndrome de Down/epidemiologia , Apneia Obstrutiva do Sono/complicações , Apneia Obstrutiva do Sono/diagnóstico , Apneia Obstrutiva do Sono/epidemiologia , Polissonografia , Comorbidade , Inquéritos e Questionários
2.
Liver Int ; 42(3): 615-627, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34951722

RESUMO

BACKGROUND & AIMS: Machine learning (ML) provides new approaches for prognostication through the identification of novel subgroups of patients. We explored whether ML could support disease sub-phenotyping and risk stratification in primary biliary cholangitis (PBC). METHODS: ML was applied to an international dataset of PBC patients. The dataset was split into a derivation cohort (training set) and a validation cohort (validation set), and key clinical features were analysed. The outcome was a composite of liver-related death or liver transplantation. ML and standard survival analysis were performed. RESULTS: The training set was composed of 11,819 subjects, while the validation set was composed of 1,069 subjects. ML identified four clusters of patients characterized by different phenotypes and long-term prognosis. Cluster 1 (n = 3566) included patients with excellent prognosis, whereas Cluster 2 (n = 3966) consisted of individuals at worse prognosis differing from Cluster 1 only for albumin levels around the limit of normal. Cluster 3 (n = 2379) included young patients with florid cholestasis and Cluster 4 (n = 1908) comprised advanced cases. Further sub-analyses on the dynamics of albumin within the normal range revealed that ursodeoxycholic acid-induced increase of albumin >1.2 x lower limit of normal (LLN) is associated with improved transplant-free survival. CONCLUSIONS: Unsupervised ML identified four novel groups of PBC patients with different phenotypes and prognosis and highlighted subtle variations of albumin within the normal range. Therapy-induced increase of albumin >1.2 x LLN should be considered a treatment goal.


Assuntos
Colangite , Cirrose Hepática Biliar , Colagogos e Coleréticos/uso terapêutico , Colangite/complicações , Humanos , Cirrose Hepática Biliar/tratamento farmacológico , Aprendizado de Máquina , Prognóstico , Medição de Risco , Ácido Ursodesoxicólico/uso terapêutico
3.
Sensors (Basel) ; 21(19)2021 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-34640846

RESUMO

Edge Computing enables to perform measurement and cognitive decisions outside a central server by performing data storage, manipulation, and processing on the Internet of Things (IoT) node. Also, Artificial Intelligence (AI) and Machine Learning applications have become a rudimentary procedure in virtually every industrial or preliminary system. Consequently, the Raspberry Pi is adopted, which is a low-cost computing platform that is profitably applied in the field of IoT. As for the software part, among the plethora of Machine Learning (ML) paradigms reported in the literature, we identified Rulex, as a good ML platform, suitable to be implemented on the Raspberry Pi. In this paper, we present the porting of the Rulex ML platform on the board to perform ML forecasts in an IoT setup. Specifically, we explain the porting Rulex's libraries on Windows 32 Bits, Ubuntu 64 Bits, and Raspbian 32 Bits. Therefore, with the aim of carrying out an in-depth verification of the application possibilities, we propose to perform forecasts on five unrelated datasets from five different applications, having varying sizes in terms of the number of records, skewness, and dimensionality. These include a small Urban Classification dataset, three larger datasets concerning Human Activity detection, a Biomedical dataset related to mental state, and a Vehicle Activity Recognition dataset. The overall accuracies for the forecasts performed are: 84.13%, 99.29% (for SVM), 95.47% (for SVM), and 95.27% (For KNN) respectively. Finally, an image-based gender classification dataset is employed to perform image classification on the Edge. Moreover, a novel image pre-processing Algorithm was developed that converts images into Time-series by relying on statistical contour-based detection techniques. Even though the dataset contains inconsistent and random images, in terms of subjects and settings, Rulex achieves an overall accuracy of 96.47% while competing with the literature which is dominated by forward-facing and mugshot images. Additionally, power consumption for the Raspberry Pi in a Client/Server setup was compared with an HP laptop, where the board takes more time, but consumes less energy for the same ML task.


Assuntos
Inteligência Artificial , Aprendizado de Máquina , Algoritmos , Humanos , Processamento de Imagem Assistida por Computador , Software
4.
BMC Bioinformatics ; 20(Suppl 9): 390, 2019 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-31757200

RESUMO

BACKGROUND: Logic Learning Machine (LLM) is an innovative method of supervised analysis capable of constructing models based on simple and intelligible rules. In this investigation the performance of LLM in classifying patients with cancer was evaluated using a set of eight publicly available gene expression databases for cancer diagnosis. LLM accuracy was assessed by summary ROC curve (sROC) analysis and estimated by the area under an sROC curve (sAUC). Its performance was compared in cross validation with that of standard supervised methods, namely: decision tree, artificial neural network, support vector machine (SVM) and k-nearest neighbor classifier. RESULTS: LLM showed an excellent accuracy (sAUC = 0.99, 95%CI: 0.98-1.0) and outperformed any other method except SVM. CONCLUSIONS: LLM is a new powerful tool for the analysis of gene expression data for cancer diagnosis. Simple rules generated by LLM could contribute to a better understanding of cancer biology, potentially addressing therapeutic approaches.


Assuntos
Regulação Neoplásica da Expressão Gênica , Lógica , Aprendizado de Máquina , Neoplasias/diagnóstico , Neoplasias/genética , Adulto , Criança , Bases de Dados Genéticas , Feminino , Humanos , Masculino , Redes Neurais de Computação , Curva ROC
5.
Am J Med Genet A ; 173(4): 889-896, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28124477

RESUMO

Obstructive sleep apnea (OSA) occurs frequently in people with Down syndrome (DS) with reported prevalences ranging between 55% and 97%, compared to 1-4% in the neurotypical pediatric population. Sleep studies are often uncomfortable, costly, and poorly tolerated by individuals with DS. The objective of this study was to construct a tool to identify individuals with DS unlikely to have moderate or severe sleep OSA and in whom sleep studies might offer little benefit. An observational, prospective cohort study was performed in an outpatient clinic and overnight sleep study center with 130 DS patients, ages 3-24 years. Exclusion criteria included previous adenoid and/or tonsil removal, a sleep study within the past 6 months, or being treated for apnea with continuous positive airway pressure. This study involved a physical examination/medical history, lateral cephalogram, 3D photograph, validated sleep questionnaires, an overnight polysomnogram, and urine samples. The main outcome measure was the apnea-hypopnea index. Using a Logic Learning Machine, the best model had a cross-validated negative predictive value of 73% for mild obstructive sleep apnea and 90% for moderate or severe obstructive sleep apnea; positive predictive values were 55% and 25%, respectively. The model included variables from survey questions, medication history, anthropometric measurements, vital signs, patient's age, and physical examination findings. With simple procedures that can be collected at minimal cost, the proposed model could predict which patients with DS were unlikely to have moderate to severe obstructive sleep apnea and thus may not need a diagnostic sleep study.


Assuntos
Síndrome de Down/diagnóstico , Modelos Estatísticos , Polissonografia/ética , Apneia Obstrutiva do Sono/diagnóstico , Adolescente , Criança , Pré-Escolar , Síndrome de Down/complicações , Síndrome de Down/fisiopatologia , Feminino , Humanos , Aprendizado de Máquina , Masculino , Pacientes Ambulatoriais , Polissonografia/economia , Estudos Prospectivos , Índice de Gravidade de Doença , Sono/fisiologia , Apneia Obstrutiva do Sono/complicações , Apneia Obstrutiva do Sono/fisiopatologia , Inquéritos e Questionários , Adulto Jovem
6.
J Gambl Stud ; 33(4): 1121-1137, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28255941

RESUMO

Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.


Assuntos
Ansiedade/psicologia , Comportamento Aditivo/psicologia , Jogo de Azar/psicologia , Recompensa , Adulto , Alcoolismo/psicologia , Estudos Transversais , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Fatores de Risco , Autorrelato , Superstições/psicologia , Inquéritos e Questionários
7.
BMC Bioinformatics ; 16 Suppl 9: S3, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26051106

RESUMO

BACKGROUND: Tumour markers are standard tools for the differential diagnosis of cancer. However, the occurrence of nonspecific symptoms and different malignancies involving the same cancer site may lead to a high proportion of misclassifications. Classification accuracy can be improved by combining information from different markers using standard data mining techniques, like Decision Tree (DT), Artificial Neural Network (ANN), and k-Nearest Neighbour (KNN) classifier. Unfortunately, each method suffers from some unavoidable limitations. DT, in general, tends to show a low classification performance, whereas ANN and KNN produce a "black-box" classification that does not provide biological information useful for clinical purposes. METHODS: Logic Learning Machine (LLM) is an innovative method of supervised data analysis capable of building classifiers described by a set of intelligible rules including simple conditions in their antecedent part. It is essentially an efficient implementation of the Switching Neural Network model and reaches excellent classification accuracy while keeping low the computational demand. LLM was applied to data from a consecutive cohort of 169 patients admitted for diagnosis to two pulmonary departments in Northern Italy from 2009 to 2011. Patients included 52 malignant pleural mesotheliomas (MPM), 62 pleural metastases (MTX) from other tumours and 55 benign diseases (BD) associated with pleurisies. Concentration of three tumour markers (CEA, CYFRA 21-1 and SMRP) was measured in the pleural fluid of each patient and a cytological examination was also carried out. The performance of LLM and that of three competing methods (DT, KNN and ANN) was assessed by leave-one-out cross-validation. RESULTS: LLM outperformed all other considered methods. Global accuracy was 77.5% for LLM, 72.8% for DT, 54.4% for KNN, and 63.9% for ANN, respectively. In more details, LLM correctly classified 79% of MPM, 66% of MTX and 89% of BD. The corresponding figures for DT were: MPM = 83%, MTX = 55% and BD = 84%; for KNN: MPM = 58%, MTX = 45%, BD = 62%; for ANN: MPM = 71%, MTX = 47%, BD = 76%. Finally, LLM provided classification rules in a very good agreement with a priori knowledge about the biological role of the considered tumour markers. CONCLUSIONS: LLM is a new flexible tool potentially useful for the differential diagnosis of pleural mesothelioma.


Assuntos
Inteligência Artificial , Biomarcadores Tumorais/análise , Neoplasias Pulmonares/diagnóstico , Mesotelioma/diagnóstico , Neoplasias Pleurais/diagnóstico , Estudos de Coortes , Árvores de Decisões , Diagnóstico Diferencial , Feminino , Humanos , Lógica , Masculino , Mesotelioma Maligno , Pessoa de Meia-Idade , Metástase Neoplásica , Redes Neurais de Computação
8.
BMC Bioinformatics ; 15 Suppl 5: S4, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25078098

RESUMO

BACKGROUND: Cancer patient's outcome is written, in part, in the gene expression profile of the tumor. We previously identified a 62-probe sets signature (NB-hypo) to identify tissue hypoxia in neuroblastoma tumors and showed that NB-hypo stratified neuroblastoma patients in good and poor outcome 1. It was important to develop a prognostic classifier to cluster patients into risk groups benefiting of defined therapeutic approaches. Novel classification and data discretization approaches can be instrumental for the generation of accurate predictors and robust tools for clinical decision support. We explored the application to gene expression data of Rulex, a novel software suite including the Attribute Driven Incremental Discretization technique for transforming continuous variables into simplified discrete ones and the Logic Learning Machine model for intelligible rule generation. RESULTS: We applied Rulex components to the problem of predicting the outcome of neuroblastoma patients on the bases of 62 probe sets NB-hypo gene expression signature. The resulting classifier consisted in 9 rules utilizing mainly two conditions of the relative expression of 11 probe sets. These rules were very effective predictors, as shown in an independent validation set, demonstrating the validity of the LLM algorithm applied to microarray data and patients' classification. The LLM performed as efficiently as Prediction Analysis of Microarray and Support Vector Machine, and outperformed other learning algorithms such as C4.5. Rulex carried out a feature selection by selecting a new signature (NB-hypo-II) of 11 probe sets that turned out to be the most relevant in predicting outcome among the 62 of the NB-hypo signature. Rules are easily interpretable as they involve only few conditions. CONCLUSIONS: Our findings provided evidence that the application of Rulex to the expression values of NB-hypo signature created a set of accurate, high quality, consistent and interpretable rules for the prediction of neuroblastoma patients' outcome. We identified the Rulex weighted classification as a flexible tool that can support clinical decisions. For these reasons, we consider Rulex to be a useful tool for cancer classification from microarray gene expression data.


Assuntos
Inteligência Artificial , Biologia Computacional/instrumentação , Perfilação da Expressão Gênica/instrumentação , Neuroblastoma/genética , Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Lactente , Lógica , Neuroblastoma/diagnóstico , Prognóstico , Software , Máquina de Vetores de Suporte
9.
BMC Bioinformatics ; 14 Suppl 7: S12, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23815266

RESUMO

BACKGROUND: Neuroblastoma is the most common pediatric solid tumor. About fifty percent of high risk patients die despite treatment making the exploration of new and more effective strategies for improving stratification mandatory. Hypoxia is a condition of low oxygen tension occurring in poorly vascularized areas of the tumor associated with poor prognosis. We had previously defined a robust gene expression signature measuring the hypoxic component of neuroblastoma tumors (NB-hypo) which is a molecular risk factor. We wanted to develop a prognostic classifier of neuroblastoma patients' outcome blending existing knowledge on clinical and molecular risk factors with the prognostic NB-hypo signature. Furthermore, we were interested in classifiers outputting explicit rules that could be easily translated into the clinical setting. RESULTS: Shadow Clustering (SC) technique, which leads to final models called Logic Learning Machine (LLM), exhibits a good accuracy and promises to fulfill the aims of the work. We utilized this algorithm to classify NB-patients on the bases of the following risk factors: Age at diagnosis, INSS stage, MYCN amplification and NB-hypo. The algorithm generated explicit classification rules in good agreement with existing clinical knowledge. Through an iterative procedure we identified and removed from the dataset those examples which caused instability in the rules. This workflow generated a stable classifier very accurate in predicting good and poor outcome patients. The good performance of the classifier was validated in an independent dataset. NB-hypo was an important component of the rules with a strength similar to that of tumor staging. CONCLUSIONS: The novelty of our work is to identify stability, explicit rules and blending of molecular and clinical risk factors as the key features to generate classification rules for NB patients to be conveyed to the clinic and to be used to design new therapies. We derived, through LLM, a set of four stable rules identifying a new class of poor outcome patients that could benefit from new therapies potentially targeting tumor hypoxia or its consequences.


Assuntos
Algoritmos , Inteligência Artificial , Neuroblastoma/diagnóstico , Neuroblastoma/genética , Criança , Humanos , Lógica , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Neuroblastoma/patologia , Prognóstico , Proteínas Proto-Oncogênicas c-myc/genética , Fatores de Risco
10.
Am J Med Genet A ; 161A(3): 556-60, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23401177

RESUMO

Multiple osteochondromas (MO), previously known as hereditary multiple exostoses (HME), is an autosomal dominant disease characterized by the formation of several benign cartilage-capped bone growth defined osteochondromas or exostoses. Various clinical classifications have been proposed but a consensus has not been reached. The aim of this study was to validate (using a machine learning approach) an "easy to use" tool to characterize MO patients in three classes according to the number of bone segments affected, the presence of skeletal deformities and/or functional limitations. The proposed classification has been validated (with a highly satisfactory mean accuracy) by analyzing 150 different variables on 289 MO patients through a Switching Neural Network approach (a novel classification technique capable of deriving models described by intelligible rules in if-then form). This approach allowed us to identify ankle valgism, Madelung deformity and limitation of the hip extra-rotation as "tags" of the three clinical classes. In conclusion, the proposed classification provides an efficient system to characterize this rare disease and is able to define homogeneous cohorts of patients to investigate MO pathogenesis.


Assuntos
Exostose Múltipla Hereditária/classificação , Redes Neurais de Computação , Adolescente , Adulto , Idoso , Criança , Análise por Conglomerados , Simulação por Computador , Exostose Múltipla Hereditária/patologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Adulto Jovem
11.
J Clin Med ; 12(12)2023 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-37373787

RESUMO

Identifying and treating lipid abnormalities is crucial for preventing cardiovascular disease in diabetic patients, yet only two-thirds of patients reach recommended cholesterol levels. Elucidating the factors associated with lipid goal attainment represents an unmet clinical need. To address this knowledge gap, we conducted a real-world analysis of the lipid profiles of 11.252 patients from the Annals of the Italian Association of Medical Diabetologists (AMD) database from 2005 to 2019. We used a Logic Learning Machine (LLM) to extract and classify the most relevant variables predicting the achievement of a low-density lipoprotein cholesterol (LDL-C) value lower than 100 mg/dL (2.60 mmol/L) within two years of the start of lipid-lowering therapy. Our analysis showed that 61.4% of the patients achieved the treatment goal. The LLM model demonstrated good predictive performance, with a precision of 0.78, accuracy of 0.69, recall of 0.70, F1 Score of 0.74, and ROC-AUC of 0.79. The most significant predictors of achieving the treatment goal were LDL-C values at the start of lipid-lowering therapy and their reduction after six months. Other predictors of a greater likelihood of reaching the target included high-density lipoprotein cholesterol, albuminuria, and body mass index at baseline, as well as younger age, male sex, more follow-up visits, no therapy discontinuation, higher Q-score, lower blood glucose and HbA1c levels, and the use of anti-hypertensive medication. At baseline, for each LDL-C range analysed, the LLM model also provided the minimum reduction that needs to be achieved by the next six-month visit to increase the likelihood of reaching the therapeutic goal within two years. These findings could serve as a useful tool to inform therapeutic decisions and to encourage further in-depth analysis and testing.

12.
Clin Ther ; 45(8): 754-761, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37451913

RESUMO

PURPOSE: Recently, the 2022 American Diabetes Association and European Association for the Study of Diabetes (ADA-EASD) consensus report stressed the importance of weight control in the management of patients with type 2 diabetes; weight control should be a primary target of therapy. This retrospective analysis evaluated, through an artificial-intelligence (AI) projection of data from the AMD Annals database-a huge collection of most Italian diabetology medical records covering 15 years (2005-2019)-the potential effects of the extended use of sodium-glucose co-transporter 2 inhibitors (SGLT-2is) and of glucose-like peptide 1 receptor antagonists (GLP-1-RAs) on HbA1c and weight. METHODS: Data from 4,927,548 visits in 558,097 patients were retrospectively extracted using these exclusion criteria: type 1 diabetes, pregnancy, age >75 years, dialysis, and lack of data on HbA1c or weight. The analysis revealed late prescribing of SGLT-2is and GLP-1-RAs (innovative drugs), and considering a time frame of 4 years (2014-2017), a paradoxic greater percentage of combined-goal (HbA1c <7% and weight gain <2%) achievement was found with older drugs than with innovative drugs, demonstrating aspects of therapeutic inertia. Through a machine-learning AI technique, a "what-if" analysis was performed, using query models of two outcomes: (1) achievement of the combined goal at the visit subsequent to a hypothetical initial prescribing of an SGLT-2i or a GLP-1-RA, with and without insulin, selected according to the 2018 ADA-EASD diabetes recommendations; and (2) persistence of the combined goal for 18 months. The precision values of the two models were, respectively, sensitivity, 71.1 % and 69.8%, and specificity, 67% and 76%. FINDINGS: The first query of the AI analysis showed a great improvement in achievement of the combined goal: 38.8% with prescribing in clinical practice versus 66.5% with prescribing in the "what-if" simulation. Addressing persistence at 18 months after the initial achievement of the combined goal, the simulation showed a potential better performance of SGLT-2is and GLP-1-RAs with respect to each antidiabetic pharmacologic class or combination considered. IMPLICATIONS: AI appears potentially useful in the analysis of a great amount of data, such as that derived from the AMD Annals. In the present study, an LLM analysis revealed a great potential improvement in achieving metabolic targets with SGLT-2i and GLP-1-RA utilization. These results underscore the importance of early, timely, and extended use of these new drugs.

13.
Epidemiol Health ; 44: e2022088, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36265519

RESUMO

OBJECTIVES: The area under a receiver operating characteristic (ROC) curve (AUC) is a popular measure of pure diagnostic accuracy that is independent from the proportion of diseased subjects in the analysed sample. However, its actual usefulness in the clinical context has been questioned, because it does not seem to be directly related to the actual performance of a diagnostic marker in identifying diseased and non-diseased subjects in real clinical settings. This study evaluates the relationship between the AUC and the proportion of correct classifications (global diagnostic accuracy, GDA) in relation to the shape of the corresponding ROC curves. METHODS: We demonstrate that AUC represents an upward-biased measure of GDA at an optimal accuracy cut-off for balanced groups. The magnitude of bias depends on the shape of the ROC plot and on the proportion of diseased and non-diseased subjects. In proper curves, the bias is independent from the diseased/non-diseased ratio and can be easily estimated and removed. Moreover, a comparison between 2 partial AUCs can be replaced by a more powerful test for the corresponding whole AUCs. RESULTS: Applications to 3 real datasets are provided: a marker for a hormone deficit in children, 2 tumour markers for malignant mesothelioma, and 2 gene expression profiles in ovarian cancer patients. CONCLUSIONS: The AUC is a measure of accuracy with potential clinical relevance for the evaluation of disease markers. The clinical meaning of ROC parameters should always be evaluated with an analysis of the shape of the corresponding ROC curve.


Assuntos
Curva ROC , Criança , Humanos , Área Sob a Curva , Viés
14.
J Pers Med ; 12(10)2022 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-36294727

RESUMO

BACKGROUND: The application of Machine Learning (ML) to genetic individual-level data represents a foreseeable advancement for the field, which is still in its infancy. Here, we aimed to evaluate the feasibility and accuracy of an ML-based model for disease risk prediction applied to Primary Biliary Cholangitis (PBC). METHODS: Genome-wide significant variants identified in subjects of European ancestry in the recently released second international meta-analysis of GWAS in PBC were used as input data. Quality-checked, individual genomic data from two Italian cohorts were used. The ML included the following steps: import of genotype and phenotype data, genetic variant selection, supervised classification of PBC by genotype, generation of "if-then" rules for disease prediction by logic learning machine (LLM), and model validation in a different cohort. RESULTS: The training cohort included 1345 individuals: 444 were PBC cases and 901 were healthy controls. After pre-processing, 41,899 variants entered the analysis. Several configurations of parameters related to feature selection were simulated. The best LLM model reached an Accuracy of 71.7%, a Matthews correlation coefficient of 0.29, a Youden's value of 0.21, a Sensitivity of 0.28, a Specificity of 0.93, a Positive Predictive Value of 0.66, and a Negative Predictive Value of 0.72. Thirty-eight rules were generated. The rule with the highest covering (19.14) included the following genes: RIN3, KANSL1, TIMMDC1, TNPO3. The validation cohort included 834 individuals: 255 cases and 579 controls. By applying the ruleset derived in the training cohort, the Area under the Curve of the model was 0.73. CONCLUSIONS: This study represents the first illustration of an ML model applied to common variants associated with PBC. Our approach is computationally feasible, leverages individual-level data to generate intelligible rules, and can be used for disease prediction in at-risk individuals.

15.
Anal Biochem ; 417(2): 174-81, 2011 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-21756868

RESUMO

Although most time-of-flight (TOF) mass spectrometers come equipped with vacuum matrix-assisted laser desorption/ionization (MALDI) sources, the atmospheric pressure MALDI (API-MALDI) source is an attractive option because of its ability to be coupled to a wide range of analyzers. This article describes the use of an API-MALDI source coupled to a TOF mass spectrometer for evaluation of the effects of medium- and long-term storage on peptidomic profiles of cryopreserved serum samples from healthy women. Peptides were purified using superparamagnetic beads either from fresh sera or after serum storage at -80°C for 18 months or at -20°C for 8 years. Data were preprocessed using newly developed bioinformatic tools and then were subjected to statistical analysis and class prediction. The analyses showed a dramatic effect of storage on the abundance of several peptides such as fibrinopeptides A and B, complement fractions, bradykinin, and clusterin, indicated by other authors as disease biomarkers. Most of these results were confirmed by shadow clustering analysis, able to classify each sample in the correct group. In addition to demonstrating the suitability of the API-MALDI technique for peptidome profiling studies, our data are of relevance for retrospective studies that involve frozen sera stored for many years in biobanks.


Assuntos
Pressão Atmosférica , Criopreservação , Neoplasias/sangue , Peptídeos/sangue , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Bradicinina/sangue , Clusterina/sangue , Proteínas do Sistema Complemento/análise , Feminino , Fibrinopeptídeo A/análise , Fibrinopeptídeo B/análise , Humanos , Manejo de Espécimes
16.
Cancers (Basel) ; 12(9)2020 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-32825087

RESUMO

The biological and clinical heterogeneity of neuroblastoma (NB) demands novel biomarkers and therapeutic targets in order to drive the most appropriate treatment for each patient. Hypoxia is a condition of low-oxygen tension occurring in poorly vascularized tumor tissues. In this study, we aimed to assess the role of hypoxia in the pathogenesis of NB and at developing a new clinically relevant hypoxia-based predictor of outcome. We analyzed the gene expression profiles of 1882 untreated NB primary tumors collected at diagnosis and belonging to four existing data sets. Analyses took advantage of machine learning methods. We identified NB-hop, a seven-gene hypoxia biomarker, as a predictor of NB patient prognosis, which is able to discriminate between two populations of patients with unfavorable or favorable outcome on a molecular basis. NB-hop retained its prognostic value in a multivariate model adjusted for established risk factors and was able to additionally stratify clinically relevant groups of patients. Tumors with an unfavorable NB-hop expression showed a significant association with telomerase activation and a hypoxic, immunosuppressive, poorly differentiated, and apoptosis-resistant tumor microenvironment. NB-hop defines a new population of NB patients with hypoxic tumors and unfavorable prognosis and it represents a critical factor for the stratification and treatment of NB patients.

17.
BMC Bioinformatics ; 9: 410, 2008 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-18834513

RESUMO

UNLABELLED: Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR). ABCR represents a more general approach than the standard area under the ROC curve (AUC), because it can identify both proper (i.e., concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods. RESULTS: We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%). CONCLUSION: NPRC represent a new useful tool for the analysis of microarray data.


Assuntos
Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Linfoma/metabolismo , Proteínas de Neoplasias/biossíntese , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Curva ROC , Animais , Bases de Dados Genéticas , Humanos , Linfoma/genética , Proteínas de Neoplasias/genética , Valor Preditivo dos Testes
18.
Health Informatics J ; 24(1): 54-65, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-27354395

RESUMO

This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.


Assuntos
Expressão Gênica/fisiologia , Doença de Hodgkin/classificação , Aprendizado de Máquina/tendências , Prognóstico , Análise por Conglomerados , Árvores de Decisões , Doença de Hodgkin/diagnóstico , Humanos
19.
Int J Mol Med ; 12(3): 355-63, 2003 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-12883652

RESUMO

The 1785 nucleotides of the coding region of the estrogen receptor alpha (ER-alpha) are dispersed over a region of more than 300,000 nucleotides in the primary transcript. Splicing of this precursor RNA frequently leads to variants lacking one or more exons that have been associated to breast cancer progression. The most frequent splice variant lacks exon 4 and is expressed in the human mammary carcinoma cell line MCF-7 at a level similar to that of the full-length messenger. The in silico analysis of ER-alpha splice sites by Hamming clustering, a self learning method trained on more than 28,000 experimentally proved splice sites, reveals high relevance for the 5' and 3' splice sites of exon 4. The splicing analysis of transfected mini-gene constructs containing drastically shortened introns excludes that weak splice sites, intron or exon lengths or splice enhancers are responsible for exon skipping. Exon 6 is never skipped in MCF-7 cells but is spliced out from mini-gene derived primary transcripts if inserted between exons 3 and 5 instead of exon 4. As a consequence, it appears that a particular splice site affinity of exon 3 donor (5' splice site) and exon 5 acceptor sites (3' splice site) is responsible for skipping of the exon in between.


Assuntos
Processamento Alternativo/fisiologia , Processamento Pós-Transcricional do RNA/fisiologia , RNA Mensageiro/metabolismo , Receptores de Estrogênio/genética , Biologia Computacional , Receptor alfa de Estrogênio , Éxons/fisiologia , Íntrons/fisiologia , Sítios de Splice de RNA/fisiologia , Receptores de Estrogênio/biossíntese
20.
IEEE Trans Neural Netw ; 15(3): 533-44, 2004 May.
Artigo em Inglês | MEDLINE | ID: mdl-15384544

RESUMO

The general problem of reconstructing an unknown function from a finite collection of samples is considered, in case the position of each input vector in the training set is not fixed beforehand but is part of the learning process. In particular, the consistency of the empirical risk minimization (ERM) principle is analyzed, when the points in the input space are generated by employing a purely deterministic algorithm (deterministic learning). When the output generation is not subject to noise, classical number-theoretic results, involving discrepancy and variation, enable the establishment of a sufficient condition for the consistency of the ERM principle. In addition, the adoption of low-discrepancy sequences enables the achievement of a learning rate of O(1/L), with L being the size of the training set. An extension to the noisy case is provided, which shows that the good properties of deterministic learning are preserved, if the level of noise at the output is not high. Simulation results confirm the validity of the proposed approach.


Assuntos
Redes Neurais de Computação , Inteligência Artificial
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA