Pesquisa | Biblioteca Virtual em Saúde

The impact of imputation quality on machine learning classifiers for datasets with missing values.

Shadbahr, Tolou; Roberts, Michael; Stanczuk, Jan; Gilbey, Julian; Teare, Philip; Dittmer, Sören; Thorpe, Matthew; Torné, Ramon Viñas; Sala, Evis; Lió, Pietro; Patel, Mishal; Preller, Jacobus; Rudd, James H F; Mirtti, Tuomas; Rannikko, Antti Sakari; Aston, John A D; Tang, Jing; Schönlieb, Carola-Bibiane.

Commun Med (Lond) ; 3(1): 139, 2023 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-37803172

RESUMO

BACKGROUND: Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods, followed by classification of the now complete samples. The focus of the machine learning researcher is to optimise the classifier's performance. METHODS: We utilise three simulated and three real-world clinical datasets with different feature types and missingness patterns. Initially, we evaluate how the downstream classifier performance depends on the choice of classifier and imputation methods. We employ ANOVA to quantitatively evaluate how the choice of missingness rate, imputation method, and classifier method influences the performance. Additionally, we compare commonly used methods for assessing imputation quality and introduce a class of discrepancy scores based on the sliced Wasserstein distance. We also assess the stability of the imputations and the interpretability of model built on the imputed data. RESULTS: The performance of the classifier is most affected by the percentage of missingness in the test data, with a considerable performance decline observed as the test missingness rate increases. We also show that the commonly used measures for assessing imputation quality tend to lead to imputed data which poorly matches the underlying data distribution, whereas our new class of discrepancy scores performs much better on this measure. Furthermore, we show that the interpretability of classifier models trained using poorly imputed data is compromised. CONCLUSIONS: It is imperative to consider the quality of the imputation when performing downstream classification as the effects on the classifier can be considerable.

Many artificial intelligence (AI) methods aim to classify samples of data into groups, e.g., patients with disease vs. those without. This often requires datasets to be complete, i.e., that all data has been collected for all samples. However, in clinical practice this is often not the case and some data can be missing. One solution is to 'complete' the dataset using a technique called imputation to replace those missing values. However, assessing how well the imputation method performs is challenging. In this work, we demonstrate why people should care about imputation, develop a new method for assessing imputation quality, and demonstrate that if we build AI models on poorly imputed data, the model can give different results to those we would hope for. Our findings may improve the utility and quality of AI models in the clinic.

The potential and pitfalls of artificial intelligence in clinical pharmacology.

Johnson, Martin; Patel, Mishal; Phipps, Alex; van der Schaar, Mihaela; Boulton, Dave; Gibbs, Megan.

CPT Pharmacometrics Syst Pharmacol ; 12(3): 279-284, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36717763

Assuntos

Inteligência Artificial , Farmacologia Clínica , Humanos

Quantitative breast density analysis to predict interval and node-positive cancers in pursuit of improved screening protocols: a case-control study.

Burnside, Elizabeth S; Warren, Lucy M; Myles, Jonathan; Wilkinson, Louise S; Wallis, Matthew G; Patel, Mishal; Smith, Robert A; Young, Kenneth C; Massat, Nathalie J; Duffy, Stephen W.

Br J Cancer ; 125(6): 884-892, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34168297

RESUMO

BACKGROUND: This study investigates whether quantitative breast density (BD) serves as an imaging biomarker for more intensive breast cancer screening by predicting interval, and node-positive cancers. METHODS: This case-control study of 1204 women aged 47-73 includes 599 cancer cases (302 screen-detected, 297 interval; 239 node-positive, 360 node-negative) and 605 controls. Automated BD software calculated fibroglandular volume (FGV), volumetric breast density (VBD) and density grade (DG). A radiologist assessed BD using a visual analogue scale (VAS) from 0 to 100. Logistic regression and area under the receiver operating characteristic curves (AUC) determined whether BD could predict mode of detection (screen-detected or interval); node-negative cancers; node-positive cancers, and all cancers vs. controls. RESULTS: FGV, VBD, VAS, and DG all discriminated interval cancers (all p < 0.01) from controls. Only FGV-quartile discriminated screen-detected cancers (p < 0.01). Based on AUC, FGV discriminated all cancer types better than VBD or VAS. FGV showed a significantly greater discrimination of interval cancers, AUC = 0.65, than of screen-detected cancers, AUC = 0.61 (p < 0.01) as did VBD (0.63 and 0.53, respectively, p < 0.001). CONCLUSION: FGV, VBD, VAS and DG discriminate interval cancers from controls, reflecting some masking risk. Only FGV discriminates screen-detected cancers perhaps adding a unique component of breast cancer risk.

Assuntos

Densidade da Mama , Neoplasias da Mama/diagnóstico por imagem , Mamografia/métodos , Idoso , Estudos de Casos e Controles , Detecção Precoce de Câncer , Feminino , Humanos , Pessoa de Meia-Idade , Ensaios Clínicos Controlados Aleatórios como Assunto , Escala Visual Analógica

Bilateral Single System Ectopic Ureters With Vaginal Insertion in a Female Child, A Rare Variant.

Patel, Mishal; Parikh, Urvish; Shrotriya, Radhika; Kadam, Spandan; Shah, Jainam; Chandna, Sudhir.

Urology ; 149: e37-e39, 2021 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-33129874

RESUMO

In most cases an ectopic ureter is associated with a duplicated renal collecting system while in only a few single systems is found. Bilateral single system ureteral ectopia is even rarer. A 9-year-old girl presented with urinary incontinence. Investigations pointed towards bilateral single system ectopic ureters with ectopic openings into vagina with a hypoplastic bladder. The left ureteric system was tortuous with malrotated and hypoplastic left kidney. A 4 × 2 cm hard calculus was found in the vagina. Right Ureteric reimplantation with left to right uretero-ureterostomy was done with satisfactory postoperative day time continence at 6 months without the need for bladder reconstruction or urinary diversion.

Assuntos

Anormalidades Múltiplas , Ureter/anormalidades , Vagina/anormalidades , Anormalidades Múltiplas/classificação , Criança , Feminino , Humanos , Ureter/patologia

Objective assessment of cancer genes for drug discovery.

Patel, Mishal N; Halling-Brown, Mark D; Tym, Joseph E; Workman, Paul; Al-Lazikani, Bissan.

Nat Rev Drug Discov ; 12(1): 35-50, 2013 01.

Artigo em Inglês | MEDLINE | ID: mdl-23274470

RESUMO

Selecting the best targets is a key challenge for drug discovery, and achieving this effectively, efficiently and systematically is particularly important for prioritizing candidates from the sizeable lists of potential therapeutic targets that are now emerging from large-scale multi-omics initiatives, such as those in oncology. Here, we describe an objective, systematic, multifaceted computational assessment of biological and chemical space that can be applied to any human gene set to prioritize targets for therapeutic exploration. We use this approach to evaluate an exemplar set of 479 cancer-associated genes, reveal the tension between biological relevance and chemical tractability, and describe major gaps in available knowledge that could be addressed to aid objective decision-making. We also propose drug repurposing opportunities and identify potentially druggable cancer-associated proteins that have been poorly explored with regard to the discovery of small-molecule modulators, despite their biological relevance.

Assuntos

Antineoplásicos/farmacologia , Descoberta de Drogas/métodos , Terapia de Alvo Molecular , Neoplasias/tratamento farmacológico , Tomada de Decisões , Desenho de Fármacos , Humanos , Neoplasias/genética , Neoplasias/patologia

canSAR: an integrated cancer public translational research and drug discovery resource.

Halling-Brown, Mark D; Bulusu, Krishna C; Patel, Mishal; Tym, Joe E; Al-Lazikani, Bissan.

Nucleic Acids Res ; 40(Database issue): D947-56, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22013161

RESUMO

canSAR is a fully integrated cancer research and drug discovery resource developed to utilize the growing publicly available biological annotation, chemical screening, RNA interference screening, expression, amplification and 3D structural data. Scientists can, in a single place, rapidly identify biological annotation of a target, its structural characterization, expression levels and protein interaction data, as well as suitable cell lines for experiments, potential tool compounds and similarity to known drug targets. canSAR has, from the outset, been completely use-case driven which has dramatically influenced the design of the back-end and the functionality provided through the interfaces. The Web interface at http://cansar.icr.ac.uk provides flexible, multipoint entry into canSAR. This allows easy access to the multidisciplinary data within, including target and compound synopses, bioactivity views and expert tools for chemogenomic, expression and protein interaction network data.

Assuntos

Antineoplásicos/química , Bases de Dados Genéticas , Neoplasias/genética , Neoplasias/metabolismo , Antineoplásicos/farmacologia , Linhagem Celular Tumoral , Descoberta de Drogas , Expressão Gênica , Variação Genética , Humanos , Internet , Modelos Moleculares , Mapas de Interação de Proteínas , Interferência de RNA , Integração de Sistemas , Pesquisa Translacional Biomédica

Anisotropic elastic network modeling of entire microtubules.

Deriu, Marco A; Soncini, Monica; Orsi, Mario; Patel, Mishal; Essex, Jonathan W; Montevecchi, Franco M; Redaelli, Alberto.

Biophys J ; 99(7): 2190-9, 2010 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-20923653

RESUMO

Microtubules are supramolecular structures that make up the cytoskeleton and strongly affect the mechanical properties of the cell. Within the cytoskeleton filaments, the microtubule (MT) exhibits by far the highest bending stiffness. Bending stiffness depends on the mechanical properties and intermolecular interactions of the tubulin dimers (the MT building blocks). Computational molecular modeling has the potential for obtaining quantitative insights into this area. However, to our knowledge, standard molecular modeling techniques, such as molecular dynamics (MD) and normal mode analysis (NMA), are not yet able to simulate large molecular structures like the MTs; in fact, their possibilities are normally limited to much smaller protein complexes. In this work, we developed a multiscale approach by merging the modeling contribution from MD and NMA. In particular, MD simulations were used to refine the molecular conformation and arrangement of the tubulin dimers inside the MT lattice. Subsequently, NMA was used to investigate the vibrational properties of MTs modeled as an elastic network. The coarse-grain model here developed can describe systems of hundreds of interacting tubulin monomers (corresponding to up to 1,000,000 atoms). In particular, we were able to simulate coarse-grain models of entire MTs, with lengths up to 350 nm. A quantitative mechanical investigation was performed; from the bending and stretching modes, we estimated MT macroscopic properties such as bending stiffness, Young modulus, and persistence length, thus allowing a direct comparison with experimental data.

Assuntos

Elasticidade , Microtúbulos/metabolismo , Modelos Biológicos , Anisotropia , Simulação de Dinâmica Molecular , Multimerização Proteica , Padrões de Referência , Tubulina (Proteína)/química

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA