Búsqueda | Portal de Búsqueda de la BVS España

Deep neural nets as a method for quantitative structure-activity relationships.

Ma, Junshui; Sheridan, Robert P; Liaw, Andy; Dahl, George E; Svetnik, Vladimir.

J Chem Inf Model ; 55(2): 263-74, 2015 Feb 23.

Artículo en Inglés | MEDLINE | ID: mdl-25635324

RESUMEN

Neural networks were widely used for quantitative structure-activity relationships (QSAR) in the 1990s. Because of various practical issues (e.g., slow on large problems, difficult to train, prone to overfitting, etc.), they were superseded by more robust methods like support vector machine (SVM) and random forest (RF), which arose in the early 2000s. The last 10 years has witnessed a revival of neural networks in the machine learning community thanks to new methods for preventing overfitting, more efficient training algorithms, and advancements in computer hardware. In particular, deep neural nets (DNNs), i.e. neural nets with more than one hidden layer, have found great successes in many applications, such as computer vision and natural language processing. Here we show that DNNs can routinely make better prospective predictions than RF on a set of large diverse QSAR data sets that are taken from Merck's drug discovery effort. The number of adjustable parameters needed for DNNs is fairly large, but our results show that it is not necessary to optimize them for individual data sets, and a single set of recommended parameters can achieve better performance than RF for most of the data sets we studied. The usefulness of the parameters is demonstrated on additional data sets not used in the calibration. Although training DNNs is still computationally intensive, using graphical processing units (GPUs) can make this issue manageable.

Asunto(s)

Redes Neurales de la Computación , Relación Estructura-Actividad Cuantitativa , Algoritmos , Descubrimiento de Drogas , Aprendizaje Automático , Estudios Prospectivos , Máquina de Vectores de Soporte , Flujo de Trabajo

A mobile-optimized artificial intelligence system for gestational age and fetal malpresentation assessment.

Gomes, Ryan G; Vwalika, Bellington; Lee, Chace; Willis, Angelica; Sieniek, Marcin; Price, Joan T; Chen, Christina; Kasaro, Margaret P; Taylor, James A; Stringer, Elizabeth M; McKinney, Scott Mayer; Sindano, Ntazana; Dahl, George E; Goodnight, William; Gilmer, Justin; Chi, Benjamin H; Lau, Charles; Spitz, Terry; Saensuksopa, T; Liu, Kris; Tiyasirichokchai, Tiya; Wong, Jonny; Pilgrim, Rory; Uddin, Akib; Corrado, Greg; Peng, Lily; Chou, Katherine; Tse, Daniel; Stringer, Jeffrey S A; Shetty, Shravya.

Commun Med (Lond) ; 2: 128, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36249461

RESUMEN

Background: Fetal ultrasound is an important component of antenatal care, but shortage of adequately trained healthcare workers has limited its adoption in low-to-middle-income countries. This study investigated the use of artificial intelligence for fetal ultrasound in under-resourced settings. Methods: Blind sweep ultrasounds, consisting of six freehand ultrasound sweeps, were collected by sonographers in the USA and Zambia, and novice operators in Zambia. We developed artificial intelligence (AI) models that used blind sweeps to predict gestational age (GA) and fetal malpresentation. AI GA estimates and standard fetal biometry estimates were compared to a previously established ground truth, and evaluated for difference in absolute error. Fetal malpresentation (non-cephalic vs cephalic) was compared to sonographer assessment. On-device AI model run-times were benchmarked on Android mobile phones. Results: Here we show that GA estimation accuracy of the AI model is non-inferior to standard fetal biometry estimates (error difference -1.4 ± 4.5 days, 95% CI -1.8, -0.9, n = 406). Non-inferiority is maintained when blind sweeps are acquired by novice operators performing only two of six sweep motion types. Fetal malpresentation AUC-ROC is 0.977 (95% CI, 0.949, 1.00, n = 613), sonographers and novices have similar AUC-ROC. Software run-times on mobile phones for both diagnostic models are less than 3 s after completion of a sweep. Conclusions: The gestational age model is non-inferior to the clinical standard and the fetal malpresentation model has high AUC-ROCs across operators and devices. Our AI models are able to run on-device, without internet connectivity, and provide feedback scores to assist in upleveling the capabilities of lightly trained ultrasound operators in low resource settings.

Machine learning guided aptamer refinement and discovery.

Bashir, Ali; Yang, Qin; Wang, Jinpeng; Hoyer, Stephan; Chou, Wenchuan; McLean, Cory; Davis, Geoff; Gong, Qiang; Armstrong, Zan; Jang, Junghoon; Kang, Hui; Pawlosky, Annalisa; Scott, Alexander; Dahl, George E; Berndl, Marc; Dimon, Michelle; Ferguson, B Scott.

Nat Commun ; 12(1): 2366, 2021 04 22.

Artículo en Inglés | MEDLINE | ID: mdl-33888692

RESUMEN

Aptamers are single-stranded nucleic acid ligands that bind to target molecules with high affinity and specificity. They are typically discovered by searching large libraries for sequences with desirable binding properties. These libraries, however, are practically constrained to a fraction of the theoretical sequence space. Machine learning provides an opportunity to intelligently navigate this space to identify high-performing aptamers. Here, we propose an approach that employs particle display (PD) to partition a library of aptamers by affinity, and uses such data to train machine learning models to predict affinity in silico. Our model predicted high-affinity DNA aptamers from experimental candidates at a rate 11-fold higher than random perturbation and generated novel, high-affinity aptamers at a greater rate than observed by PD alone. Our approach also facilitated the design of truncated aptamers 70% shorter and with higher binding affinity (1.5 nM) than the best experimental candidate. This work demonstrates how combining machine learning and physical approaches can be used to expedite the discovery of better diagnostic and therapeutic agents.

Asunto(s)

Aptámeros de Nucleótidos/metabolismo , Aprendizaje Automático , Aptámeros de Nucleótidos/química , Aptámeros de Nucleótidos/genética , Simulación por Computador , Descubrimiento de Drogas/métodos , Biblioteca de Genes , Ligandos , Lipocalina 2/química , Lipocalina 2/genética , Lipocalina 2/metabolismo , Modelos Químicos , Unión Proteica

Artificial Intelligence-Based Breast Cancer Nodal Metastasis Detection: Insights Into the Black Box for Pathologists.

Liu, Yun; Kohlberger, Timo; Norouzi, Mohammad; Dahl, George E; Smith, Jenny L; Mohtashamian, Arash; Olson, Niels; Peng, Lily H; Hipp, Jason D; Stumpe, Martin C.

Arch Pathol Lab Med ; 143(7): 859-868, 2019 07.

Artículo en Inglés | MEDLINE | ID: mdl-30295070

RESUMEN

CONTEXT.: Nodal metastasis of a primary tumor influences therapy decisions for a variety of cancers. Histologic identification of tumor cells in lymph nodes can be laborious and error-prone, especially for small tumor foci. OBJECTIVE.: To evaluate the application and clinical implementation of a state-of-the-art deep learning-based artificial intelligence algorithm (LYmph Node Assistant or LYNA) for detection of metastatic breast cancer in sentinel lymph node biopsies. DESIGN.: Whole slide images were obtained from hematoxylin-eosin-stained lymph nodes from 399 patients (publicly available Camelyon16 challenge dataset). LYNA was developed by using 270 slides and evaluated on the remaining 129 slides. We compared the findings to those obtained from an independent laboratory (108 slides from 20 patients/86 blocks) using a different scanner to measure reproducibility. RESULTS.: LYNA achieved a slide-level area under the receiver operating characteristic (AUC) of 99% and a tumor-level sensitivity of 91% at 1 false positive per patient on the Camelyon16 evaluation dataset. We also identified 2 "normal" slides that contained micrometastases. When applied to our second dataset, LYNA achieved an AUC of 99.6%. LYNA was not affected by common histology artifacts such as overfixation, poor staining, and air bubbles. CONCLUSIONS.: Artificial intelligence algorithms can exhaustively evaluate every tissue patch on a slide, achieving higher tumor-level sensitivity than, and comparable slide-level performance to, pathologists. These techniques may improve the pathologist's productivity and reduce the number of false negatives associated with morphologic detection of tumor cells. We provide a framework to aid practicing pathologists in assessing such algorithms for adoption into their workflow (akin to how a pathologist assesses immunohistochemistry results).

Asunto(s)

Neoplasias de la Mama/patología , Aprendizaje Profundo , Interpretación de Imagen Asistida por Computador/métodos , Metástasis Linfática/diagnóstico , Patología Clínica/métodos , Femenino , Humanos , Patólogos , Biopsia del Ganglio Linfático Centinela

Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error.

Faber, Felix A; Hutchison, Luke; Huang, Bing; Gilmer, Justin; Schoenholz, Samuel S; Dahl, George E; Vinyals, Oriol; Kearnes, Steven; Riley, Patrick F; von Lilienfeld, O Anatole.

J Chem Theory Comput ; 13(11): 5255-5264, 2017 Nov 14.

Artículo en Inglés | MEDLINE | ID: mdl-28926232

RESUMEN

We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of 13 electronic ground-state properties of organic molecules. The performance of each regressor/representation/property combination is assessed using learning curves which report out-of-sample errors as a function of training set size with up to â¼118k distinct molecules. Molecular structures and properties at the hybrid density functional theory (DFT) level of theory come from the QM9 database [ Ramakrishnan et al. Sci. Data 2014 , 1 , 140022 ] and include enthalpies and free energies of atomization, HOMO/LUMO energies and gap, dipole moment, polarizability, zero point vibrational energy, heat capacity, and the highest fundamental vibrational frequency. Various molecular representations have been studied (Coulomb matrix, bag of bonds, BAML and ECFP4, molecular graphs (MG)), as well as newly developed distribution based variants including histograms of distances (HD), angles (HDA/MARAD), and dihedrals (HDAD). Regressors include linear models (Bayesian ridge regression (BR) and linear regression with elastic net regularization (EN)), random forest (RF), kernel ridge regression (KRR), and two types of neural networks, graph convolutions (GC) and gated graph networks (GG). Out-of sample errors are strongly dependent on the choice of representation and regressor and molecular property. Electronic properties are typically best accounted for by MG and GC, while energetic properties are better described by HDAD and KRR. The specific combinations with the lowest out-of-sample errors in the â¼118k training set size limit are (free) energies and enthalpies of atomization (HDAD/KRR), HOMO/LUMO eigenvalue and gap (MG/GC), dipole moment (MG/GC), static polarizability (MG/GG), zero point vibrational energy (HDAD/KRR), heat capacity at room temperature (HDAD/KRR), and highest fundamental vibrational frequency (BAML/RF). We present numerical evidence that ML model predictions deviate from DFT (B3LYP) less than DFT (B3LYP) deviates from experiment for all properties. Furthermore, out-of-sample prediction errors with respect to hybrid DFT reference are on par with, or close to, chemical accuracy. The results suggest that ML models could be more accurate than hybrid DFT if explicitly electron correlated quantum (or experimental) data were available.

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA