Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
1.
J Chem Inf Model ; 2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-38953560

RESUMEN

Message passing neural networks (MPNNs) on molecular graphs generate continuous and differentiable encodings of small molecules with state-of-the-art performance on protein-ligand complex scoring tasks. Here, we describe the proximity graph network (PGN) package, an open-source toolkit that constructs ligand-receptor graphs based on atom proximity and allows users to rapidly apply and evaluate MPNN architectures for a broad range of tasks. We demonstrate the utility of PGN by introducing benchmarks for affinity and docking score prediction tasks. Graph networks generalize better than fingerprint-based models and perform strongly for the docking score prediction task. Overall, MPNNs with proximity graph data structures augment the prediction of ligand-receptor complex properties when ligand-receptor data are available.

2.
bioRxiv ; 2023 Nov 22.
Artículo en Inglés | MEDLINE | ID: mdl-38045231

RESUMEN

The investigation of chromatin organization in single cells holds great promise for identifying causal relationships between genome structure and function. However, analysis of single-molecule data is hampered by extreme yet inherent heterogeneity, making it challenging to determine the contributions of individual chromatin fibers to bulk trends. To address this challenge, we propose ChromaFactor, a novel computational approach based on non-negative matrix factorization that deconvolves single-molecule chromatin organization datasets into their most salient primary components. ChromaFactor provides the ability to identify trends accounting for the maximum variance in the dataset while simultaneously describing the contribution of individual molecules to each component. Applying our approach to two single-molecule imaging datasets across different genomic scales, we find that these primary components demonstrate significant correlation with key functional phenotypes, including active transcription, enhancer-promoter distance, and genomic compartment. ChromaFactor offers a robust tool for understanding the complex interplay between chromatin structure and function on individual DNA molecules, pinpointing which subpopulations drive functional changes and fostering new insights into cellular heterogeneity and its implications for bulk genomic phenomena.

3.
Acta Neuropathol Commun ; 11(1): 202, 2023 12 18.
Artículo en Inglés | MEDLINE | ID: mdl-38110981

RESUMEN

Machine learning (ML) has increasingly been used to assist and expand current practices in neuropathology. However, generating large imaging datasets with quality labels is challenging in fields which demand high levels of expertise. Further complicating matters is the often seen disagreement between experts in neuropathology-related tasks, both at the case level and at a more granular level. Neurofibrillary tangles (NFTs) are a hallmark pathological feature of Alzheimer disease, and are associated with disease progression which warrants further investigation and granular quantification at a scale not currently accessible in routine human assessment. In this work, we first provide a baseline of annotator/rater agreement for the tasks of Braak NFT staging between experts and NFT detection using both experts and novices in neuropathology. We use a whole-slide-image (WSI) cohort of neuropathology cases from Emory University Hospital immunohistochemically stained for Tau. We develop a workflow for gathering annotations of the early stage formation of NFTs (Pre-NFTs) and mature intracellular (iNFTs) and show ML models can be trained to learn annotator nuances for the task of NFT detection in WSIs. We utilize a model-assisted-labeling approach and demonstrate ML models can be used to aid in labeling large datasets efficiently. We also show these models can be used to extract case-level features, which predict Braak NFT stages comparable to expert human raters, and do so at scale. This study provides a generalizable workflow for various pathology and related fields, and also provides a technique for accomplishing a high-level neuropathology task with limited human annotations.


Asunto(s)
Enfermedad de Alzheimer , Enfermedades Neurodegenerativas , Humanos , Ovillos Neurofibrilares/patología , Enfermedades Neurodegenerativas/patología , Proteínas tau/metabolismo , Flujo de Trabajo , Encéfalo/patología , Enfermedad de Alzheimer/patología , Aprendizaje Automático
4.
Cell Genom ; 3(10): 100410, 2023 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-37868032

RESUMEN

Natural and experimental genetic variants can modify DNA loops and insulating boundaries to tune transcription, but it is unknown how sequence perturbations affect chromatin organization genome wide. We developed a deep-learning strategy to quantify the effect of any insertion, deletion, or substitution on chromatin contacts and systematically scored millions of synthetic variants. While most genetic manipulations have little impact, regions with CTCF motifs and active transcription are highly sensitive, as expected. Our unbiased screen and subsequent targeted experiments also point to noncoding RNA genes and several families of repetitive elements as CTCF-motif-free DNA sequences with particularly large effects on nearby chromatin interactions, sometimes exceeding the effects of CTCF sites and explaining interactions that lack CTCF. We anticipate that our disruption tracks may be of broad interest and utility as a measure of 3D genome sensitivity, and our computational strategies may serve as a template for biological inquiry with deep learning.

5.
bioRxiv ; 2023 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-37693536

RESUMEN

Chemical probes interrogate disease mechanisms at the molecular level by linking genetic changes to observable traits. However, comprehensive chemical screens in diverse biological models are impractical. To address this challenge, we developed ChemProbe, a model that predicts cellular sensitivity to hundreds of molecular probes and drugs by learning to combine transcriptomes and chemical structures. Using ChemProbe, we inferred the chemical sensitivity of cancer cell lines and tumor samples and analyzed how the model makes predictions. We retrospectively evaluated drug response predictions for precision breast cancer treatment and prospectively validated chemical sensitivity predictions in new cellular models, including a genetically modified cell line. Our model interpretation analysis identified transcriptome features reflecting compound targets and protein network modules, identifying genes that drive ferroptosis. ChemProbe is an interpretable in silico screening tool that allows researchers to measure cellular response to diverse compounds, facilitating research into molecular mechanisms of chemical sensitivity.

6.
Commun Biol ; 6(1): 668, 2023 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-37355729

RESUMEN

Precise, scalable, and quantitative evaluation of whole slide images is crucial in neuropathology. We release a deep learning model for rapid object detection and precise information on the identification, locality, and counts of cored plaques and cerebral amyloid angiopathy (CAA). We trained this object detector using a repurposed image-tile dataset without any human-drawn bounding boxes. We evaluated the detector on a new manually-annotated dataset of whole slide images (WSIs) from three institutions, four staining procedures, and four human experts. The detector matched the cohort of neuropathology experts, achieving 0.64 (model) vs. 0.64 (cohort) average precision (AP) for cored plaques and 0.75 vs. 0.51 AP for CAAs at a 0.5 IOU threshold. It provided count and locality predictions that approximately correlated with gold-standard human CERAD-like WSI scoring (p = 0.07 ± 0.10). The openly-available model can quickly score WSIs in minutes without a GPU on a standard workstation.


Asunto(s)
Proteínas Amiloidogénicas , Placa Amiloide , Humanos , Registros , Coloración y Etiquetado , Virión
7.
bioRxiv ; 2023 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-36711704

RESUMEN

Precise, scalable, and quantitative evaluation of whole slide images is crucial in neuropathology. We release a deep learning model for rapid object detection and precise information on the identification, locality, and counts of cored plaques and cerebral amyloid angiopathies (CAAs). We trained this object detector using a repurposed image-tile dataset without any human-drawn bounding boxes. We evaluated the detector on a new manually-annotated dataset of whole slide images (WSIs) from three institutions, four staining procedures, and four human experts. The detector matched the cohort of neuropathology experts, achieving 0.64 (model) vs. 0.64 (cohort) average precision (AP) for cored plaques and 0.75 vs. 0.51 AP for CAAs at a 0.5 IOU threshold. It provided count and locality predictions that correlated with gold-standard CERAD-like WSI scoring (p=0.07± 0.10). The openly-available model can quickly score WSIs in minutes without a GPU on a standard workstation.

8.
Nat Mach Intell ; 4(6): 583-595, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36276634

RESUMEN

In microscopy-based drug screens, fluorescent markers carry critical information on how compounds affect different biological processes. However, practical considerations, such as the labor and preparation formats needed to produce different image channels, hinders the use of certain fluorescent markers. Consequently, completed screens may lack biologically informative but experimentally impractical markers. Here, we present a deep learning method for overcoming these limitations. We accurately generated predicted fluorescent signals from other related markers and validated this new machine learning (ML) method on two biologically distinct datasets. We used the ML method to improve the selection of biologically active compounds for Alzheimer's disease (AD) from a completed high-content high-throughput screen (HCS) that had only contained the original markers. The ML method identified novel compounds that effectively blocked tau aggregation, which had been missed by traditional screening approaches unguided by ML. The method improved triaging efficiency of compound rankings over conventional rankings by raw image channels. We reproduced this ML pipeline on a biologically independent cancer-based dataset, demonstrating its generalizability. The approach is disease-agnostic and applicable across diverse fluorescence microscopy datasets.

9.
J Chem Inf Model ; 62(18): 4300-4318, 2022 09 26.
Artículo en Inglés | MEDLINE | ID: mdl-36102784

RESUMEN

Machine learning-based drug discovery success depends on molecular representation. Yet traditional molecular fingerprints omit both the protein and pointers back to structural information that would enable better model interpretability. Therefore, we propose LUNA, a Python 3 toolkit that calculates and encodes protein-ligand interactions into new hashed fingerprints inspired by Extended Connectivity FingerPrint (ECFP): EIFP (Extended Interaction FingerPrint), FIFP (Functional Interaction FingerPrint), and Hybrid Interaction FingerPrint (HIFP). LUNA also provides visual strategies to make the fingerprints interpretable. We performed three major experiments exploring the fingerprints' use. First, we trained machine learning models to reproduce DOCK3.7 scores using 1 million docked Dopamine D4 complexes. We found that EIFP-4,096 performed (R2 = 0.61) superior to related molecular and interaction fingerprints. Second, we used LUNA to support interpretable machine learning models. Finally, we demonstrate that interaction fingerprints can accurately identify similarities across molecular complexes that other fingerprints overlook. Hence, we envision LUNA and its interface fingerprints as promising methods for machine learning-based virtual screening campaigns. LUNA is freely available at https://github.com/keiserlab/LUNA.


Asunto(s)
Dopamina , Proteínas , Descubrimiento de Drogas/métodos , Ligandos , Aprendizaje Automático , Proteínas/química
10.
Acta Neuropathol Commun ; 10(1): 66, 2022 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-35484610

RESUMEN

Pathologists can label pathologies differently, making it challenging to yield consistent assessments in the absence of one ground truth. To address this problem, we present a deep learning (DL) approach that draws on a cohort of experts, weighs each contribution, and is robust to noisy labels. We collected 100,495 annotations on 20,099 candidate amyloid beta neuropathologies (cerebral amyloid angiopathy (CAA), and cored and diffuse plaques) from three institutions, independently annotated by five experts. DL methods trained on a consensus-of-two strategy yielded 12.6-26% improvements by area under the precision recall curve (AUPRC) when compared to those that learned individualized annotations. This strategy surpassed individual-expert models, even when unfairly assessed on benchmarks favoring them. Moreover, ensembling over individual models was robust to hidden random annotators. In blind prospective tests of 52,555 subsequent expert-annotated images, the models labeled pathologies like their human counterparts (consensus model AUPRC = 0.74 cored; 0.69 CAA). This study demonstrates a means to combine multiple ground truths into a common-ground DL model that yields consistent diagnoses informed by multiple and potentially variable expert opinions.


Asunto(s)
Angiopatía Amiloide Cerebral , Aprendizaje Profundo , Péptidos beta-Amiloides , Angiopatía Amiloide Cerebral/diagnóstico , Humanos , Neuropatología , Estudios Prospectivos
11.
NPJ Digit Med ; 4(1): 10, 2021 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-33479460

RESUMEN

Artificial intelligence models match or exceed dermatologists in melanoma image classification. Less is known about their robustness against real-world variations, and clinicians may incorrectly assume that a model with an acceptable area under the receiver operating characteristic curve or related performance metric is ready for clinical use. Here, we systematically assessed the performance of dermatologist-level convolutional neural networks (CNNs) on real-world non-curated images by applying computational "stress tests". Our goal was to create a proxy environment in which to comprehensively test the generalizability of off-the-shelf CNNs developed without training or evaluation protocols specific to individual clinics. We found inconsistent predictions on images captured repeatedly in the same setting or subjected to simple transformations (e.g., rotation). Such transformations resulted in false positive or negative predictions for 6.5-22% of skin lesions across test datasets. Our findings indicate that models meeting conventionally reported metrics need further validation with computational stress tests to assess clinic readiness.

12.
J Chem Inf Model ; 60(12): 5957-5970, 2020 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-33245237

RESUMEN

Multitask deep neural networks learn to predict ligand-target binding by example, yet public pharmacological data sets are sparse, imbalanced, and approximate. We constructed two hold-out benchmarks to approximate temporal and drug-screening test scenarios, whose characteristics differ from a random split of conventional training data sets. We developed a pharmacological data set augmentation procedure, Stochastic Negative Addition (SNA), which randomly assigns untested molecule-target pairs as transient negative examples during training. Under the SNA procedure, drug-screening benchmark performance increases from R2 = 0.1926 ± 0.0186 to 0.4269 ± 0.0272 (122%). This gain was accompanied by a modest decrease in the temporal benchmark (13%). SNA increases in drug-screening performance were consistent for classification and regression tasks and outperformed y-randomized controls. Our results highlight where data and feature uncertainty may be problematic and how leveraging uncertainty into training improves predictions of drug-target relationships.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación
13.
J Med Chem ; 63(16): 8705-8722, 2020 08 27.
Artículo en Inglés | MEDLINE | ID: mdl-32366098

RESUMEN

The accurate modeling and prediction of small molecule properties and bioactivities depend on the critical choice of molecular representation. Decades of informatics-driven research have relied on expert-designed molecular descriptors to establish quantitative structure-activity and structure-property relationships for drug discovery. Now, advances in deep learning make it possible to efficiently and compactly learn molecular representations directly from data. In this review, we discuss how active research in molecular deep learning can address limitations of current descriptors and fingerprints while creating new opportunities in cheminformatics and virtual screening. We provide a concise overview of the role of representations in cheminformatics, key concepts in deep learning, and argue that learning representations provides a way forward to improve the predictive modeling of small molecule bioactivities and properties.


Asunto(s)
Química Farmacéutica/métodos , Aprendizaje Profundo , Compuestos Orgánicos/química , Quimioinformática , Modelos Moleculares , Estructura Molecular , Relación Estructura-Actividad Cuantitativa
14.
Acta Neuropathol Commun ; 8(1): 59, 2020 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-32345363

RESUMEN

Semi-quantitative scoring schemes like the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) are the most commonly used method in Alzheimer's disease (AD) neuropathology practice. Computational approaches based on machine learning have recently generated quantitative scores for whole slide images (WSIs) that are highly correlated with human derived semi-quantitative scores, such as those of CERAD, for Alzheimer's disease pathology. However, the robustness of such models have yet to be tested in different cohorts. To validate previously published machine learning algorithms using convolutional neural networks (CNNs) and determine if pathological heterogeneity may alter algorithm derived measures, 40 cases from the Goizueta Emory Alzheimer's Disease Center brain bank displaying an array of pathological diagnoses (including AD with and without Lewy body disease (LBD), and / or TDP-43-positive inclusions) and levels of Aß pathologies were evaluated. Furthermore, to provide deeper phenotyping, amyloid burden in gray matter vs whole tissue were compared, and quantitative CNN scores for both correlated significantly to CERAD-like scores. Quantitative scores also show clear stratification based on AD pathologies with or without additional diagnoses (including LBD and TDP-43 inclusions) vs cases with no significant neurodegeneration (control cases) as well as NIA Reagan scoring criteria. Specifically, the concomitant diagnosis group of AD + TDP-43 showed significantly greater CNN-score for cored plaques than the AD group. Finally, we report that whole tissue computational scores correlate better with CERAD-like categories than focusing on computational scores from a field of view with densest pathology, which is the standard of practice in neuropathological assessment per CERAD guidelines. Together these findings validate and expand CNN models to be robust to cohort variations and provide additional proof-of-concept for future studies to incorporate machine learning algorithms into neuropathological practice.


Asunto(s)
Enfermedad de Alzheimer/diagnóstico , Aprendizaje Automático , Redes Neurales de la Computación , Enfermedades Neurodegenerativas/diagnóstico , Enfermedad de Alzheimer/patología , Péptidos beta-Amiloides , Humanos , Interpretación de Imagen Asistida por Computador , Enfermedad por Cuerpos de Lewy/diagnóstico , Enfermedad por Cuerpos de Lewy/patología , Enfermedades Neurodegenerativas/patología , Proteinopatías TDP-43/diagnóstico , Proteinopatías TDP-43/patología
15.
J Invest Dermatol ; 140(8): 1504-1512, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32229141

RESUMEN

Artificial intelligence is becoming increasingly important in dermatology, with studies reporting accuracy matching or exceeding dermatologists for the diagnosis of skin lesions from clinical and dermoscopic images. However, real-world clinical validation is currently lacking. We review dermatological applications of deep learning, the leading artificial intelligence technology for image analysis, and discuss its current capabilities, potential failure modes, and challenges surrounding performance assessment and interpretability. We address the following three primary applications: (i) teledermatology, including triage for referral to dermatologists; (ii) augmenting clinical assessment during face-to-face visits; and (iii) dermatopathology. We discuss equity and ethical issues related to future clinical adoption and recommend specific standardization of metrics for reporting model performance.


Asunto(s)
Aprendizaje Profundo/ética , Dermatología/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Enfermedades de la Piel/diagnóstico , Piel/diagnóstico por imagen , Dermatología/ética , Humanos , Procesamiento de Imagen Asistido por Computador/ética , Derivación y Consulta , Piel/patología , Enfermedades de la Piel/patología , Telemedicina/ética , Telemedicina/métodos , Triaje/ética , Triaje/métodos
16.
Nat Commun ; 10(1): 4078, 2019 09 09.
Artículo en Inglés | MEDLINE | ID: mdl-31501447

RESUMEN

Anesthetics are generally associated with sedation, but some anesthetics can also increase brain and motor activity-a phenomenon known as paradoxical excitation. Previous studies have identified GABAA receptors as the primary targets of most anesthetic drugs, but how these compounds produce paradoxical excitation is poorly understood. To identify and understand such compounds, we applied a behavior-based drug profiling approach. Here, we show that a subset of central nervous system depressants cause paradoxical excitation in zebrafish. Using this behavior as a readout, we screened thousands of compounds and identified dozens of hits that caused paradoxical excitation. Many hit compounds modulated human GABAA receptors, while others appeared to modulate different neuronal targets, including the human serotonin-6 receptor. Ligands at these receptors generally decreased neuronal activity, but paradoxically increased activity in the caudal hindbrain. Together, these studies identify ligands, targets, and neurons affecting sedation and paradoxical excitation in vivo in zebrafish.


Asunto(s)
Conducta Animal , Sedación Consciente , Receptores de GABA-A/metabolismo , Receptores de Serotonina/metabolismo , Pez Cebra/metabolismo , Animales , Ligandos , Inhibición Neural , Neuronas/fisiología , Antagonistas de la Serotonina/química , Proteínas de Pez Cebra/metabolismo
17.
Nat Commun ; 10(1): 2173, 2019 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-31092819

RESUMEN

Neuropathologists assess vast brain areas to identify diverse and subtly-differentiated morphologies. Standard semi-quantitative scoring approaches, however, are coarse-grained and lack precise neuroanatomic localization. We report a proof-of-concept deep learning pipeline that identifies specific neuropathologies-amyloid plaques and cerebral amyloid angiopathy-in immunohistochemically-stained archival slides. Using automated segmentation of stained objects and a cloud-based interface, we annotate > 70,000 plaque candidates from 43 whole slide images (WSIs) to train and evaluate convolutional neural networks. Networks achieve strong plaque classification on a 10-WSI hold-out set (0.993 and 0.743 areas under the receiver operating characteristic and precision recall curve, respectively). Prediction confidence maps visualize morphology distributions at high resolution. Resulting network-derived amyloid beta (Aß)-burden scores correlate well with established semi-quantitative scores on a 30-WSI blinded hold-out. Finally, saliency mapping demonstrates that networks learn patterns agreeing with accepted pathologic features. This scalable means to augment a neuropathologist's ability suggests a route to neuropathologic deep phenotyping.


Asunto(s)
Enfermedad de Alzheimer/patología , Encéfalo/patología , Aprendizaje Profundo , Procesamiento de Imagen Asistido por Computador/métodos , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Conjuntos de Datos como Asunto , Femenino , Humanos , Masculino , Curva ROC
18.
Science ; 362(6416)2018 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-30442776

RESUMEN

Ahneman et al (Reports, 13 April 2018) applied machine learning models to predict C-N cross-coupling reaction yields. The models use atomic, electronic, and vibrational descriptors as input features. However, the experimental design is insufficient to distinguish models trained on chemical features from those trained solely on random-valued features in retrospective and prospective test scenarios, thus failing classical controls in machine learning.

19.
ACS Chem Biol ; 13(10): 2819-2821, 2018 10 19.
Artículo en Inglés | MEDLINE | ID: mdl-30336670

RESUMEN

New machine learning methods to analyze raw chemical and biological data are now widely accessible as open-source toolkits. This positions researchers to leverage powerful, predictive models in their own domains. We caution, however, that the application of machine learning to experimental research merits careful consideration. Machine learning algorithms readily exploit confounding variables and experimental artifacts instead of relevant patterns, leading to overoptimistic performance and poor model generalization. In parallel to the strong control experiments that remain a cornerstone of experimental research, we advance the concept of adversarial controls for scientific machine learning: the design of exacting and purposeful experiments to ensure that predictive performance arises from meaningful models.


Asunto(s)
Aprendizaje Automático/normas , Modelos Teóricos , Lógica
20.
Cell ; 174(3): 505-520, 2018 07 26.
Artículo en Inglés | MEDLINE | ID: mdl-30053424

RESUMEN

Although gene discovery in neuropsychiatric disorders, including autism spectrum disorder, intellectual disability, epilepsy, schizophrenia, and Tourette disorder, has accelerated, resulting in a large number of molecular clues, it has proven difficult to generate specific hypotheses without the corresponding datasets at the protein complex and functional pathway level. Here, we describe one path forward-an initiative aimed at mapping the physical and genetic interaction networks of these conditions and then using these maps to connect the genomic data to neurobiology and, ultimately, the clinic. These efforts will include a team of geneticists, structural biologists, neurobiologists, systems biologists, and clinicians, leveraging a wide array of experimental approaches and creating a collaborative infrastructure necessary for long-term investigation. This initiative will ultimately intersect with parallel studies that focus on other diseases, as there is a significant overlap with genes implicated in cancer, infectious disease, and congenital heart defects.


Asunto(s)
Mapeo Cromosómico/métodos , Trastornos del Neurodesarrollo/genética , Biología de Sistemas/métodos , Redes Reguladoras de Genes/genética , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Humanos , Neurobiología/métodos , Neuropsiquiatría
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...