Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Cell ; 173(3): 792-803.e19, 2018 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-29656897

RESUMO

Microscopy is a central method in life sciences. Many popular methods, such as antibody labeling, are used to add physical fluorescent labels to specific cellular constituents. However, these approaches have significant drawbacks, including inconsistency; limitations in the number of simultaneous labels because of spectral overlap; and necessary perturbations of the experiment, such as fixing the cells, to generate the measurement. Here, we show that a computational machine-learning approach, which we call "in silico labeling" (ISL), reliably predicts some fluorescent labels from transmitted-light images of unlabeled fixed or live biological samples. ISL predicts a range of labels, such as those for nuclei, cell type (e.g., neural), and cell state (e.g., cell death). Because prediction happens in silico, the method is consistent, is not limited by spectral overlap, and does not disturb the experiment. ISL generates biological measurements that would otherwise be problematic or impossible to acquire.


Assuntos
Corantes Fluorescentes/química , Processamento de Imagem Assistida por Computador/métodos , Microscopia de Fluorescência/métodos , Neurônios Motores/citologia , Algoritmos , Animais , Linhagem Celular Tumoral , Sobrevivência Celular , Córtex Cerebral/citologia , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Aprendizado de Máquina , Redes Neurais de Computação , Neurociências , Ratos , Software , Células-Tronco/citologia
2.
Proc Natl Acad Sci U S A ; 118(37)2021 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-34508002

RESUMO

The quest to identify materials with tailored properties is increasingly expanding into high-order composition spaces, with a corresponding combinatorial explosion in the number of candidate materials. A key challenge is to discover regions in composition space where materials have novel properties. Traditional predictive models for material properties are not accurate enough to guide the search. Herein, we use high-throughput measurements of optical properties to identify novel regions in three-cation metal oxide composition spaces by identifying compositions whose optical trends cannot be explained by simple phase mixtures. We screen 376,752 distinct compositions from 108 three-cation oxide systems based on the cation elements Mg, Fe, Co, Ni, Cu, Y, In, Sn, Ce, and Ta. Data models for candidate phase diagrams and three-cation compositions with emergent optical properties guide the discovery of materials with complex phase-dependent properties, as demonstrated by the discovery of a Co-Ta-Sn substitutional alloy oxide with tunable transparency, catalytic activity, and stability in strong acid electrolytes. These results required close coupling of data validation to experiment design to generate a reliable end-to-end high-throughput workflow for accelerating scientific discovery.

3.
Nat Methods ; 16(6): 519-525, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31133761

RESUMO

Peptide fragmentation spectra are routinely predicted in the interpretation of mass-spectrometry-based proteomics data. However, the generation of fragment ions has not been understood well enough for scientists to estimate fragment ion intensities accurately. Here, we demonstrate that machine learning can predict peptide fragmentation patterns in mass spectrometers with accuracy within the uncertainty of measurement. Moreover, analysis of our models reveals that peptide fragmentation depends on long-range interactions within a peptide sequence. We illustrate the utility of our models by applying them to the analysis of both data-dependent and data-independent acquisition datasets. In the former case, we observe a q-value-dependent increase in the total number of peptide identifications. In the latter case, we confirm that the use of predicted tandem mass spectrometry spectra is nearly equivalent to the use of spectra from experimental libraries.


Assuntos
Biomarcadores/sangue , Análise de Dados , Fragmentos de Peptídeos/análise , Biblioteca de Peptídeos , Proteoma/análise , Software , Espectrometria de Massas em Tandem/métodos , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Células HeLa , Humanos , Fragmentos de Peptídeos/metabolismo , Proteoma/metabolismo
4.
Mol Syst Biol ; 16(3): e9174, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32181581

RESUMO

We present IDEA (the Induction Dynamics gene Expression Atlas), a dataset constructed by independently inducing hundreds of transcription factors (TFs) and measuring timecourses of the resulting gene expression responses in budding yeast. Each experiment captures a regulatory cascade connecting a single induced regulator to the genes it causally regulates. We discuss the regulatory cascade of a single TF, Aft1, in detail; however, IDEA contains > 200 TF induction experiments with 20 million individual observations and 100,000 signal-containing dynamic responses. As an application of IDEA, we integrate all timecourses into a whole-cell transcriptional model, which is used to predict and validate multiple new and underappreciated transcriptional regulators. We also find that the magnitudes of coefficients in this model are predictive of genetic interaction profile similarities. In addition to being a resource for exploring regulatory connectivity between TFs and their target genes, our modeling approach shows that combining rapid perturbations of individual genes with genome-scale time-series measurements is an effective strategy for elucidating gene regulatory networks.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Saccharomycetales/genética , Fatores de Transcrição/genética , Algoritmos , Bases de Dados Genéticas , Proteínas Fúngicas/genética , Regulação da Expressão Gênica
5.
BMC Bioinformatics ; 19(1): 77, 2018 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-29540156

RESUMO

BACKGROUND: Large image datasets acquired on automated microscopes typically have some fraction of low quality, out-of-focus images, despite the use of hardware autofocus systems. Identification of these images using automated image analysis with high accuracy is important for obtaining a clean, unbiased image dataset. Complicating this task is the fact that image focus quality is only well-defined in foreground regions of images, and as a result, most previous approaches only enable a computation of the relative difference in quality between two or more images, rather than an absolute measure of quality. RESULTS: We present a deep neural network model capable of predicting an absolute measure of image focus on a single image in isolation, without any user-specified parameters. The model operates at the image-patch level, and also outputs a measure of prediction certainty, enabling interpretable predictions. The model was trained on only 384 in-focus Hoechst (nuclei) stain images of U2OS cells, which were synthetically defocused to one of 11 absolute defocus levels during training. The trained model can generalize on previously unseen real Hoechst stain images, identifying the absolute image focus to within one defocus level (approximately 3 pixel blur diameter difference) with 95% accuracy. On a simpler binary in/out-of-focus classification task, the trained model outperforms previous approaches on both Hoechst and Phalloidin (actin) stain images (F-scores of 0.89 and 0.86, respectively over 0.84 and 0.83), despite only having been presented Hoechst stain images during training. Lastly, we observe qualitatively that the model generalizes to two additional stains, Hoechst and Tubulin, of an unseen cell type (Human MCF-7) acquired on a different instrument. CONCLUSIONS: Our deep neural network enables classification of out-of-focus microscope images with both higher accuracy and greater precision than previous approaches via interpretable patch-level focus and certainty predictions. The use of synthetically defocused images precludes the need for a manually annotated training dataset. The model also generalizes to different image and cell types. The framework for model training and image prediction is available as a free software library and the pre-trained model is available for immediate use in Fiji (ImageJ) and CellProfiler.


Assuntos
Diagnóstico por Imagem/métodos , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Microscopia/métodos , Osteossarcoma/diagnóstico , Software , Neoplasias Ósseas/diagnóstico , Humanos , Células Tumorais Cultivadas
6.
J Comput Aided Mol Des ; 30(8): 595-608, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27558503

RESUMO

Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph-atoms, bonds, distances, etc.-which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.


Assuntos
Gráficos por Computador , Desenho Assistido por Computador , Desenho de Fármacos , Aprendizado de Máquina , Redes Neurais de Computação , Ligantes , Estrutura Molecular , Preparações Farmacêuticas/química
7.
Elife ; 122023 04 17.
Artigo em Inglês | MEDLINE | ID: mdl-36975205

RESUMO

Biological age, distinct from an individual's chronological age, has been studied extensively through predictive aging clocks. However, these clocks have limited accuracy in short time-scales. Here we trained deep learning models on fundus images from the EyePACS dataset to predict individuals' chronological age. Our retinal aging clocking, 'eyeAge', predicted chronological age more accurately than other aging clocks (mean absolute error of 2.86 and 3.30 years on quality-filtered data from EyePACS and UK Biobank, respectively). Additionally, eyeAge was independent of blood marker-based measures of biological age, maintaining an all-cause mortality hazard ratio of 1.026 even when adjusted for phenotypic age. The individual-specific nature of eyeAge was reinforced via multiple GWAS hits in the UK Biobank cohort. The top GWAS locus was further validated via knockdown of the fly homolog, Alk, which slowed age-related decline in vision in flies. This study demonstrates the potential utility of a retinal aging clock for studying aging and age-related diseases and quantitatively measuring aging on very short time-scales, opening avenues for quick and actionable evaluation of gero-protective therapeutics.


Assuntos
Envelhecimento , Estudo de Associação Genômica Ampla , Humanos , Pré-Escolar , Envelhecimento/genética , Retina , Fundo de Olho , Diagnóstico por Imagem , Epigênese Genética
8.
STAR Protoc ; 3(4): 101724, 2022 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-36208449

RESUMO

Systematic evolution of ligands by exponential enrichment (SELEX) encompasses a wide variety of high-throughput screening techniques for producing nucleic acid binders to molecular targets through directed evolution. We describe here the design and selection steps for discovery of DNA aptamers with specificity for the two consecutive N-terminal amino acids (AAs) of a small peptide (8-10 amino acids). This bead-based method may be adapted for applications requiring binders which recognize a specific portion of the desired target. For complete details on the use and execution of this protocol, please refer to Hong et al. (2022).


Assuntos
Aptâmeros de Nucleotídeos , Técnica de Seleção de Aptâmeros , Técnica de Seleção de Aptâmeros/métodos , Dipeptídeos , Aptâmeros de Nucleotídeos/química , Ligantes , Ensaios de Triagem em Larga Escala
9.
STAR Protoc ; 3(4): 101829, 2022 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-36386871

RESUMO

Large-scale, high-throughput specificity assays to characterize binding properties within a competitive and complex environment of potential binder-target pairs remain challenging and cost prohibitive. Barcode cycle sequencing (BCS) is a molecular binding assay for proteins, peptides, and other small molecules that is built on a next-generation sequencing (NGS) chip. BCS uses a binder library and targets labeled with unique DNA barcodes. Upon binding, binder barcodes are ligated to target barcodes and sequenced to identify encoded binding events. For complete details on the use and execution of this protocol, please refer to Hong et al. (2022).


Assuntos
Código de Barras de DNA Taxonômico , Sequenciamento de Nucleotídeos em Larga Escala , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Código de Barras de DNA Taxonômico/métodos , Sequência de Bases
10.
iScience ; 25(1): 103586, 2022 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-35005536

RESUMO

We demonstrate early progress toward constructing a high-throughput, single-molecule protein sequencing technology utilizing barcoded DNA aptamers (binders) to recognize terminal amino acids of peptides (targets) tethered on a next-generation sequencing chip. DNA binders deposit unique, amino acid-identifying barcodes on the chip. The end goal is that, over multiple binding cycles, a sequential chain of DNA barcodes will identify the amino acid sequence of a peptide. Toward this, we demonstrate successful target identification with two sets of target-binder pairs: DNA-DNA and Peptide-Protein. For DNA-DNA binding, we show assembly and sequencing of DNA barcodes over six consecutive binding cycles. Intriguingly, our computational simulation predicts that a small set of semi-selective DNA binders offers significant coverage of the human proteome. Toward this end, we introduce a binder discovery pipeline that ultimately could merge with the chip assay into a technology called ProtSeq, for future high-throughput, single-molecule protein sequencing.

11.
Nat Commun ; 13(1): 1590, 2022 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-35338121

RESUMO

Drug discovery for diseases such as Parkinson's disease are impeded by the lack of screenable cellular phenotypes. We present an unbiased phenotypic profiling platform that combines automated cell culture, high-content imaging, Cell Painting, and deep learning. We applied this platform to primary fibroblasts from 91 Parkinson's disease patients and matched healthy controls, creating the largest publicly available Cell Painting image dataset to date at 48 terabytes. We use fixed weights from a convolutional deep neural network trained on ImageNet to generate deep embeddings from each image and train machine learning models to detect morphological disease phenotypes. Our platform's robustness and sensitivity allow the detection of individual-specific variation with high fidelity across batches and plate layouts. Lastly, our models confidently separate LRRK2 and sporadic Parkinson's disease lines from healthy controls (receiver operating characteristic area under curve 0.79 (0.08 standard deviation)), supporting the capacity of this platform for complex disease modeling and drug screening applications.


Assuntos
Aprendizado Profundo , Doença de Parkinson , Fibroblastos , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
12.
Nat Commun ; 12(1): 2366, 2021 04 22.
Artigo em Inglês | MEDLINE | ID: mdl-33888692

RESUMO

Aptamers are single-stranded nucleic acid ligands that bind to target molecules with high affinity and specificity. They are typically discovered by searching large libraries for sequences with desirable binding properties. These libraries, however, are practically constrained to a fraction of the theoretical sequence space. Machine learning provides an opportunity to intelligently navigate this space to identify high-performing aptamers. Here, we propose an approach that employs particle display (PD) to partition a library of aptamers by affinity, and uses such data to train machine learning models to predict affinity in silico. Our model predicted high-affinity DNA aptamers from experimental candidates at a rate 11-fold higher than random perturbation and generated novel, high-affinity aptamers at a greater rate than observed by PD alone. Our approach also facilitated the design of truncated aptamers 70% shorter and with higher binding affinity (1.5 nM) than the best experimental candidate. This work demonstrates how combining machine learning and physical approaches can be used to expedite the discovery of better diagnostic and therapeutic agents.


Assuntos
Aptâmeros de Nucleotídeos/metabolismo , Aprendizado de Máquina , Aptâmeros de Nucleotídeos/química , Aptâmeros de Nucleotídeos/genética , Simulação por Computador , Descoberta de Drogas/métodos , Biblioteca Gênica , Ligantes , Lipocalina-2/química , Lipocalina-2/genética , Lipocalina-2/metabolismo , Modelos Químicos , Ligação Proteica
13.
SLAS Discov ; 24(8): 829-841, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31284814

RESUMO

The etiological underpinnings of many CNS disorders are not well understood. This is likely due to the fact that individual diseases aggregate numerous pathological subtypes, each associated with a complex landscape of genetic risk factors. To overcome these challenges, researchers are integrating novel data types from numerous patients, including imaging studies capturing broadly applicable features from patient-derived materials. These datasets, when combined with machine learning, potentially hold the power to elucidate the subtle patterns that stratify patients by shared pathology. In this study, we interrogated whether high-content imaging of primary skin fibroblasts, using the Cell Painting method, could reveal disease-relevant information among patients. First, we showed that technical features such as batch/plate type, plate, and location within a plate lead to detectable nuisance signals, as revealed by a pre-trained deep neural network and analysis with deep image embeddings. Using a plate design and image acquisition strategy that accounts for these variables, we performed a pilot study with 12 healthy controls and 12 subjects affected by the severe genetic neurological disorder spinal muscular atrophy (SMA), and evaluated whether a convolutional neural network (CNN) generated using a subset of the cells could distinguish disease states on cells from the remaining unseen control-SMA pair. Our results indicate that these two populations could effectively be differentiated from one another and that model selectivity is insensitive to batch/plate type. One caveat is that the samples were also largely separated by source. These findings lay a foundation for how to conduct future studies exploring diseases with more complex genetic contributions and unknown subtypes.


Assuntos
Ensaios de Triagem em Larga Escala , Aprendizado de Máquina , Imagem Molecular , Redes Neurais de Computação , Aprendizado Profundo , Humanos , Processamento de Imagem Assistida por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA