Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Cell ; 153(3): 707-20, 2013 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-23622250

RESUMO

The genetics of complex disease produce alterations in the molecular interactions of cellular pathways whose collective effect may become clear through the organized structure of molecular networks. To characterize molecular systems associated with late-onset Alzheimer's disease (LOAD), we constructed gene-regulatory networks in 1,647 postmortem brain tissues from LOAD patients and nondemented subjects, and we demonstrate that LOAD reconfigures specific portions of the molecular interaction structure. Through an integrative network-based approach, we rank-ordered these network structures for relevance to LOAD pathology, highlighting an immune- and microglia-specific module that is dominated by genes involved in pathogen phagocytosis, contains TYROBP as a key regulator, and is upregulated in LOAD. Mouse microglia cells overexpressing intact or truncated TYROBP revealed expression changes that significantly overlapped the human brain TYROBP network. Thus the causal network structure is a useful predictor of response to gene perturbations and presents a framework to test models of disease mechanisms underlying LOAD.


Assuntos
Doença de Alzheimer/genética , Encéfalo/metabolismo , Redes Reguladoras de Genes , Proteínas Adaptadoras de Transdução de Sinal/metabolismo , Doença de Alzheimer/metabolismo , Animais , Teorema de Bayes , Encéfalo/patologia , Humanos , Proteínas de Membrana/metabolismo , Camundongos , Microglia/metabolismo
2.
Artigo em Inglês | MEDLINE | ID: mdl-33088611

RESUMO

The productivity of computational biologists is limited by the speed of their workflows and subsequent overall job throughput. Because most biomedical researchers are focused on better understanding scientific phenomena rather than developing and optimizing code, a computing and data system implemented in an adventitious and/or non-optimized manner can impede the progress of scientific discovery. In our experience, most computational, life-science applications do not generally leverage the full capabilities of high-performance computing, so tuning a system for these applications is especially critical. To optimize a system effectively, systems staff must understand the effects of the applications on the system. Effective stewardship of the system includes an analysis of the impact of the applications on the compute cores, file system, resource manager and queuing policies. The resulting improved system design, and enactment of a sustainability plan, help to enable a long-term resource for productive computational and data science. We present a case study of a typical biomedical computational workload at a leading academic medical center supporting over $100 million per year in computational biology research. Over the past eight years, our high-performance computing system has enabled over 900 biomedical publications in four major areas: genetics and population analysis, gene expression, machine learning, and structural and chemical biology. We have upgraded the system several times in response to trends, actual usage, and user feedback. Major components crucial to this evolution include scheduling structure and policies, memory size, compute type and speed, parallel file system capabilities, and deployment of cloud technologies. We evolved a 70 teraflop machine to a 1.4 petaflop machine in seven years and grew our user base nearly 10-fold. For long-term stability and sustainability, we established a chargeback fee structure. Our overarching guiding principle for each progression has been to increase scientific throughput and enable enhanced scientific fidelity with minimal impact to existing user workflows or code. This highly-constrained system optimization has presented unique challenges, leading us to adopt new approaches to provide constructive pathways forward. We share our practical strategies resulting from our ongoing growth and assessments.

3.
Sci Rep ; 9(1): 12495, 2019 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-31467326

RESUMO

The rapid development of deep learning, a family of machine learning techniques, has spurred much interest in its application to medical imaging problems. Here, we develop a deep learning algorithm that can accurately detect breast cancer on screening mammograms using an "end-to-end" training approach that efficiently leverages training datasets with either complete clinical annotation or only the cancer status (label) of the whole image. In this approach, lesion annotations are required only in the initial training stage, and subsequent stages require only image-level labels, eliminating the reliance on rarely available lesion annotations. Our all convolutional network method for classifying screening mammograms attained excellent performance in comparison with previous methods. On an independent test set of digitized film mammograms from the Digital Database for Screening Mammography (CBIS-DDSM), the best single model achieved a per-image AUC of 0.88, and four-model averaging improved the AUC to 0.91 (sensitivity: 86.1%, specificity: 80.1%). On an independent test set of full-field digital mammography (FFDM) images from the INbreast database, the best single model achieved a per-image AUC of 0.95, and four-model averaging improved the AUC to 0.98 (sensitivity: 86.7%, specificity: 96.1%). We also demonstrate that a whole image classifier trained using our end-to-end approach on the CBIS-DDSM digitized film mammograms can be transferred to INbreast FFDM images using only a subset of the INbreast data for fine-tuning and without further reliance on the availability of lesion annotations. These findings show that automatic deep learning methods can be readily trained to attain high accuracy on heterogeneous mammography platforms, and hold tremendous promise for improving clinical tools to reduce false positive and false negative screening mammography results. Code and model available at: https://github.com/lishen/end2end-all-conv .


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo , Mamografia , Algoritmos , Neoplasias da Mama/diagnóstico , Bases de Dados Factuais , Diagnóstico por Computador , Detecção Precoce de Câncer , Feminino , Humanos
4.
Gigascience ; 6(8): 1-13, 2017 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-28814063

RESUMO

Visualizations of biomolecular networks assist in systems-level data exploration in many cellular processes. Data generated from high-throughput experiments increasingly inform these networks, yet current tools do not adequately scale with concomitant increase in their size and complexity. We present an open source software platform, interactome-CAVE (iCAVE), for visualizing large and complex biomolecular interaction networks in 3D. Users can explore networks (i) in 3D using a desktop, (ii) in stereoscopic 3D using 3D-vision glasses and a desktop, or (iii) in immersive 3D within a CAVE environment. iCAVE introduces 3D extensions of known 2D network layout, clustering, and edge-bundling algorithms, as well as new 3D network layout algorithms. Furthermore, users can simultaneously query several built-in databases within iCAVE for network generation or visualize their own networks (e.g., disease, drug, protein, metabolite). iCAVE has modular structure that allows rapid development by addition of algorithms, datasets, or features without affecting other parts of the code. Overall, iCAVE is the first freely available open source tool that enables 3D (optionally stereoscopic or immersive) visualizations of complex, dense, or multi-layered biomolecular networks. While primarily designed for researchers utilizing biomolecular networks, iCAVE can assist researchers in any field.


Assuntos
Biologia Computacional/métodos , Software , Algoritmos , Animais , Bases de Dados Factuais , Redes Reguladoras de Genes , Humanos , Redes e Vias Metabólicas , Mapas de Interação de Proteínas , Transdução de Sinais , Interface Usuário-Computador
5.
SC Conf Proc ; 20152015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-30788464

RESUMO

As personalized medicine becomes more integrated into healthcare, the rate at which human genomes are being sequenced is rising quickly together with a concomitant acceleration in compute and storage requirements. To achieve the most effective solution for genomic workloads without re-architecting the industry-standard software, we performed a rigorous analysis of usage statistics, benchmarks and available technologies to design a system for maximum throughput. We share our experiences designing a system optimized for the "Genome Analysis ToolKit (GATK) Best Practices" whole genome DNA and RNA pipeline based on an evaluation of compute, workload and I/O characteristics. The characteristics of genomic-based workloads are vastly different from those of traditional HPC workloads, requiring different configurations of the scheduler and the I/O subsystem to achieve reliability, performance and scalability. By understanding how our researchers and clinicians work, we were able to employ techniques not only to speed up their workflow yielding improved and repeatable performance, but also to make more efficient use of storage and compute resources.

6.
J Chem Inf Comput Sci ; 43(3): 743-52, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12767132

RESUMO

We present an application of a novel methodology called Text Influenced Molecular Indexing (TIMI) to mine the information in the scientific literature. TIMI is an extension of two existing methodologies: (1) Latent Semantic Structure Indexing (LaSSI), a method for calculating chemical similarity using two-dimensional topological descriptors, and (2) Latent Semantic Indexing (LSI), a method for generating correlations between textual terms. The singular value decomposition (SVD) of a feature/object matrix is the fundamental mathematical operation underlying LSI, LaSSI, and TIMI and is used in the identification of associations between textual and chemical descriptors. We present the results of our studies with a database containing 11,571 PubMed/MEDLINE abstracts which show the advantages of merging textual and chemical descriptors over using either text or chemistry alone. Our work demonstrates that searching text-only databases limits retrieved documents to those that explicitly mention compounds by name in the text. Similarly, searching chemistry-only databases can only retrieve those documents that have chemical structures in them. TIMI, however, enables search and retrieval of documents with textual, chemical, and/or text- and chemistry-based queries. Thus, the TIMI system offers a powerful new approach to uncovering the contextual scientific knowledge sought by the medical research community.


Assuntos
Bases de Dados Factuais , Compostos Orgânicos , Preparações Farmacêuticas , Descritores , Algoritmos , Química Farmacêutica/métodos , MEDLINE
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA