Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
J Pathol ; 253(3): 268-278, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33197281

RESUMO

Inconsistencies in the preparation of histology slides and whole-slide images (WSIs) may lead to challenges with subsequent image analysis and machine learning approaches for interrogating the WSI. These variabilities are especially pronounced in multicenter cohorts, where batch effects (i.e. systematic technical artifacts unrelated to biological variability) may introduce biases to machine learning algorithms. To date, manual quality control (QC) has been the de facto standard for dataset curation, but remains highly subjective and is too laborious in light of the increasing scale of tissue slide digitization efforts. This study aimed to evaluate a computer-aided QC pipeline for facilitating a reproducible QC process of WSI datasets. An open source tool, HistoQC, was employed to identify image artifacts and compute quantitative metrics describing visual attributes of WSIs to the Nephrotic Syndrome Study Network (NEPTUNE) digital pathology repository. A comparison in inter-reader concordance between HistoQC aided and unaided curation was performed to quantify improvements in curation reproducibility. HistoQC metrics were additionally employed to quantify the presence of batch effects within NEPTUNE WSIs. Of the 1814 WSIs (458 H&E, 470 PAS, 438 silver, 448 trichrome) from n = 512 cases considered in this study, approximately 9% (163) were identified as unsuitable for subsequent computational analysis. The concordance in the identification of these WSIs among computational pathologists rose from moderate (Gwet's AC1 range 0.43 to 0.59 across stains) to excellent (Gwet's AC1 range 0.79 to 0.93 across stains) agreement when aided by HistoQC. Furthermore, statistically significant batch effects (p < 0.001) in the NEPTUNE WSI dataset were discovered. Taken together, our findings strongly suggest that quantitative QC is a necessary step in the curation of digital pathology cohorts. © 2020 The Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.


Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Nefropatias/diagnóstico , Patologia Cirúrgica/métodos , Controle de Qualidade , Algoritmos , Biópsia , Humanos , Interpretação de Imagem Assistida por Computador/normas , Patologia Cirúrgica/normas
2.
Kidney Int ; 99(1): 86-101, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32835732

RESUMO

The application of deep learning for automated segmentation (delineation of boundaries) of histologic primitives (structures) from whole slide images can facilitate the establishment of novel protocols for kidney biopsy assessment. Here, we developed and validated deep learning networks for the segmentation of histologic structures on kidney biopsies and nephrectomies. For development, we examined 125 biopsies for Minimal Change Disease collected across 29 NEPTUNE enrolling centers along with 459 whole slide images stained with Hematoxylin & Eosin (125), Periodic Acid Schiff (125), Silver (102), and Trichrome (107) divided into training, validation and testing sets (ratio 6:1:3). Histologic structures were manually segmented (30048 total annotations) by five nephropathologists. Twenty deep learning models were trained with optimal digital magnification across the structures and stains. Periodic Acid Schiff-stained whole slide images yielded the best concordance between pathologists and deep learning segmentation across all structures (F-scores: 0.93 for glomerular tufts, 0.94 for glomerular tuft plus Bowman's capsule, 0.91 for proximal tubules, 0.93 for distal tubular segments, 0.81 for peritubular capillaries, and 0.85 for arteries and afferent arterioles). Optimal digital magnifications were 5X for glomerular tuft/tuft plus Bowman's capsule, 10X for proximal/distal tubule, arteries and afferent arterioles, and 40X for peritubular capillaries. Silver stained whole slide images yielded the worst deep learning performance. Thus, this largest study to date adapted deep learning for the segmentation of kidney histologic structures across multiple stains and pathology laboratories. All data used for training and testing and a detailed online tutorial will be publicly available.


Assuntos
Aprendizado Profundo , Biópsia , Corantes , Rim , Córtex Renal/diagnóstico por imagem
3.
Sleep ; 39(5): 1151-64, 2016 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-27070134

RESUMO

ABSTRACT: Professional sleep societies have identified a need for strategic research in multiple areas that may benefit from access to and aggregation of large, multidimensional datasets. Technological advances provide opportunities to extract and analyze physiological signals and other biomedical information from datasets of unprecedented size, heterogeneity, and complexity. The National Institutes of Health has implemented a Big Data to Knowledge (BD2K) initiative that aims to develop and disseminate state of the art big data access tools and analytical methods. The National Sleep Research Resource (NSRR) is a new National Heart, Lung, and Blood Institute resource designed to provide big data resources to the sleep research community. The NSRR is a web-based data portal that aggregates, harmonizes, and organizes sleep and clinical data from thousands of individuals studied as part of cohort studies or clinical trials and provides the user a suite of tools to facilitate data exploration and data visualization. Each deidentified study record minimally includes the summary results of an overnight sleep study; annotation files with scored events; the raw physiological signals from the sleep record; and available clinical and physiological data. NSRR is designed to be interoperable with other public data resources such as the Biologic Specimen and Data Repository Information Coordinating Center Demographics (BioLINCC) data and analyzed with methods provided by the Research Resource for Complex Physiological Signals (PhysioNet). This article reviews the key objectives, challenges and operational solutions to addressing big data opportunities for sleep research in the context of the national sleep research agenda. It provides information to facilitate further interactions of the user community with NSRR, a community resource.


Assuntos
Pesquisa Biomédica/métodos , Pesquisa Biomédica/organização & administração , Bases de Dados Factuais , Conjuntos de Dados como Assunto , Medicina do Sono/organização & administração , Medicina do Sono/tendências , Sono , Ensaios Clínicos como Assunto , Estudos de Coortes , Recursos em Saúde , Humanos , Internet , National Institutes of Health (U.S.)/organização & administração , Medicina do Sono/métodos , Estados Unidos
4.
Front Neuroinform ; 9: 4, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25852536

RESUMO

Data-driven neuroscience research is providing new insights in progression of neurological disorders and supporting the development of improved treatment approaches. However, the volume, velocity, and variety of neuroscience data generated from sophisticated recording instruments and acquisition methods have exacerbated the limited scalability of existing neuroinformatics tools. This makes it difficult for neuroscience researchers to effectively leverage the growing multi-modal neuroscience data to advance research in serious neurological disorders, such as epilepsy. We describe the development of the Cloudwave data flow that uses new data partitioning techniques to store and analyze electrophysiological signal in distributed computing infrastructure. The Cloudwave data flow uses MapReduce parallel programming algorithm to implement an integrated signal data processing pipeline that scales with large volume of data generated at high velocity. Using an epilepsy domain ontology together with an epilepsy focused extensible data representation format called Cloudwave Signal Format (CSF), the data flow addresses the challenge of data heterogeneity and is interoperable with existing neuroinformatics data representation formats, such as HDF5. The scalability of the Cloudwave data flow is evaluated using a 30-node cluster installed with the open source Hadoop software stack. The results demonstrate that the Cloudwave data flow can process increasing volume of signal data by leveraging Hadoop Data Nodes to reduce the total data processing time. The Cloudwave data flow is a template for developing highly scalable neuroscience data processing pipelines using MapReduce algorithms to support a variety of user applications.

5.
J Am Med Inform Assoc ; 21(1): 82-9, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-23686934

RESUMO

OBJECTIVE: Epilepsy encompasses an extensive array of clinical and research subdomains, many of which emphasize multi-modal physiological measurements such as electroencephalography and neuroimaging. The integration of structured, unstructured, and signal data into a coherent structure for patient care as well as clinical research requires an effective informatics infrastructure that is underpinned by a formal domain ontology. METHODS: We have developed an epilepsy and seizure ontology (EpSO) using a four-dimensional epilepsy classification system that integrates the latest International League Against Epilepsy terminology recommendations and National Institute of Neurological Disorders and Stroke (NINDS) common data elements. It imports concepts from existing ontologies, including the Neural ElectroMagnetic Ontologies, and uses formal concept analysis to create a taxonomy of epilepsy syndromes based on their seizure semiology and anatomical location. RESULTS: EpSO is used in a suite of informatics tools for (a) patient data entry, (b) epilepsy focused clinical free text processing, and (c) patient cohort identification as part of the multi-center NINDS-funded study on sudden unexpected death in epilepsy. EpSO is available for download at http://prism.case.edu/prism/index.php/EpilepsyOntology. DISCUSSION: An epilepsy ontology consortium is being created for community-driven extension, review, and adoption of EpSO. We are in the process of submitting EpSO to the BioPortal repository. CONCLUSIONS: EpSO plays a critical role in informatics tools for epilepsy patient care and multi-center clinical research.


Assuntos
Epilepsia/classificação , Convulsões/classificação , Vocabulário Controlado , Morte Súbita/etiologia , Eletrodiagnóstico , Epilepsia/complicações , Humanos , Sistemas Computadorizados de Registros Médicos
6.
J Am Med Inform Assoc ; 21(2): 263-71, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24326538

RESUMO

OBJECTIVE: The rapidly growing volume of multimodal electrophysiological signal data is playing a critical role in patient care and clinical research across multiple disease domains, such as epilepsy and sleep medicine. To facilitate secondary use of these data, there is an urgent need to develop novel algorithms and informatics approaches using new cloud computing technologies as well as ontologies for collaborative multicenter studies. MATERIALS AND METHODS: We present the Cloudwave platform, which (a) defines parallelized algorithms for computing cardiac measures using the MapReduce parallel programming framework, (b) supports real-time interaction with large volumes of electrophysiological signals, and (c) features signal visualization and querying functionalities using an ontology-driven web-based interface. Cloudwave is currently used in the multicenter National Institute of Neurological Diseases and Stroke (NINDS)-funded Prevention and Risk Identification of SUDEP (sudden unexplained death in epilepsy) Mortality (PRISM) project to identify risk factors for sudden death in epilepsy. RESULTS: Comparative evaluations of Cloudwave with traditional desktop approaches to compute cardiac measures (eg, QRS complexes, RR intervals, and instantaneous heart rate) on epilepsy patient data show one order of magnitude improvement for single-channel ECG data and 20 times improvement for four-channel ECG data. This enables Cloudwave to support real-time user interaction with signal data, which is semantically annotated with a novel epilepsy and seizure ontology. DISCUSSION: Data privacy is a critical issue in using cloud infrastructure, and cloud platforms, such as Amazon Web Services, offer features to support Health Insurance Portability and Accountability Act standards. CONCLUSION: The Cloudwave platform is a new approach to leverage of large-scale electrophysiological data for advancing multicenter clinical research.


Assuntos
Algoritmos , Redes de Comunicação de Computadores , Bases de Dados Factuais , Eletrocardiografia , Epilepsia/fisiopatologia , Processamento de Sinais Assistido por Computador , Arritmias Cardíacas/complicações , Arritmias Cardíacas/diagnóstico , Pesquisa Biomédica , Redes de Comunicação de Computadores/economia , Confidencialidade , Análise Custo-Benefício , Morte Súbita , Técnicas Eletrofisiológicas Cardíacas , Epilepsia/complicações , Health Insurance Portability and Accountability Act , Humanos , Internet , Estados Unidos
7.
Stud Health Technol Inform ; 192: 817-21, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23920671

RESUMO

Epilepsy is the most common serious neurological disorder affecting 50-60 million persons worldwide. Electrophysiological data recordings, such as electroencephalogram (EEG), are the gold standard for diagnosis and pre-surgical evaluation in epilepsy patients. The increasing trend towards multi-center clinical studies require signal visualization and analysis tools to support real time interaction with signal data in a collaborative environment, which cannot be supported by traditional desktop-based standalone applications. As part of the Prevention and Risk Identification of SUDEP Mortality (PRISM) project, we have developed a Web-based electrophysiology data visualization and analysis platform called Cloudwave using highly scalable open source cloud computing infrastructure. Cloudwave is integrated with the PRISM patient cohort identification tool called MEDCIS (Multi-modality Epilepsy Data Capture and Integration System). The Epilepsy and Seizure Ontology (EpSO) underpins both Cloudwave and MEDCIS to support query composition and result retrieval. Cloudwave is being used by clinicians and research staff at the University Hospital - Case Medical Center (UH-CMC) Epilepsy Monitoring Unit (EMU) and will be progressively deployed at four EMUs in the United States and the United Kingdomas part of the PRISM project.


Assuntos
Pesquisa Biomédica/métodos , Diagnóstico por Computador/métodos , Eletroencefalografia/métodos , Epilepsia/diagnóstico , Armazenamento e Recuperação da Informação/métodos , Internet , Interface Usuário-Computador , Algoritmos , Bases de Dados Factuais , Eletroencefalografia/estatística & dados numéricos , Epilepsia/fisiopatologia , Humanos , Software
8.
AMIA Annu Symp Proc ; 2013: 691-700, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24551370

RESUMO

Epilepsy is the most common serious neurological disorder affecting 50-60 million persons worldwide. Multi-modal electrophysiological data, such as electroencephalography (EEG) and electrocardiography (EKG), are central to effective patient care and clinical research in epilepsy. Electrophysiological data is an example of clinical "big data" consisting of more than 100 multi-channel signals with recordings from each patient generating 5-10GB of data. Current approaches to store and analyze signal data using standalone tools, such as Nihon Kohden neurology software, are inadequate to meet the growing volume of data and the need for supporting multi-center collaborative studies with real time and interactive access. We introduce the Cloudwave platform in this paper that features a Web-based intuitive signal analysis interface integrated with a Hadoop-based data processing module implemented on clinical data stored in a "private cloud". Cloudwave has been developed as part of the National Institute of Neurological Disorders and Strokes (NINDS) funded multi-center Prevention and Risk Identification of SUDEP Mortality (PRISM) project. The Cloudwave visualization interface provides real-time rendering of multi-modal signals with "montages" for EEG feature characterization over 2TB of patient data generated at the Case University Hospital Epilepsy Monitoring Unit. Results from performance evaluation of the Cloudwave Hadoop data processing module demonstrate one order of magnitude improvement in performance over 77GB of patient data. (Cloudwave project: http://prism.case.edu/prism/index.php/Cloudwave).


Assuntos
Eletroencefalografia , Epilepsia/fisiopatologia , Internet , Processamento de Sinais Assistido por Computador , Pesquisa Biomédica , Eletrocardiografia , Processamento Eletrônico de Dados , Humanos , Software
9.
BMC Syst Biol ; 6 Suppl 3: S20, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23282161

RESUMO

BACKGROUND: One of the primary challenges in translational research data management is breaking down the barriers between the multiple data silos and the integration of 'omics data with clinical information to complete the cycle from the bench to the bedside. The role of contextual metadata, also called provenance information, is a key factor ineffective data integration, reproducibility of results, correct attribution of original source, and answering research queries involving "What", "Where", "When", "Which", "Who", "How", and "Why" (also known as the W7 model). But, at present there is limited or no effective approach to managing and leveraging provenance information for integrating data across studies or projects. Hence, there is an urgent need for a paradigm shift in creating a "provenance-aware" informatics platform to address this challenge. We introduce an ontology-driven, intuitive Semantic Proteomics Dashboard (SemPoD) that uses provenance together with domain information (semantic provenance) to enable researchers to query, compare, and correlate different types of data across multiple projects, and allow integration with legacy data to support their ongoing research. RESULTS: The SemPoD platform, currently in use at the Case Center for Proteomics and Bioinformatics (CPB), consists of three components: (a) Ontology-driven Visual Query Composer, (b) Result Explorer, and (c) Query Manager. Currently, SemPoD allows provenance-aware querying of 1153 mass-spectrometry experiments from 20 different projects. SemPod uses the systems molecular biology provenance ontology (SysPro) to support a dynamic query composition interface, which automatically updates the components of the query interface based on previous user selections and efficiently prunes the result set usinga "smart filtering" approach. The SysPro ontology re-uses terms from the PROV-ontology (PROV-O) being developed by the World Wide Web Consortium (W3C) provenance working group, the minimum information required for reporting a molecular interaction experiment (MIMIx), and the minimum information about a proteomics experiment (MIAPE) guidelines. The SemPoD was evaluated both in terms of user feedback and as scalability of the system. CONCLUSIONS: SemPoD is an intuitive and powerful provenance ontology-driven data access and query platform that uses the MIAPE and MIMIx metadata guideline to create an integrated view over large-scale systems molecular biology datasets. SemPoD leverages the SysPro ontology to create an intuitive dashboard for biologists to compose queries, explore the results, and use a query manager for storing queries for later use. SemPoD can be deployed over many existing database applications storing 'omics data, including, as illustrated here, the LabKey data-management system. The initial user feedback evaluating the usability and functionality of SemPoD has been very positive and it is being considered for wider deployment beyond the proteomics domain, and in other 'omics' centers.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Proteômica/métodos , Pesquisa Translacional Biomédica , Algoritmos , Animais , Simulação por Computador , Doença/genética , Modelos Animais de Doenças , Humanos , Espectrometria de Massas , Camundongos , Polimorfismo de Nucleotídeo Único , Proteínas/análise , Proteínas/química , Reprodutibilidade dos Testes , Semântica , Transdução de Sinais , Software , Biologia de Sistemas , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA