RESUMO
For more than three decades, concurrent advances in laboratory technologies and computer science have driven the rise of cancer informatics. Today, software tools for cancer research are indispensable to the entire cancer research enterprise.
Assuntos
Neoplasias , Humanos , Biologia Computacional/métodos , Pesquisa Biomédica , SoftwareRESUMO
BACKGROUND AND MOTIVATION: The high-throughput genomics communities have been successfully using standardized spreadsheet-based formats to capture and share data within labs and among public repositories. The nanomedicine community has yet to adopt similar standards to share the diverse and multi-dimensional types of data (including metadata) pertaining to the description and characterization of nanomaterials. Owing to the lack of standardization in representing and sharing nanomaterial data, most of the data currently shared via publications and data resources are incomplete, poorly-integrated, and not suitable for meaningful interpretation and re-use of the data. Specifically, in its current state, data cannot be effectively utilized for the development of predictive models that will inform the rational design of nanomaterials. RESULTS: We have developed a specification called ISA-TAB-Nano, which comprises four spreadsheet-based file formats for representing and integrating various types of nanomaterial data. Three file formats (Investigation, Study, and Assay files) have been adapted from the established ISA-TAB specification; while the Material file format was developed de novo to more readily describe the complexity of nanomaterials and associated small molecules. In this paper, we have discussed the main features of each file format and how to use them for sharing nanomaterial descriptions and assay metadata. CONCLUSION: The ISA-TAB-Nano file formats provide a general and flexible framework to record and integrate nanomaterial descriptions, assay data (metadata and endpoint measurements) and protocol information. Like ISA-TAB, ISA-TAB-Nano supports the use of ontology terms to promote standardized descriptions and to facilitate search and integration of the data. The ISA-TAB-Nano specification has been submitted as an ASTM work item to obtain community feedback and to provide a nanotechnology data-sharing standard for public development and adoption.
Assuntos
Armazenamento e Recuperação da Informação , Nanoestruturas/química , Disseminação de Informação , PesquisaRESUMO
MOTIVATION: Business Architecture Models (BAMs) describe what a business does, who performs the activities, where and when activities are performed, how activities are accomplished and which data are present. The purpose of a BAM is to provide a common resource for understanding business functions and requirements and to guide software development. The cancer Biomedical Informatics Grid (caBIG®) Life Science BAM (LS BAM) provides a shared understanding of the vocabulary, goals and processes that are common in the business of LS research. RESULTS: LS BAM 1.1 includes 90 goals and 61 people and groups within Use Case and Activity Unified Modeling Language (UML) Diagrams. Here we report on the model's current release, LS BAM 1.1, its utility and usage, and plans for future use and continuing development for future releases. AVAILABILITY AND IMPLEMENTATION: The LS BAM is freely available as UML, PDF and HTML (https://wiki.nci.nih.gov/x/OFNyAQ).
Assuntos
Pesquisa Biomédica , Neoplasias , Software , Vocabulário Controlado , Biologia Computacional/métodos , Sistemas Computacionais , National Cancer Institute (U.S.) , Neoplasias/tratamento farmacológico , Neoplasias/fisiopatologia , Estados UnidosRESUMO
Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons.
RESUMO
[This corrects the article on p. 83 in vol. 5, PMID: 28983483.].
RESUMO
The cancer Nanotechnology Laboratory (caNanoLab) data portal is an online nanomaterial database that allows users to submit and retrieve information on well-characterized nanomaterials, including composition, in vitro and in vivo experimental characterizations, experimental protocols, and related publications. Initiated in 2006, caNanoLab serves as an established resource with an infrastructure supporting the structured collection of nanotechnology data to address the needs of the cancer biomedical and nanotechnology communities. The portal contains over 1,000 curated nanomaterial data records that are publicly accessible for review, comparison, and re-use, with the ultimate goal of accelerating the translation of nanotechnology-based cancer therapeutics, diagnostics, and imaging agents to the clinic. In this paper, we will discuss challenges associated with developing a nanomaterial database and recognized needs for nanotechnology data curation and sharing in the biomedical research community. We will also describe the latest version of caNanoLab, caNanoLab 2.0, which includes enhancements and new features to improve usability such as personalized views of data and enhanced search and navigation.
RESUMO
The use of nanotechnology in biomedicine involves the engineering of nanomaterials to act as therapeutic carriers, targeting agents and diagnostic imaging devices. The application of nanotechnology in cancer aims to transform early detection, targeted therapeutics and cancer prevention and control. To assist in expediting and validating the use of nanomaterials in biomedicine, the National Cancer Institute (NCI) Center for Biomedical Informatics and Information Technology, in collaboration with the NCI Alliance for Nanotechnology in Cancer (Alliance), has developed a data sharing portal called caNanoLab. caNanoLab provides access to experimental and literature curated data from the NCI Nanotechnology Characterization Laboratory, the Alliance and the greater cancer nanotechnology community.
Assuntos
Pesquisa Biomédica/tendências , Biologia Computacional/métodos , Neoplasias/etiologia , Pesquisa Biomédica/métodos , Biologia Computacional/tendências , Bases de Dados como Assunto , Genômica/métodos , Genômica/tendências , Sequenciamento de Nucleotídeos em Larga Escala , Projeto Genoma Humano , Humanos , Neoplasias/terapia , Integração de SistemasRESUMO
OBJECTIVE: Meaningful exchange of information is a fundamental challenge in collaborative biomedical research. To help address this, the authors developed the Life Sciences Domain Analysis Model (LS DAM), an information model that provides a framework for communication among domain experts and technical teams developing information systems to support biomedical research. The LS DAM is harmonized with the Biomedical Research Integrated Domain Group (BRIDG) model of protocol-driven clinical research. Together, these models can facilitate data exchange for translational research. MATERIALS AND METHODS: The content of the LS DAM was driven by analysis of life sciences and translational research scenarios and the concepts in the model are derived from existing information models, reference models and data exchange formats. The model is represented in the Unified Modeling Language and uses ISO 21090 data types. RESULTS: The LS DAM v2.2.1 is comprised of 130 classes and covers several core areas including Experiment, Molecular Biology, Molecular Databases and Specimen. Nearly half of these classes originate from the BRIDG model, emphasizing the semantic harmonization between these models. Validation of the LS DAM against independently derived information models, research scenarios and reference databases supports its general applicability to represent life sciences research. DISCUSSION: The LS DAM provides unambiguous definitions for concepts required to describe life sciences research. The processes established to achieve consensus among domain experts will be applied in future iterations and may be broadly applicable to other standardization efforts. CONCLUSIONS: The LS DAM provides common semantics for life sciences research. Through harmonization with BRIDG, it promotes interoperability in translational science.
Assuntos
Disciplinas das Ciências Biológicas , Disseminação de Informação , Sistemas de Informação , Integração de Sistemas , Pesquisa Translacional Biomédica , Humanos , Armazenamento e Recuperação da Informação , Padrões de Referência , Semântica , Unified Medical Language SystemRESUMO
There are several issues to be addressed concerning the management and effective use of information (or data), generated from nanotechnology studies in biomedical research and medicine. These data are large in volume, diverse in content, and are beset with gaps and ambiguities in the description and characterization of nanomaterials. In this work, we have reviewed three areas of nanomedicine informatics: information resources; taxonomies, controlled vocabularies, and ontologies; and information standards. Informatics methods and standards in each of these areas are critical for enabling collaboration; data sharing; unambiguous representation and interpretation of data; semantic (meaningful) search and integration of data; and for ensuring data quality, reliability, and reproducibility. In particular, we have considered four types of information standards in this article, which are standard characterization protocols, common terminology standards, minimum information standards, and standard data communication (exchange) formats. Currently, because of gaps and ambiguities in the data, it is also difficult to apply computational methods and machine learning techniques to analyze, interpret, and recognize patterns in data that are high dimensional in nature, and also to relate variations in nanomaterial properties to variations in their chemical composition, synthesis, characterization protocols, and so on. Progress toward resolving the issues of information management in nanomedicine using informatics methods and standards discussed in this article will be essential to the rapidly growing field of nanomedicine informatics.