RESUMO
OBJECTIVE: The increasing demands for curated, high-quality research data are driving the emergence of a novel registry type. The need to assemble, curate, and export this data grows, and the conventional simplicity of registry models is driving the need for advanced, multimodal data registries-the dawn of the next-generation registry. MATERIALS AND METHODS: The article provides an outline of the technology roles and responsibilities needed for successful implementations of next-generation registries. RESULTS: We propose a framework for the planning, construction, maintenance, and sustainability of this new registry type. DISCUSSION: A rubric of organizational, computational, and human resource needs is discussed in detail, backed by over 40 years of combined in-the-field experiences by the authors. CONCLUSIONS: A novel field, registry science, within the clinical research informatics domain, has arisen to offer its insights into conceiving, structuring, and sustaining this new breed of tools.
Assuntos
Confiabilidade dos Dados , Humanos , Sistema de RegistrosRESUMO
Clinical research in neurodevelopmental disorders remains reliant upon clinician and caregiver measures. Limitations of these approaches indicate a need for objective, quantitative, and reliable biomarkers to advance clinical research. Extant research suggests the potential utility of multiple candidate biomarkers; however, effective application of these markers in trials requires additional understanding of replicability, individual differences, and intra-individual stability over time. The Autism Biomarkers Consortium for Clinical Trials (ABC-CT) is a multi-site study designed to investigate a battery of electrophysiological (EEG) and eye-tracking (ET) indices as candidate biomarkers for autism spectrum disorder (ASD). The study complements published biomarker research through: inclusion of large, deeply phenotyped cohorts of children with ASD and typical development; a longitudinal design; a focus on well-evidenced candidate biomarkers harmonized with an independent sample; high levels of clinical, regulatory, technical, and statistical rigor; adoption of a governance structure incorporating diverse expertise in the ASD biomarker discovery and qualification process; prioritization of open science, including creation of a repository containing biomarker, clinical, and genetic data; and use of economical and scalable technologies that are applicable in developmental populations and those with special needs. The ABC-CT approach has yielded encouraging results, with one measure accepted into the FDA's Biomarker Qualification Program to date. Through these advances, the ABC-CT and other biomarker studies in progress hold promise to deliver novel tools to improve clinical trials research in ASD.
RESUMO
Mental illness is increasingly recognized as both a significant cost to society and a significant area of opportunity for biological breakthrough. As -omics and imaging technologies enable researchers to probe molecular and physiological underpinnings of multiple diseases, opportunities arise to explore the biological basis for behavioral health and disease. From individual investigators to large international consortia, researchers have generated rich data sets in the area of mental health, including genomic, transcriptomic, metabolomic, proteomic, clinical and imaging resources. General data repositories such as the Gene Expression Omnibus (GEO) and Database of Genotypes and Phenotypes (dbGaP) and mental health (MH)-specific initiatives, such as the Psychiatric Genomics Consortium, MH Research Network and PsychENCODE represent a wealth of information yet to be gleaned. At the same time, novel approaches to integrate and analyze data sets are enabling important discoveries in the area of mental and behavioral health. This review will discuss and catalog into an organizing framework the increasingly diverse set of MH data resources available, using schizophrenia as a focus area, and will describe novel and integrative approaches to molecular biomarker discovery that make use of mental health data.
Assuntos
Biologia Computacional , Saúde Mental , Pesquisa Translacional Biomédica , Biomarcadores/metabolismo , HumanosRESUMO
OBJECTIVE: This study assesses data management needs in clinical research from the perspectives of researchers, software analysts and developers. MATERIALS AND METHODS: This is a mixed-methods study that employs sublanguage analysis in an innovative manner to link the assessments. We performed content analysis using sublanguage theory on transcribed interviews conducted with researchers at four universities. A business analyst independently extracted potential software features from the transcriptions, which were translated into the sublanguage. This common sublanguage was then used to create survey questions for researchers, analysts and developers about the desirability and difficulty of features. Results were synthesized using the common sublanguage to compare stakeholder perceptions with the original content analysis. RESULTS: Individual researchers exhibited significant diversity of perspectives that did not correlate by role or site. Researchers had mixed feelings about their technologies, and sought improvements in integration, interoperability and interaction as well as engaging with study participants. Researchers and analysts agreed that data integration has higher desirability and mobile technology has lower desirability but disagreed on the desirability of data validation rules. Developers agreed that data integration and validation are the most difficult to implement. DISCUSSION: Researchers perceive tasks related to study execution, analysis and quality control as highly strategic, in contrast with tactical tasks related to data manipulation. Researchers have only partial technologic support for analysis and quality control, and poor support for study execution. CONCLUSION: Software for data integration and validation appears critical to support clinical research, but may be expensive to implement. Features to support study workflow, collaboration and engagement have been underappreciated, but may prove to be easy successes. Software developers should consider the strategic goals of researchers with regard to the overall coordination of research projects and teams, workflow connecting data collection with analysis and processes for improving data quality.
Assuntos
Pesquisa Biomédica/métodos , Pesquisa Biomédica/tendências , Gestão do Conhecimento , Informática Médica/métodos , Algoritmos , Computadores , Humanos , Linguagens de Programação , Controle de Qualidade , Software , Interface Usuário-ComputadorRESUMO
Many scientific questions are best approached by sharing data--collected by different groups or across large collaborative networks--into a combined analysis. Unfortunately, some of the most interesting and powerful datasets--like health records, genetic data, and drug discovery data--cannot be freely shared because they contain sensitive information. In many situations, knowing if private datasets overlap determines if it is worthwhile to navigate the institutional, ethical, and legal barriers that govern access to sensitive, private data. We report the first method of publicly measuring the overlap between private datasets that is secure under a malicious model without relying on private protocols or message passing. This method uses a publicly shareable summary of a dataset's contents, its cryptoset, to estimate its overlap with other datasets. Cryptosets approach "information-theoretic" security, the strongest type of security possible in cryptography, which is not even crackable with infinite computing power. We empirically and theoretically assess both the accuracy of these estimates and the security of the approach, demonstrating that cryptosets are informative, with a stable accuracy, and secure.
Assuntos
Segurança Computacional , Disseminação de Informação , Algoritmos , Registros Eletrônicos de Saúde , Humanos , Modelos TeóricosRESUMO
This paper describes a usability evaluation study of an innovative first generation system (Data Dig) designed to retrieve phenotypic data from the large SFARI data set of 2700 families each of which has one child affected with autism spectrum disorder. The usability methods included a cognitive walkthrough and usability testing. Although the subjects were able to learn to use the system, more than 50 usability problems of varying severity were noted. The problems with the greatest frequency resulted from users being unable to understand meanings of variables, filter categories correctly, use the Boolean filter, and correctly interpret the feedback provided by the system. Subjects had difficulty forming a mental model of the organizational system underlying the database. This precluded them from making informed navigation choices while formulating queries. Clinical research informatics is a new and immensely promising discipline. However in its nascent stage, it lacks a stable interaction paradigm to support a range of users on pertinent tasks. This presents great opportunity for researchers to further this science by harnessing the powers of user-centered iterative design.
Assuntos
Transtorno Autístico/diagnóstico , Informática Médica/métodos , Algoritmos , Pesquisa Biomédica/tendências , Criança , Pré-Escolar , Sistemas Computacionais , Computadores , Humanos , Modelos Organizacionais , Modelos Estatísticos , Software , Interface Usuário-ComputadorRESUMO
OBJECTIVE: To propose a centralized method for generating global unique identifiers to link collections of research data and specimens. DESIGN: The work is a collaboration between the Simons Foundation Autism Research Initiative and the National Database for Autism Research. The system is implemented as a web service: an investigator inputs identifying information about a participant into a client application and sends encrypted information to a server application, which returns a generated global unique identifier. The authors evaluated the system using a volume test of one million simulated individuals and a field test on 2000 families (over 8000 individual participants) in an autism study. MEASUREMENTS: Inverse probability of hash codes; rate of false identity of two individuals; rate of false split of single individual; percentage of subjects for which identifying information could be collected; percentage of hash codes generated successfully. RESULTS: Large-volume simulation generated no false splits or false identity. Field testing in the Simons Foundation Autism Research Initiative Simplex Collection produced identifiers for 96% of children in the study and 77% of parents. On average, four out of five hash codes per subject were generated perfectly (only one perfect hash is required for subsequent matching). DISCUSSION: The system must achieve balance among the competing goals of distinguishing individuals, collecting accurate information for matching, and protecting confidentiality. Considerable effort is required to obtain approval from institutional review boards, obtain consent from participants, and to achieve compliance from sites during a multicenter study. CONCLUSION: Generic unique identifiers have the potential to link collections of research data, augment the amount and types of data available for individuals, support detection of overlap between collections, and facilitate replication of research findings.