RESUMO
This commentary introduces a new clinical trial construct, the Master Observational Trial (MOT), which hybridizes the power of molecularly based master interventional protocols with the breadth of real-world data. The MOT provides a clinical venue to allow molecular medicine to rapidly advance, answers questions that traditional interventional trials generally do not address, and seamlessly integrates with interventional trials in both diagnostic and therapeutic arenas. The result is a more comprehensive data collection ecosystem in precision medicine.
Assuntos
Estudos Observacionais como Assunto/métodos , Medicina de Precisão/métodos , Projetos de Pesquisa/normas , Big Data , Protocolos de Ensaio Clínico como Assunto , Humanos , Terapia de Alvo Molecular/métodos , Terapia de Alvo Molecular/tendências , Estudos Observacionais como Assunto/normasRESUMO
Scientists are enthusiastically imagining ways in which artificial intelligence (AI) tools might improve research. Why are AI tools so attractive and what are the risks of implementing them across the research pipeline? Here we develop a taxonomy of scientists' visions for AI, observing that their appeal comes from promises to improve productivity and objectivity by overcoming human shortcomings. But proposed AI solutions can also exploit our cognitive limitations, making us vulnerable to illusions of understanding in which we believe we understand more about the world than we actually do. Such illusions obscure the scientific community's ability to see the formation of scientific monocultures, in which some types of methods, questions and viewpoints come to dominate alternative approaches, making science less innovative and more vulnerable to errors. The proliferation of AI tools in science risks introducing a phase of scientific enquiry in which we produce more but understand less. By analysing the appeal of these tools, we provide a framework for advancing discussions of responsible knowledge production in the age of AI.
Assuntos
Inteligência Artificial , Ilusões , Conhecimento , Projetos de Pesquisa , Pesquisadores , Humanos , Inteligência Artificial/provisão & distribuição , Inteligência Artificial/tendências , Cognição , Difusão de Inovações , Eficiência , Reprodutibilidade dos Testes , Projetos de Pesquisa/normas , Projetos de Pesquisa/tendências , Risco , Pesquisadores/psicologia , Pesquisadores/normasRESUMO
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.
Assuntos
Inteligência Artificial , Projetos de Pesquisa , Inteligência Artificial/normas , Inteligência Artificial/tendências , Conjuntos de Dados como Assunto , Aprendizado Profundo , Projetos de Pesquisa/normas , Projetos de Pesquisa/tendências , Aprendizado de Máquina não SupervisionadoRESUMO
Detailed method descriptions are essential for reproducibility, research evaluation, and effective data reuse. We summarize the key recommendations for life sciences researchers and research institutions described in the European Commission PRO-MaP report.
Assuntos
Disciplinas das Ciências Biológicas , Disciplinas das Ciências Biológicas/métodos , Humanos , Projetos de Pesquisa/normas , Reprodutibilidade dos TestesRESUMO
Upon completion of an experiment, if a trend is observed that is "not quite significant," it can be tempting to collect more data in an effort to achieve statistical significance. Such sample augmentation or "N-hacking" is condemned because it can lead to an excess of false positives, which can reduce the reproducibility of results. However, the scenarios used to prove this rule tend to be unrealistic, assuming the addition of unlimited extra samples to achieve statistical significance, or doing so when results are not even close to significant; an unlikely situation for most experiments involving patient samples, cultured cells, or live animals. If we were to examine some more realistic scenarios, could there be any situations where N-hacking might be an acceptable practice? This Essay aims to address this question, using simulations to demonstrate how N-hacking causes false positives and to investigate whether this increase is still relevant when using parameters based on real-life experimental settings.
Assuntos
Confiabilidade dos Dados , Projetos de Pesquisa , Reprodutibilidade dos Testes , Projetos de Pesquisa/normasRESUMO
Context-dependent biological variation presents a unique challenge to the reproducibility of results in experimental animal research, because organisms' responses to experimental treatments can vary with both genotype and environmental conditions. In March 2019, experts in animal biology, experimental design and statistics convened in Blonay, Switzerland, to discuss strategies addressing this challenge. In contrast to the current gold standard of rigorous standardization in experimental animal research, we recommend the use of systematic heterogenization of study samples and conditions by actively incorporating biological variation into study design through diversifying study samples and conditions. Here we provide the scientific rationale for this approach in the hope that researchers, regulators, funders and editors can embrace this paradigm shift. We also present a road map towards better practices in view of improving the reproducibility of animal research.
Assuntos
Experimentação Animal/normas , Variação Biológica da População , Projetos de Pesquisa/normas , Animais , Reprodutibilidade dos TestesRESUMO
The power of language to modify the reader's perception of interpreting biomedical results cannot be underestimated. Misreporting and misinterpretation are pressing problems in randomized controlled trials (RCT) output. This may be partially related to the statistical significance paradigm used in clinical trials centered around a P value below 0.05 cutoff. Strict use of this P value may lead to strategies of clinical researchers to describe their clinical results with P values approaching but not reaching the threshold to be "almost significant." The question is how phrases expressing nonsignificant results have been reported in RCTs over the past 30 years. To this end, we conducted a quantitative analysis of English full texts containing 567,758 RCTs recorded in PubMed between 1990 and 2020 (81.5% of all published RCTs in PubMed). We determined the exact presence of 505 predefined phrases denoting results that approach but do not cross the line of formal statistical significance (P < 0.05). We modeled temporal trends in phrase data with Bayesian linear regression. Evidence for temporal change was obtained through Bayes factor (BF) analysis. In a randomly sampled subset, the associated P values were manually extracted. We identified 61,741 phrases in 49,134 RCTs indicating almost significant results (8.65%; 95% confidence interval (CI): 8.58% to 8.73%). The overall prevalence of these phrases remained stable over time, with the most prevalent phrases being "marginally significant" (in 7,735 RCTs), "all but significant" (7,015), "a nonsignificant trend" (3,442), "failed to reach statistical significance" (2,578), and "a strong trend" (1,700). The strongest evidence for an increased temporal prevalence was found for "a numerical trend," "a positive trend," "an increasing trend," and "nominally significant." In contrast, the phrases "all but significant," "approaches statistical significance," "did not quite reach statistical significance," "difference was apparent," "failed to reach statistical significance," and "not quite significant" decreased over time. In a random sampled subset of 29,000 phrases, the manually identified and corresponding 11,926 P values, 68,1% ranged between 0.05 and 0.15 (CI: 67. to 69.0; median 0.06). Our results show that RCT reports regularly contain specific phrases describing marginally nonsignificant results to report P values close to but above the dominant 0.05 cutoff. The fact that the prevalence of the phrases remained stable over time indicates that this practice of broadly interpreting P values close to a predefined threshold remains prevalent. To enhance responsible and transparent interpretation of RCT results, researchers, clinicians, reviewers, and editors may reduce the focus on formal statistical significance thresholds and stimulate reporting of P values with corresponding effect sizes and CIs and focus on the clinical relevance of the statistical difference found in RCTs.
Assuntos
PubMed/normas , Publicações/normas , Ensaios Clínicos Controlados Aleatórios como Assunto/normas , Projetos de Pesquisa/normas , Relatório de Pesquisa/normas , Teorema de Bayes , Viés , Humanos , Modelos Lineares , Avaliação de Resultados em Cuidados de Saúde/métodos , Avaliação de Resultados em Cuidados de Saúde/normas , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos , PubMed/estatística & dados numéricos , Publicações/estatística & dados numéricos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Reprodutibilidade dos TestesRESUMO
The goal of sex and gender analysis is to promote rigorous, reproducible and responsible science. Incorporating sex and gender analysis into experimental design has enabled advancements across many disciplines, such as improved treatment of heart disease and insights into the societal impact of algorithmic bias. Here we discuss the potential for sex and gender analysis to foster scientific discovery, improve experimental efficiency and enable social equality. We provide a roadmap for sex and gender analysis across scientific disciplines and call on researchers, funding agencies, peer-reviewed journals and universities to coordinate efforts to implement robust methods of sex and gender analysis.
Assuntos
Engenharia/métodos , Engenharia/normas , Projetos de Pesquisa/normas , Projetos de Pesquisa/tendências , Ciência/métodos , Ciência/normas , Caracteres Sexuais , Fatores Sexuais , Animais , Inteligência Artificial , Feminino , Humanos , Masculino , Terapia de Alvo Molecular , Reprodutibilidade dos Testes , Tamanho da AmostraRESUMO
NRG Oncology's Developmental Therapeutics and Radiation Therapy Subcommittee assembled an interdisciplinary group of investigators to address barriers to successful early phase clinical trials of novel combination therapies involving radiation. This Policy Review elucidates some of the many challenges associated with study design for early phase trials combining radiotherapy with novel systemic agents, which are distinct from drug-drug combination development and are often overlooked. We also advocate for potential solutions that could mitigate or eliminate some of these barriers, providing examples of specific clinical trial designs that could help facilitate efficient and effective evaluation of novel drug-radiotherapy combinations.
Assuntos
Ensaios Clínicos como Assunto , Neoplasias , Humanos , Neoplasias/radioterapia , Quimiorradioterapia/efeitos adversos , Projetos de Pesquisa/normas , Radioterapia (Especialidade)/normasRESUMO
The requirement of large-scale expensive cancer screening trials spanning decades creates considerable barriers to the development, commercialisation, and implementation of novel screening tests. One way to address these problems is to use surrogate endpoints for the ultimate endpoint of interest, cancer mortality, at an earlier timepoint. This Review aims to highlight the issues underlying the choice and use of surrogate endpoints for cancer screening trials, to propose criteria for when and how we might use such endpoints, and to suggest possible candidates. We present the current landscape and challenges, and discuss lessons and shortcomings from the therapeutic trial setting. It is hugely challenging to validate a surrogate endpoint, even with carefully designed clinical studies. Nevertheless, we consider whether there are candidates that might satisfy the requirements defined by research and regulatory bodies.
Assuntos
Detecção Precoce de Câncer , Neoplasias , Humanos , Detecção Precoce de Câncer/métodos , Neoplasias/diagnóstico , Biomarcadores Tumorais/análise , Ensaios Clínicos como Assunto , Projetos de Pesquisa/normas , Biomarcadores/análise , Determinação de Ponto FinalRESUMO
BACKGROUND: When research evidence is limited, inconsistent, or absent, healthcare decisions and policies need to be based on consensus amongst interested stakeholders. In these processes, the knowledge, experience, and expertise of health professionals, researchers, policymakers, and the public are systematically collected and synthesised to reach agreed clinical recommendations and/or priorities. However, despite the influence of consensus exercises, the methods used to achieve agreement are often poorly reported. The ACCORD (ACcurate COnsensus Reporting Document) guideline was developed to help report any consensus methods used in biomedical research, regardless of the health field, techniques used, or application. This explanatory document facilitates the use of the ACCORD checklist. METHODS AND FINDINGS: This paper was built collaboratively based on classic and contemporary literature on consensus methods and publications reporting their use. For each ACCORD checklist item, this explanation and elaboration document unpacks the pieces of information that should be reported and provides a rationale on why it is essential to describe them in detail. Furthermore, this document offers a glossary of terms used in consensus exercises to clarify the meaning of common terms used across consensus methods, to promote uniformity, and to support understanding for consumers who read consensus statements, position statements, or clinical practice guidelines. The items are followed by examples of reporting items from the ACCORD guideline, in text, tables and figures. CONCLUSIONS: The ACCORD materials - including the reporting guideline and this explanation and elaboration document - can be used by anyone reporting a consensus exercise used in the context of health research. As a reporting guideline, ACCORD helps researchers to be transparent about the materials, resources (both human and financial), and procedures used in their investigations so readers can judge the trustworthiness and applicability of their results/recommendations.
Assuntos
Lista de Checagem , Consenso , Humanos , Pesquisa Biomédica/normas , Projetos de Pesquisa/normas , Guias como Assunto , Relatório de Pesquisa/normasRESUMO
BACKGROUND: Clinical studies are often limited by resources available, which results in constraints on sample size. We use simulated data to illustrate study implications when the sample size is too small. METHODS AND RESULTS: Using 2 theoretical populations each with Nâ =â 1000, we randomly sample 10 from each population and conduct a statistical comparison, to help make a conclusion about whether the 2 populations are different. This exercise is repeated for a total of 4 studies: 2 concluded that the 2 populations are statistically significantly different, while 2 showed no statistically significant difference. CONCLUSIONS: Our simulated examples demonstrate that sample sizes play important roles in clinical research. The results and conclusions, in terms of estimates of means, medians, Pearson correlations, chi-square test, and P values, are unreliable with small samples.
Assuntos
Projetos de Pesquisa , Tamanho da Amostra , Humanos , Projetos de Pesquisa/normasRESUMO
Evidence-based medicine (EBM) can be an unfamiliar territory for those working in tumor pathology research, and there is a great deal of uncertainty about how to undertake an EBM approach to planning and reporting histopathology-based studies. In this article, reviewed and endorsed by the Word Health Organization International Agency for Research on Cancer's International Collaboration for Cancer Classification and Research, we aim to help pathologists and researchers understand the basics of planning an evidence-based tumor pathology research study, as well as our recommendations on how to report the findings from these. We introduce some basic EBM concepts, a framework for research questions, and thoughts on study design and emphasize the concept of reporting standards. There are many study-specific reporting guidelines available, and we provide an overview of these. However, existing reporting guidelines perhaps do not always fit tumor pathology research papers, and hence, here, we collate the key reporting data set together into one generic checklist that we think will simplify the task for pathologists. The article aims to complement our recent hierarchy of evidence for tumor pathology and glossary of evidence (study) types in tumor pathology. Together, these articles should help any researcher get to grips with the basics of EBM for planning and publishing research in tumor pathology, as well as encourage an improved standard of the reports available to us all in the literature.
Assuntos
Medicina Baseada em Evidências , Neoplasias , Organização Mundial da Saúde , Humanos , Neoplasias/patologia , Neoplasias/classificação , Patologistas , Pesquisa Biomédica , Projetos de Pesquisa/normas , Patologia/normas , Lacunas de EvidênciasRESUMO
BACKGROUND: Non-inferiority (NI) trials require unique trial design and methods, which pose challenges in their interpretation and applicability, risking introduction of inferior therapies in clinical practice. With the abundance of novel therapies, NI trials are increasing in publication. Prior studies found inadequate quality of reporting of NI studies, but were limited to certain specialties/journals, lacked NI margin evaluation, and did not examine temporal changes in quality. We conducted a systematic review without restriction to journal type, journal impact factor, disease state or intervention to evaluate the quality of NI trials, including a comprehensive risk of bias assessment and comparison of quality over time. METHODOLOGY: We searched PubMed and Cochrane Library databases for NI trials published in English in 2014 and 2019. They were assessed for: study design and NI margin characteristics, primary results, and risk of bias for blinding, concealment, analysis method and missing outcome data. RESULTS: We included 823 studies. Between 2014 and 2019, a shift from publication in specialty to general journals (15% vs 28%, p < 0.001) and from pharmacological to non-pharmacological interventions (25% vs 38%, p = 0.025) was observed. The NI margin was specified in most trials for both years (94% vs 95%). Rationale for the NI margin increased (36% vs 57%, p < 0.001), but remained low, with clinical judgement the most common rationale (30% vs 23%), but more 2019 articles incorporating patient values (0.3% vs 21%, p < 0.001). Over 50% of studies were open-label for both years. Gold standard method of analyses (both per protocol + (modified) intention to treat) declined over time (43% vs 36%, p < 0.001). DISCUSSION: The methodological quality and reporting of NI trials remains inadequate although improving in some areas. Improved methods for NI margin justification, blinding, and analysis method are warranted to facilitate clinical decision-making.
Assuntos
Estudos de Equivalência como Asunto , Humanos , Projetos de Pesquisa/normasRESUMO
The premise of research in human physiology is to explore a multifaceted system whilst identifying one or a few outcomes of interest. Therefore, the control of potentially confounding variables requires careful thought regarding the extent of control and complexity of standardisation. One common factor to control prior to testing is diet, as food and fluid provision may deviate from participants' habitual diets, yet a self-report and replication method can be flawed by under-reporting. Researchers may also need to consider standardisation of physical activity, whether it be through familiarisation trials, wash-out periods, or guidance on levels of physical activity to be achieved before trials. In terms of pharmacological agents, the ethical implications of standardisation require researchers to carefully consider how medications, caffeine consumption and oral contraceptive prescriptions may affect the study. For research in females, it should be considered whether standardisation between- or within-participants in regards to menstrual cycle phase is most relevant. The timing of measurements relative to various other daily events is relevant to all physiological research and so it can be important to standardise when measurements are made. This review summarises the areas of standardisation which we hope will be considered useful to anyone involved in human physiology research, including when and how one can apply standardisation to various contexts.
Assuntos
Fisiologia , Humanos , Fisiologia/normas , Fisiologia/métodos , Projetos de Pesquisa/normas , Feminino , Ciclo Menstrual/fisiologiaRESUMO
The replicability of research results has been a cause of increasing concern to the scientific community. The long-held belief that experimental standardization begets replicability has also been recently challenged, with the observation that the reduction of variability within studies can lead to idiosyncratic, lab-specific results that cannot be replicated. An alternative approach is to, instead, deliberately introduce heterogeneity, known as "heterogenization" of experimental design. Here, we explore a novel perspective in the heterogenization program in a meta-analysis of variability in observed phenotypic outcomes in both control and experimental animal models of ischemic stroke. First, by quantifying interindividual variability across control groups, we illustrate that the amount of heterogeneity in disease state (infarct volume) differs according to methodological approach, for example, in disease induction methods and disease models. We argue that such methods may improve replicability by creating diverse and representative distribution of baseline disease state in the reference group, against which treatment efficacy is assessed. Second, we illustrate how meta-analysis can be used to simultaneously assess efficacy and stability (i.e., mean effect and among-individual variability). We identify treatments that have efficacy and are generalizable to the population level (i.e., low interindividual variability), as well as those where there is high interindividual variability in response; for these, latter treatments translation to a clinical setting may require nuance. We argue that by embracing rather than seeking to minimize variability in phenotypic outcomes, we can motivate the shift toward heterogenization and improve both the replicability and generalizability of preclinical research.
Assuntos
Experimentação Animal/normas , Projetos de Pesquisa/normas , Animais , Comportamento Animal/fisiologia , Isquemia Encefálica/metabolismo , Humanos , Metanálise como Assunto , Modelos Animais , Fenótipo , Padrões de Referência , Reprodutibilidade dos Testes , Projetos de Pesquisa/tendências , Acidente Vascular Cerebral/fisiopatologiaRESUMO
In an effort to better utilize published evidence obtained from animal experiments, systematic reviews of preclinical studies are increasingly more common-along with the methods and tools to appraise them (e.g., SYstematic Review Center for Laboratory animal Experimentation [SYRCLE's] risk of bias tool). We performed a cross-sectional study of a sample of recent preclinical systematic reviews (2015-2018) and examined a range of epidemiological characteristics and used a 46-item checklist to assess reporting details. We identified 442 reviews published across 43 countries in 23 different disease domains that used 26 animal species. Reporting of key details to ensure transparency and reproducibility was inconsistent across reviews and within article sections. Items were most completely reported in the title, introduction, and results sections of the reviews, while least reported in the methods and discussion sections. Less than half of reviews reported that a risk of bias assessment for internal and external validity was undertaken, and none reported methods for evaluating construct validity. Our results demonstrate that a considerable number of preclinical systematic reviews investigating diverse topics have been conducted; however, their quality of reporting is inconsistent. Our study provides the justification and evidence to inform the development of guidelines for conducting and reporting preclinical systematic reviews.
Assuntos
Revisão da Pesquisa por Pares/métodos , Revisão da Pesquisa por Pares/normas , Projetos de Pesquisa/normas , Experimentação Animal/normas , Animais , Viés , Lista de Checagem/normas , Avaliação Pré-Clínica de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/normas , Pesquisa Empírica , Métodos Epidemiológicos , Epidemiologia/tendências , Humanos , Revisão da Pesquisa por Pares/tendências , Publicações , Reprodutibilidade dos Testes , Projetos de Pesquisa/tendênciasRESUMO
The United States (U.S.) National Institutes of Health-funded Environmental influences on Child Health Outcomes (ECHO)-wide Cohort was established to conduct high impact, transdisciplinary science to improve child health and development. The cohort is a collaborative research design in which both extant and new data are contributed by over 57,000 children across 69 cohorts. In this review article, we focus on two key challenging issues in the ECHO-wide Cohort: data collection standardization and data harmonization. Data standardization using a Common Data Model and derived analytical variables based on a team science approach should facilitate timely analyses and reduce errors due to data misuse. However, given the complexity of collaborative research designs, such as the ECHO-wide Cohort, dedicated time is needed for harmonization and derivation of analytic variables. These activities need to be done methodically and with transparency to enhance research reproducibility. IMPACT: Many collaborative research studies require data harmonization either prior to analyses or in the analyses of compiled data. The Environmental influences on Child Health Outcomes (ECHO) Cohort pools extant data with new data collection from over 57,000 children in 69 cohorts to conduct high-impact, transdisciplinary science to improve child health and development, and to provide a national database and biorepository for use by the scientific community at-large. We describe the tools, systems, and approaches we employed to facilitate harmonized data for impactful analyses of child health outcomes.