Search | VHL Regional Portal

1.

Is generalization decay a fundamental law of psychology?

Mandel, David R.

Behav Brain Sci ; 47: e54, 2024 Feb 05.

Article in English | MEDLINE | ID: mdl-38311463

ABSTRACT

Generalizations strengthen in traditional sciences, but in psychology (and social and behavioral sciences, more generally) they decay. This is usually viewed as a problem requiring solution. It could be viewed instead as a law-like phenomenon. Generalization decay cannot be squelched because human behavior is metastable and all behavioral data collected thus far have resulted from a thin sliver of human time.

Subject(s)

Generalization, Psychological , Psychology , Humans

2.

When expert predictions fail.

Grossmann, Igor; Varnum, Michael E W; Hutcherson, Cendri A; Mandel, David R.

Trends Cogn Sci ; 28(2): 113-123, 2024 02.

Article in English | MEDLINE | ID: mdl-37949791

ABSTRACT

We examine the opportunities and challenges of expert judgment in the social sciences, scrutinizing the way social scientists make predictions. While social scientists show above-chance accuracy in predicting laboratory-based phenomena, they often struggle to predict real-world societal changes. We argue that most causal models used in social sciences are oversimplified, confuse levels of analysis to which a model applies, misalign the nature of the model with the nature of the phenomena, and fail to consider factors beyond the scientist's pet theory. Taking cues from physical sciences and meteorology, we advocate an approach that integrates broad foundational models with context-specific time series data. We call for a shift in the social sciences towards more precise, daring predictions and greater intellectual humility.

Subject(s)

Models, Theoretical , Social Sciences , Humans , Judgment , Time Factors

3.

Predicting reliability through structured expert elicitation with the repliCATS (Collaborative Assessments for Trustworthy Science) process.

Fraser, Hannah; Bush, Martin; Wintle, Bonnie C; Mody, Fallon; Smith, Eden T; Hanea, Anca M; Gould, Elliot; Hemming, Victoria; Hamilton, Daniel G; Rumpff, Libby; Wilkinson, David P; Pearson, Ross; Singleton Thorn, Felix; Ashton, Raquel; Willcox, Aaron; Gray, Charles T; Head, Andrew; Ross, Melissa; Groenewegen, Rebecca; Marcoci, Alexandru; Vercammen, Ans; Parker, Timothy H; Hoekstra, Rink; Nakagawa, Shinichi; Mandel, David R; van Ravenzwaaij, Don; McBride, Marissa; Sinnott, Richard O; Vesk, Peter; Burgman, Mark; Fidler, Fiona.

PLoS One ; 18(1): e0274429, 2023.

Article in English | MEDLINE | ID: mdl-36701303

ABSTRACT

As replications of individual studies are resource intensive, techniques for predicting the replicability are required. We introduce the repliCATS (Collaborative Assessments for Trustworthy Science) process, a new method for eliciting expert predictions about the replicability of research. This process is a structured expert elicitation approach based on a modified Delphi technique applied to the evaluation of research claims in social and behavioural sciences. The utility of processes to predict replicability is their capacity to test scientific claims without the costs of full replication. Experimental data supports the validity of this process, with a validation study producing a classification accuracy of 84% and an Area Under the Curve of 0.94, meeting or exceeding the accuracy of other techniques used to predict replicability. The repliCATS process provides other benefits. It is highly scalable, able to be deployed for both rapid assessment of small numbers of claims, and assessment of high volumes of claims over an extended period through an online elicitation platform, having been used to assess 3000 research claims over an 18 month period. It is available to be implemented in a range of ways and we describe one such implementation. An important advantage of the repliCATS process is that it collects qualitative data that has the potential to provide insight in understanding the limits of generalizability of scientific claims. The primary limitation of the repliCATS process is its reliance on human-derived predictions with consequent costs in terms of participant fatigue although careful design can minimise these costs. The repliCATS process has potential applications in alternative peer review and in the allocation of effort for replication studies.

Subject(s)

Behavioral Sciences , Data Accuracy , Humans , Reproducibility of Results , Costs and Cost Analysis , Peer Review

4.

Effect of exogenous testosterone in the context of energy deficit on risky choice: Behavioural and neural evidence from males.

Vartanian, Oshin; Lam, Timothy K; Mandel, David R; Ann Saint, Sidney; Navarrete, Gorka; Carmichael, Owen T; Murray, Kori; Pillai, Sreekrishna R; Shankapal, Preetham; Caldwell, John; Berryman, Claire E; Karl, J Philip; Harris, Melissa; Rood, Jennifer C; Pasiakos, Stefan M; Rice, Emma; Duncan, Matthew; Lieberman, Harris R.

Biol Psychol ; 176: 108468, 2023 01.

Article in English | MEDLINE | ID: mdl-36481265

ABSTRACT

Previous research has shown greater risk aversion when people make choices about lives than cash. We tested the hypothesis that compared to placebo, exogenous testosterone administration would lead to riskier choices about cash than lives, given testosterone's association with financial risk-taking and reward sensitivity. A double-blind, placebo-controlled, randomized trial was conducted to test this hypothesis (Clinical Trials Registry: NCT02734238, www.clinicaltrials.gov). We collected functional magnetic resonance imaging (fMRI) data from 50 non-obese males before and shortly after 28 days of severe exercise-and-diet-induced energy deficit, during which testosterone (200 mg testosterone enanthate per week in sesame oil) or placebo (sesame seed oil only) was administered. Because we expected circulating testosterone levels to be reduced due to severe energy deficit, testosterone administration served a restorative function to mitigate the impact of energy deficit on testosterone levels. The fMRI task involved making choices under uncertainty for lives and cash. We also manipulated whether the outcomes were presented as gains or losses. Consistent with prospect theory, we observed the reflection effect such that participants were more risk averse when outcomes were presented as gains than losses. Brain activation in the thalamus covaried with individual differences in exhibiting the reflection effect. Testosterone did not impact choice, but it increased sensitivity to negative feedback following risky choices. These results suggest that exogenous testosterone administration in the context of energy deficit can impact some aspects of risky choice, and that individual differences in the reflection effect engage a brain structure involved in processing emotion, reward and risk.

Subject(s)

Gambling , Risk-Taking , Male , Humans , Testosterone , Gambling/psychology , Choice Behavior/physiology , Brain , Reward , Decision Making/physiology

5.

Communicating uncertainty in national security intelligence: Expert and nonexpert interpretations of and preferences for verbal and numeric formats.

Irwin, Daniel; Mandel, David R.

Risk Anal ; 43(5): 943-957, 2023 05.

Article in English | MEDLINE | ID: mdl-35994518

ABSTRACT

Organizations in several domains including national security intelligence communicate judgments under uncertainty using verbal probabilities (e.g., likely) instead of numeric probabilities (e.g., 75% chance), despite research indicating that the former have variable meanings across individuals. In the intelligence domain, uncertainty is also communicated using terms such as low, moderate, or high to describe the analyst's confidence level. However, little research has examined how intelligence professionals interpret these terms and whether they prefer them to numeric uncertainty quantifiers. In two experiments (N = 481 and 624, respectively), uncertainty communication preferences of expert (n = 41 intelligence analysts in Experiment 1) and nonexpert intelligence consumers were elicited. We examined which format participants judged to be more informative and simpler to process. We further tested whether participants treated verbal probability and confidence terms as independent constructs and whether participants provided coherent numeric probability translations of verbal probabilities. Results showed that although most nonexperts favored the numeric format, experts were about equally split, and most participants in both samples regarded the numeric format as more informative. Experts and nonexperts consistently conflated probability and confidence. For instance, confidence intervals inferred from verbal confidence terms had a greater effect on the location of the estimate than the width of the estimate, contrary to normative expectation. Approximately one-fourth of experts and over one-half of nonexperts provided incoherent numeric probability translations for the terms likely and unlikely when the elicitation of best estimates and lower and upper bounds were briefly spaced by intervening tasks.

Subject(s)

Communication , Judgment , Humans , Uncertainty , Probability , Intelligence

6.

Predicting Clinical Trial Results: A Synthesis of Five Empirical Studies and Their Implications.

Kimmelman, Jonathan; Mandel, David R; Benjamin, Daniel M.

Perspect Biol Med ; 66(1): 107-128, 2023.

Article in English | MEDLINE | ID: mdl-38662011

ABSTRACT

Expectations about future events underlie practically every decision we make, including those in medical research. This paper reviews five studies undertaken to assess how well medical experts could predict the outcomes of clinical trials. It explains why expert trial forecasting was the focus of study and argues that forecasting skill affords insights into the quality of expert judgment and might be harnessed to improve decision-making in care, policy, and research. The paper also addresses potential criticisms of the research agenda and summarizes key findings from the five studies of trial forecasting. Together, the studies suggest that trials frequently deliver surprising results to expert communities and that individual experts are often uninformative when it comes to forecasting trial outcome and recruitment. However, the findings also suggest that expert forecasts often contain a "signal" about whether a trial will be positive, especially when forecasts are aggregated. The paper concludes with needs for further research and tentative policy recommendations.

Subject(s)

Clinical Trials as Topic , Humans , Clinical Trials as Topic/methods , Decision Making , Forecasting

7.

Framing, equivalence, and rational inference.

Mandel, David R.

Behav Brain Sci ; 45: e234, 2022 10 25.

Article in English | MEDLINE | ID: mdl-36281854

ABSTRACT

Bermúdez's case for rational framing effects, while original, is unconvincing and gives only parenthetical treatment to the problematic assumptions of extensional and semantic equivalence of alternative frames in framing experiments. If the assumptions are false, which they sometimes are, no valid inferences about "framing effects" follow and, then, neither do inferences about human rationality. This commentary recaps the central problem.

Subject(s)

Choice Behavior , Semantics , Humans

8.

Communicating uncertainty using words and numbers.

Dhami, Mandeep K; Mandel, David R.

Trends Cogn Sci ; 26(6): 514-526, 2022 06.

Article in English | MEDLINE | ID: mdl-35397985

ABSTRACT

Life in an increasingly information-rich but highly uncertain world calls for an effective means of communicating uncertainty to a range of audiences. Senders prefer to convey uncertainty using verbal (e.g., likely) rather than numeric (e.g., 75% chance) probabilities, even in consequential domains, such as climate science. However, verbal probabilities can convey something other than uncertainty, and senders may exploit this. For instance, senders can maintain credibility after making erroneous predictions. While verbal probabilities afford ease of expression, they can be easily misunderstood, and the potential for miscommunication is not effectively mitigated by assigning (imprecise) numeric probabilities to words. When making consequential decisions, recipients prefer (precise) numeric probabilities.

Subject(s)

Communication , Humans , Probability , Uncertainty

9.

Principal investigators over-optimistically forecast scientific and operational outcomes for clinical trials.

Benjamin, Daniel M; Hey, Spencer P; MacPherson, Amanda; Hachem, Yasmina; Smith, Kara S; Zhang, Sean X; Wong, Sandy; Dolter, Samantha; Mandel, David R; Kimmelman, Jonathan.

PLoS One ; 17(2): e0262862, 2022.

Article in English | MEDLINE | ID: mdl-35134071

ABSTRACT

OBJECTIVE: To assess the accuracy of principal investigators' (PIs) predictions about three events for their own clinical trials: positivity on trial primary outcomes, successful recruitment and timely trial completion. STUDY DESIGN AND SETTING: A short, electronic survey was used to elicit subjective probabilities within seven months of trial registration. When trial results became available, prediction skill was calculated using Brier scores (BS) and compared against uninformative prediction (i.e. predicting 50% all of the time). RESULTS: 740 PIs returned surveys (16.7% response rate). Predictions on all three events tended to exceed observed event frequency. Averaged PI skill did not surpass uninformative predictions (e.g., BS = 0.25) for primary outcomes (BS = 0.25, 95% CI 0.20, 0.30) and were significantly worse for recruitment and timeline predictions (BS 0.38, 95% CI 0.33, 0.42; BS = 0.52, 95% CI 0.50, 0.55, respectively). PIs showed poor calibration for primary outcome, recruitment, and timelines (calibration index = 0.064, 0.150 and 0.406, respectively), modest discrimination in primary outcome predictions (AUC = 0.76, 95% CI 0.65, 0.85) but minimal discrimination in the other two outcomes (AUC = 0.64, 95% CI 0.57, 0.70; and 0.55, 95% CI 0.47, 0.62, respectively). CONCLUSION: PIs showed overconfidence in favorable outcomes and exhibited limited skill in predicting scientific or operational outcomes for their own trials. They nevertheless showed modest ability to discriminate between positive and non-positive trial outcomes. Low survey response rates may limit generalizability.

Subject(s)

Forecasting , Research Personnel/psychology , Clinical Trials as Topic , Surveys and Questionnaires , Treatment Outcome

10.

Political Identity Over Personal Impact: Early U.S. Reactions to the COVID-19 Pandemic.

Collins, Robert N; Mandel, David R; Schywiola, Sarah S.

Front Psychol ; 12: 607639, 2021.

Article in English | MEDLINE | ID: mdl-33833708

ABSTRACT

Research suggests political identity has strong influence over individuals' attitudes and beliefs, which in turn can affect their behavior. Likewise, firsthand experience with an issue can also affect attitudes and beliefs. A large (N = 6,383) survey (Pew Research and Ipsos W64) of Americans was analyzed to investigate the effects of both political identity (i.e., Democrat or Republican) and personal impact (i.e., whether they suffered job or income loss) on individuals' reactions to the COVID-19 pandemic. Results show that political identity and personal impact influenced the American public's attitudes about and response to COVID-19. Consistent with prior research, political identity exerted a strong influence on self-reports of emotional distress, threat perception, discomfort with exposure, support for restrictions, and perception of under/overreaction by individuals and institutions. The difference between Democrats and Republican responses were consistent with their normative value differences and with the contemporary partisan messaging. Personal impact exerted a comparatively weaker influence on reported emotional distress and threat perception. Both factors had a weak influence on appraisal of individual and government responses. The dominating influence of political identity carried over into the bivariate relations among these self-reported attitudes and responses. In particular, the appraisal of government response divided along party lines, tied to opposing views of whether there has been over- or under-reaction to the pandemic. The dominance of political identity has important implications for crisis management and reflects the influence of normative value differences between the parties, partisan messaging on the pandemic, and polarization in American politics.

11.

On measuring agreement with numerically bounded linguistic probability schemes: A re-analysis of data from Wintle, Fraser, Wills, Nicholson, and Fidler (2019).

Mandel, David R; Irwin, Daniel.

PLoS One ; 16(3): e0248424, 2021.

Article in English | MEDLINE | ID: mdl-33735197

ABSTRACT

Across a wide range of domains, experts make probabilistic judgments under conditions of uncertainty to support decision-making. These judgments are often conveyed using linguistic expressions (e.g., x is likely). Seeking to foster shared understanding of these expressions between senders and receivers, the US intelligence community implemented a communication standard that prescribes a set of probability terms and assigns each term an equivalent numerical probability range. In an earlier PLOS ONE article, [1] tested whether access to the standard improves shared understanding and also explored the efficacy of various enhanced presentation formats. Notably, they found that embedding numeric equivalents in text (e.g., x is likely [55-80%]) substantially outperformed the status-quo approach in terms of the percentage overlap between participants' interpretations of linguistic probabilities (defined in terms of the numeric range equivalents they provided for each term) and the numeric ranges in the standard. These results have important prescriptive implications, yet Wintle et al.'s percentage overlap measure of agreement may be viewed as unfairly punitive because it penalizes individuals for being more precise than the stipulated guidelines even when the individuals' interpretations fall perfectly within the stipulated ranges. Arguably, subjects' within-range precision is a positive attribute and should not be penalized in scoring interpretive agreement. Accordingly, in the present article, we reanalyzed Wintle et al.'s data using an alternative measure of percentage overlap that does not penalize in-range precision. Using the alternative measure, we find that percentage overlap is substantially elevated across conditions. More importantly, however, the effects of presentation format and probability level are highly consistent with the original study. By removing the ambiguity caused by Wintle et al.'s unduly punitive measure of agreement, these findings buttress Wintle et al.'s original claim that the methods currently used by intelligence organizations are ineffective at coordinating the meaning of uncertainty expressions between intelligence producers and intelligence consumers. Future studies examining agreement between senders and receivers are also encouraged to reflect carefully on the most appropriate measures of agreement to employ in their experiments and to explicate the bases for their methodological choices.

Subject(s)

Communication , Decision Making , Terminology as Topic , Uncertainty , Adult , Data Analysis , Humans , Language , Research/standards

12.

Can Oncologists Predict the Efficacy of Treatments in Randomized Trials?

Benjamin, Daniel M; Mandel, David R; Barnes, Tristan; Krzyzanowska, Monika K; Leighl, Natasha; Tannock, Ian F; Kimmelman, Jonathan.

Oncologist ; 26(1): 56-62, 2021 01.

Article in English | MEDLINE | ID: mdl-32936509

ABSTRACT

BACKGROUND: Decisions about trial funding, ethical approval, or clinical practice guideline recommendations require expert judgments about the potential efficacy of new treatments. We tested whether individual and aggregated expert opinion of oncologists could predict reliably the efficacy of cancer treatments tested in randomized controlled trials. MATERIALS AND METHODS: An international sample of 137 oncologists specializing in genitourinary, lung, and colorectal cancer provided forecasts on primary outcome attainment for five active randomized cancer trials within their subspecialty; skill was assessed using Brier scores (BS), which measure the average squared deviation between forecasts and outcomes. RESULTS: A total of 40% of trials in our sample reported positive primary outcomes. Experts generally anticipated this overall frequency (mean forecast, 34%). Individual experts on average outperformed random predictions (mean BS = 0.29 [95% confidence interval (CI), 0.28-0.33] vs. 0.33) but underperformed prediction algorithms that always guessed 50% (BS = 0.25) or that were trained on base rates (BS = 0.19). Aggregating forecasts improved accuracy (BS = 0.25; 95% CI, 0.16-0.36]). Neither individual experts nor aggregated predictions showed appreciable discrimination between positive and nonpositive trials (area under the curve of a receiver operating characteristic curve, 0.52 and 0.43, respectively). CONCLUSION: These findings are based on a limited sample of trials. However, they reinforce the importance of basing research and policy decisions on the results of randomized trials rather than expert opinion or low-level evidence. IMPLICATIONS FOR PRACTICE: Predictions of oncologists, either individually or in the aggregate, did not anticipate reliably outcomes for randomized trials in cancer. These findings suggest that pooled expert opinion about treatment efficacy is no substitute for randomized trials. They also underscore the challenges of using expert opinion to prioritize interventions for clinical trials or to make recommendations in clinical practice guidelines.

Subject(s)

Expert Testimony , Oncologists , Humans , Randomized Controlled Trials as Topic , Treatment Outcome

13.

Words or numbers? Communicating probability in intelligence analysis.

Dhami, Mandeep K; Mandel, David R.

Am Psychol ; 76(3): 549-560, 2021 Apr.

Article in English | MEDLINE | ID: mdl-32700939

ABSTRACT

Intelligence analysis is fundamentally an exercise in expert judgment made under conditions of uncertainty. These judgments are used to inform consequential decisions. Following the major intelligence failure that led to the 2003 war in Iraq, intelligence organizations implemented policies for communicating probability in their assessments. Virtually all chose to convey probability using standardized linguistic lexicons in which an ordered set of select probability terms (e.g., highly likely) is associated with numeric ranges (e.g., 80-90%). We review the benefits and drawbacks of this approach, drawing on psychological research on probability communication and studies that have examined the effectiveness of standardized lexicons. We further discuss how numeric probabilities can overcome many of the shortcomings of linguistic probabilities. Numeric probabilities are not without drawbacks (e.g., they are more difficult to elicit and may be misunderstood by receivers with poor numeracy). However, these drawbacks can be ameliorated with training and practice, whereas the pitfalls of linguistic probabilities are endemic to the approach. We propose that, on balance, the benefits of using numeric probabilities outweigh their drawbacks. Given the enormous costs associated with intelligence failure, the intelligence community should reconsider its reliance on using linguistic probabilities to convey probability in intelligence assessments. Our discussion also has implications for probability communication in other domains such as climate science. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

Subject(s)

Communication , Decision Making , Linguistics , Policy Making , Probability , Humans , Judgment , Uncertainty

14.

Improving Probability Judgment in Intelligence Analysis: From Structured Analysis to Statistical Aggregation.

Karvetski, Christopher W; Mandel, David R; Irwin, Daniel.

Risk Anal ; 40(5): 1040-1057, 2020 05.

Article in English | MEDLINE | ID: mdl-32065440

ABSTRACT

As in other areas of expert judgment, intelligence analysis often requires judging the probability that hypotheses are true. Intelligence organizations promote the use of structured methods such as "Analysis of Competing Hypotheses" (ACH) to improve judgment accuracy and analytic rigor, but these methods have received little empirical testing. In this experiment, we pitted ACH against a factorized Bayes's theorem (FBT) method, and we examined the value of recalibration (coherentization) and aggregation methods for improving the accuracy of probability judgment. Analytic techniques such as ACH and FBT were ineffective in improving accuracy and handling correlated evidence, and ACH in fact decreased the coherence of probability judgments. In contrast, statistical postanalytic methods (i.e., coherentization and aggregation) yielded large accuracy gains. A wide range of methods for instantiating these techniques were tested. The interactions among the factors considered suggest that prescriptive theorists and interventionists should examine the value of ensembles of judgment-support methods.

15.

Editorial: Judgment and Decision Making Under Uncertainty: Descriptive, Normative, and Prescriptive Perspectives.

Mandel, David R; Navarrete, Gorka; Dieckmann, Nathan; Nelson, Jonathan.

Front Psychol ; 10: 1506, 2019.

Article in English | MEDLINE | ID: mdl-31312160

16.

Cognitive Style and Frame Susceptibility in Decision-Making.

Mandel, David R; Kapler, Irina V.

Front Psychol ; 9: 1461, 2018.

Article in English | MEDLINE | ID: mdl-30147670

ABSTRACT

The susceptibility of decision-makers' choices to variations in option framing has been attributed to individual differences in cognitive style. According to this view, individuals who are prone to a more deliberate, or less intuitive, thinking style are less susceptible to framing manipulations. Research findings on the topic, however, have tended to yield small effects, with several studies also being limited in inferential value by methodological drawbacks. We report two experiments that examined the value of several cognitive-style variables, including measures of cognitive reflection, subjective numeracy, actively open-minded thinking, need for cognition, and hemispheric dominance, in predicting participants' frame-consistent choices. Our experiments used an isomorph of the Asian Disease Problem and we manipulated frames between participants. We controlled for participants' sex and age, and we manipulated the order in which choice options were presented to participants. In Experiment 1 (N = 190) using an undergraduate sample and in Experiment 2 (N = 316) using a sample of Amazon Mechanical Turk workers, we found no significant effect of any of the cognitive-style measures taken on predicting frame-consistent choice, regardless of whether we analyzed participants' binary choices or their choices weighted by the extent to which participants preferred their chosen option over the non-chosen option. The sole factor that significantly predicted frame-consistent choice was framing: in both experiments, participants were more likely to make frame-consistent choices when the frame was positive than when it was negative, consistent with the tendency toward risk aversion in the task. The present findings do not support the view that individual differences in people's susceptibility to framing manipulations can be substantially accounted for by individual differences in cognitive style.

17.

Correcting Judgment Correctives in National Security Intelligence.

Mandel, David R; Tetlock, Philip E.

Front Psychol ; 9: 2640, 2018.

Article in English | MEDLINE | ID: mdl-30622501

ABSTRACT

Intelligence analysts, like other professionals, form norms that define standards of tradecraft excellence. These norms, however, have evolved in an idiosyncratic manner that reflects the influence of prominent insiders who had keen psychological insights but little appreciation for how to translate those insights into testable hypotheses. The net result is that the prevailing tradecraft norms of best practice are only loosely grounded in the science of judgment and decision-making. The "common sense" of prestigious opinion leaders inside the intelligence community has pre-empted systematic validity testing of the training techniques and judgment aids endorsed by those opinion leaders. Drawing on the scientific literature, we advance hypotheses about how current best practices could well be reducing rather than increasing the quality of analytic products. One set of hypotheses pertain to the failure of tradecraft training to recognize the most basic threat to accuracy: measurement error in the interpretation of the same data and in the communication of interpretations. Another set of hypotheses focuses on the insensitivity of tradecraft training to the risk that issuing broad-brush, one-directional warnings against bias (e.g., over-confidence) will be less likely to encourage self-critical, deliberative cognition than simple response-threshold shifting that yields the mirror-image bias (e.g., under-confidence). Given the magnitude of the consequences of better and worse intelligence analysis flowing to policy-makers, we see a compelling case for greater funding of efforts to test what actually works.

18.

Can cancer researchers accurately judge whether preclinical reports will reproduce?

Benjamin, Daniel; Mandel, David R; Kimmelman, Jonathan.

PLoS Biol ; 15(6): e2002212, 2017 Jun.

Article in English | MEDLINE | ID: mdl-28662052

ABSTRACT

There is vigorous debate about the reproducibility of research findings in cancer biology. Whether scientists can accurately assess which experiments will reproduce original findings is important to determining the pace at which science self-corrects. We collected forecasts from basic and preclinical cancer researchers on the first 6 replication studies conducted by the Reproducibility Project: Cancer Biology (RP:CB) to assess the accuracy of expert judgments on specific replication outcomes. On average, researchers forecasted a 75% probability of replicating the statistical significance and a 50% probability of replicating the effect size, yet none of these studies successfully replicated on either criterion (for the 5 studies with results reported). Accuracy was related to expertise: experts with higher h-indices were more accurate, whereas experts with more topic-specific expertise were less accurate. Our findings suggest that experts, especially those with specialized knowledge, were overconfident about the RP:CB replicating individual experiments within published reports; researcher optimism likely reflects a combination of overestimating the validity of original studies and underestimating the difficulties of repeating their methodologies.

Subject(s)

Biomedical Research/standards , Judgment , Neoplasms/therapy , Research Personnel/standards , Research Report/standards , Science/standards , Animals , Biomedical Research/methods , Data Collection/methods , Data Collection/statistics & numerical data , Expert Testimony/methods , Humans , Mice , Neoplasms/diagnosis , Professional Competence/standards , Reproducibility of Results , Xenograft Model Antitumor Assays/methods , Xenograft Model Antitumor Assays/standards

19.

Debunking the Myth of Value-Neutral Virginity: Toward Truth in Scientific Advertising.

Mandel, David R; Tetlock, Philip E.

Front Psychol ; 7: 451, 2016.

Article in English | MEDLINE | ID: mdl-27064318

ABSTRACT

The scientific community often portrays science as a value-neutral enterprise that crisply demarcates facts from personal value judgments. We argue that this depiction is unrealistic and important to correct because science serves an important knowledge generation function in all modern societies. Policymakers often turn to scientists for sound advice, and it is important for the wellbeing of societies that science delivers. Nevertheless, scientists are human beings and human beings find it difficult to separate the epistemic functions of their judgments (accuracy) from the social-economic functions (from career advancement to promoting moral-political causes that "feel self-evidently right"). Drawing on a pluralistic social functionalist framework that identifies five functionalist mindsets-people as intuitive scientists, economists, politicians, prosecutors, and theologians-we consider how these mindsets are likely to be expressed in the conduct of scientists. We also explore how the context of policymaker advising is likely to activate or de-activate scientists' social functionalist mindsets. For instance, opportunities to advise policymakers can tempt scientists to promote their ideological beliefs and values, even if advising also brings with it additional accountability pressures. We end prescriptively with an appeal to scientists to be more circumspect in characterizing their objectivity and honesty and to reject idealized representations of scientific behavior that inaccurately portray scientists as value-neutral virgins.

20.

Editorial: Improving Bayesian Reasoning: What Works and Why?

Mandel, David R; Navarrete, Gorka.

Front Psychol ; 6: 1872, 2015.

Article in English | MEDLINE | ID: mdl-26696936

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL