Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Eval Rev ; 48(3): 403-409, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38590012

RESUMEN

In a 1987 article, Peter R. Rossi promulgated "The Iron Law of Evaluation and Other Metallic Rules." The Metallic Laws were meant as an informal (and humorous) overstatement of the weakness of contemporary evaluations of social programs. Rossi' s underlying worry was not so much about the state of evaluation technology in the abstract, but, rather, in its inability to advance our broad understanding of social problems and what to do about them---in other words, to make evaluation policy relevant. Rossi attributed the continuing failure to develop successful "large-scale social programs" to the failure to build a strong knowledge base for this kind of "social engineering." The qualities of studies that enable such accumulated learning are variously labeled "external validity," "generalizability," "applicability," or "transferability." This Special Issue includes five papers that seek to explore and apply this understanding.


Asunto(s)
Ansiedad , Ingeniería , Evaluación de Programas y Proyectos de Salud
2.
Eval Rev ; 48(3): 410-426, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38235700

RESUMEN

Assessing the transferability of lessons from social research or evaluation continues to raise challenges. Efforts to identify transferable lessons can be based on two different forms of argumentation. The first draws upon statistics and causal inferences. The second involves constructing a reasoned case based on weighing up different data collected along the causal chain from designing to delivery. Both approaches benefit from designing research based upon existing evidence and ensuring that the descriptions of the programme, context, and intended beneficiaries are sufficiently rich. Identifying transferable lessons should not be thought of as a one-off event but involves contributing to the iterative and learning of a scientific community. To understand the circumstances under which findings can be confidently transferred, we need to understand: (1) How far and why outcomes of interest have multiple, interacting and fluctuating causes. (2) The program design and implementation capacity. (3) Prior knowledge and causal landscapes (and how far these are included in the theory of change). (4) New and relevant knowledge; what can we learn in our 'disputatious community of truth seekers'.


Asunto(s)
Aprendizaje
3.
Eval Rev ; 47(1): 43-70, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-33302732

RESUMEN

In this article, we explore the reasons why multiarm trials have been conducted and the design and analysis issues they involve. We point to three fundamental reasons for such designs: (1) Multiarm designs allow the estimation of "response surfaces"-that is, the variation in response to an intervention across a range of one or more continuous policy parameters. (2) Multiarm designs are an efficient way to test multiple policy approaches to the same social problem simultaneously, either to compare the effects of the different approaches or to estimate the effect of each separately. (3) Multiarm designs may allow for the estimation of the separate and combined effects of discrete program components. We illustrate each of these objectives with examples from the history of public policy experimentation over the past 50 years and discuss some design and analysis issues raised by each, including sample allocation, statistical power, multiple comparisons, and alignment of analysis with goals of the evaluation.


Asunto(s)
Política Pública , Proyectos de Investigación
4.
Eval Rev ; 46(1): 32-57, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-33251816

RESUMEN

PURPOSE: This case study discusses Mathematica's experience providing large-scale evaluation technical assistance (ETA) to 65 grantees across two cohorts of Teen Pregnancy Prevention (TPP) Program grants. The grantees were required to conduct rigorous evaluations with specific evaluation benchmarks. This case study provides an overview of the TPP grant program, the evaluation requirements, the ETA provider, and other key stakeholders and the ETA provided to the grantees. Finally, it discusses the successes, challenges, and lessons learned from the effort. CONCLUSION: One important lesson learned is that there are two related evaluation features, strong counterfactuals and insufficient target sample sizes, that funders should attend to prior to selecting awardees because they are not easy to change through ETA. In addition, if focused on particular outcomes (for TPP, the goal was to improve sexual behavior outcomes), the funder should prioritize studies with an opportunity to observe differences in these outcomes across conditions; several TPP grantees served young populations, and sexual behavior outcomes were not observed or were rare, limiting the opportunity to observe impacts. Unless funders are attentive to weaning out evaluations with critical limitations during the funding process, requiring grantees to conduct impact evaluations supported by ETA might unintentionally foster internally valid, yet underpowered studies that show nonsignificant program impacts. The TPP funder was able to overcome some of the limitations of the grantee evaluations by funding additional evidence-building activities, including federally led evaluations and a large meta-analysis of the effort, as part of a broader learning agenda.


Asunto(s)
Embarazo en Adolescencia , Adolescente , Femenino , Humanos , Embarazo , Embarazo en Adolescencia/prevención & control , Evaluación de Programas y Proyectos de Salud , Educación Sexual , Conducta Sexual
5.
Eval Rev ; 45(3-4): 134-165, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34693773

RESUMEN

INTRODUCTION: Flavored tobacco appeals to new users. This paper describes evaluation results of California's early ordinances restricting flavored tobacco sales. METHODS: A multicomponent evaluation of proximal policy outcomes involved the following: (a) tracking the reach of local ordinances; (b) a retail observation survey; and (c) a statewide opinion poll of tobacco retailers. Change in the population covered by local ordinances was computed. Retail observations compared availability of flavored tobacco at retailers in jurisdictions with and without an ordinance. Mixed models compared ordinance and matched no-ordinance jurisdictions and adjusted for store type. An opinion poll assessed retailers' awareness and ease of compliance with local ordinances, comparing respondents in ordinance jurisdictions with the rest of California. RESULTS: The proportion of Californians living in a jurisdiction with an ordinance increased from 0.6% in April 2015 to 5.82% by January 1, 2019. Flavored tobacco availability was significantly lower in ordinance jurisdictions than in matched jurisdictions: menthol cigarettes (40.6% vs. 95.0%), cigarillos/cigar wraps with explicit flavor descriptors (56.4% vs. 85.0%), and vaping products with explicit flavor descriptors (6.1% vs. 56.9%). Over half of retailers felt compliance was easy; however, retailers in ordinance jurisdictions expressed lower support for flavor sales restrictions. CONCLUSIONS: The proportion of California's population covered by a flavor ordinance increased nine-fold between April 2015 and January 2019. Fewer retailers in ordinance jurisdictions had flavored tobacco products available compared to matched jurisdictions without an ordinance, but many still advertised flavored products they could not sell. Comprehensive ordinances and retailer outreach may facilitate sales-restriction support and compliance.


Asunto(s)
Aromatizantes , Productos de Tabaco , California , Comercio , Mercadotecnía , Productos de Tabaco/legislación & jurisprudencia , Productos de Tabaco/provisión & distribución
6.
Eval Rev ; 44(1): 3-50, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32527152

RESUMEN

Evidence-based policy is limited by the perception that randomized controlled trials (RCTs) are expensive and infeasible. We argue that carefully tailored research design can overcome these challenges and enable more widespread randomized evaluations of policy implementation. We demonstrate how a stepped-wedge (randomized rollout) design that adapts synthetic control methods overcame substantial practical, administrative, political, and statistical constraints to evaluating King County's new food safety rating system. The core RCT component of the evaluation came at little financial cost to the government, allowed the entire county to be treated, and resulted in no functional implementation delay. The case of restaurant sanitation grading has played a critical role in the scholarship on information disclosure, and our study provides the first evidence from a randomized trial of the causal effects of grading on health outcomes. We find that the grading system had no appreciable effects on foodborne illness, hospitalization, or food handling practices but that the system may have marginally increased public engagement by encouraging higher reporting.


Asunto(s)
Inocuidad de los Alimentos , Política de Salud , Proyectos de Investigación , Estudios de Factibilidad , Enfermedades Transmitidas por los Alimentos/prevención & control , Humanos , Restaurantes , Saneamiento , Washingtón
7.
Eval Rev ; 42(1): 3-33, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-30126296

RESUMEN

BACKGROUND: This article explores the performance of regression discontinuity (RD) designs for measuring program impacts using a synthetic within-study comparison design. We generate synthetic RD data sets from experimental data sets from two recent evaluations of educational interventions-the Educational Technology Study and the Teach for America Study-and compare the RD impact estimates to the experimental estimates of the same intervention. OBJECTIVES: This article examines the performance of the RD estimator with the design is well implemented and also examines the extent of bias introduced by manipulation of the assignment variable in an RD design. RESEARCH DESIGN: We simulate RD analysis files by selectively dropping observations from the original experimental data files. We then compare impact estimates based on this RD design with those from the original experimental study. Finally, we simulate a situation in which some students manipulate the value of the assignment variable to receive treatment and compare RD estimates with and without manipulation. RESULTS AND CONCLUSION: RD and experimental estimators produce impact estimates that are not significantly different from one another and have a similar magnitude, on average. Manipulation of the assignment variable can substantially influence RD impact estimates, particularly if manipulation is related to the outcome and occurs close to the assignment variable's cutoff value.


Asunto(s)
Evaluación de Programas y Proyectos de Salud , Análisis de Regresión , Proyectos de Investigación , Algoritmos , Conjuntos de Datos como Asunto , Tecnología Educacional , Evaluación de Resultado en la Atención de Salud/estadística & datos numéricos , Evaluación de Programas y Proyectos de Salud/estadística & datos numéricos , Enseñanza
8.
Eval Rev ; 42(5-6): 575-615, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30213214

RESUMEN

BACKGROUND: This article reports on the Future to Discover Project-a Canadian randomized controlled trial of two high school interventions-where data on key postsecondary enrollment outcomes were collected for two phases. During the initial phase, outcomes were recorded from administrative data and follow-up surveys. During the later phase, data came from administrative records only. OBJECTIVES: The article provides analyses that are informative about the consequences of a change from administrative-only data to survey-only data (and vice versa) for the estimation of impacts. RESULTS: The change from administrative-only to survey-only data tended to produce apparent drops in postsecondary enrollment rates that varied by subgroup and education outcome. Nonetheless, levels and significance of impact with respect to postsecondary enrollment remained relatively stable. CONCLUSIONS: The findings of the article provide evidence that estimating education program impacts in the context of a randomized experiment can be relatively robust to the data sources chosen. They suggest that internal validity and conclusions for policy need not be affected by changing data sources even when the change produces marked changes in levels of the outcome of interest observed.


Asunto(s)
Escolaridad , Almacenamiento y Recuperación de la Información , Evaluación de Programas y Proyectos de Salud/métodos , Encuestas y Cuestionarios , Adolescente , Canadá , Recolección de Datos/métodos , Femenino , Humanos , Masculino , Ensayos Clínicos Controlados Aleatorios como Asunto , Reproducibilidad de los Resultados , Universidades
9.
Eval Rev ; 42(3): 318-357, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-30081667

RESUMEN

Policy makers face dilemmas when choosing a policy, program, or practice to implement. Researchers in education, public health, and other fields have proposed a sequential approach to identifying interventions worthy of broader adoption, involving pilot, efficacy, effectiveness, and scale-up studies. In this article, we examine a scale-up of an early math intervention to the state level, using a cluster randomized controlled trial. The intervention, Pre-K Mathematics, has produced robust positive effects on children's math ability in prior pilot, efficacy, and effectiveness studies. In the current study, we ask if it remains effective at a larger scale in a heterogeneous collection of pre-K programs that plausibly represent all low-income families with a child of pre-K age who live in California. We find that Pre-K Mathematics remains effective at the state level, with positive and statistically significant effects (effect size on the Early Childhood Longitudinal Study, Birth Cohort Mathematics Assessment = .30, p < .01). In addition, we develop a framework of the dimensions of scale-up to explain why effect sizes might decrease as scale increases. Using this framework, we compare the causal estimates from the present study to those from earlier, smaller studies. Consistent with our framework, we find that effect sizes have decreased over time. We conclude with a discussion of the implications of our study for how we think about the external validity of causal relationships.


Asunto(s)
Intervención Educativa Precoz/normas , Práctica Clínica Basada en la Evidencia , California , Preescolar , Humanos , Matemática/educación , Estudios de Casos Organizacionales , Formulación de Políticas , Pobreza , Evaluación de Programas y Proyectos de Salud/métodos
10.
Eval Rev ; 41(5): 436-471, 2017 10.
Artículo en Inglés | MEDLINE | ID: mdl-26785891

RESUMEN

BACKGROUND: Variations in local context bedevil the assessment of external validity: the ability to generalize about effects of treatments. For evaluation, the challenges of assessing external validity are intimately tied to the translation and spread of evidence-based interventions. This makes external validity a question for decision makers, who need to determine whether to endorse, fund, or adopt interventions that were found to be effective and how to ensure high quality once they spread. OBJECTIVE: To present the rationale for using theory to assess external validity and the value of more systematic interaction of theory and practice. METHODS: We review advances in external validity, program theory, practitioner expertise, and local adaptation. Examples are provided for program theory, its adaptation to diverse contexts, and generalizing to contexts that have not yet been studied. The often critical role of practitioner experience is illustrated in these examples. Work is described that the Robert Wood Johnson Foundation is supporting to study treatment variation and context more systematically. RESULTS: Researchers and developers generally see a limited range of contexts in which the intervention is implemented. Individual practitioners see a different and often a wider range of contexts, albeit not a systematic sample. Organized and taken together, however, practitioner experiences can inform external validity by challenging the developers and researchers to consider a wider range of contexts. Researchers have developed a variety of ways to adapt interventions in light of such challenges. CONCLUSIONS: In systematic programs of inquiry, as opposed to individual studies, the problems of context can be better addressed. Evaluators have advocated an interaction of theory and practice for many years, but the process can be made more systematic and useful. Systematic interaction can set priorities for assessment of external validity by examining the prevalence and importance of context features and treatment variations. Practitioner interaction with researchers and developers can assist in sharpening program theory, reducing uncertainty about treatment variations that are consistent or inconsistent with the theory, inductively ruling out the ones that are harmful or irrelevant, and helping set priorities for more rigorous study of context and treatment variation.


Asunto(s)
Modelos Teóricos , Reproducibilidad de los Resultados , Toma de Decisiones , Estudios de Evaluación como Asunto , Práctica Clínica Basada en la Evidencia , Formulación de Políticas
11.
Eval Rev ; 41(6): 542-567, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29232964

RESUMEN

BACKGROUND: Youth who have experienced foster care are at risk of negative outcomes in adulthood. The family finding model aims to promote more positive outcomes by finding and engaging relatives of children in foster care in order to provide options for legal and emotional permanency. OBJECTIVES: The present study tested whether family finding, as implemented in North Carolina from 2008 through 2011, improved child welfare outcomes for youth at risk of emancipating foster care without permanency. RESEARCH DESIGN: A randomized controlled trial evaluation was carried out in nine counties in North Carolina. All children eligible for intervention services between 2008 and 2011 underwent random assignment. Effects were tested with an intent-to-treat design. Outcome data were obtained for all subjects from child welfare administrative data. Additional outcome data for a subset of older youth came from in-person interviews. SUBJECTS: Subjects included 568 children who were in foster care, were 10-17 years old (at time of referral), had no identified permanent placement resource, and had no plan for reunification. MEASURES: The confirmatory outcome was moves to more family-like placements, whether through a step-down in foster care placement or discharge from foster care to legal permanency. RESULTS: No impact on the confirmatory outcome was observed. Findings regarding exploratory impacts are also described; these must be interpreted with caution, given the large number of outcomes compared. CONCLUSIONS: The evaluation failed to find evidence that family finding improves the outcomes of older youth at risk of emancipation from foster care.


Asunto(s)
Trastornos de la Conducta Infantil/epidemiología , Protección a la Infancia , Relaciones Familiares/psicología , Cuidados en el Hogar de Adopción/organización & administración , Psicoterapia/organización & administración , Adaptación Psicológica , Adolescente , Niño , Maltrato a los Niños/prevención & control , Maltrato a los Niños/estadística & datos numéricos , Trastornos de la Conducta Infantil/terapia , Humanos , Incidencia , Salud Mental , North Carolina , Calidad de Vida , Medición de Riesgo , Factores Socioeconómicos
12.
Eval Rev ; 41(2): 130-154, 2017 04.
Artículo en Inglés | MEDLINE | ID: mdl-27671874

RESUMEN

BACKGROUND: To limit the influence of attrition bias in assessments of intervention effectiveness, several federal evidence reviews have established a standard for acceptable levels of sample attrition in randomized controlled trials. These evidence reviews include the What Works Clearinghouse (WWC), the Home Visiting Evidence of Effectiveness Review, and the Teen Pregnancy Prevention Evidence Review. We believe the WWC attrition standard may constitute the first use of model-based, empirically supported bounds on attrition bias in the context of a federally sponsored systematic evidence review. Meeting the WWC attrition standard (or one of the attrition standards based on the WWC standard) is now an important consideration for researchers conducting studies that could potentially be reviewed by the WWC (or other evidence reviews). OBJECTIVES: The purpose of this article is to explain the WWC attrition model, how that model is used to establish attrition bounds, and to assess the sensitivity of attrition bounds to key parameter values. RESEARCH DESIGN: Results are based on equations derived in the article and values generated by applying those equations to a range of parameter values. RESULTS: The authors find that the attrition boundaries are more sensitive to the maximum level of bias that an evidence review is willing to tolerate than to other parameters in the attrition model. CONCLUSIONS: The authors conclude that the most productive refinements to existing attrition standards may be with respect to the definition of "maximum tolerable bias."


Asunto(s)
Embarazo en Adolescencia/prevención & control , Proyectos de Investigación/normas , Educación Sexual/normas , Adolescente , Investigación Biomédica/normas , Femenino , Humanos , Embarazo , Evaluación de Programas y Proyectos de Salud
13.
Eval Rev ; 2016 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-27780907

RESUMEN

BACKGROUND: Cluster randomized controlled trials (CRCTs) often require a large number of clusters in order to detect small effects with high probability. However, there are contexts where it may be possible to design a CRCT with a much smaller number of clusters (10 or fewer) and still detect meaningful effects. OBJECTIVES: The objective is to offer recommendations for best practices in design and analysis for small CRCTs. RESEARCH DESIGN: I use simulations to examine alternative design and analysis approaches. Specifically, I examine (1) which analytic approaches control Type I errors at the desired rate, (2) which design and analytic approaches yield the most power, (3) what is the design effect of spurious correlations, and (4) examples of specific scenarios under which impacts of different sizes can be detected with high probability. RESULTS/CONCLUSIONS: I find that (1) mixed effects modeling and using Ordinary Least Squares (OLS) on data aggregated to the cluster level both control the Type I error rate, (2) randomization within blocks is always recommended, but how best to account for blocking through covariate adjustment depends on whether the precision gains offset the degrees of freedom loss, (3) power calculations can be accurate when design effects from small sample, spurious correlations are taken into account, and (4) it is very difficult to detect small effects with just four clusters, but with six or more clusters, there are realistic circumstances under which small effects can be detected with high probability.

14.
Eval Rev ; 40(6): 500-525, 2016 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-27784814

RESUMEN

OBJECTIVE:: Over the past two decades, the lack of reliable empirical evidence concerning the effectiveness of educational interventions has motivated a new wave of research in education in sub-Saharan Africa (and across most of the world) that focuses on impact evaluation through rigorous research designs such as experiments. Often these experiments draw on the random assignment of entire clusters, such as schools, to accommodate the multilevel structure of schooling and the theory of action underlying many school-based interventions. Planning effective and efficient school randomized studies, however, requires plausible values of the intraclass correlation coefficient (ICC) and the variance explained by covariates during the design stage. The purpose of this study was to improve the planning of two-level school-randomized studies in sub-Saharan Africa by providing empirical estimates of the ICC and the variance explained by covariates for education outcomes in 15 countries. METHOD:: Our investigation drew on large-scale representative samples of sixth-grade students in 15 countries in sub-Saharan Africa and includes over 60,000 students across 2,500 schools. We examined two core education outcomes: standardized achievement in reading and mathematics. We estimated a series of two-level hierarchical linear models with students nested within schools to inform the design of two-level school-randomized trials. RESULTS:: The analyses suggested that outcomes were substantially clustered within schools but that the magnitude of the clustering varied considerably across countries. Similarly, the results indicated that covariance adjustment generally reduced clustering but that the prognostic value of such adjustment varied across countries.

15.
Eval Rev ; 39(1): 3-18, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25740727

RESUMEN

BACKGROUND: Scholarly publishing is an essential vehicle for actively participating in the scientific debate and for sustaining the invisible colleges of the modern research environment, which extend far beyond the borders of individual research institutions. However, its current dynamics have deeply transformed the scientific life and conditioned in new ways the economics of academic knowledge production. They have also challenged the perceived common sense view of scientific research. METHOD: Analytical approach to set out a comprehensive framework on the current debate on scholarly publishing and to shed light on the peculiar organization and the working of this peculiar productive sector. RESULT: The way in which scientific knowledge is produced and transmitted has been dramatically affected by the series of recent major technosocietal transformations. Although the effects are many, in particular the current overlap and interplay between two distinct and somewhat opposite stances-scientific and economic-tend to blur the overall understanding of what scholarly publishing is and produces distortion on its working which in turn affect the scientific activities. The outcome is thus a series of intended and unintended effects on the production and dissemination of scientific knowledge. CONCLUSION: The article suggests that a substantial transformation characterizes science today that seems more like a thrusting, entrepreneurial business than a contemplative, disinterested endeavor. In this essay, we provide a general overview of the pivotal role of the scholarly publishing in fostering this change and its pros and cons connected to the idiosyncratic interplay between social norms and market stances.


Asunto(s)
Acceso a la Información , Difusión de la Información/métodos , Edición/organización & administración , Ciencia/organización & administración , Humanos , Gestión de la Calidad Total
16.
Eval Rev ; 39(6): 555-86, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26908581

RESUMEN

BACKGROUND: The impact of surveying on individuals' behavior and decision making has been widely studied in academic literature on market research but not so much the impact of monitoring on economic development interventions. OBJECTIVES: To estimate whether different monitoring strategies lead to improvement in participation levels and adoption of best practices for coffee production for farmer who participated in TechnoServe Agronomy Training Program in Rwanda. RESEARCH DESIGN: Farmers were identified randomly for monitoring purposes to belong to two different groups and then selected depending on the additional criterion of having productive coffee trees. We estimate treatment-on-the-treated and intention-to-treat effects on training attendance rates and farmers best-practice adoptions using difference-in-differences estimation techniques. SUBJECTS: Farmers were randomly identified to a high or low monitoring with different type and frequency of data collection and selected if they had productive coffee trees as part of the monitoring strategy. MEASURES: Attendance to training sessions by all farmers in the program and best-practice adoption data for improving coffee yield. RESULTS: We find that monitoring led to surprisingly large increases in farmer participation levels in the project and also improved best-practice adoption rates. We also find that higher frequency of data collection has long-lasting effects and are more pronounced for low-attendance farmers. CONCLUSIONS: Monitoring not only provides more data and a better understanding of project dynamics, which in turn can help improve design, but can also improve processes and outcomes, in particular for the least engaged.


Asunto(s)
Agricultura/educación , Agricultura/organización & administración , Coffea/crecimiento & desarrollo , Educación/organización & administración , Agricultores/educación , Adaptación Psicológica , Recolección de Datos , Países en Desarrollo , Agricultores/psicología , Humanos , Monitoreo Fisiológico/métodos , Desarrollo de Programa , Evaluación de Programas y Proyectos de Salud , Rwanda , Encuestas y Cuestionarios
17.
Eval Rev ; 39(2): 167-78, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25805301

RESUMEN

BACKGROUND: For much of the last 40 years, the evaluation profession has been consumed in a battle over internal validity. Today, that battle has been decided. Random assignment, while still far from universal in practice, is almost universally acknowledged as the preferred method for impact evaluation. It is time for the profession to shift its attention to the remaining major flaws in the "standard model" of evaluation: (i) external validity and (ii) the high cost and low hit rate of experimental evaluations as currently practiced. RECOMMENDATIONS: To raise the profession's attention to external validity, the author recommends some simple, easy steps to be taken in every evaluation. The author makes two recommendations to increase the number of interventions found to be effective within existing resources: First, a two-stage evaluation strategy in which a cheap, streamlined Stage 1 evaluation is followed by a more intensive Stage 2 evaluation only for those interventions found to be effective in a Stage 1 trial and, second, use of random assignment to guide the myriad program management decisions that must be made in the course of routine program operations. This article is not intended as a solution to these issues: It is intended to stimulate the evaluation community to take these issues more seriously and to develop innovative solutions.


Asunto(s)
Distinciones y Premios , Estudios de Evaluación como Asunto , Humanos
18.
Eval Rev ; 38(3): 217-50, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25015261

RESUMEN

BACKGROUND: Cluster randomization (CR) is often used for program evaluation when simple random assignment is inappropriate or infeasible. Pairwise cluster random (PCR) assignment is a more efficient alternative, but evaluators seemed to be deterred from PCR because of bias and identification problems. This article explains the problems, argues that they can be mitigated through design choices, and demonstrates that the suitability of PCR can be tested using Monte Carlo procedures. RESEARCH DESIGN: The article presents simple formulas showing how the PCR estimator is biased and explains why its standard error is not identified. Formal derivations appear in a longer companion article. Using those formulas, this article discusses how good design can mitigate the problems with bias and identification. Using Monte Carol simulation, this article also shows how to choose between CR and PCR at the design stage. CONCLUSIONS: This article advocates for wider use of the PCR design. PCR loses its appeal when the investigator lacks baseline data for matching the clusters. Its use is less compelling when there are a large number of clusters. But when the evaluator is working with a fairly small number of clusters-26 in the running example used in this article-PCR is an attractive alternative to CR.


Asunto(s)
Evaluación de Programas y Proyectos de Salud/métodos , Distribución Aleatoria , Humanos , Modelos Estadísticos , Método de Montecarlo , Proyectos de Investigación
19.
Eval Rev ; 37(3-4): 239-73, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24583377

RESUMEN

BACKGROUND: Corrections agencies frequently place offenders into risk categories, within which offenders receive different levels of supervision and programming. This supervision strategy is seldom evaluated but often can be through routine use of a regression discontinuity design (RDD). This article argues that RDD provides a rigorous and cost-effective method for correctional agencies to evaluate and improve supervision strategies and advocates for using RDD routinely in corrections administration. The objective is to better employ correctional resources. METHOD: This article uses a Neyman-Pearson counterfactual framework to introduce readers to RDD, to provide intuition for why RDD should be used broadly, and to motivate a deeper reading into the methodology. The article also illustrates an application of RDD to evaluate an intensive supervision program for probationers. RESULT: Application of the RDD, which requires basic knowledge of regressions and some special diagnostic tools, is within the competencies of many criminal justice evaluators. RDD is shown to be an effective strategy to identify the treatment effect in a community corrections agency using supervision that meets the necessary conditions for RDD. CONCLUSION: The article concludes with a critical review of how RDD compares to experimental methods to answer policy questions. The article recommends using RDD to evaluate whether differing levels of control and correction reduce criminal recidivism. It also advocates for routine use of RDD as an administrative tool to determine cut points used to assign offenders into different risk categories based on the offenders' risk scores.


Asunto(s)
Derecho Penal/métodos , Crimen/prevención & control , Derecho Penal/normas , Criminales/estadística & datos numéricos , Humanos , Evaluación de Programas y Proyectos de Salud , Análisis de Regresión , Proyectos de Investigación
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda