Why “Rehabilitating” Repeat Criminal Offenders Often Fails

Executive Summary
This report seeks to add much-needed perspective to America’s debate over criminal rehabilitation policies. Crucially, we document what is known and not known about the efficacy of rehabilitation programs in curtailing recidivism. We start by highlighting the psychological challenge of altering criminal behavior. We then review U.S. efforts to develop and deliver rehabilitation programs over the past decade. We show how rhetoric has caused treatments to seem more successful than honest data analysis reveals them to be. Nearly all such programs have either failed or been almost impossible to replicate in clinical trials. We then detail how political narratives have bolstered misleading claims about rehabilitation’s successes and highlight the wasteful investments that have ensued, citing Washington State’s juvenile rehabilitation programs as the embodiment of all these issues.
Finally, we propose that policymakers and agency heads first acknowledge how hard it is to change behavior and then make moral and evidence-based decisions about program implementation and investment. We recommend, in particular, that they:
- Avoid adopting rehabilitation programs until blinded, randomized controlled trials that include sufficient sample sizes have been conducted, including at least three years of follow-up data.
- Require program evaluators and scholars to a priori establish concrete, readily quantifiable working definitions of rehabilitation and their effectiveness, success, and failure.
- Favor locally tailored programs with a strong independent evidence base over programs implemented statewide with less evidence of scalable success.
- Endorse treatment programs that retain public safety as a primary goal, and recognize that rehabilitation advocates often have competing, ideologically driven, aims.
The Enormous Challenge of Altering Criminal Behavior
For over a century, Americans have searched for ways to rehabilitate criminal offenders. Despite our best efforts, in the aggregate, we have been unable to produce even minuscule reductions in overall recidivism. As influential scholar Robert Martinson soberly concluded decades ago: some programs work for people some of the time. When new rehabilitation treatments emerge and show promise, advocates and officials too quickly react by overpromising and moralizing, before subsequent results fail to confirm early findings.
Though politically unsavory, the fact is that high recidivism rates are the historical norm and have never substantively changed. After a hundred years of theorizing, testing, evaluating, and criticizing, social science has consistently demonstrated that serious criminal behavior remains stubbornly stable over time, situation, and place.[1] Those who commit crimes today will be those who commit crimes tomorrow, and they will be the same people who commit crimes until they are incapacitated by age, infirmity, imprisonment, or death.
Why is it so difficult to change criminal conduct, especially when its consequences are often so grave? A large body of criminological and psychological evidence finds that recidivistic criminals are not accidental, adjacent, or incidental criminals.[2] They share characteristics including hyperaggression, poor self-control, bad decision-making skills, disdain for conventions such as employment and education, social and economic parasitism, entitled attitudes, and manipulative behavior. They see nothing wrong or immoral with their impolite and dangerous behavior. As several qualitative studies of active offenders demonstrate, many report experiencing enjoyment at terrorizing, maiming, and killing others.[3] Criminals have friends and family who are also criminals, normalizing these traits.[4]
Criminal behaviors are habituated and manifest from early callous and unemotional traits in childhood that seamlessly unfold into antisocial personality throughout adulthood.[5] Criminals represent a class of people who are very different from the rest of society. They do not share conventional means, aspirations, and moral values; indeed, they see crime as an important element to their self-identity. Crime earns them status and respect on the street, empowers them to exert influence over others, and enables them to live recklessly nihilistic lives.[6] Criminals are more likely to resist than to earnestly embrace behavioral change. This is why rehabilitation programs characterized by high fidelity and dosage fail more often than not—and when they work, why they do so only in a limited fashion. Changing depressive symptoms in a patient is challenging for psychological practitioners; changing fundamental deficits in personality is a task of a different order.
Certainly, some high-rate criminal offenders change their lives. The literature on desistance from crime tells us, however, that we cannot predict who will terminate their criminal career or when.[7] Some desisters have a deeply religious conversion; others burn out after spending years in and out of jail and prison and suffering the deprivations of a life of crime. Some get a job they value or a spouse they wish to keep;[8] others simply decide to stop offending.
But few, if any, report ending their criminal career because of a correctional rehabilitation program. The hard reality is that a lifestyle of criminal behavior, backed by a strong self-identity reinforced by decades of poor behavioral choices, is very difficult to change. This, we believe, is the key underappreciated and politically intolerable lesson of a century of correctional efforts.
Within this context, it becomes easier to understand why correctional rehabilitation efforts are most often met with failure and why, when a program shows initial positive results, people are so quick to tout its efficacy. Rehabilitation offers the illusion of hope, sprinkled with the occasional story of short-term success.
But this does not mean that policymakers and correctional officials should abandon their rehabilitative efforts. Correctional programming in jails and prisons may confer little benefit to inmates once released, but it does benefit our correctional institutions by creating a pro-social environment, much like prison-based employment and job skills training. Programming keeps inmates busy and might yield positive benefits that affect the safety and security of the institution. There are also societal benefits of programming unrelated to recidivism, such as the significant value that it provides for inmates’ families and loved ones. Indeed, we have a societal duty to offer the opportunity for change, even while criminals are incarcerated.
Competing Correctional Ideals
The American correctional system has always juggled three interrelated but competing ideals: justice, punishment, and rehabilitation. Justice, arguably the most important, encompasses a cardinal social demand that the state pursue retribution on behalf of crime victims proportionate to the harm done. This, in turn, curbs citizens’ tendency to retaliate and perpetuate violence, since it sentences offenders to periods of incarceration, restricts their freedom, or imposes fines, among other legal inconveniences. Indeed, the correctional system is what balances the scales of justice and informs public perceptions of fairness.
The second ideal of the correctional system addresses the individual, holding an offender accountable for criminal behavior by punishment. Punishments range from financial penalties or inconveniences, such as going to see a probation officer or abiding by a curfew, to restriction of movement and liberty, such as house arrest, jail, prison, or death. Our correctional system is the institution that metes out proportional punishment for transgressions.
While our correctional system is influenced by ideals of justice and punishment, it is also tasked with altering or changing the behavior of those who are sanctioned so that they will not continue to commit crimes. Calls for offender rehabilitation justified our earliest jails and prisons. Surveys still show that while the public wants the state to hold offenders accountable and punish them for wrongdoing, it also wants to afford them the opportunity to redeem themselves.[9]
These three themes—justice, punishment, and rehabilitation—intertwine through the landscape of the American correctional system. As historical contexts vary, these shape the goals of the system and how it manages offenders.
Historical Evolution of Criminal Rehabilitation
During the Progressive Era (1890–1930), rehabilitation was introduced as the guiding principle of corrections. This period was characterized by rapid change, and public outlook regarding probation, parole, and juvenile court shifted.
The Progressive Era also saw rehabilitation capture reformist zeal, while the countervailing ideals of punishment and proportionality receded significantly. But importantly, unlike the prison abolitionists of today, Progressive-Era reformers were infatuated with institutions. As noted by Matt DeLisi and John Paul Wright, the number of state prisons increased from 61 to 106 between 1880 and 1950.[10] During the 1940s, the prison population crept down from 200,000 to 178,000, while the number of inmates in jails and workhouses fell from 99,000 to 86,000. Those in state mental hospitals rose slightly, from more than 591,000 to more than 600,000. By 1950, 140,000 juveniles were committed to dedicated facilities. Rehabilitation, reformers argued, was a two-tiered process and could occur only when individuals were isolated from the temptations of crime found on the outside as well as treated by criminal-behavior specialists.
Prisons and jails were transformed to treat the maladies believed to cause criminals to offend, backed by the expertise of the new American social sciences. In this context, each offender was seen as unique, spurring the introduction of individualized treatment plans. Efforts to rehabilitate often manifested in indeterminate sentences, where convicts would serve an indefinite number of years in prison—five to 20, say—until they were deemed “reformed.” Psychologists and psychiatrists were afforded total discretion, controlling when an individual was release-eligible.
This necessarily resulted in offenders convicted of the same crime serving dramatically different amounts of time.
The 1960s birthed many changes, including widespread social unrest. By the mid-1960s, Americans had begun to witness unprecedented, nationwide spikes in crime that would continue for the next three decades. This precipitous increase led to growing critiques of the rehabilitative model as “soft on crime” at the expense of proportional justice. Indeed, criminal sentences at that time were relatively limited, and most offenders were released from prison and jail well before completing their sentences.
In 1967, New York State commenced a study on effective inmate rehabilitation. A team of scholars reviewed intervention studies run from 1945 to 1967 and, using a set of scientific criteria, identified and evaluated 231 studies. Summarizing their eight hundred pages of findings in 1972, they wrote: “On the whole, the evidence from the survey indicated that the present array of correctional treatments has no appreciable effect—positive or negative—on the rates of recidivism of convicted offenders.”[11]
After New York State invested for decades in rehabilitative efforts, this was not the desired determination. The state’s official response was swift: it refused to publish the study, and even prohibited its authors from publishing it independently. But the review’s findings would eventually become public, due to the efforts of a lawyer who brought it in as evidence in a different case.
Writing in 1974 in The Public Interest,[12] Robert Martinson concluded damningly that, “with few and isolated exceptions, the rehabilitative efforts that have been reported so far have had no appreciable effect on recidivism.”
The pushback was severe. Martinson was vilified by rehabilitation advocates, therapists, and criminologists. According to Martinson, “some treatment advocates have been motivated to become kinglike and shoot, or at least shoot down, the messengers. We have been tagged ‘yellow scientists’ (apparently close kin of yellow journalists), pessimists, and idealists in search of magic cure for all offenders all the time.”[13] As the years progressed, Martinson became known for ushering in the “nothing works” era of criminal justice, a moniker undeserved and highly stigmatizing. He paid a devastating professional price and, in 1980, jumped to his death from his ninth-floor apartment in front of his teenage son.
But of course, Martinson’s critics were right to worry about the impact of his team’s study, since ending rehabilitation conflicted with entrenched economic and social interests. By the late 1970s, with violent crime showing no sign of tapering off, the public had wearied of rehabilitation and demanded more stringent responses. His report hit at a time of record-high crime rates, revolving-door justice, rampant drug use, and broadscale social decay. Divestitures in offender rehabilitation meant that programs would close, psychologists and counselors would be out of work, and academic fiefdoms built around program evaluation would end.
Despite these interests, the political Left used Martinson’s study to attack sentencing disparities attributed to the indeterminate sentences resulting from psychologists’ subjective determinations, while the Right used it to argue for reforming sentencing dispositions that they believed undermined public safety and justice. That Martinson never said “nothing works” never mattered; both parties demanded a return to championing the ideals of justice and proportionality—albeit for starkly different reasons. With the 1984 Supreme Court decision upholding the Sentencing Reform Act, the backbone of Progressive-Era rehabilitation—indeterminate sentencing—came to an end.
In reality, several other studies prior to Martinson had reached the same conclusions—namely, that most correctional programs lacked scientific efficacy. Yet, however unfairly, it was Martinson’s study that was blamed for ending the era of progressive rehabilitation and launching a new era of punitiveness. And this set the next stage: progressives had their villain, their cause, and a newfound desire to promote correctional rehabilitation at any cost.
From Advocacy to Entrepreneurship to Abolition
Over 50 years have passed since Martinson’s freighted report. During that time, correctional rehabilitation has been “revived, reaffirmed, and resold,”[14] and has been relabeled the “what works”[15] movement in order to reduce recidivism. According to rehabilitation advocates, Martinson’s “nothing works” dictum served as the intellectual impetus behind an effort to understand why some correctional rehabilitation programs apparently reduced offender recidivism while others did not. A small group of Canadian psychologists and American criminologists joined forces to promote the virtues of the rehabilitative ideal and to prove that science could be harnessed to reform the errant and to reduce imprisonment. At least “some things worked,” they insisted, and began their undertaking.
The “what works” movement sprang from both progressive politics and an earnest desire to improve offenders’ lives. While rehabilitation advocates have retained their almost religious zeal for converting criminals into productive citizens, they have taken a very different approach in their rhetoric and arguments. Prisons and jails are no longer therapeutic centers where inmates are evaluated, their maladies discovered and treated, and their release monitored. Some of today’s rehabilitation advocates are fully committed prison abolitionists radically opposed to offender punishment and stigmatization.[16] Anything resembling punishment, such as intensive supervision of inmates, is immediately scrutinized and lambasted. Even the terms “inmate” and “offender” are problematic for them, and they impugn criticism of rehabilitation as “anti-science.”
Setting up a choice between “moral” policies that promote rehabilitation and “immoral” policies that favor arrest and incarceration, today’s advocates divide criminal-justice policy into a false dichotomy rife with virtue signaling. The intention to help is what matters to these advocates, not actually helping; the effort to rehabilitate prevails, rather than actual rehabilitation. Support for offender rehabilitation indicates an ideological and political orientation that imbues itself with goodwill and casts its critics with evil intent. “This presumption against punishment as a moral value,” wrote criminologists Charles Logan and Gerald Gaes, “is a subtext running through much of the new literature on reviving or reaffirming rehabilitation.”[17]
This false dichotomy has been adopted as progressive dogma, pushed by intellectual radicals and skewed media reporting and has gained dominance by slowly delegitimizing and dismantling the justice system. The narrative has caused rollbacks in preventive detention and bail, police restrictions and divestment, and constant attacks on the “prison industrial complex” (or the newer moniker of the “criminal legal system”).
Whereas rehabilitation attempts were once housed within institutions, progressives now seek to supplant incarceration with treatment. This embrace of the therapeutic state also serves other political aims, mirroring previous calls to deinstitutionalize the mentally ill, when advocates overpromised and oversold their ability to change behavior. Instead of downsizing expectations, deinstitutionalization supporters pushed to eliminate the institutions themselves, which at least had kept very sick and sometimes aggressive people housed and safe.
Downwardly Defining Rehabilitation Success
Organized skepticism challenges ideology-based arguments, forcing scholars to countenance that neither theories nor the observations on which they are based are sacrosanct.[18] Proponents of correctional rehabilitation are acutely aware of this and have convincingly used it to root out dogmatic ways of thinking.[19] To their credit, they have spent considerable effort in evaluating programs and interventions, such as boot camps,[20] intensive supervision,[21] and programs such as Scared Straight,[22] often with an eye toward debunking state efforts to punish offenders. Indeed, rehabilitation scholars should be commended for scrutinizing and promoting replication-based research on “what works” to reduce recidivism. Their position is seemingly buttressed by the score of empirical assessments collectively suggesting that rehabilitative approaches can reduce offending.
Despite decades of studies showing that correctional rehabilitation typically failed to produce positive results, the “what works” movement adopted a powerful new method that obscures this disappointing reality: the meta-analysis. Based on the affirming studies included in these large-scale reviews, the field of correctional rehabilitation appears to practice organized skepticism regarding its theoretical and empirical foundation. However, a deep reading of the literature reveals the appearance of objectivity to be deceiving.
A meta-analysis measures the large-scale difference between “control” and experimental groups by aggregating the results of many studies into a single metric. Theoretically, this provides researchers an overall estimate of how well interventions work or don’t work by allowing simultaneous evaluation of dozens, if not hundreds, of studies from diverse fields that employ disparate methodologies. This approach supposedly screens out researcher bias because, whereas prior evidence reviews were narratives provided by authors, meta-analysis produces concrete, quantitative estimates of program effects.
Indeed, meta-analytic reviews, if undertaken with great nuance and attention to the quality of the studies being aggregated and their outcomes, can afford important insights. However, results require intensely critical interpretation and careful translation into practice. As will be later illustrated in the Washington State case study, this approach did not provide objectivity or clarity in assessing correctional rehabilitation.
The first sets of meta-analyses of correctional rehabilitation programs produced eye-popping indications of program effectiveness. Not only did some programs reportedly reduce recidivism; most did—and some with remarkable effects.[23] Reductions of 50%–60% in reoffending were reported, and many also reported extraordinary reductions of 20%–30%. Martinson, it appeared, was not only a villain; he was also scientifically wrong. With new evidence in hand, rehabilitation advocates renewed their efforts to reintroduce correctional programming that appeared to show tremendous success. Intervention programs spanning nearly every part of the adult and juvenile justice system were developed, employed, and tested, often with great fanfare.
Truly, the advent of the meta-analysis helped rapidly rewrite the annals of correctional programming, and its use increased exponentially. Dozens of meta-analytic reviews now assess a diverse array of correctional treatment programs for reducing recidivism. There are also meta-analyses of meta-analyses of correctional intervention. These studies converge to show that, overall, correctional treatment reduces recidivism by about 10%. Some programs reduce recidivism by 20%–30%, while many have no effect, and some are correlated with increases in recidivism, known as iatrogenic effects. From these studies, and an average effect size of 0.10, rehabilitation advocates have readily embraced the success of correctional treatment—often arguing that treatment is superior to incarceration.
But the details of meta-analytic reviews and effect sizes—the measure of their success—tell a story not of widespread success but of an institutional effort to downwardly define what qualifies as “effective” rehabilitation.
Rehabilitation, commonly understood, translates to the cessation of offending and the embrace of other pro-social roles and institutions consistently over time. Perhaps the offender quit crime and became a productive member of society. Perhaps he became and stayed gainfully employed, found a caring romantic relationship, and earned a GED or high school diploma.
Rehabilitation means something far narrower, however, for scholars and advocates. By their definition, a program reduces recidivism not because it caused an individual to terminate offending. Rather, they mean that recidivism in a treatment group, typically measured as a rearrest or a reconviction, was statistically lower than recidivism in a control group. For example, suppose 75% of a control (untreated) group was rearrested over a two-year follow-up, compared with 65% of the treatment group. This 10-percentage-point difference between the two groups would be evidence that the treatment program “effectively” reduced recidivism. However, a large majority of each group was still rearrested over the follow-up period. Thus, a program can still be labeled “effective” even when the majority of those treated continue to commit crimes.
The more substantive issue involves the policy relevance of effect sizes. This is where the difference between statistical significance and policy relevance clash and where definitions come into play. Problematically, meta-analytic reviews make the ungrounded assumption that the control or untreated group will have a 50% recidivism rate. In reality, recidivism rates are a function of time, with about 35% of people rearrested within the first year and 85% rearrested over the next nine years.[24] If the actual rate of recidivism departs from meta-analytic assumptions of 50%, effect size estimates break down.
Meta-analytic studies report several different quantitative estimates of program effectiveness, or “effect sizes,” including odds ratios and mean effect sizes. Of course, any quantitative estimate of effect size is associated with specific biases and limitation. Odds ratios, for example, which are widely reported in the research literature, can present highly distorted, upwardly biased, estimates of program effectiveness. In that same vein, studies have found that lower-quality studies produce substantively higher estimates of program effectiveness, with one study rating 85% of intervention studies as “weak” methodologically and another 16% as “moderate” in quality. None were rated as “strong.”[25]
With these caveats in mind, how do we interpret an effect size of 0.10, which is heralded by advocates as meaningful and policy-relevant? Is this sufficient evidence to widely embrace rehabilitation? To start, standard statistical thresholds consider effect sizes under 0.20 to be small, up to 0.50 as moderate, and 0.80 and higher as large. Yet many of the best, “most effective” correctional treatments generate effect sizes below 0.20, so an average effect of 0.10 is considered very small. When rehabilitation advocates say that a program reduces recidivism by 10%, they are saying that, compared with the control group’s assumed recidivism rate of 50%, the treated groups’ recidivism rate was 10% lower. Translated differently, 45% of the treatment group were rearrested, compared with 50% of the controls: only a 5-percentage-point difference.
Context matters because small effects could have large and important social effects. Advocates point to other small effects linked to specific medical interventions, such as aspirin use to prevent heart attacks (effect size = 0.02), as an instance where small-effect interventions are important to consider.[26] But in reality, in these different contexts, effect sizes can mean very different things. Aspirin has huge potential upsides—preventing heart attacks, major surgeries, or death—and few downsides, as it is inexpensive and holds a small range of minor side effects. Because the costs are so low and the risk of a heart attack is so severe, a low dose can be recommended even if very few individuals will be spared a heart attack. Critically, the same cannot be said of small effects found in the criminal rehabilitation research, where the outcome measured is typically an arrest for another crime.
Research blithely defines recidivism as a rearrest over the follow-up period, at which point a program is deemed failed. But studies show that most crimes, even those committed by high-rate offenders, do not result in arrest. For example, offenders may have simply avoided arrest although they committed dozens of crimes. These offenders would be labeled a treatment success—as would individuals who were too sick or injured to offend, or who moved out of state and were no longer on the particular criminal registries checked by researchers. The small effect of intervention studies is thus likely an inflated estimate of program effectiveness.
To place effect sizes in context, we point to Figure 1, where we plotted effect sizes found in the literature related to correctional interventions: police strategies to apprehend active offenders, psychotherapy for the treatment of depression, and two medically related assessments. Compared with the effect sizes related to juvenile and adult rehabilitation, effect sizes for police interventions are substantially larger, as is talk therapy for depression.

In Figure 2, we show graphically what the respective effect sizes correspond to for treatment and control groups.

Table 1
Understanding Effect Sizes
% Overlap | Probability | Number to Treat | |
Correctional Treatment | 96 | 52.8 | 25.11 |
Focused Deterrence | 85 | 60.6 | 6.59 |
Depression Treatment | 72 | 69.5 | 3.78 |
Dementia Testing | 34 | 91 | 2.12 |
Table 1 shows other measures that give a clearer picture of effect size differences. Looking first at the line on correctional treatment, an effect size of 0.10 means that the two groups (experimental and control) are almost indistinguishable. There is 96% overlap between the two groups, and chances are only 52.8% that a treated individual selected randomly would have a better outcome than one from the control group. Moreover, it would take 25 people to successfully graduate from treatment to achieve a reduction of a single rearrest. Contrast these findings with the effect size showing the effectiveness of police-focused deterrence strategies. There is 85% overlap between the two groups, and chances are 61% that a person selected randomly from the experimental group would have a better outcome than the control group. Moreover, it would take only six or seven arrests to achieve lower crime than treating 25 people in rehabilitation.
By any conventional standard, overall correctional treatment effects are very small and likely not clinically or socially meaningful. Perhaps they could be, if the same body of research examined simultaneously other indicators of pro-social adaptation, such as gaining and keeping employment or paying child support. Unfortunately, program success has been so narrowly defined that other outcomes, particularly in meta-analyses, escape assessment. Rehabilitation programs, especially those focused on cognitive-behavioral therapy, should generate a range of positive spillover effects. They should mean more than not being arrested one more time, and other indicators of success should be readily visible.
Thus far, we’ve assumed that the various meta-analytic reviews of the correctional rehabilitation literature arrive at similar conclusions and that they converge to show a small positive effect of programming. While this is generally true, there are important exceptions.
In a recent study of prison-based corrections programs, Gabrielle Beaudry and her coauthors examined randomized controlled trials (RCTs) on adult and juvenile psychological interventions administered while incarcerated. Their meta-analysis, published in The Lancet, identified 29 RCTs from seven countries that included 9,443 individuals.[27] Beaudry and her colleagues found a significant net reduction in recidivism linked to correctional treatment. The effect size of 0.18 was consistent with prior studies.
However, prior research has found that small sample sizes in RCTs produce upwardly biased estimates of effectiveness. When the sample was restricted to RCTs that contained at least 50 people, the positive results vanished (effect size = 0.07). Moreover, the authors found no evidence that the vaunted cognitive behavioral therapy had any impact on recidivism. They also noted that their results diverged from a prior systematic review that had reported 20%–30% reductions in recidivism. They attributed this difference to their inclusion of only RCTs, the exclusion of less methodologically rigorous studies and small-scale studies, and publication biases.
Similar observations were made most recently by Seena Fazel and colleagues in their assessment of the evidence base in support of the Risk-Need-Responsivity (RNR) model—the linchpin for proponents of correctional rehabilitation. Their review, based on 26 unique meta-analyses examining RNR principles, concluded: (1) the underlying evidence for RNR principles is mixed and mostly of low quality; (2) this evidence shows authorship bias and often lacks transparency (a point to which we return below); (3) higher-quality research is needed to support the claims about RNR principles.[28] At any rate, the most recent rigorous systematic and meta-analytic reviews of prison and jail-based treatment found correctional rehabilitation programing unrelated to recidivism—a Martinson redux.
Indeed, meta-analyses are highly dependent on which studies are aggregated and generally do not include all studies or studies of uniformly high quality. In the world of research, there is a hierarchy of research design structures, with RCTs that rely on independent evaluators producing the most reliable results. Beneath RCTs are quasi-experimental studies, which take a control group and try statistically to match, for instance, people from a probation sample with those from an incarcerated sample. Quasi-experimental designs can be useful but cannot ferret out causal information because added “noise” and variability muddy results.
The vast majority of studies used in correctional rehabilitation analyses, to their detriment, are not RCTs. They are either based on correlation/observational data or the aforementioned quasi-experimental approach.
Further, researcher biases in coding decisions matter and can sharply skew results.[29] The accuracy and validity of meta-analytic results often depend on the coding of individual studies that are run through the regression models. These coding decisions are muddied by the enormous variety in the study designs that the analyses aggregate. For instance, some studies measure rearrest over 12 months, and others measure over 18 or 30 months. Some track convictions rather than rearrest. Some studies sample only individuals on probation, others in prison, and others mixed. Some studies track whether participants complete programs, while others record the number of hours attended, and still others measure inmates’ degree of program engagement.
Most often, meta-analyses will simply code for the broadest categories because it is too difficult to code for all the different types of measures. But aggregating studies with varying methodologies, measurements, samples, sample sizes, and lengths of follow-up is comparing apples to oranges.
States and the federal government have invested heavily in correctional interventions. Many departments of correction have broadened their evaluation efforts and now regularly train staff on “best practices” and “what works” in offender change. Since the mid-1990s, millions of tax dollars have been spent on research, program development, implementation, staff and counselor training, and aftercare. One rehabilitation provider, MST Services, had received over $55 million by 2004 to develop and test the efficacy of its Multisystemic Therapy program. Indeed, it received about $900 (in current dollars) per youth and treated thousands of youths across the U.S. and Chile.[30] To greet this inflow of resources, a cottage industry was built around modifying offender behavior through “evidence-based” practices. Where did this evidence originate? Most often, it came from those paid to produce it. Obviously, having program developers also evaluate the effectiveness of their programs would seem to be a conflict of interest. Not so much in the grant-driven world of rehabilitation funding, however. This may be why, in a study of 300 RCTs, Anthony Petrosino and Haluk Soydan found that effect sizes produced by the developer of the intervention programs aimed at reducing recidivism were substantially larger than those produced by independent scholars—which often showed no effect.[31] For some scholars and program developers, advocacy has crossed over to entrepreneurship, and it is easy to understand why.
Washington’s CJAA: Reenforcing Bad Program Investment with Misleading Reporting
It is very hard to study a concept as nebulous as rehabilitation. For starters, researchers must assign and track the progress of otherwise recalcitrant offenders. Rates of attrition are troublingly high, and it takes a significant and concerted effort to maintain program fidelity. Furthermore, comparison groups aren’t always clear, and selection factors almost always pose a threat to a study’s validity. Even the highest-quality studies are, for the most part, very difficult to replicate. Research suggests that initial studies tend to yield the largest effect sizes, which then shrink rapidly or disappear in follow-up studies.
Faulty landmark evaluations that nonetheless document large effect sizes and statistical significance (worryingly often, as in the MST example above) frequently feature a team member—usually the first author or principal investigator—with a vested interest in a particular set of results.
All these issues were at play in Washington State’s 1997 Community Juvenile Accountability Act (CJAA), which enacted sweeping reforms to its juvenile justice system (Figure 3). CJAA was, in fact, the nation’s first experiment in statewide adoption and mandatory use of “evidence-based” programs. To identify programs for adoption, the Washington State Institute for Public Policy (WSIPP) reviewed the scientific literature but was guided more broadly by the University of Colorado’s Center for the Study and Prevention of Violence and its repositories of “blueprint programs.”[32] These programs were deemed to have met strict scientific standards and to be effective at reducing recidivism, although they had been examined only in small-scale studies and were rife with many of the problems delineated above.

Four such “blueprint” programs were selected: Functional Family Therapy (FFT), Aggression Replacement Therapy (ART), Coordination of Services (COS), and Multisystemic Therapy (MST). State legislators eliminated funding for almost all other intervention programs, arguing that they didn’t have sufficient evidence of effectiveness. Washington’s 33 juvenile courts were then mandated to select and implement one of the four blueprint programs. At the time of implementation, all these programs had published studies documenting their effectiveness, and some included RCTs. A total of 14 juvenile courts opted to use FFT, 26 courts selected ART, 1 court selected COS, and 3 courts embraced MST. Implementation began in January 1999,[33] and the first study[34] of results was completed in 2002.
This initial report of program effectiveness triggered widespread fanfare, as results seemingly confirmed the efficacy of Washington’s approach. For example, WSIPP found that program participation was associated with an overall 10% reduction in juvenile recidivism across the state. It reported a 38% reduction in felony recidivism linked to FFT; ART reduced felony recidivism 24%; and COS reduced felony recidivism an incredible 59%. MST was associated with a 10-percentage-point increase in recidivism—a fact, according to the authors, produced by implementation problems.
A more critical examination reveals that the report’s findings were based on faulty research and reporting decisions. For example, the report breaks down and highlights recidivism rates by therapist, labeling individual therapists “competent” or “not competent.” Competence, however, was not measured systematically but was determined by the subjective opinion of staff members. And they determined, troublingly, that over 50% of FFT therapists and 29% of the 704 ART therapists were not competent.
How could such a large percentage of deficient therapists achieve the purported 38% reduction in felony recidivism? That figure was generated by tallying only the “competent” therapists.
Indeed, those labeled “not competent” were associated with a significant 17% increase in recidivism. When considering outcomes for all the FFT program participants, regardless of therapist competence, treated youth had an 18-month felony recidivism rate of 24.2%—comparable with the control group’s 27% rate of felony reoffending. And this meager 2.8-percentage-point difference was then framed, in order to sound larger, as a “10% reduction” in reoffending.
But no matter how euphemistically it is worded, this difference was not statistically significant: the two groups’ averages quite possibly differed only by chance. Perhaps more telling, there was no difference between the FFT and control groups on overall arrests (for misdemeanors and felonies) or in violent felonies. In other words, FFT did not reduce reoffending.
This same analytical shell game distorted the ART report findings. Again, subjectively delineated “competent” therapists were associated with a significant reduction in felony recidivism (24%), while “noncompetent” therapists were associated with a 7% increase in recidivism. Even the significant reduction in felony recidivism was illusory, produced by using an extremely liberal statistical criteria never used in published research. Moreover, when incorporating all therapists’ cases, there was no difference in overall arrests and in violent felony arrests. ART did not reduce reoffending.
And that COS reduction of 59% in felony arrests? Another illusion. COS treated youth they deemed very unlikely to reoffend—and that showed in their data. Only 3.3% of the control group and 1.4% of the COS group were rearrested for violent felonies (for a 57.6% reduction).
In 2017, WSIPP officials released another study evaluating MST, ART, and FFT. The report found that, proportionately, FFT produced one “therapeutic” effect to 188 “iatrogenic” effects, where the treatment actually increased recidivism. Similarly, for ART, researchers found one therapeutic effect to 124 iatrogenic effects; for MST, they reported six therapeutic effects to 125 iatrogenic effects. In simple terms, juveniles participating in these “evidence-based” programs were more likely to reoffend than had they not participated at all.
This was also true in 2023, when WSIPP found that juveniles in FFT fared significantly worse than their untreated counterparts.[35] FFT youth recidivated at a rate 23% greater than youth waiting for treatment. And remember the 2003 report that allegedly showed the critical importance of therapist competence? No such relationship was detected in 2023.
Bigger-Picture Policy Mistakes
What can be learned from Washington’s move to statewide juvenile rehabilitation? First, state officials committed time and substantial financial resources to select and implement only evidence-based programs. Unlike other efforts at correctional rehabilitation, these were neither piecemeal nor underfunded. Moreover, jurisdictions had a range of evidence-based programs to pick from, and they implemented these with the aid of respective program representatives. The effort was substantial, guided by experts, and rooted in best practices. Thus, the first lesson is that devoting time, effort, resources, and research does not guarantee that rehabilitation programs will reduce recidivism. Changing behavior, especially criminal behavior, is very difficult.
Second, initial and early reports of program success are often reversed in follow-up studies showing no effect or worse effects. For example, a single non-peer-reviewed study on Georgia’s boot camps for juvenile offenders in the 1990s led to widespread implementation of boot camps across several states.[36] But not only were the results of that study never again replicated; further assessment found that participants were more likely to recidivate. Similarly, the initial 2003 WSIPP report—which allegedly demonstrated meaningful reductions in juvenile recidivism and cost savings that were truly unbelievable—was touted as evidence of the wisdom of Washington State officials. The report is still heralded in the academic literature as justification for statewide implementation of evidence-based programming.[37]
Either the initial WSIPP (2003) study was wrong or biased, or the positive effects evaporated over time. Regardless, initial studies are often not confirmed by follow-up studies, especially when done by independent evaluators. This point was articulated by Peter Rossi in 1983, in what he termed “The Iron Law of Evaluation.”[38] The “Iron Law” stipulates that any evaluation of a public social program is likely to find it ineffective or marginally effective. Stated differently, we should expect evaluation results to show no effect or only a minor, perhaps defensible, effect. Initial findings showing modest to large positive effects should be viewed critically and with skepticism. “The findings of the majority of evaluations purporting to be impact assessments,” Rossi would later state, in 2003, “are not credible.”
Third, while there are benefits to using what WSIPP authors considered “evidence-based” correctional programs, their implementation is no guarantee that they will work in any given jurisdiction. As mentioned, implementing a program with therapeutic integrity and offering those services with consistent quality is very difficult. The process is often fraught with sticky personnel issues, policy and legal hurdles, logistical challenges, and clientele who are often antisocial, hostile, and less than committed to change. Furthermore, the quality of scientific evidence behind an “evidence-based” designation is too often questionable. Few “evidence-based” programs, for example, have been evaluated by independent assessors using RCTs with sufficiently lengthy follow-up periods. Even those that have, such as MST, have shown widely varying results.
The Subservience of Science to Anti-Carceral Ideology
Science has long been used to support ideology, and the story of correctional rehabilitation would not be complete without addressing the intersection between progressive morality, science, and political advocacy.
Whether, and under what circumstances, we can effectively alter criminal behavior through therapy is a matter of science. But the validity of science is undone by powerful pro-rehabilitation or anti-punishment narratives. When science is used to justify an ostensibly political agenda, replete with liberal underdogs and sinister conservatives, it is further discredited. To ignore the visible political threads running through much of the literature on offender rehabilitation is to accept any level of evidence as sufficient proof. An absence of evidence would not change a rehabilitation advocate’s mind.
Notwithstanding hype and moralizing, important insights have emerged about reducing criminal behavior. Hundreds of studies, many with conflicting findings, have been undertaken to understand offender change and how to implement programs with some modicum of evidence. It would not be fair to say that “nothing works,” as there is reliable evidence that some programs do work for some people some of the time. Yet it might also be fair to say that despite our best efforts, our ability to compel or incentivize criminals to improve their lives remains very limited and may not be worth the effort.
This is especially important given the costs and challenges of managing our criminal-justice system while honoring our social contract’s balance between freedom and social order. Citizens rightfully expect institutions to provide security and protection from interpersonal and other harms, while not abusing this role or overstepping their powers. As John Paul Wright has noted, establishing equilibrium in the administration of justice requires trade-offs.[39] For advocates of rehabilitation, who today bear a closer resemblance to prison abolitionists, this includes aggressive calls to wholly replace incarceration with less punitive dispositions.
Indeed, the anti-incarceration narrative and its corresponding rehabilitative ideals extend deep into the policy arena. It has culled widespread support among politicians anxious to placate a growing number of agitated constituents and is fueled by powerful—yet fundamentally misguided—presuppositions about the American correctional system, including how it works and whom it affects.[40] Chief among them is that the U.S. criminal-justice system currently practices and promotes mass incarceration to the detriment of the disenfranchised. This sentiment is so prevailing that it persists in the wake of precipitous increases in crime nationwide, including violent crime, and when juxtaposed against data showing significant drops in the correctional population over time. Recent reports from the Bureau of Justice Statistics (BJS) and the Federal Bureau of Investigation (FBI) are especially telling. Collectively, they suggest that aggregate-level decreases in the state and federal prison population are linked to aggregate-level increases in crime, especially homicide.
Rising Crime Has Accompanied Decarceration
These reports also comport with several observations made in the academic literature—especially the seminal work of William Spelman, who demonstrated that prison populations necessarily and inversely correspond with crime rates.[41] Put simply, decreases in the prison population are associated with significant increases in crime (and vice versa)—a fact that is squarely, but needlessly, at odds with proponents of rehabilitation. In fact, according to economist Steven Levitt, increasing prison populations during the 1980s and 1990s was one of four major drivers of virtually all the crime decline.[42] The others were that era’s increase in the number of police, the waning crack epidemic, and the delayed impact of Roe v. Wade’s enactment in 1973, legalizing abortion in all 50 states and thereby theoretically reducing the number of unwanted children who would commit crimes as teenagers in the 1990s.
Consider that the U.S. correctional population has decreased by about 25% over the last decade, from approximately 1.6 million inmates in 2011 to approximately 1.2 million in 2021.[43] Besides the drop in overall inmate numbers, the country’s rate of incarceration similarly fell: at the end of 2021, there were 350 sentenced prisoners per 100,000 U.S. residents, a 2% decrease from 2020 and 29% decrease from 2011. In contrast, long-term trends documented by the FBI indicate a significant annual increase in homicides beginning in 2014, when the U.S. witnessed more than 14,000 murders. Homicides peaked at approximately 22,500 in 2020—the highest number of murder victims in nearly 30 years.
Myth: It Is Easy to Wind Up in Prison
Progressives continue advocating for mass decarceration even when confronted with alarming increases in violence, the brunt of it borne by the marginalized communities they seek to help. Their commitment to decarceration is driven by at least two erroneous beliefs: that it is too easy to incarcerate someone in the U.S.; and, vehemently promulgated within academia, that the majority of offenders have been incarcerated for trivial or nonserious offenses and therefore their needs would be better served in the community than in an institution. In fact, it takes a great deal of effort to wind up in prison. Yet this reality is largely absent in the discourse of rehabilitation advocates and prison abolitionists. Their misguided sentiment both romanticizes the arrests of offenders and downplays the harm done to their communities that triggered their arrests.
The “life course” theory in criminology provides a useful adage about the nature of offending: past behavior predicts future behavior. The path to prison, for the overwhelming majority of offenders, is characterized by long-term/stable, frequent, and serious criminality. Put differently,
many offenders actively work toward their incarceration. Although estimates vary depending on the source, BJS estimates—which leverage data from the National Corrections Reporting Program—show that the median number of arrests prior to incarceration is nine.[44] By the same token, these same data indicate that nearly 20% of offenders admitted to state prison have at least five prior arrests—the scientific criterion for career/habitual criminality—and approximately 25% have criminal careers spanning over two decades.
Although criminal histories vary widely among offenders, those who end up in prison share a trait: the generality of deviance. That is, criminals serving prison sentences tend not to have specialized in any one type of crime, despite popular belief among laypersons and many academics. As mentioned, one of the most heavily cited arguments by proponents of the rehabilitative ideal and decarceration movement is that most offenders are serving lengthy prison sentences for low-level drug offenses. This position has also been used to imply excessive incarceration results in racial disparities in prison and a miscarriage of justice. Yet as criminologist Barry Latzer’s historical account of the rise and fall of violent crime in the U.S. shows, the overwhelming majority of inmates, irrespective of race or ethnicity, are serving time for serious, violent offenses.[45] In fact, his analyses show that if all drug offenders were removed from state prisons, the black inmate population would remain virtually unchanged proportionally. Given these trends and the accompanying research on the intractability of criminality, and the Herculean efforts made to get arrested by offenders who end up in prison, it is instructive to provide a sense of the fruitfulness and potential of correctional rehabilitation.
Abolition Creep and the Fallacies Behind Decarceration
Rehabilitation advocates view correctional institutions with disdain. Some wish to convert jails and prisons into therapy centers—as California governor Gavin Newsom intends to do with the San Quentin State Prison—while others want to abolish them altogether. Importantly, it is becoming increasingly difficult to distinguish between advocates and abolitionists. Are proponents of rehabilitation really interested in promoting best practices to establish “what works,” or has rehabilitation become a modern-day Trojan horse for abolitionists of the criminal-justice system writ large?
Regarding the latter, the hypothetical “pros” of widespread decarceration under the guise of rehabilitation, Rafael Mangual notes, would be greatly outweighed by its “cons,” especially as it pertains to promoting public safety among those deemed most vulnerable.[46] The communities to which most former inmates return are typically poorly resourced ones whose residents are not readily able to cope with an influx of high-risk individuals. One of the most sobering lessons learned from the mass deinstitutionalization of mentally ill patients in the 1970s was that many families actually opted to commit afflicted relatives to those institutions simply because they lacked the skill, expertise, or patience to take care of them. The overwhelming majority of those released from mental hospitals during that period did not evince patterns of career criminality; yet their communities still struggled to absorb the influx.
Estimates vary but suggest that 45%–70% of the state prison population meet the clinical threshold for antisocial personality disorder (ASPD).[47] As noted earlier, these personality traits are stable, enduring, and highly resistant to change. Progressives ignore this fact in favor of the more palatable presupposition that offenders, had they not been sent to prison, would otherwise be positively contributing to their communities. Yet as the literature also demonstrates, the path to prison is hard-earned and characterized by a trail of physical damage and emotional grief.
The Personification of the Dangers of Under-Sentencing: Darrell Brooks
Darrell Brooks drove his red SUV into a 2021 Christmas parade route in Waukesha, Wisconsin, and subsequently received six consecutive life sentences, plus an additional 700 years in prison, after a jury found him guilty of all 76 charges levied against him.[48] These included six counts of first-degree homicide. Brooks’s criminal history prior to that incident spanned over two decades and raises the question of why he wasn’t already incarcerated the day he drove into the Christmas parade. Brooks is an example of the grave consequences of shunting incarceration for more stylish but less substantive approaches.
Brooks’s criminal history demonstrates both the depth of criminal deviance and the degree of criminal effort required to trigger the serious prison sentences that activists decry. Despite the impressively long and weighty offending timeline below, Brooks served only short stints in prison, often with early release.
- September 1999: Substantial battery / intentionally causing bodily harm
- September 2002: Possession of THC
- September 2003: Resisting/obstructing an officer
- February 2005: Obstruction/failing to appear in court
- November 2006: Statutory rape after impregnating a 15-year-old girl
- 2010: Strangulation/suffocation (with previous conviction), battery, and criminal damage to property
- March 2011: Resisting/obstructing an officer
- November 2011: Possession of THC
- December 2011: Possession of THC and misdemeanor bail jumping
- 2012: Providing a false address regarding his sex-offender registration
- 2016: Failure to comply with sex-offender laws since 2008
- July 2020: Charged with two counts of second-degree recklessly endangering safety and one count of felony in possession of a firearm
- March 2021: Violation of court-ordered child-support payments
- May 2021: Arrested for domestic violence
- November 2021: Arrested and charged with running over his girlfriend and their child at a gas station
- December 2021: Charged with witness intimidation regarding the November 2021 case
Notably, in the context of lengthy prison sentences, the histories of offenders like Brooks are not the exception: they are the rule. These are also offenders who, by the admission of rehabilitation advocates, are almost certain to fail or drop out of any correctional intervention, no matter how rigorous.
Recommendations
Against this backdrop, we make the following recommendations for policymakers and justice officials.
1. Avoid adopting programs until blinded randomized controlled trials (RCTs) that include sufficient sample sizes have been conducted, based on at least three years of follow-up data.
All data sets should be made public and their results peer-reviewed. Studies should be conducted by independent organizations with the skills and resources to conduct complex intervention studies. Under no circumstances should program designers, staff, or those with financial or career interests be allowed to evaluate their own programs.
- This is particularly important if a member of the team that invented or implemented the program was involved in its initial evaluation. Proponents of rehabilitation are as much marketers as scientists. They have strong financial incentives and career interests in the success of programming. The cottage industry created by this co-opting effect wields power and influence, a development also present in other fields. For instance, a recent study appearing in the Journal of Clinical Epidemiology on researcher bias regarding the efficacy of antidepressants in psychiatry journals found that “studies reporting favorable results were more frequently published in psychiatric journals than nonpsychiatric journals and were more often conducted by lead authors with financial conflicts of interest (fCOI).”[49] Furthermore, the authors of the study noted that “within psychiatric journals, lead authors with fCOI published in journals with higher impact factors and rankings.”
- Inflated estimates obtained through non-RCTs, especially quasi-experimental designs, produce harmful statistical noise and diminish the potential to make any sort of reliable inference to the target population. This is also true for RCTs of small samples. This helps explain why the vast majority of evaluative studies on rehabilitation are completely at odds with what is known about the immutability of antisocial behavior.[50]
- Studies frequently fail to include extensive follow-up periods to determine whether services and treatments deterred offenders from repeating well-established criminal impulses. This, too, is an especially relevant concern because a chief argument leveraged by proponents of rehabilitation is that prisons don’t deter individuals from committing future crimes. Instead, they suggest restorative justice, a rehabilitation-adjacent approach to correcting behavior that has also been met with fanfare. As criminologists Francis Cullen and Cheryl Jonson deftly articulate: “Sentencing conferences (or ‘circles’) and restitution programs ‘sound good.’ But what would lead us to expect that these kinds of interventions are powerful enough to change deep-seated criminal impulses developed over 15 or 20 or 25 years? Do we really think that a couple of hours at a sentencing conference—even an emotional meeting with the victim—are capable of transforming such offenders? Does this risk becoming much like a liberal scared straight program?”[51]
2. Require program evaluators and scholars to a priori establish concrete, readily quantifiable/working definitions of rehabilitation as well as metrics for effectiveness, success, and failure. Current evaluations are essentially meaningless if they encompass any of the following faulty elements:
- “Rehabilitation” doesn’t mean the termination of offending behavior or even the embrace of other pro-social roles in its current usage.
- Depending on the study, a program can still be rated effective even if the majority of participants fail; even if treated individuals show no other improvements in life functioning; or even if the effect size is trivial to small.
- Effect sizes can be very misleading.
3. Acknowledge that locally tailored programs with a strong independent evidence base are generally more effective than programs implemented statewide and that even programs proved effective are very difficult to scale up and out.
- Evidence-based programs might work at one location but not another.
- Jurisdictions have different needs, different levels of buy-in and talent, and different problems.
- Without high-fidelity implementation, the program will fail.
4. To endorse treatment programs that retain public safety as a primary goal, officials should recognize that rehabilitation advocates have a competing aim.
Rehabilitation advocates view incarceration as part of a moral matrix that preferences the welfare of offenders, frequently ignores their victims, and scorns punishment meted out by institutions of incarceration. Their agenda is rooted in ideology rather than science, and therefore many will:
- advocate for prison and jail abolition, arguing that rehabilitation can safely occur in the community
- uncritically accept methodologically questionable evidence if programs appear to be effective, while ignoring or attacking evidence showing that instead, arrests, incarceration, or punishment reduced reoffending
- portray a false choice between rehabilitation and punishment rather than pursue both simultaneously. Cullen and Jonson explain that rehabilitation proponents “do not like imprisonment” and “do their best to explain why prisons do not prevent crime”—a dominant sentiment that quashes organized skepticism or the objective study of correctional rehabilitation.[52]
Scholars in the field who are truly serious about offender rehabilitation should reaffirm their commitment to organized skepticism and to a scientific ethos rooted in objectivity and realistic expectations. To date, the field has largely become an echo chamber devoid of self-reflection or introspection. Scholars should also reaffirm a commitment to public safety: few in this area have expressed concern about widespread decarceration, including its mistaken assumptions and whom it hurts the most. To the extent that scholars are concerned by decarceration’s impact, many are equally concerned about being ostracized by their colleagues for supporting mass incarceration and the prison industrial complex, despite research that demonstrates the social utility of incapacitation.[53] Researchers should reembrace the work of Martinson, who was branded the penultimate villain in the tough-on-crime movement of the 1970s, but who noted that some programs work some of the time in certain situations. Given the lore surrounding Martinson and the supposed harm he caused, this might be especially hard for scholars. To this end, we suggest reaffirming a commitment to scholarly humility over hubris.
Endnotes
Photo: Thinkstock Images / Stockbyte via Getty Images
Are you interested in supporting the Manhattan Institute’s public-interest research and journalism? As a 501(c)(3) nonprofit, donations in support of MI and its scholars’ work are fully tax-deductible as provided by law (EIN #13-2912529).