Researchers from Florida State University, Boot and colleagues recently reported on a pervasive problem in psychological intervention research that undermines, if not entirely invalidates, the entire field.
The concern? Psychological interventions are rarely double-blinded, because the experimental group always knows they are receiving a treatment. Most reasonably designed studies attempt to obviate this concern by including a second “control” group, which undergoes a similar, but distinct, intervention in order to isolate the specific “treatment” effect of the experimental intervention.
However, Boot and colleagues argue that even the best “active” control groups are insufficient. They are pretty dismissive of others in the field, chastising the common belief that “an active control group automatically controls for placebo effects” [emphasis mine]. In contrast, the authors contend that differences in expectations between the experimental and active control group represent a “fundamental design flaw” that invalidates causal conclusions from these studies regarding efficacy.
Ouch. That’s pretty harsh.
In fact, I happen to agree with the authors. Inappropriate control groups are used all the time and researchers often gloss over (or completely ignore) the myriad and often blatant ways in which systematic differences between the treatment groups could confound the preferred interpretation of results.
But let’s not get carried away here. This particular critique doesn’t quite paint all studies in the same light, rather it focuses exclusively on “psychological intervention studies”. These include common interventions like psychotherapy as well as the same behavioral “cognitive enhancement” studies which I have discussed over the past month (e.g. math learning or visual attention enhancement).
Among psychological intervention studies, Boot and colleagues single out one area of research as exemplary, while noting that even these studies are still flawed. These studies all use action video game training paradigms, which tend to be among the best controlled interventions because they consistently include active control groups.
Nonetheless, these “active controls” are still insufficient, contends this new paper. For example, students in the experimental group who practice playing an intense, action video game might, in fact, expect to improve their performance on cognitive tasks far more than students in the active control group who play a more laid-back game like Tetris or The Sims. In my last discussion of one such study, I noted an example of how this could play out: students trained to play action games might be more motivated to perform well on the cognitive tasks, perhaps because they receive a more potent “rush” associated with the sense competition. There are additional criticisms, but this “expectation” account is definitely worth considering.
Accounting for “expectation”
In the present discussion, Boot and colleagues actually try to test this “expectation difference” hypothesis. To do so, they asked 200 participants from Amazon’s Mechanical Turk system to view short clips of an action game (Unreal Tournament) or an active game control (Tetris, The Sims). Participants were next asked to rate how much they expected to improve on cognitive tasks (given practice on one of the video games). The tasks included the field-of-view (FOV) task I mentioned in my previous post as well as the multiple-object-tracking (MOT) and mental rotation (MR) tasks seen in Bavelier’s TED talk (jump to 6:50 and 14:05).
Their survey results indicated that participants were more likely to expect performance improvements as a result of action game practice versus active control practice on the FOV (expectation: 83 vs 45% and 75 vs 38%) and MOT (86 vs 65%) tasks, whereas the opposite pattern was seen for the MR task (89 vs 69 %, with higher expectation favoring the active control).
These expected performance improvements matched with the observed performance improvements seen in Green and Bavelier’s and Boot et al’s previous action video game intervention research.
In other words: participants’ expectations about how much gaming practice would improve performance (in this study) predicted the actual performance improvements found in other studies.
So, are expectations a problem? Personally, I’m not all that impressed by the empirical evidence from this one survey study. It’s based upon an extremely select group of poorly paid participants (“Turkers” usually earn cents for these kinds of studies), who were recruited solely through an online system–so the sample itself is not ideal. Also, the authors are trying to infer participants’ expectations of performance improvement due to practice, but by asking people who never had a chance to practice—that’s pretty sketchy. Wording this another way: the authors are proposing that researchers should measure expectation in order to “validate” that performance enhancements are truly due to the intervention, but they don’t even validate the measure of expectation very well!
So, empirically, I’m not sure this adds all that much to the argument.
Poor control undermines causality
On the other hand, the theoretical critique itself is extremely valid. If there is even a possibility that there are differences in expectations about performance enhancement between control and intervention groups, then this totally undermines any causal inferences which could be drawn from the data. Moreover, the authors are right in pointing out that interventions from “commercial brain-fitness training” to “adaptive memory exercise to improve IQ” to psychotherapy in general or online therapy, (and, in my opinion, an awfully large number of psychological studies), may all be suspect as a result.
The authors offer two concrete solutions:
The first is to “measure” expectancy (e.g. by asking participants about their expectations) and the second is to use better active control groups or “alternate” research designs (e.g. trick half of the experimental and control group into thinking that the effects come later).
These recommendations certainly sound simple, but I’m not entirely sure they will do the job. With respect to the video game example, I’m still concerned that regardless of participant expectations that people who practice playing action video games will still be way more motivated to maximize their performance on cognitive tasks.
It also doesn’t deal with one of my core criticisms: Tetris is very similar to mental rotation, whereas action video games are not. On the other hand, action games have a lot more in common with field-of-view and object-tracking tasks than does Tetris. Thus, differences in performance improvement are probably due to simple practice effects–improved performance on the practiced skill, no improvement on the general abilities. To deflate this criticism, these researchers need to start testing their gamers on cognitive tasks that don’t resemble video games so much!
This same principle can be applied to a lot of other interventions aimed at cognitive enhancement (in fact, it is a well-known problem in psychometric testing): if your measure of “success” is the same as what your “intervention” group uses for practice, then those people will become very “successful” at this one task used to measure success. But they won’t perform nearly so well at all of the other “general” tasks that we really care about. Nobody cares if you can score 150 on an IQ test but lack the ability to apply your skills to something useful!
Moreover, it also helps us illustrate at least two key components of the generic “placebo effect”. Placebo effects include “expectancy” effects (i.e. physiological and psychological change due to belief in the efficacy of a treatment), but we can also discuss all sorts of non-specific factors such as statistical variation across the course of a trial (e.g. regression to the mean, or “outliers tend to dissipate over time”) or various other measurement errors or flaws in the experimental design that might change the average outcome, when, in fact, the treatment did nothing useful whatsoever. Notably, expectancy effects tend not to affect “objective” outcome measures, whereas there are are myriad ways that an nonequivalent control group could wreck havoc with the measured outcome of a study.
Regardless, the authors conclude on a point with which I can unreservedly agree:
“In the case of cognitive interventions, the field has had enough speculation. Researchers, reviewers, and editors should no longer accept inadequate control conditions, and causal claims should be rejected unless a study demonstrably eliminates differential placebo effects.”