Implicit Attitudes, Science, and Philosophy (guest post)
“Philosophers, including myself, have for decades been too credulous about science, being misled by scientists’ marketing and ignoring the unavoidable uncertainties that affect the scientific process…”
The following is a guest post* by Edouard Machery, Distinguished Professor in the Department of History and Philosophy of Science at the University of Pittsburgh and Director of the university’s Center for Philosophy of Science. It is the first in a series of weekly guest posts by different authors at Daily Nous this summer.
Implicit Attitudes, Science, and Philosophy
by Edouard Machery
How can we be responsible and savvy consumers of science, particularly when it gives us morally and politically pleasing narratives? Philosophers’ fascination with the psychology of attitudes is an object lesson.
Some of the most exciting philosophy in the 21st century has been done with an eye towards philosophically significant developments in science. Social psychology has been a reliable source of insights: consider only how much ink has been spilled on situationism and virtue ethics or on Greene’s dual-process model of moral judgment and deontology.
That people can have, at the same time, perhaps without being aware of it, two distinct and possibly conflicting attitudes toward the same object (a brand like Apple, an abstract idea like capitalism, an individual like Obama, or a group such as the elderly or women philosophers) is one of the most remarkable ideas to come from social psychology: in addition to the attitude we can report (usually called “explicit”), people can harbor an unconscious attitude that influences behavior automatically (their “implicit” attitude)—or so we were told. We have all grown familiar with (and perhaps now we have all grown tired of) the well-meaning liberal who unbeknownst to them harbors negative attitudes toward some minority or other: women or African Americans, for instance.
While it was first discussed in the late 2000s—Tamar Gendler discussed the Implicit Association Test in her papers on aliefs and Dan Kelly, Luc Faucher, and I discussed how implicit attitudes bear on issues in the philosophy of race—this idea crystallized as an important philosophical topic through the series of conferences Implicit Bias & Philosophy, organized by Jennifer Saul in the early 2010s at Sheffield. This conference series led to two groundbreaking volumes edited by Michael Brownstein and Jennifer Saul (Implicit Bias and Philosophy, Volumes 1 and 2, Oxford University Press). By then, philosophers’ fascination with implicit attitudes was in sync with the obsession with the topic in the society at large: implicit attitudes were discussed in dozens of articles and open-eds in the New York Times, by then President Obama, and by Hilary Clinton during her presidential campaign. We were lectured to be on the lookout for our unconscious prejudices by deans and provosts, well-paid consultants on “debiasing,” and journalists.
Most remarkable is the range of areas of philosophy that engaged with implicit attitudes. Here is a small sample:
- Moral philosophy: Can people be held responsible for their implicit attitudes?
- Social and political philosophy: Should social inequalities be explained by means of structural/social or psychological factors?
- Metaphysics of mind: What kind of things are attitudes? How to think of beliefs in light of implicit attitudes?
- Philosophy of cognitive science: Are implicit attitudes propositional or associations?
- Epistemology: How should implicit bias impact our trust in our own faculties?
The social psychology of implicit attitudes in philosophy had also another kind of impact: it provided a ready explanation of women’s embarrassing underrepresentation and of the perduring inequalities between men and women philosophers. Jennifer Saul published a series of important articles on this theme, including “Ranking Exercises in Philosophy and Implicit Bias” in 2012 and “Implicit Bias, Stereotype Threat, and Women in Philosophy” in 2013. In the first article, after summarizing “what we know about implicit bias” (my emphasis), Saul concluded her discussion of the Philosophical Gourmet Report as follows:
There is plenty of room for implicit bias to detrimentally affect rankings of both areas and whole departments. However, it seems to me that this worry is much more acute in the case of whole department rankings. With that in mind, I offer what is sure to be a controversial suggestion: abandon the portion of the Gourmet Report that asks rankers to evaluate whole departments.
The British Philosophical Association was receptive to explaining gender inequalities in philosophy by means of implicit biases and to this day implicit attitudes are mentioned on its website. Of course, by doing so, philosophers were just following broader social trends in English-speaking countries.
Looking back, it is hard not to find this enthusiasm puzzling since the shortcomings of the scientific research on implicit attitudes have become glaring. In “Anomalies in Implicit Attitudes Research,” recently published in WIREs Cognitive Science, I have identified four fundamental shortcomings, which are still not addressed after nearly 25 years of research:
- It isn’t yet clear whether the indirect measurement of attitudes (via, e.g., the IAT) and their direct measurement measure different things; in fact, it seems increasingly dubious that we need to postulate implicit attitudes in addition to explicit attitudes.
- The indirect measurement of attitudes predicts individuals’ behavior very poorly, and it isn’t clear under what conditions their predictive power can be improved.
- Indirect measures of attitudes are temporally unstable.
- There is no evidence that whatever it is that indirect measures of attitudes happen to measure causally impact behavior.
These four shortcomings should lead us to question whether the concept of indirect attitudes refers to anything at all (or as psychologists or philosophers of science put it, to question its construct validity). To my surprise, leading researchers in this area such as psychologist Bertram Gawronski and philosophers Michael Brownstein and Alex Madva agree with the main thrust of my discussion (see “Anomalies in Implicit Attitudes Research: Not so Easily Dismissed”): indirect measures of attitudes do not measure stable traits that predict individuals’ behavior.
It thus appears that many of the beliefs that motivated philosophical discussion of implicit attitudes are either erroneous or scientifically uncertain—why worry about how to limit the influence of implicit attitudes in philosophy when they might not have any influence on anything at all?—and that philosophers have been way too quick to reify measures (the indirect measures of attitudes) into psychological entities (implicit attitudes).
Hindsight is of course 20/20, and it would be ill-advised to blame philosophers (including my former self) for taking seriously science in the making. On the other hand, philosophers failed to even listen and a fortiori to give a fair hearing to the dissenting voices challenging the relentless hype by implicit-attitudes cheerleaders. The lesson is not limited to implicit attitudes: the neuroscience of meditation, the neuroscience of oxytocin, the so-called love molecule, the experimental research on epigenetics in humans, and the research on gene x environment interaction in human genetics also come to mind.
Philosophers, including myself, have for decades been too credulous about science, being misled by scientists’ marketing and ignoring the unavoidable uncertainties that affect the scientific process: the frontier of science is replete with unreplicable results, it is affected by hype and exaggeration (COVID researchers, I am looking at you!), and its course is shaped by deeply rooted cognitive and motivational biases. In fact, we should be particularly mindful of the uncertainty of science when it appears to provide a simple explanation for, and promises a simple solution to, the moral, social, and political ills that we find repugnant such as the underrepresentation of women in philosophy and elsewhere and enduring racial inequalities in the broader society.
Thanks for this post, Edouard. I’ve been following this debate on and off, and mostly from a position of ignorance when it comes to the scientific literature on the topic. But I note that most objections to assessments like the IAT criticize its effectiveness at predicting behavior at the level of individuals. That seems to be the case, too, for most of the shortcomings you raise above. At the level of populations, however, it appears IATs show highly stable (and predictable) results over time concerning people’s attitudes towards things like race, gender, and sexual orientation: see, e.g., these results. Further, this piece from a few years ago by Payne, Niemi, and Doris suggests non-negligible correlations between such results and disparities in outcomes across different populations.
What are your thoughts on these aggregate-level patterns? If they even partially explain disparate outcomes, it seems the phenomenon of implicit bias isn’t totally dubious and that assessments like the IAT have at least some use, both when it comes to policy decisions and in making personal choices.Report
This is an excellent question. A few articles report correlations between aggregate behavioral criteria and aggregate indirect measures of attitudes/biases (see Payne et al.’s bias of crowds 2017 paper for the basic idea). It is not obvious what to make of those results. First, I worry about data dredging, and I’d like to see a preregistered study that specify in advance which correlations are to be expected rather than searching through all possible correlations those that are significant. Second, the correlations themselves, supposing they are real, say very little about the nature of what is measured: The indirect measures might not measure anything different from what the direct measures already measure, and thus wouldn’t justify postulating *implicit* attitudes. (I’ll just note that showing that indirect measures add predictive power to direct measures isn’t sufficient.)Report
Thanks for this reply, Edouard, I’m looking forward to seeing the direction of the research on aggregate-level data that you and Chandra mention. This discussion has been helpful.
I also have another question. You say: “The indirect measures might not measure anything different from what the direct measures already measure, and thus wouldn’t justify postulating *implicit* attitudes.”
Maybe I have the wrong impression of what “direct measures” are but I’d expect that if you ask IAT respondents explicitly about their views concerning things like race, the majority would state they have no preference. Yet actual results on the IAT consistently show that number is under 20% — with close to 70% of respondents, in fact, expressing some “automatic preference for European American compared to African American.” Also startling: the same distribution patterns hold in IAT respondents’ attitudes towards things like gender and sexual orientation. Bracketing for now the relation of such data to real-world effects, I don’t see why this discrepancy between stated attitudes and IAT results doesn’t all by itself weigh strongly in favor of postulating implicit attitudes.Report
There are racist who are not in denial or bow to woke political correctness. They report pro White attitudes. After correcting for measurement error these reports correlate strongly with IAT scores.
At aggregated county levels the correlation is r = .95.Report
Sorry, but I don’t see how this answers my question. It’s not surprising to me that the statements of people who explicitly report pro-White attitudes correlate strongly with IAT scores that also register pro-White attitudes.
What I’m asking about is the discrepancy between (what I presume to be) the majority of all IAT respondents who, if asked, would explicitly report neutrality on pro-White vs. pro-Black attitudes and the fact that less than 20% of all IAT scores register such neutrality, with almost 70% reporting a pro-White preference and over 50% reporting a “moderate” or “strong” pro-White preference. Isn’t this discrepancy significant?
Such findings seem to have been quite stable over time and similar patterns appear across a range of IAT respondents’ attitudes towards things like gender, sexual orientation, age, disability, skin tone, and body weight. I think we can all guess how aggregate preferences shake out in these cases, and the fact that our guesses are confirmed by the IAT data is illuminating. Really, if you want a good Rorschach test for all the pro-attitudes of a culture, there doesn’t seem to be a better measure.Report
Not an expert, but I would suggest that people are reporting their settled views on race (eg “Black people and white people are moral equals”) rather than reporting the attitudes they hold that conflict with their settled views, attitudes that need not be “implicit” in any deep sense, ie unconscious. I take it we are all aware of attitudes or associations we have that we do not endorse (and not only those that are race/sex related). What’s not clear to me is how IAT can distinguish between those sorts of attitudes/associations—those of which we are aware but reject—and those that might be said to be unconscious or otherwise “implicit” in any interesting sense of that word.
Given the pervasiveness and insidiousness of racism, it would be far from surprising if even the most egalitarian-minded (in terms of commitments, settled views, etc) harbored associations shaped by racism and that these were quite available to consciousness.
I wonder whether surveys that asked people about the associations they have but don’t endorse would reveal something closer to the results we get from IAT than surveys that ask people whether they harbor racist views but without making it explicit that these views need not be endorsed.
Perhaps there are such studies?Report
“I wonder whether surveys that asked people about the associations they have but don’t endorse would reveal something closer to the results we get from IAT than surveys that ask people whether they harbor racist views but without making it explicit that these views need not be endorsed.”
That sounds like a worthwhile study! I doubt, though, that the results for those expressing neutrality would be anywhere close to as low as 20%.Report
Right–but the percentage of people who would claim neutrality when the question is about associations/attitudes of which they are aware but do not endorse would be lower than the percentage of people who claim neutrality with respect to their commitments, settled views, etc. So the discrepancy between the explicit attitudes/associations and the IAT results would be smaller than they are now. I would predict much smaller, though this is not based on any sort of research.Report
Edouard, thanks for the concise summary of problems with implicit attitudes research. Too many theorists don’t know about these issues, and your review in WIREs is an important one.
However, I believe you and many other critics of implicit attitudes are making a serious mistake: You are failing to distinguish questions about ontology (do implicit attitudes exist?) from questions about measurement (can we quantify individual differences effectively?).
To illustrate the difference, consider semantic associations. If I present typical American subjects with the word “doctor”, they are much faster to recognize subsequent semantically related words (“nurse”) than unrelated words (“tree”). The size of this effect is huge, reflecting the *existence* of very strong associations between “doctor” and “nurse”. But this effect does not vary much across competent English speakers. So, you cannot measure individual differences in these associations effectively because such differences are essentially absent. Where everyone is similar, measurement — in the psychometric sense at issue here — necessarily fails.
Similarly, in the IAT, White American subjects are much faster in pairing White faces with positive words and Black faces with negative words. Once again, this effect is huge reflecting the *existence* of very strong associations (e.g., associations between representations of Blacks as a social group and negative evaluative representations). But these associations do not vary much in White subjects. This is likely because they are inculcated through years of exposure to stigmatizing stimuli that permeate a shared cultural milieu. So, we end up with a dissociation: The IAT provides solid evidence that stigmatizing implicit attitudes exist, but we cannot measure individual differences effectively because meaningful differences aren’t present.
I think your critique of implicit attitudes research would be stronger if you made it clear that it is focused on attempts to *measure* individual differences, which many psychologists do in fact try to do. Such a critique is valuable and needed. However, whether implicit attitudes *exist* is a different question, and I believe there is strong evidence that they do.Report
I don’t think *anyone* is confused by the distinction between epistemology and ontology, but like most of those who have thought deeply about measurement (Van Fraassen, Chang, Tal, Meehl), I do not think you can separate them neatly: You can’t distinguish ontology from the question of *what* exists, and you need to be clear about how evidence allows you to answer this latter question.
So, surely, the IAT does measure something! The question is what is it and what does the evidence tells us about it. Given what “implicit attitudes” means (for one they are supposed to be attitudes, and they are supposed to be distinct from the attitudes reported by direct measures), you are jumping questions of validation when you assert that they measure implicit attitudes.Report
Edouard, there is confusion. Measurement has a folk sense where it is a synonym for detection, a bland umbrella term for epistemic access. Measurement in the psychometric sense is about quantifying individual differences, a much more specialized endeavor. There is no confusion about ontology and measurement-in-the-folk-sense. There is confusion and ongoing disagreement about the relation between ontology and measurement-in-the-psychometric-sense specifically.
One key issue is that it is not widely recognized that failure of measurement-in-the-psychometric-sense, I mean near total failure, is fully compatible with the thing you are seeking to measure existing and being highly causally efficacious. For example, semantic associations between “doctor” and “nurse” provide the best explanation for reaction time differences in lexical priming tasks, and these associations undeniably exist. But there are no meaningful individual differences in a linguistic community, so there is a total failure of measurement-in-the-psychometric-sense. That this kind of dissociation can happen *is* a point of confusion.
Uli Schimmack, maybe the most influential current critic of the IAT and other indirect measures, appears to basically think the existence of stigmatizing implicit associations is not in doubt, and it is only measurement of individual differences, i.e., measurement-in-the-psychometric-sense, that has failed (his argument for this split view is similar to my own, see page 397 of his review in Perspectives on Psychological Science). You appear to think that failure of measurement-in-the-psychometric-sense should lead us to doubt that “the concept of implicit attitudes refers to anything at all”, a position that seems very different from Schimmack’s. Both cannot be right.Report
Ulrich Schimmack here.
My results show that there is some small amount of valid variance in race IAT scores but this variance is highly correlated with self reported attitude. This undermines the dual attitude model of one explicit and one implicit attitude.
Of course it is still possible that implicit attitudes exist but there is no evidence and no clear concept of implicit attitudes.Report
You say: “My results show that there is some small amount of valid variance in race IAT scores but this variance is highly correlated with self reported attitude. This undermines the dual attitude model of one explicit and one implicit attitude.”
I strongly disagree. Assume for discussion’s sake the hypothesis of “excess uniformity”, which says the relevant White-favoring automatic associations are widely shared in a population without meaningful individual differences (like semantic associations). In the limiting case, there would be **zero** valid variance in race IAT difference scores, again because of excess uniformity. But as long as there is valid variance in explicit attitudes, then we have potentially two different psychological factors that are relevant to behavior: the uniform (non-varying) White-favoring automatic associations and the varying explicit attitudes. In this “excess uniformity” regime, correlation b/w valid variance in race IAT scores and explicit attitudes fails to be a test of the dual attitudes model. In statistical terms, the automatic associations shift the intercept for the population, but do not meaningfully contribute to individual differences in behavior.
Is “excess uniformity” at all credible, or is it just a trick to salvage the moribund implicit attitude research program? It easily meets the standards of credibility. Automatic semantic associations, which exhibit excess uniformity, are a very plausible model for race-directed automatic associations. It is very plausible that both are acquired through experiencing stimulus cooccurrence patterns within a shared cultural milieu. Also, experimental researchers take excess uniformity of automatic associations in the Stroop, flanker, Simon. etc. tasks as an assumption. And where the assumption is checked, like in conflict drift diffusion models, the assumption is upheld.
To be clear, I am not endorsing the dual-attitudes model. There are a lot of models that I am open to about the relationship b/w explicit and implicit attitudes (or “automatic valenced associations directed at social categories”, I am not going to argue about the term). My point is that the quote above from you, which has been pretty influential in the field, should be rejected if “excess uniformity” is true. If I am mistaken, please tell me how I am mistaken.Report
Tushar’s point is a good one. Even if implicit attitudes are pretty uniform in a population precluding measurement of individual-level differences, sometimes the population is structured so you do get some measurable differences at the level of aggregates, like Northern versus Southern states. There is some evidence that you see this, but more work is needed.
But the more fundamental point remains about the need to separate ontology from measurement. Whites, as a group, are much, much faster to pair White faces with positive words and Black faces with negative words. No one contests this observation, the effect is large, and it is very stable across testing sessions (this is different than test-retest reliability which concerns stability of rank ordering of individual differences). Arguably, the best explanation for this effect is the existence of implicit biases.
Now suppose every White person had this bias to an identical degree. In that case, you can’t measure individual differences whatsoever; the only thing that makes two White subjects differ on the IAT is noise. You’d see precisely the anomalies that Edouard is drawing attention to for *measurement* of implicit biases. But this failure of measurement of individual differences does not at all change the fact that implicit biases *exist*!Report
“Whites, as a group, are much, much faster to pair White faces with positive words and Black faces with negative words. No one contests this observation, the effect is large, and it is very stable across testing sessions (this is different than test-retest reliability which concerns stability of rank ordering of individual differences). Arguably, the best explanation for this effect is the existence of implicit biases.”
As far as I’ve seen, this summary is at least questionable on many points:
Hi Justin K.
Regarding (1), you are mixing up the size of the IAT effect (the tendency of subjects to be faster at pairing white faces with good and black faces with bad) and the size of predictive correlations b/w *individual differences in the IAT effect* with some other variable. The former is very large (Cohen’s d of around 1!) and no one disputes this. It is only the latter that is small. But my whole point is that there are settings where a factor can be causally important even when predictive correlations b/w individual differences in that factor and other variables are small. Suppose everyone has a virus (with the same viral load) that is causing their symptoms. The virus is causally important for the symptoms. But individual differences in viral load reflect measurement error and don’t correlate with anything.
Regarding (2) and (3), the IAT is member of a larger class of tasks called “conflict tasks”, which includes the Stroop task and flanker task. The role of automatic associations in driving reaction time differences in these tasks has strong, convergent support. I review evidence for this point in the commentary linked to below after Michael’s post.Report
Differences in reaction times for (variations on) Black/Unpleasant v. White/Unpleasant cannot simply be attributed to racial attitudes, prejudice, or the like. They *may* capture that to some extent, but, if so, the extent to which they do, vs. are polluted and tainted by irrelevant artifacts is not very clear. There is also no evidence that anything the IAT measures is “unconscious” thereby mooting all proclamations about “unconscious racism” based on IAT research — which is nearly all such claims.
And with all due respect, Schimmack is super clear that, in the absence of a valid measure of a construct (in this case, the IAT) there is no evidence for that construct. Anything is possible, absence of evidence is not evidence of absence. But absence if evidence surely is not evidence of presence.
Here is one of his articles. The title is super clear on his view of this:
Schimmack, U. (2021). The Implicit Association Test: A method in search of a construct. Perspectives on Psychological Science, 16(2), 396-414.
“In search of a construct.” I.e., “the construct has not yet been found.”
Blanton, H., Jaccard, J., & Burrows, C. N. (2015a). Implications of the Implicit Association Test D-Transformation for Psychological Assessment. Assessment, 22(4), 429–440. https://doi.org/10.1177/1073191114551382
Blanton, H., Jaccard, J., Strauts, E., Mitchell, G., & Tetlock, P. E. (2015b). Toward a meaningful metric of implicit prejudice. Journal of Applied Psychology, 100(5), 1468-1481. https://doi.org/10.1037/a0038379
Bluemke, M., & Fiedler, K. (2009). Base rate effects on the IAT. Consciousness and Cognition, 18(4), 1029-1038. https://doi.org/10.1016/j.concog.2009.07.010
Conrey, F. R., Sherman, J. W., Gawronski, B., Hugenberg, K., & Groom, C. J. (2005). Separating multiple processes in implicit social cognition: The quad model of implicit task performance. Journal of Personality and Social Psychology, 89(4), 469-487. https://doi.org/10.1037/0022-3518.104.22.1689
Corneille, O., & Hütter, M. (2020). Implicit? What do you mean? A comprehensive review of the delusive implicitness construct in attitude research. Personality and Social Psychology Review, 24(3), 212-232.
Fiedler, K., Messner, C., & Bluemke, M. (2006). Unresolved problems with the “I”, the “A”, and the “T”: A logical and psychometric critique of the Implicit Association Test (IAT). European Review of Social Psychology, 17(1), 74-147.
Hahn, A., Judd, C. M., Hirsh, H. K., & Blair, I. V. (2014). Awareness of implicit attitudes. Journal of Experimental Psychology: General, 143(3), 1369-1392. https://doi.org/10.1037/a0035028
Mitchell, G. (2018). Jumping to conclusions: Advocacy and application of psychological research. In J. T. Crawford & L. Jussim (Eds.), The politics of social psychology (pp. 139-155). New York, NY: Routledge.Report
Lee, The highly reliable and stable group-averaged difference in reaction time seen in the IAT (white-good much faster than black-good, etc.) is the fixed point in this debate that all parties agree to, and no one, including you, me, Edouard, Uli, etc. has ever denied it. You agree it *may* be explained by stigmatizing automatic associations. I work on conflict tasks more broadly where it is widely accepted that these reaction time differences arise from automatic associations. For that and other reasons I say it is *likely* that is what they reflect. I do not see a huge difference here and reasonable people can disagree.
I don’t buy for a moment that implicit attitudes are “unconscious”. They reflect *automaticity*, which is a different thing. Talk of “unconscious attitudes” is deeply unfortunate, so we agree there too.
Schimmack’s title is as you say. But what he actually writes is clearer and more telling:
“Just like the Stroop test, the IAT shows highly reliable interference effects when the majority of participants have preexisting associations (e.g., most people like flowers and do not like insects). Few people doubt that the IAT can reveal average differences in associations (e.g., on average, people like flowers more than insects). Thus, there is no disagreement about the validity of the IAT as “as a general method for measuring relative association strengths” (Greenwald, Banaji, & Nosek, 2015, p. 553). There is also good evidence that the IAT can reveal group differences in associations. For example, German fans have positive associations with the German soccer team, and Brazilian fans have positive associations with the Brazilian team.”
This is not a person who doubts the existence of valenced implicit associations attaching to social categories (i.e., implicit attitudes). Moreover, he thinks that the IAT itself, via these highly reliable group-averaged effects, provides strong evidence for these associations. But he goes on to say:
“Showing reliable group differences in the strength of associations is not sufficient to demonstrate validity of the IAT as a measure of individual differences, which was the purpose for developing the IAT” (emphasis mine).
So he is not denying the existence of implicit attitudes, especially widely shared implicit attitudes in a group. He is denying that the IAT measures *individual differences* in these attitudes well. That has been my point throughout this discussion!Report
It would be good to stop equivocating between the claim that the iat measures something and the claim that the iat measures implicit attitudes, a point I made earlier but that didn’t seem to be put clearly enough.Report
Automatic does not equal without awareness. You seem to define implicit as automatic but wecan just say association. Some thought come to mind more readily. The process is implicit automatic but the thought is not.Report
“ Arguably, the best explanation for this effect is the existence of implicit biases.”
‘Arguably’ is doing a lot of work in this argument.Report
That is inarguable!Report
Respondents are aware that they have these associations. So it is still not clear that the reaction times are driven by implicit processes. This is partly due to sloppy confusion of uncontrollable and without awareness.Report
Something worth emphasising about this isn’t just that the field hasn’t addressed some shortcomings, it’s that failed replications and problems were noticeable to anyone paying attention years ago, and yet these were not given much uptake by the broader philosophical community. See also stereotype threat, stereotypes ‘lead to their own confirmation’, and various work by Lee Jussim. Anyone here who has ever written on such topics should be asking themselves why they didn’t hear about these earlier, or why this information wasn’t given proper weight by others, or why it took so long for some of this information to become common knowledge. And we should all be doing more to implement policies which can reduce the chances of things like this happening again.
One might try to shift responsibility to the scientists and plead for an excuse because the original studies get cited far far more than the failed replications. But it also seems pretty clear that that in itself is because psychologists are very very liberal as a field, just like philosophy, and we should have the kind of self-awareness to be able to correct for other kinds of biases affecting various forms of evidence.
Even if implicit bias turns out to be real, as a call for consistency we should also be concerned about the biases that lead to this kind of collective blind spot. That is, if biases with effect sizes so weak they can only be detected with extremely large data sets, and which still manifests lots of failed replications, and neglible effect on behaviour was worth all this time, effort, awareness raising, analysis, and campaigning, it seems like we should be just as or more concerned with the kind of biases that caused many people in our discipline running with this research without the appropriate nuance and qualifiers. I won’t mention any particular issues, but it seems there are many other topics which people in our field regularly take as given for which people with different epistemic networks and ideological leanings might be able to offer a much needed corrective.
I know many readers here make fun of calls for ideological/viewpoint diversity (‘why not invite creationists then?’) but it seems that people who do not find such calls persuasive owe us some alternative solutions; it wasn’t the folks at Heterodox academy being too credulous about this research.Report
“…because psychologists are very very liberal as a field, just like philosophy…”
I get what you’re saying but it’s interesting how these things get sorted politically, no? I mean, saying that people are deep down still racist or sexist despite their introspection and protestations to the contrary, and despite decades of well-known rational argument against racism and sexism — well, it’s hard to imagine a statement about racism that fits better with the conservative intellectual tradition.Report
Edouard raises many interesting and important issues in his recent articles on implicit social cognition. This comment is aimed at clarifying just a few points in Edouard’s post.
First, readers might want to know that Edouard’s recent papers—linked in the post above—are part of an exchange in the journal WIREs Cognitive Science between himself and Bertram Gawronski, Alex Madva, and Michael Brownstein. The exchange consists of 4 papers, followed by responses from philosophers and psychologists who work on implicit social cognition. Readers interested in understanding these issues in more depth, and learning about views different from Edouard’s, might want to read the other papers too. Here they are in the order in which they were written:
(1) Brownstein, Madva, & Gawronski, “What do implicit measures measure?”
(2) Machery, “Anomalies in implicit attitude research”
(3) Gawronski, Brownstein, & Madva, “How should we think about implicit measures and their empirical ‘anomalies?’”
(4) Machery, “Anomalies in implicit attitude research: Not so easily dismissed”
–Dai & Albarracín, “It’s Time to Do More Research on the Attitude-Behavior Relation: A Commentary on Implicit Attitude Measures”
–Moors & Koster, “Behavior prediction requires implicit measures of stimulus-goal discrepancies and expected utilities of behavior options rather than of attitudes towards objects”
–Byrd & Thompson, “Testing for Implicit Bias: Values, Psychometrics, and Science Communication”
Spaulding, “Assessing the Implicit Bias Research Programme: Comments on Brownstein, Gawronski and Madva vs. Machery”
(Not all the commentaries are published yet.)
Readers might also want to check out a recent special issue of the journal Social Cognition, “Twenty-Five Years of Research Using Implicit Measures,” which reviews up to date data and theorizing. From reading Edouard’s post above, one might think that research in implicit social cognition is moribund, but that is not at all the case.
Second, Edouard suggests that we agree with the main thrust of his discussion. That is not true. Edouard is here alluding to the point we make in our second article above (“How should we think . . .?”) that what we call the “modal view” of implicit social cognition is flawed. This is the now largely outdated view (albeit still very popular and influential outside social-cognitive psychology) that “implicit attitudes” are uncontrollable and unconscious representations stored in our minds alongside our conscious beliefs. Much like two epistemologists who agree that the justified true belief theory of knowledge is wrong, the fact that we agree with Edouard that this particular theory is unsupported leaves quite a lot for us disagree about. We hope people will check out the exchange to explore the substantive points of disagreement.
Finally, we’d like to add that we think that many philosophers go astray not by being too credulous of the science, but by failing to read it carefully or to stay up to date about current developments. There have been large upheavals in thinking about implicit social cognition (as there have been in, the last time we checked, pretty much every other domain of study into the mind). A different perspective on these upheavals is that it’s an exciting time to learn more about and even participate in these ongoing debates.
For just one example of a recent intriguing entry, we invite interested readers to check out Peter Kvam, Colin Smith, Louis H. Irving, and Konstantina Sokratous’s new preprint, “Improving the reliability and validity of the IAT with a dynamic model driven by associations.” Without dwelling too long in the weeds, prevailing methods of assessing IAT performance either focus on speed (the D-score) or accuracy (the Quad Model), where Kvam and colleagues develop a dynamic model that incorporates both. They claim that their “model disentangles processes related to cognitive control, stimulus encoding, associations between concepts and categories, and processes unrelated to the choice itself. This approach to analyzing IAT data illustrates that the unreliability in the IAT is almost entirely attributable to the methods used to analyze data from the task: the model parameters show test-retest reliability around .8-.9, on par with that of many of the most reliable self-report measures.”
Numerous open questions remain for the model, which has not yet been published. Our point is simply that, for those of us neck deep in the literature, new contributions like this (and there are innumerable others) are thrilling invitations to observe and take part in the long and tortuous journey to understanding the mind and social reality, one which—also the last time we checked—quite often takes more than 25 years.
–Michael Brownstein and Alex MadvaReport
Thanks Michael and Alex. My comment on the exchange was also recently accepted at WIREs Cognitive Science:
Sripada, C. “Whether Implicit Attitudes Exist is One Question, and Whether We Can Measure Individual Differences Effectively is Another.”Report
So sorry for the oversight, Chandra! Thanks for posting your excellent commentary.Report
Thanks for linking to all these great articles. It is important to have a good sense of the diversity of opinions on this topic.
Research on implicit social cognition inspired by the IAT and other indirect measures is of course not moribund – too many careers are staked on it. But I predict not much will remain in 10 to 15 years.
At some point psychologists and funding agencies will start to ask what has be learned after nearly 3 decades of research.Report
I’ll gladly take that betReport
My fav line is from your final letter in the exchange: “Finally, we note that Machery’s repeated invocation of how much remains unsettled “30 years after the implicit revolution” (p. 5) takes it as self-evident that 30 years is too long for a scientific research program to settle on the kinds of debates and frameworks we discuss here. This is not at all self-evident. Basic theorizing about emotion, perception, attention, and consciousness is multipronged, conflicted, and ongoing—appropriately so—after much more than 30 years.”
So maybe people shouldn’t have been screaming about the horrors of unconscious racism for the last 24 years, designing “trainings,” writing articles for law journals to influence law and policy … you know, until the massive uncertainty surrounding almost all claims involving almost all things “implicit bias” especially but not exclusively based on the IAT were actually resolved?
Who is doing the work to inform the public that wild claims like “70-80% of Americans are unconscious racists?” That unconscious bias is independent of and more powerful than explicit bias? That the role of “implicit bias” in producing anything anyone cares about, from discrimination, inequality, gaps is entirely unknown and no one should be acting like they “know” (“its based on the peer reviewed science! — are you some sort of science denialist?”) it does?Report
Folks might interested in this paper Jules Holroyd and I published on the implications of these sorts of criticisms for reform efforts in philosophy: https://www.researchgate.net/publication/334108322_Implicit_Bias_and_Reform_Efforts_in_Philosophy_A_DefenceReport
While I don’t agree with everything in this paper, I recommend readers check it out. I particularly agree with the following claim: « the reforms can be motivated quite independently of the implicit bias data ». This seems true for most reforms proposed in the last two decades.Report
Thanks for posting this paper, Professor Saul. Though I’m not steeped in the empirical literature on implicit bias and thus have difficulty concluding anything about the second half of the paper (or about much of the debate in this discussion on Daily Nous), I think the first half (or so) of the paper is really helpful for understanding the upshot of the debates about implicit bias and how these figure in efforts to diversify philosophy.
One thing I’d like to suggest is that if the various policy reforms are, as you say, multiply justified, then it might be a mistake for those leading the reform efforts (the ends of which I endorse, though for reasons other than “diversity” as that term seems to be understood by those leading the reforms) to invoke implicit bias/attitudes so frequently, and to make implicit bias so central in the analyses of what’s going on. It seems obvious to me that simply as a strategic matter, if one has multiple justifications for some policy, it is best to rely on those justifications that are strongest.
I also think the focus on implicit bias may actually be counterproductive because it leads to efforts that conflict. We are encouraged to employ anonymization to a greater extent and also told that applying universal standards anonymously may not help because minorities-in-philosophy face special challenges that put them at a disadvantage when such standards are employed. If someone has faced special challenges on the basis of their race, sex, country of origin, or whatever, then we need to know these things in order to make whatever adjustment to our assessments. But of course, in order to know these things, we can’t anonymize, at least not very far, and then if we don’t anonymize, the problem of implicit bias purportedly rears its head. Though perhaps resolvable with further specifications and complications, these tensions, along with the more general concerns about the validity of implicit bias research, I think, risk giving people the impression that the efforts at addressing distributive injustice along the lines of race, sex, national origin, etc. aren’t well justified, when in fact they are.Report
It might be helpful to separate two questions:
1) Is ‘implicit bias’ a sufficiently well-evidenced phenomenon to justify further research within psychology?
2) Is ‘implicit bias’ a sufficiently well-evidenced phenomenon to justify those outside psychology, e.g. policy-makers and philosophers, to proceed on the assumption that it is correct?
It looks to me as if the experts in this thread agree that the answer to (1) is ‘yes’. (They have variable degrees of optimism/skepticism, but everyone agrees both that there is lots of room for uncertainty and doubt, and that there is serious science being done.)
But I read Edouard’s main point in the OP to be (2): that outsiders have been way too quick to treat implicit bias as settled science and to make policy, and do philosophy, on that basis. And that seems highly defensible, and not really challenged by the comments. (The typical form of a policy statement about implicit bias is not: ‘some psychologists have theorized that humans display ‘implicit bias’. This is highly contested within the field; however, if it were true it would suggest A,B,C, and since A,B,C are at worst harmless you could consider implementing them anyway’. It is ‘implicit bias has been scientifically established, so you should do A,B,C.’)
To draw an analogy to an area I know better: implicit bias sounds like string theory. String theory is really controversial in physics: some people think there’s a really strong case for it, others think it’s a largely-failed research program (I’m much more in the former camp FWIW) but pretty much everyone agrees it’s legitimate science. But it would be very premature to develop broader scientific or philosophical themes, outside theoretical physics, on the presumption that string theory was true.Report
David, this is a helpful intervention in this debate. If Edouard’s point is in fact only (2), then I strongly agree with him. Research on implicit bias *is* way too unsettled and uncertain for outsiders to draw any strong conclusions, especially strong ethical or policy conclusions. I am definitely not in favor of mandated implicit bias training sessions, University websites that present one-sided perspectives on implicit bias research, professors opining about how implicit bias explain this or that disparity, and the like.
But I see lots of things in what Edouard says that look more like he is denying (1). Anomalies are, after all, things that fundamentally threaten a scientific paradigm lead to its demise! And I think he is putting way too much weight on failures of psychometric properties of indirect measures of implicit attitudes as a justification to abandon the construct. I reject that inference for principled reasons that I lay out in my Commentary linked above. Michael and Alex reject it for their own principled reasons that differ somewhat from mine. So some of the debate here is (legitimately) about (1).Report
I’ll weigh in here because I fit David’s point to a T. I do work ongoing research using the IAT. There is no reason to abandon research using the IAT. There are tons of reasons to be deeply skeptical of the most common conclusions about “implicit bias” that have been promulgated by IAT/implicit bias advocates for the last 25 years, and, given the known flaws and limitations of all implicit measures, and the wildly unjustified conclusions long promulgated by implicit bias advocates, to express deep caution and uncertainty about *anything* found using the IAT and other implicit measures.Report
It seems to me that there are quite a number of views under consideration here, including:
As far as I can see, nobody in this discussion is saying either 1 or 4, though people are distancing themselves from both. The interesting question, therefore, seems to lie between 2 and 3. And I don’t think that any evidence provided so far gives clear support to 3 over 2. In fact, from what I can tell from what I understand of the literature, 2 (‘Something may be going on here, but we have no reason to think that the best explanation is implicit bigotry’) is to be preferred over 3 (‘The best explanation is clearly implicit bigotry’) at this point.
Comparisons have been made with the Stroop task. But as I understand the Stroop task, some important elements are different from the state of play in the IAT literature. First, as far as I’m aware (and I’ll admit that I’m not up on the recent Stroop task literature), the main candidate explanations for the Stroop task anomalies do not entail anything that has been shown to be false or dubious. However, the ‘implicit bigotry’ hypothesis for the IAT anomalies seems to entail a number of things that, as far as I know, have been shown to be at least extremely doubtful. Most important, perhaps, is the fact that implicitly bigoted people would presumably act and make decisions in a more bigoted manner, despite not realizing that they are doing so. But as I understand it, nobody after all these years has come up with any study that confirms this hypothesis. Doesn’t that present serious problems for the ‘implicit bigotry’ interpretation of the phenomena in ways that we don’t have in the Stroop task literature?
Also, Edouard Machery has raised a point that one would hope would be of interest to all serious philosophers: that we as a philosophical community were very quick to presume the truth of some purportedly scientific claims far before we were warranted in doing so. (I say ‘we’, because I made the same error at the time, trusting too easily — as I now see it — that the entire philosophical community would not rush headlong into such an error unless the evidence for the hypothesis were well-nigh-irrefutable). Edouard is now acting exactly as I would have hoped every philosopher would: finding that he had been in error, he is trying to figure out what led him astray. This seems to me a highly important thing for us to do together. Among other things, what has happened seems to highlight the risks to which we are all prone now that academia has become radically skewed in one political direction, and it seems that a big lesson for all of us is that, if we can’t achieve reasonable viewpoint diversity, we should at least tread very carefully when the science seems to favor a conclusion that would be remarkably convenient for the political direction favored by nearly all our colleagues (especially when it’s our own political direction!). I’m actually not too bothered by the fact that Machery, Gendler, Schweitzgebel, and others rushed ahead of the evidence and seemed to get things wrong: we all make mistakes, and I assume that they all acted in good faith. But I do find it chilling that the response to all this within philosophy (with very few exceptions, Edouard Machery being one of them) is crickets. Instead of considering how we all went wrong, rationalizations are given, and (worst of all) history from just a few years ago is rewritten and it’s now apparently to be pretended that we didn’t feel certain that we had a reliable tool for finding hidden racism deep in our own hearts. We did. We jumped to conclusions, and we got things wrong. Can we now discuss what went wrong and put some safeguards in place (yes, while the research into the IAT anomalies continues)?Report
This is helpful, but I don’t find David’s analogy with string theory apt since the stakes on developing policy proposals on the basis of string theory are so low. I think a better analogy for policies based on the (possible) truth of implicit bias are policies that regulate media content. How and why advertising works isn’t settled science. Nonetheless, given the public health risks associated with smoking, I support the regulation of ads that promote the sale of cigarettes.
This also works better as an analogy because the same unconscious mechanisms that link ad campaigns to patterns in consumer behavior seem to be present in the role that implicit bias may play in explaining patterns of political behavior. In both cases unreflective cognition of some sort is thought to have a role, though its connection with observable outcomes (if any) remains unclear. Still, to the extent that we believe regulating advertising is worth doing, there appear to be the same reasons to believe that addressing things like implicit bias is worth doing. (I’m leaving open here what the work of addressing such things should look like.)Report
I think this does a good job of crystallising the stakes of the IAT discussion and the question of its implementation policy contexts.
But I don’t think the IAT emerges well from the comparison with regulated advertising, despite the ambiguities and uncertainties justifying both sets of policy decisions.
i) Policy aim of regulating media content (eg reducing or eliminating advertising for tobacco products): reduce incidence of smoking related illnesses.
Proposition 1: Consuming tobacco related products causes higher incidence of specific negative health outcomes
Proposition 2: Media content/ advertising induces/ triggers people to consume more tobacco related products*
Proposition 3: [sight of ambiguity] The mechanism by which advertising effectuates this behavioral change is undetermined
Conclusion: Regulating media content/ advertising is a sensible and justified policy in the aim of reducing smoking related illnesses; whereas
ii) Policy aim of implementing IAT/ implicit bias findings: reduce social impact of racist/ biased attitudes and/ or identify and eliminate racist/ biased attitudes.
Proposition 1: The IAT shows that there are mean group differences in reaction times to certain stimuli, including those representing particular racial or sexual characteristics
Proposition 2: The reaction times and group differences may possibly be best explained by (group) differences in automatic associations
Proposition 3: [sight of ambiguity] These automatic associations may or may not translate to actual behaviour or attitudes amongst particular groups towards real-world people in real-world situations with varying stimuli, including those representing particular racial or sexual characteristics
Proposition 4: These automatic associations say nothing about individual behaviour or attitudes amongst particular groups towards real-world people in real-world situations with varying stimuli, including those representing particular racial or sexual characteristics
Proposition 5: [sight of ambiguity] Group differences in reaction times in the IAT may or may not reflect something like unconscious bias or implicit attitudes or automatic associations or secret bigotry or social familiarity or individual mediation of social representations or something else
Proposition 6: The IAT does not measure stable traits that predict individuals’ behavior
Conclusion: Implementing IAT/ implicit bias findings in HR, recruitment, academic and other professional contexts is a sensible and justified policy in the aim of reducing the social impact of racist/ biased attitudes and/ or identifying and eliminating racist/ biased attitudes.
I think you might agree that the conclusion to ii) is altogether weaker and less compelling than the conclusion to i), despite the presence of some evidential ambiguity in both contexts.
Not that this in and of itself would show the IAT to be a scientifically useless or debunked tool. But I think it certainly reinforces the thought that more philosophical interrogation of the substantive claims made on the basis of IAT/ implicit bias findings is warranted before we accept it as a basis of policy making.
The analogy with regulating advertising isn’t exact but there’s more symmetry here than comes across in your comparison. Ad campaigns are all about generating implicit associations and they succeed at this impressively, to the extent that to this day it’s hard for me to listen to Bach’s Air on a G String without also picturing a man lighting up a cigar. The ultimate aim is to affect public consumption but the proximate aim is just to establish a mental connection so that as many people as possible automatically think “a cigar called Hamlet” whenever they hear “happiness.” Who knows how the mechanism works but it seems to produce results in consumer behavior, with a $70 billion industry riding on it doing so.
Now, most of us don’t think it’s controversial to see a set of connections between the kind of representational content produced by ad agencies, the generation of implicit associations, and the effect of these associations in influencing consumer behavior. Most of us also believe it’s prudent to regulate the kind of content that ad agencies produce and to hold corporations accountable for promoting associations that are harmful to public health.
Isn’t this the same set of connections, essentially, that believers in implicit bias are trying to address? Again, we can debate what the work of addressing the phenomenon should look like, including the effectiveness of things like anti-bias training and even the term “implicit bias” itself, but in most of the recommendations made by Saul and others that I’m aware of, the advice is really just to be more mindful about including the representation of women on course syllabi, pictures of philosophers of color on department walls, etc. That is: the positive advice is directed at reforming institutions and content, not primarily at reforming individuals. Until the science is born out more conclusively, such proposals strike me as a justified and fairly modest way of proceeding in the interim.Report
“…the advice is really just to be more mindful about including…women on course syllabi, pictures of philosophers of color on department walls, etc… such proposals strike me as a justified and fairly modest way of proceeding in the interim.”
I’m unsure how much to say about that proposal here, since it takes us away from the topic of implicit bias. But it seems to me that a number of questions ought to be discussed first. Here are four that come to mind:
1. Would following those recommendations really make, or help make, the desired change? The claim that diversifying syllabi will lead to more women staying in philosophy is repeated so often, it seems that few of us ever think to investigate it. But it has been tested on one occasion, and the results were not promising. Thomson, Adleberg, Sims, and Nahmias describe that experiment in Why Do Women Leave Philosophy? Surveying Students at the Introductory Level (umich.edu). Students were surveyed at the end of two large introductory courses, one of which was required to include a number of writings by women. Women taking the latter course were not more likely to indicate a willingness to continue in philosophy.
2. Are the benefits of adopting those recommendations worth the negative side effects? The side effects I’m thinking about here are negative for the very goal the recommendations are meant to pursue: the goal of making women feel more comfortable in philosophy. Here are two recent incidents that illustrate a couple of the unintended side effects I never see discussed. Many years ago, I had a female student in an introductory ethics course who came to office hours a few times. She was clearly very excited about philosophy. One day, she asked me where she could read more about it. I recommended that she read Judith Jarvis Thomson and Frances Kamm. When I said of Kamm that she had probably done the most to explore the nuances of the problem, the student’s expression suddenly changed from happy to serious, and she looked at me with an expression of what looked like disappointment in me. She then asked me whether these two people really were the best people to read on the topic, or whether I just mentioned them because they were women. I assured her that I would never let that be a factor in any recommendations I made to anyone, and that I was sincere, but it wasn’t easy. I had never discussed feminism or the position of women in philosophy with her or the class before, so she had no reason to suspect that I personally would skew my recommendations. And yet, she seemed to have developed a suspicion of any endorsement of female philosophers. A female student I had in another version of that same course once demanded (privately) that I justify my choice of a course textbook by a male author (it was the only text I used). It seems clear that the idea that professors ought to diversify their syllabi is well known to our students, and that many students therefore view all suggestions and readings from female philosophers with at least initial suspicion, like the first student I mentioned did.
The other incident involves a student who was in her second semester of philosophy when she took one of my courses. She, too, obviously loved doing philosophy, and was the most active student in class discussions. She came to see me once during office hours, near the end of the semester. But she didn’t have her customary frantic enthusiasm about philosophy when she started talking to me. Instead, she began grilling me about a poster in the hallway: it was an APA poster that showed a number of female philosophers, and said (as I recall) ‘Got Women?’. She peppered me with doubtful questions: was there some problem with women in philosophy? Is philosophy just something that women don’t take? How many majors are women? And so on. Again, we had something here that was intended to make philosophy more inviting to women, and it instead pulled at least some female philosophy students out of the excitement of doing philosophy with everyone else by making their sex salient and raising an issue that hadn’t seemed to bother them before.
I don’t know how common these sorts of reactions are in female students, and I’ve never seen anyone try to work it out or even consider these possible negative effects. But they clearly occur at least sometimes. In both cases, a class of students are treated as objects that can be manipulated through syllabi and posters so as to further positive social goals, with no thought for how those same students might easily see through our maneuvers (which we never do much to disguise) and resent it or feel distanced from philosophy.
3. Is it really good to use social engineering to achieve a balance of men and women in philosophy, and if so, what is the best way of achieving that end? Female students significantly outperform male students at all levels of education, and are significantly overrepresented at university: For every 100 male students in higher education, there are 131 female students. While there are some areas in which men predominate, these areas are more than offset by the predomination of women in other areas. Psychology has exactly the opposite ratios of male and female students that we find in philosophy. Other disciplines, like English, have an even higher level of female domination. And yet, I know of no campaigns to increase the ratio of male students there by contriving to include more men in syllabi or putting pictures of men on department walls with slogans that anyone would see were meant to manipulate men into feeling more comfortable there. In fact, there doesn’t seem to be any attempt to increase the ratio of men through any means at all.
The strange thing is that, given the predominance of women at university, increasing male enrollments in female-dominated fields would also increase female enrollments in fields like philosophy where men currently predominate, since the men who would be attracted to psychology, English, social work, etc. (if such campaigns really can engineer our choices so easily) would presumably be stripped out of the alternatives. So why are such things never proposed?
For that matter, if ending male preponderance in certain disciplines is so important, why shouldn’t it be approached more directly, without manipulations that students seem to have an easy time seeing through? All it would take would be a quota system in all courses, majors, and advanced degrees. These systems would prevent male or female students from enrolling where they are already overrepresented. This quota system would guarantee that every discipline would end up with a significant predomination by female students (since higher education as a whole has such a predomination by women). Why do we never hear such proposals, either? Such quotas would guarantee proportional equality very quickly, and it seems that nearly everyone feels that that aim is of crucial significance. Why is it better to try to sneakily manipulate people’s perceptions, especially since it seems that they will catch us in the act?
4. Why not do this for those whose viewpoints are more obviously underrepresented? One motivation that is frequently given for these measures is that we can’t have a fair discussion of the issues we need to discuss if we exclude certain perspectives. And yet, it is often unclear how much those perspectives correlate with the demographic groups people seem to assume they can use as a proxy. For instance, women are about equally as divided as men on the abortion issue, so bringing more women into a discussion on abortion doesn’t seem to tilt (or un-tilt) the discussion toward a certain side of that issue.
However, it is certainly the case now that certain sociopolitical viewpoints are underrepresented in nearly all philosophical discussions on those topics. I’m almost certain, for instance, that nobody in my department (including me) supported Trump in either of the last two elections. And yet, about half of the country did. Survey after survey shows a draining away of conservatives, libertarians and centrists from higher education, and it’s now at the point where discussions of many issues of philosophical interest cannot even be fairly conducted, since we don’t have a single philosopher to bring in to present the best arguments and objectives from an entire side of the issue. The problem of viewpoint underrepresentation threatens the possibility of the same fair discussions we philosophers need to have. And yet we hear painfully little of any attempts to attract students of diverse political views to philosophy or even to diversify syllabi so that they represent a broad range of viewpoints (which really seems like a no-brainer).
Perhaps, if putting up posters of members underrepresented groups and diversifying syllabi is the best thing for us to do, we could accomplish both tasks at once: we could use these means to promote philosophers who are not white and/or male, but who are conservatives, libertarians or centrists. But I wish we would start by getting clear on the exact purposes behind these projects, the likely outcomes (good and bad) of pursuing them, and the evidence that they will have the intended effects.Report
I’ll bite on (parts of) 2. Let “implicit” attitudes be as Edouard defines them here: unconscious, which here I take to mean simply ‘unknown to the subject’ (and so unreportable), mental states that potentially influence behavior.
It seems obvious to many people that we have these attitudes. I might discover, for instance, that I have an association, like that if you say ‘salt’ the first thing that comes to my mind is ‘pepper.’ I might also discover that I was biased for or against somebody–e.g., sympathetically being more lenient towards students from my hometown or enviously holding it against Edouard that he crafts such clever arguments. If you have ever discovered that you said something when in a bad mood that you later regretted–or if you have ever spent time with a ‘hangry’ seven year old–and if you can trace those behaviors to something fairly called an attitude, then this kind of phenomenon should not be surprising. And since we can discover such attitudes–they are at times unknown and yet capable of influencing us–they are implicit on Edouard’s definition.
As long as we accept those commonsense data, we face the moral question that Edouard mentioned above: are we responsible for such attitudes and the behaviors, if any, that stem from them? If you are not even aware that your (say) grading behavior is biased toward students from your hometown, then you might be violating a version of the knowledge condition for responsibility. And if such attitudes are not part of your true self, again you might not be responsible for them. Cue the philosophers.
Obviously, the fact that we have such philosophically interesting mental states does not by itself vindicate the IAT or any other way of trying to track our implicit attitudes. Nor does it mean that we should use the results of the IAT or any other instrument to justify policy changes. Nor does it mean, by itself, that there are troubling implicit-attitude patterns regarding race, gender, or other social categories. But of course, because some measurement of implicit attitudes might fail, that does not mean that there is no phenomenon of philosophical relevance to be studied. It just means that those measures aren’t tracking that phenomenon, and the science has to catch up to the phenomenon. (Though on this point, see Chandra’s comments.) Meanwhile, we can (and should) keep engaging the relevant philosophical questions.Report
I weighed in several times and its been an interesting and pleasantly civil set of exchanges. No one is calling anyone else a racist, fascist, White supremacist, etc., which has become all too common in online discussions like this, so huge hat tip to all here, even when I disagree.
Some of you may find two resources useful:
That is an open repository of more than 40 sources, most peer review, some chapters, a couple of talks that provide criticisms of the IAT or implicit bias or walk back some of the more extreme claims based on said research for much of the last 25 years. Anyone wishing to defend the IAT or implicit bias should probably familiarize themselves with that literature.
Then there is this:
It is a work in progress, a shot list, of the tricks of the trade that (mostly social) scientists use to make their findings and arguments sound vastly more credible and impressive than they actually are. It is generic, not on the IAT or implicit bias, but lord above does much of it apply.Report
David and Justin K. offer their takes on what we are actually disagreeing about in this thread. I think we are disagreeing about one and only one thing. But to get there, I want to give it one more go at clarification.
Suppose you give a large number of White Americans the race IAT. You will observe at least TWO very different kinds of results that have gotten totally muddled in the discussion so far and need to be kept separate.
1) IAT effect: You will observe a very large reaction time difference in most subjects where they are quicker to make pairings in the congruent condition (White-good and Black-bad) versus the incongruent condition (White-bad and Black-good). The group average of this difference is the IAT effect. This effect is large and temporally very stable. If you bring these subjects back in a few weeks, you are going to see a very similar large White-favoring difference. The IAT effect is not a “measure” in the psychometric sense—it is a group average so definitionally it cannot be used to quantify individual differences.
2) IAT difference scores: This is the reaction time difference between congruent and incongruent conditions for each subject. It is not a group average. When people talk about the IAT as a “measure” of individual differences, this is what they are referring to.
No one in this thread disputes that the IAT effect is large and temporally stable, etc. And to be clear, there are no “anomalies” with respect to the IAT effect. It was reported in Greenwald’s original paper and has been consistently replicated.
Also no one in this thread really disputes that IAT difference scores have poor psychometric properties. Just as Edouard says, difference scores aren’t cohesive across indirect measures, they have poor test-retest reliability, and they have poor predictive relationships with behavior.
Since much of the early IAT literature was based on claims that IAT difference scores DO predict behavior well, then people who accepted this uncritically do have to ask themselves why they were so credulous. This too is not in dispute. No one here is an apologist for false claims that the IAT can discern who is a racist.
The only real dispute in this thread concerns this:
How damning is the poor psychometric performance of IAT difference scores for the construct of implicit attitudes and for the implicit attitudes research program?
Edouard seems to believe it is very damning, and it is basically enough to undermine the implicit attitudes research program. I disagree and so do others.
One problem for Edouard’s position is that you can’t just criticize IAT difference scores and stop there. What explains the huge IAT effect? Edouard doesn’t really address this in detail.
A second key problem is that he treats indirect measures like the IAT in isolation. The IAT is a member of a larger class of conflict tasks like the Stroop task. These tasks have a congruent condition with “matching” stimuli and an incongruent condition with “mismatched” stimuli. All these tasks, like the IAT, generate huge group-averaged congruency effects.
And way before the IAT ever came on the scene, many theorists were already convinced of the following claim for a number of convergent reasons:
(a) The best explanation of the group-averaged reaction time advantage for congruent versus incongruent conditions in conflict tasks is the operation of automatic associations.
More recently, people have found that difference scores from conflict tasks all have very poor psychometric properties. I myself wrote in a paper in Cognition in 2021 that conflict task difference scores are not cohesive across tasks, they have poor test-retest reliability, and they have poor predictive relationships with behavior, eerily similar to what Edouard writes about IAT difference scores.
When faced with evidence of poor psychometric properties of difference scores from conflict tasks, most theorists did not abandon (a)—I am aware of no one who did that. They instead retained (a) and proposed a specific explanation for the failure of difference scores. The two leading approaches are i) excess uniformity across subjects in the factors that produce the congruency effect; ii) failure of pure insertion. I won’t explain these here, but you can look at my Commentary for some details. But I doubt anyone in this thread disputes (i) and (ii) are viable explanations for why (a) is compatible with poor psychometric performance of difference scores.
So that is a second key problem for Edouard’s view. In the case of conflict tasks like the Stroop, psychometric problems for difference scores were not seen as a reason to abandon (a). But Edouard seems to think they are damning for the analogous case of the IAT.
Edouard’s post is supposed to be a plea for circumspection in how we treat scientific evidence. My criticisms of his post are very much in that very spirit. We are hearing from him key problems for the IAT and his reasons for pessimism. I and others are pushing back with key considerations in favor of the implicit attitudes construct. That right there is pretty much the only real thing in dispute. Report
The idea that there are no anamolies in the IAT effect is simply wrong. The well-replicated mean group difference in reaction times is exceedingly poorly understood. One reason is that when an individual measure (which is what the IAT is) is so heavily polluted by non-attitudinal artifacts and biases, the conditions under which those artifacts and biases simply evaporate when averaging across groups are rare indeed.
Concrete, Blanton, Jaccard & Burrows (2015) used Monte Carlo simulations to show that, even when 90% of a sample were unbiased 9.8% were only slightly pro-white, the IAT can produce results showing massive pro-white bias. That this is even possible clouds any simple interpretation of the mean IAT effect as saying much about bias (implicit or otherwise).
Relatedly, Conrey et al 2005 showed that no fewer than four separate processes, only one of which is implicit attitude, contribute to IAT scores. Again, this seriously complexifies interpretation of any group differences.
I see that the Payne theory paper has gotten a lot of play in these discussions. Anyone wishing to make an informed judgment about how much credibility to ascribe to it should also then read Connor & Ever’s (2020, Perspectives on Psych Sci) critique of the Payne theory. Here is a quote from their abstract:
“using real and simulated data, we show how each
of Payne and colleagues’ proposed puzzles can be explained as being the result of measurement error and its
reduction via aggregation. Second, we discuss why the authors’ counterarguments against this explanation have been
unconvincing. Finally, we test a hypothesis derived from the bias-of-crowds model about the effect of an individually
targeted “implicit-bias-based expulsion program” within universities and show the model to lack empirical support.”
The IAT may measure something useful rather than nothing. If this is the argument, this is such a minimalist claim that it raises the question, “Have we really spent 25 years and godawful numbers of journal pages and scholar time and lord knows how many tens of millions of dollars to discover that the IAT is *not completely useless*? I leave that for the rest of you to debate.
Last, I will leave you with this quote, from the first wave of research on implicit cognition (Reber, 1989, J. Experimental Psych: General):
“The conclusions reached are as follows: (a) Implicit learning produces a tacit knowledge base that is abstract and representative of the structure of the environment; (b) such knowledge is optimally acquired independently of conscious efforts to learn; and (c) it can be used implicitly to solve problems and make accurate decisions about novel stimulus circumstances.”
In other words, it may well be that Payne almost got there. Yup, group diffs are stable. Yup, they may reflect existing inequalities. Think about that: it may not be that they cause inequalities very much (or at all). They may *reflect* them.
There is abundant evidence that people’s beliefs about groups often reflect realities to at least some degree, often to a large degree. Therefore, simple correlations between beliefs & attitudes (even implicitly measured ones) are completely uninformative about whether the belief/attitude causes the reality or whether the reality causes the belief/attitude.
The existence of replicable group differences has not solved the anomaly problem.
In fact, that it is often interpreted as some version of “replicability means credibility” speaks to another misunderstanding of how to interpret psych science (probably science more broadly but I will for now eschew too much epistemic trespassing).
The reification of replication is not justified. It is certainly true that if something is not replicable it deserves no credibility. But that some result is replicable says nothing about what that result means.
Aspirin thins the blood. Highly replicable. Not at all clear that taking it reduces chance of heart attack:
Moral of the story: Please do not overinterpret or reify replicability of a finding using a particular method as equivalent to credibility or justification of some psychological conclusion or claim.Report
Lee, Claims that IAT difference scores strongly predict behavior failed replication again and again. We all agree that is a big issue, and some will agree with Edouard that this is even an anomaly that threatens to undermine the implicit attitudes research program and render “implicit attitudes” as a term that fails to refer.
In contrast, the group averaged IAT effect replicated again and again. Yes, we don’t understand it well. But not understanding a well-replicated effect is not an *anomaly*! Instead, it invites and demands that we study the replicated effect. Conrey and many others showed irrelevant processes, like individual differences in executive control, contribute to the group-averaged IAT effect. These irrelevant processes usually add *noise variance* to the data making it all the more remarkable that the group averaged IAT effect is so large. All the more reason to study this large, replicated effect and understand its drivers.
I love your Reber quote. You might be right that implicit attitudes mostly reflect inequalities rather than cause them. But posing this question already means the term “implicit attitudes” refers to something real—precisely the nub of the disagreement. Report
Chandra, ok, let’s start where we agree. Do people have associations between concepts in memory? Of course. Can these involve stereotypes and prejudice? Of course.
“Can” is fair but doing a lot of work here.
Does the IAT do a good job of capturing any of this at the level of group differences? No one knows. “Good” needs to be defined.
Does the IAT capture something-as-opposed-to-nothing? Sure. But so does any bad questionnaire measure with an internal reliability greater than .20. No one would take that seriously.
Your analysis of the Conrey et al Quad model seems to me to be the type of thing Edouard highlighted — leaping to an unjustified conclusion. You wrote: “These irrelevant processes usually add *noise variance* to the data making it all the more remarkable that the group averaged IAT effect is so large.”
Conrey et al neither described non-associative processes as “noise” nor described then that way. They never reported the intercorrelations among their four parameters. Therefore, the extent to which the four processes they identify contribute to the IAT effect found in large samples is unknown absent someone actually testing it.
The Blanton et al paper showing that nearly-total egalitarianism in a sample can produce a large IAT Effect is certainly plausibly described as an anomaly. So is Blanton et al’s finding (separate paper, same year) that IAT scores move toward a gigantic bias effect of D=2.0 when within block variance moves to 0, regardless of how small the absolute bias.
I do agree that whether IAT Effects are causal versus caused is not an anomaly. What is an anomaly is the nearly-total blindness of the social science community to have even considered the possibility that reality causes the IAT effects, especially given what was known about implicit cognition before the IAT was created.
But that’s what happens when one lives in a conceptual, theoretical, and political bubble (I am referring to the, as Schimmack put it, Prisoners of the Implicit Social Cognition Paradigm) not to you Chandra. Report
I just read this article and the whole comment thread (as of 9am est May 26 2022) and I have to say I’m confused: is IAT “implicit association test” or “implicit attitude test”? It seems to me that everyone who’s questioning the “construct validity” is working with the understanding that the construct in question (i.e. the thing supposedly being measured) is ‘attitude’ and Chandra keeps pointing out that the construct of ‘association’ is very well established.Report
I, also, wish someone would make heavier weather about this distinction.Report
Hi Hunter and Guy,
We raise and elaborate on this point on page 3 of (the preprint of) our commentary (DOI: 10.31234/osf.io/y5nm9) that Michael Brownstein linked to in a separate comment (dailynous.com/…/#comment-432809). Happy to talk about it more on- or off-line.Report
Chandra puts the claim under discussion as follows: “a) The best explanation of the group-averaged reaction time advantage for congruent versus incongruent conditions in conflict tasks is the operation of automatic associations.”
As Lee J. points out, there are many confounding factors here, especially in the interpretation. But if the version of the IAT construct that is still salvageable is nothing more than the above articulation by Chandra, then very little of what is generally taken to follow from it seems to follow from it (I think that Chandra might agree with that). It would not follow that the IAT effect gives us any reason at all to think that individuals *or even groups — even at the scale of society as a whole* — is biased, or that bias has any causal power in generating the IAT effect (for the reasons that Lee outlines). If Chandra’s claims about the results being so robust are true, but nothing more can be said, then all sorts of things that were introduced not many years ago with great fanfare need to be thrown out the window. This is not limited to mandated antiracism training. As has been discussed already, many philosophical projects (e.g. those about belief/alief and moral responsibility) have been built on the rock of the more robust version of the implicit attitudes paradigm (the one that posits unconscious bigotry as the cause of the effect). I cannot even count the number of seminars and talks on teaching I’ve attended, or discussions I’ve read about grading, that did not warn people to heed the alleged results of the IAT ‘science’. Whether or not the effect Chandra is interested in is robust, a great deal here is due for an overhaul.
It’s as though much discussion within academia and beyond presumed the truth of phrenology for a decade or two, after which it became very doubtful that there was any association between people’s attitudes and the bumps on their skulls, at which point some scientists shifted to using the term ‘phrenology’ to refer to the simple study of bumps on skulls without presuming anything about any possible effect of those bumps on attitudes, and just about everyone else either kept assuming that bumps on skulls determine our attitudes or else suddenly started denying that they ever thought so. Maybe there are still interesting bump-on-skull things to keep looking at, and maybe that study should still be called phrenology; but one would hope for the deafening crash of all the old junk being thrown out the window at once while we figure out how we got this so wrong. I’ve got nothing against the continuation of the IAT research after the main promise of it (i.e. the revelation and understanding of seemingly deep and hidden levels of bigotry) seems permanently unfulfilled. But given the immense importance that was initially given to the supposed causal and explanatory significance of the IAT results, it sure feels strange that that’s apparently not the big headline people are seeing here.Report
Yes, Justin, if it turns out that stigmatizing implicit attitudes are widely shared in a population much like semantic associations between “doctor” and “nurse”, then it will be hard to establish ties to behavior using the standard individual difference paradigms that nearly all social psychologists like to use. That is because meaningful individual differences are absent. The same thing would be true if I am trying to link height to basketball playing behaviors, and all my subjects are pretty much 6 foot 5. This doesn’t make implicit attitude research like phrenology. It just makes studying the phenomenon a lot harder.
My sense is that people are upset and feel lied to by early overheated claims that one’s personal implicit attitudes are strong predictors of differences in behavior, and the IAT could tell you if you are an implicit bigot or not, and so forth. When those claims turn out to be totally false, people want to abandon implicit attitudes altogether. But that is just anger and frustration, not circumspection in dealing with scientific results, which is the whole point of this discussion.Report
Thank you for the response, Chandra. I think we’re getting closer to something common that we take to be at issue, but that we’re not there yet.
As I understand the literature (and I hope that you and Lee will put me right on this), the problem is not just that the exact causal mechanism that connects [whatever the IAT measures] and racist behaviors is difficult to get clear on. It’s that there isn’t evidence that there’s any correlation at all.
In other words, as I understand it, no significant correlation has been found between people who take longer than average to make the black/positive and white/negative pairings and people who act in a manner that independently suggests bigotry. And yet, if there were any causal mechanism at all (even one we haven’t yet identified) leading from [whatever causes longer hesitation times on the IAT] and bigoted behavior, it seems that we would be very likely to have found that.
Do you think that I have the facts wrong here?Report
Thanks, Justin. But that is not the state of play as I see it.
Maybe I am being thick headed or pedantic, but I would never use a phrase like “whatever it is that the IAT measures”, as if it needs to measure something. I use “measure” and its cognates in a psychometric sense to refer exclusively to instruments that quantify individual differences. If racial implicit attitudes are like semantic associations, i.e., they are pretty uniform in strength across a population and they do not meaningfully vary across individuals, then the IAT measures nothing in this psychometric sense. If someone is faster than another person at pairing white and good versus black and good, that will be just due to noise. But keep in mind that in this “excess uniformity” regime, we will still see a huge group-averaged IAT effect (see my long comment above for the difference between group average IAT effect and IAT difference scores). That group-averaged IAT effect still calls out for explanation. And at least some people, like me, will point out that there are good reasons to explain that group-averaged IAT effect in terms of automatic associations like we do in other conflict tasks, in this case automatic associations between a social category and a valenced representation (which is what I mean by implicit attitude). So we will have some evidence that white-favoring implicit associations operate in people’s minds. But we will have a hard time tying differences in those associations to behavior, because true differences in association strength are absent—whatever differences there are reflect noise. It follows that I reject this statement: “And yet, if there were any causal mechanism at all (even one we haven’t yet identified) leading from [whatever causes longer hesitation times on the IAT] and bigoted behavior, it seems that we would be very likely to have found that.” In an excess uniformity regime, we won’t easily find these mechanisms.Report
Sorry if I’m missing something here, Chandra. But let me try to put what I’m saying more succinctly to see whether that gets us anywhere.
You say, “In an excess uniformity regime, we won’t easily find these mechanisms.”
But what I’m saying is, even if we’re unable to find mechanisms (causation), a lack of correlation should be sufficient to raise doubts that there are any mechanisms to look for at all.
You say that, if we rise above the individual level to the group level, we find an effect on the IAT that can’t be explained away by chance. Okay: let me attempt an analogy. Suppose we find that Americans are apt to drink more coffee than tea. Suppose further that nobody has found much to confirm this at the individual level, but that the effect becomes visible when we examine America’s hot beverage drinking at a national level. Some people who study these sociological questions about hot beverage drinking in America posit, let’s say, that the reason for this is that Americans unconsciously associate patriotism with a rejection of tea, and that Americans tend to be unconsciously patriotic on about seventeen days of the month.
Now, this would be an interesting hypothesis to explore, even though it might be difficult to work out exactly what *mechanism* could be responsible for turning the underlying feelings of patriotism into a choice to drink coffee over tea. But even if we had no idea what the *mechanism* were, we could presumably check whether there is a *correlation* at all between choosing coffee over tea on a certain day and being patriotic on that day. For instance, we could give people patriotism tests wherever they get their hot beverages and see whether, on the whole, the coffee drinkers tend to be more patriotic. If they are, then we’ve got some support for the hypothesis. But if they’re not, then we have some stronger evidence against the hypothesis. We would have good reason for doubting that there is any such correlation. We would have that counter-evidence without ever having to get into the question of what the mechanism could be. Similarly the correlational point I’m discussing seems to tell against the implicit bigotry hypothesis even if it’s true that the mechanisms would be difficult to find.Report
“we could presumably check whether there is a *correlation* at all between choosing coffee over tea on a certain day and being patriotic on that day.”
Excess uniformity means precisely that day-by-day variation in patriotism does not as a matter of course happen. So what this maps onto in terms of doable studies that exhibit the needed attitude variation is really not clear.
I noticed that my previous comment contained all the same points as my very first comment. some 40 comments above 🙂 It’s been a good discussion, and I learned more about how philosophers see these issues — certainly very different from my own take. But I suspect it is time to wind this one down…Report
Justin, even I am convinced that there is *some* nonzero correlation between IAT scores and discrimination-like behavior. Even the skeptics (such as Tetlock) have found this, albeit at levels most of us consider quite low (~r=.15) but which advocates make all sorts of arguments to salvage as “not small” (I think they’re bad arguments but let’s put that aside).
There is no evidence that those correlations are justifiably interpretable as IAT scores *causing* discrim. Implicit bias trainings don’t reduce discrim even when they do change IAT scores!
So *is it possible* that *something* as opposed to nothing interesting is going on? Of course. As I wrote, I DO work using the IAT.
But after 25 years, oodles of grant $, deployment of the very poorly justified claims about implicit bias for naked political purposes (Clinton, Harris, every DEI or bias training that justifies itself based on the “science”), whole careers made on the backs of overwhelmingly unjustified claims while other scientists labor in obscurity doing the hard work of finding things that are actually true and can’t get grants and get published in C-level journals (to be clear, this is way not about me, my career has been more than fine, thank you very much), after all that, perhaps the feeling of frustration, of having been lied to, of having been sold a heap of snake oil masquerading as “science!” is more than a little justified? As is huge heaping mountains of skepticism about anything that ever emerges from the implicit social cognition paradigm ever again?
Fool me once, shame on you. Fool me twice…Report
Thanks for this, Lee. I had read elsewhere that nobody had ever succeeded in showing any significant correlation between IAT scores and discriminatory behavior in any context, but you are clearly one of the great experts in the field, and so I of course defer to you on all these empirical issues!
Either way, it seems that your general point stands — that scientific support for the robust claims about implicit bias that have long been taken for granted in politics, antidiscrimination training, and so on is inadequate by a vast degree. Nobody here seems to dispute that, and I think it shows that Machery is right in thinking that the attempts to build philosophical projects on the foundation of what was taken to be the science of IAT really jumped the gun.Report
Thanks, Chandra. I agree that your recent comments repeat what you already said in your original comment. What I’ve been asking all this time has been about something slightly different than what you keep saying, which is why I’ve continued to try to ask it in different ways. Either I haven’t been able to get across my true meaning, or perhaps the two things really are the same, despite appearances, and I just don’t see it. Either way, thanks for the responses.Report
I am very happy with this thread. It was a substantial discussion that remained relatively cordial in spite of the disagreements and the stakes. I also would like to thank Lee and Ulrich for their contribution. They have done a lot to challenge the orthodoxy in this area of psychology (and in others too).
It is important that readers do not miss how much *everyone* in this thread, as far as I can see, *agrees* on some crucial points, despite being keen to emphasize differences.
Here they are:
1. The IAT and other indirect measures of attitudes do not measure stable constructs and do not capture individual differences. At least this is what current evidence suggests.
2. So the race IAT and other race related indirect measures (mutatis mutandis for other biases) do not measure anything we usually understand as racism.
3.The IAT and other indirect measures of attitudes do not predict cross-situational biased behavior well at all. From a pro-white behavior, you can’t predict how someone will behave in a racial context, particularly if you have controlled for their explicit reported bias.
While readers of this blog post have not commented on this, noone have challenged the claim that there is to date ZERO evidence of a CAUSAL link between what the IAT and other indirect measures of attitudes measure and behavior.
something I didn’t mention in my post, but talked about in the papers is that debiasing programs are costly but, as far as we know, utterly useless.
Now, the IAT measures something. What it is and how well it measures it is unclear as is unclear why we should care about it. Calling its measurand “implicit attitudes” is just equivocating.
My sense is that philosophers, policy makers, and the lay public got interested in the IAT and other indirect measures of attitudes precisely because we thought that they measure something that is reasonably well related to what we call biases. They don’t. So, perhaps the IAT will remain a usesul tool in social psychology (In the long run, I doubt it: they will join the heap of forgotten indirect measures of attitudes in psychology), but its possible useful uses do nothing to salvage our past enthusiasm. Report
Here is something I think we all can agree on: Claims that the IAT (as currently used) can quantify individual differences in “unconscious” racial attitudes and reveal if you are personally are an inner bigot are blatantly false. And we should all do some soul searching to figure out how such claims got wide traction, including in the academy. I do feel like this needs to be done. And it is likely I hijacked this discussion a bit with other more “in the weeds” issues that loom larger for me. If I did that, my bad.
And yet, here I go again. I am happy with Edouard’s summary, but there are some subtle differences. Here is my own take on what we agree on and some important areas where we do not have agreement:
1. We all agree, I think, that there is a reliable, replicated, and large group-averaged IAT effect in which White Americans are quicker to pair White and good versus Black and good (and Black and bad versus White and bad).
2. We do not have agreement on what best explains this group-averaged IAT effect. It could arise from: 2a. an automatic association between social category representations and valenced representations; 2b) some other confounding factor like salience of the relevant social categories that has nothing to do with valenced representations, or 2c) something else.
3. We all agree, I think, that per-subject IAT difference scores correlate poorly, if at all, with discriminatory behaviors. To repeat, here we are focusing on individual differences in IAT difference scores (i.e., individual differences relative to an elevated mean), not the group average IAT effect discussed in (1).
4. We all agree, I think, that there is to date no evidence of a CAUSAL link between what individual differences in whatever it is that per-subject IAT difference scores quantify and behavior
5. We do not have agreement on why there are poor individual-difference correlations with behavior. This could be due to: 5a) there are no automatic race-related valenced associations at all (as in 2b and 2c); 5b) there are automatic race-related valenced associations but they are do not meaningfully differ across subjects (“excess uniformity”); 5c) there are automatic race-related valenced associations but they are causally inert with respect to behavior; 5d) there are automatic race-related valenced associations that are causally efficacious but they are not quantified well with difference scores or D scores, and some other computational model is needed, 5e) something else.
6. We all agree, I think, that given the unsettled state of the science, we should not be doing things like mandated implicit bias training, debiasing training, etc. Teachers and professors should not be telling students we know that certain disparities arise due to implicit biases.
I *think* everyone agrees on (1), (3), (4), and (6). In any case, I strongly agree with them.
Regarding (2) and (5), I think 2a and 5b are certainly viable and are as good or better than any alternative hypothesis. But hopefully we agree that reasonable people can disagree about which options to take in (2) and (5).
Thank you, Chandra, for your contributions to this thread. As an outsider to this topic, I’ve found them extremely helpful.Report
This thread has been really illuminating for me. I’ve heard of implicit bias tests but have never taken one. It’s good to hear the science behind this issue because it is so politically sensitive.Report
Readers may also be interested by the following post by Eric Schliesser, which focuses on another aspect of my original blog post: the relation between philosophers of science.
Save for David Wallace, few here have commented on the sense that philosophers tend to be credulous when it comes to science (or, a slightly different problem, use it in a “lawyerly” way to make a point).
I am sympathetic to Eric’s criticism of the consumer model of science. On the other hand, I am mindful of epistemic asymmetries between scientists and non-scientists and of course among scientists themselves: Scientific communities are built in part on sharing information that recipients need, but aren’t able to fully assess.Report
While I think much of your criticism of the evidence is on point this narrative about being mislead by those unreliable scientists who overhype and under replicate their work is just making excuses. Philosophers were just as much to blame here as scientists.
The failure with these tests wasn’t that anyone lied nor was it primarily that scientists assumed a result replicated. It was a failure of inference: what had been tested didn’t support the kind of uses it was put to (implicitly assuming it measured what we think of by bias). Spotting issues like that is kinda the point of philosophy of science. If philosophy is supposed to offer value (and I believe it can) it’s exactly situations like this where it should show up and point out the issues (and sooner than we saw). And given the piles of philosophical work about the limitations/flaws/etc in the scientific process you can hardly claim to be a naive about the dangers of taking scientific publications at face value.
No, philosophers were ideally situated to notice this issue and call it out. The reasons they instead published all those papers about it instead is exactly the same reason the scientists weren’t more skeptical. Taking it at face value let you write some great papers that were sure to get accepted while making you feel warm and fuzzy. Challenging the claims risked being branded as some kind of right wing denier of racism. Yes, it’s absurd but once a criticism becomes associated with partisans on one side ppl tend to assume anyone who makes it has that motive.
So yah, science needs to do better but look to your/our own house first. Once you can fix the problems that caused so many philosophers who should have known better to take it at face value you can give the scientists some tips.Report
Peter, that is a killer set of points. However true it is of philosophers, though, *it is even more true* of social psychologists, who have all the training to have known better. A few did — Tetlock, Blanton, Jaccard, LeBel, and, more recently, Schimmack, Corneille, me. But you had to risk your career to do it — publishing critiques was massively more difficult than endorsements (of implicit bias) so it was a huge timesuck. And you would then be subject to whisper campaigns (and some much more public) accusations that you were some sort of rightwing reactionary/white supremacist. Jost et al’s “10 studies No Manager Should Ignore”** paper all but says this in so many words — imagine what went on behind the scenes. The reputational risk wasn’t worth it for most people, no matter how much they saw thru the nonsense.
But what we have here is a global failure to engage in the type of thoughtful skeptical vetting of popular ideas that is the supposed responsibility of truth-seeking institutions such as academia. Do you know who blew the lid off the worst of the IAT nonsense? A journalist, Jesse Singal. This one report cracked open the door to legitimate criticism:
It is no coincidence that that sort of hammer blow came from *outside* the academia.
**Go here for an analysis that justifies its recommendations that the 10 Studies No Manager Can Ignore should be ignored (e.g., because few were actually on racial discrimination and those that were found hardly any):
“But you had to risk your career to do it — publishing critiques was massively more difficult than endorsements (of implicit bias) so it was a huge timesuck. And you would then be subject to whisper campaigns (and some much more public) accusations that you were some sort of rightwing reactionary/white supremacist.”
That Jesse Singal piece you cite was what tipped the thing for me. I remember being quite shocked when I read it. I credulously taught the IAT to my students for years as presenting the unvarnished truth about human nature. For years while all this was going on, I just thought it was a matter of ‘trusting the science’ — because I knew, or so I thought, that psychologists were subjecting it all to rigorous testing and would have blown the whistle if there were anything amiss or if there were any other plausible interpretation.
I also assumed, for years, that anything discussed and endorsed by virtually every philosopher who had looked at it had to be solid. After all, isn’t this what we’re paid to do, and what we’re so good at doing, and what drives us to get into philosophy in the first place? When I was making my way through grad school, I saw people all around me who would jump on the slightest weak point in an argument or position and expose the problems. The possibility that hundreds or thousands of philosophers could stare right at something, talk about it day after day with their philosophical glasses on, and unanimously come out endorsing it to the point where it could serve as support for an entire area of philosophical research seemed preposterous if it weren’t rock-solid.
And yet, that turns out to be exactly what’s happened not just with the IAT but with many other things we haven’t discussed here.
The result? At this point, I have much less faith in the consensus of philosophers on any topic. But when the topic is anything to do with sociopolitical issues, and the thing just about every philosopher and/or social scientist agrees on conveniently supports the very political view that more or less everyone in academia endorses independently, I try not to let the academic consensus move my belief in any way. At this point, it just seems epistemically irresponsible to let such a consensus move the needle at all.
It’s not even as though the pressures are hidden very carefully any longer. For years now, we’ve seen people openly attack philosophers and journals for publishing views they find objectionable — often in open letters, no less. And now it’s even got to the point where people have to endorse certain views as part of the official job application process in order to have a chance at being a member of the profession after completing a decade’s worth of preparations for it.
I’ve been called a ‘sea lion’ — by others on this blog, even — for pointing out that the evidence in favor of views everyone else wanted to take for granted was very weak. (I’ve generally turned out to be right on those matters, but by that time the attention had shifted elsewhere). But I don’t understand how any informed and reasonable person could *resist* a skeptical, wait-and-see view, at least, on these issues.
This is one of the many things that the dogmatists, enthusiasts and grandstanders keep missing. They present the issue as one between the comfort and feelings of inclusion of certain people on one side, and on the other, the freedom of some people they deem bad to say things they find objectionable. But there’s so much at stake beyond that short-sighted vision. One of the many other things at risk here is the quick erosion of public trust in anything scientists and philosophers say, and in academia’s role as a neutral and rigorous arbiter of ideas. If the public stops trusting us on issues of social science and ethics, then whom exactly will they trust, if anyone?
I became aware of these problems after years of endorsing only the progressive side of the related issues. As a result, I’ve become much more agnostic on many things I felt confident about before. If these sorts of misrepresentations of the state of play has caused me to lose faith, what is it apt to do to members of the public who, unlike me, don’t have my predispositions when they see how easily those of us in academia tend to be tricked into chasing each other into overconfidence on issues on which we have a collective and personal bias? And can we really blame them, if we routinely fail to live up to the ideals that justify the public support and trust we’ve enjoyed, and then shrug our shoulders and engage in denialism when everyone else can see the thumb on the scale?
People like to say, “Well, what would you prefer: an environment in which flat earthers, creationists, defenders of slavery, and other cranks get to argue for their points as much as they like?” I wouldn’t say I’d like to hear everything they have to say, but it seems to me that the damage done by a handful of cranks who will have a hard time coming up with persuasive arguments to use against their well-informed and level-headed peers would be vastly less than the harm caused by our continuing to stifle — and be seen to stifle — free discussion to the point where hardly anyone hears about the major problems with the inferences drawn from IAT research (say) until those inferences have permeated everyone’s thinking and practices in the name of science and ethics. In a liberal society, trust in the integrity of an *unbiased* research system is at least as important as trust in the integrity of unbiased law courts. And we’re blowing it.Report
Heh. My Chair term ends 6/30/22. I will be on sabbatical. Only half-joking, Justin, I propose a collaboration on a paper tentatively titled:
As a white person who never showed the standard pattern on these tests, I was always aware of *wanting* to believe the claims made for them (since they would suggest that I am egalitarian even at the level of my implicit attitudes) and yet of simultaneously doubting those claims (since I’m aware that I do have biases). And now that does make me interested in this question of why so many people jumped on them.
I wonder if people who were in fact dimly aware of having racist biases and had earlier been anxiously trying to repress them were attracted to a view that says “But don’t worry, you can’t help having them.”
Or maybe it’s much simpler than that; maybe it just made racial inequalities seem to be the fault of biased individuals (vast numbers of biased individuals, but still, individuals) rather than the fault of systems or structures or inheritance from past injustices.
In any event I suspect that the answer here has something to do with what this science was a science of, even though I’m sure there’s also a general temptation to jump too quickly on bold social scientific claims on a number of topics.Report
When I was in university I saw different reactions in relation to implicit bias research. Dissenting voices was more encouraged and taken more seriously within even psychology departments than they were in the English and humanities departments (most disappointing in philosophy where I graduated).
Also, your statement about the science of mediation, are you stating that the evidence is scant or worth the hype? My own research into the topic suggests that the science of mindfulness is borderline bankrupt; held up by scant evidence, conceptually infused with religious doctrine, and with some researchers even suggesting that their is a bias towards publishing positive evidence (look up “dark nights” in mindfulness, causing some patients with deep psychological traumas to have full blown disassociations and even experience psychosis).
anyways, thanks for the post.Report