Rule o' Thumb

Anyone who has ever read a textbook on research methods in the social siences has come across the claim that correlation is not causation. Taken at face value, that's about as valuable as saying that a thermometer reading is not temperature. What's meant is that correlation does not imply causation.

That's correct. Given that in the social sciences (with the exception of psychology, assuming you call that a social science) you can rarely run an experiment, it's also not very helpful. The rule of thumb that I use is that correlation implies causation to the extent that alternative explanations for the correlation are implausible. In other words, if I say that this correlation suggests causation, it is not good enough for you to offer the correlation is not causation mantra, you also have to come up with a better explanation for why the correlation is there.

Max Goldt said that a general problem with "writing" (his scare quotes, with which, I think, he meant to distinguish writing proper from writing shopping lists) is that you can never be certain that you developed an idea you present as your own yourself. My introspeciton tells me that I came up with this rule of thumb all on my own years ago, but I know that memory is far from perfect and that I'm the kind of person that's likely to read texts which treat this topic. Also, I think it's unlikely that no one before me thought of this, which isn't exactly Nobel material. I am comfortable saying, however, that all of the textbooks I've read said that correlation is not (or does not imply) causality (as they should) and gave reasons for that statement (as they should), but none included the above rule of thumb. As they also should.

And when you read the discourse, it shows.


pj said...

I think you need to add that before positing that a given correlation does in fact imply causation you should have taken some pretty strenuous effort to eliminate other actual or potential confounding factors.

In the field of medicine most similar to the social sciences, epidemiology, I've lost track of the number of studies on stuff like the effect of something like local deprivation on heart disease that don't even measure smoking or attempt to control for it, then announce with great fanfare that the stress hormones from being poor give people heart disease.

And there are good arguments made that you can't always meaningfully statistically control for the effects of confounds. Analysis of covariance and linear correlations are not always the answer.

LemmusLemmus said...


the rule I proposed was that "correlation implies causation to the extent that alternative explanations for the correlation are implausible." When you say that there are possible confounding factors that were not controlled for, this is the same as saying that there are alternative explanations that are plausible, no?

The example you mention from epidemology simply sounds like shoddy, and possibly politically motivated, science. Apart from the problem of insufficient controls, my superficial impression (and corrrect me if I'm wrong here) of epidemology is that pretty much anything goes: Just collect 100 independent and five dependent variables, and, sure, you're going to find some significant associations. And if not, you can still partition the data ("divide and conquer").

This kind of approach is much less common (although by no means unheard of) in the segment of the social sciences I know of: Usually measurements are motivated by theory.

I'm not against epidemology. If you find the same result again and again (using appropriate controls), that's strongly suggestive. If you can show the same thing in a randomized controlled trial, even better. And if then you can show the actual physical mechanism at work, perfect. So I think there is a role for epidemology mainly as an hypothesis generator, but when I read in the paper that eating broccoli causes short-sightedness, I switch to the next article.

pj said...

The problem is that many people seem to think that the burden of proof falls on those questioning the finding to prove that the result is due to confounders - rather than it simply being sufficient to highlight that confounders were not controlled for.

This is something I've noticed a lot - people accept that a study is fundamentally flawed but still believe the result (and thus quote the paper) because it accords with their prejudices.