Showing posts with label Introduction to Causality. Show all posts
Showing posts with label Introduction to Causality. Show all posts

13/04/2011

Unpacking Karl Smith on Experiments and Regressions (An Introduction to Causality and How to Measure It), Part III

Part I, Part II

Internal Validity (Part C)
Smith writes: "Notably, double-blind experiments are an attempt in medicine to go beyond simple randomness because simple randomness not enough." If I understand Smith correctly, he brings up a very interesting problem here, namely that the difference between the independent variable you're actually interested in and the treatment you are in fact administering. The two can be different, which can bring about new confounding issues. I've never heard a general term for this problem, let's call it treatment confounding. The classic example is from medicine, as mentioned by Smith. Researchers are actually interested in the consequences of introducing a medical agent into the body. But if subjects in the treatment group are given a medicine, while subjects in the control group are given nothing, there are differences between the two groups other than the introduction of the agent into the body: They now also differ on the expectation of getting help, the act of taking a medicine, etc. Using placebos means matching treatment and control on these aspects. Making the administering person blind means matching treatment and control group on the expectations of that person. Randomized double-blind means that treatment and control group differ on nothing but the introduction of the agent into the body (the independent variable of interest).

The treatment confounding problem is not confined to medicine. For example, you might do a psychological experiment on aggression in the lab. You're interested in the effect of aggressive affect on aggressive behaviour. To instil aggressive affect in the treatment group you make them write essays on "a situation in the past that made you feel really aggressive;" the control group write essays about something else. You measure aggressive behaviour afterwards. Did you really measure the effect of aggressive affect? Perhaps what you actually measured was the effect of signaling that it is o.k. to express aggression (an experimenter effect) or the accessibility of aggressive scripts.

So, there's a potential problem to keep in mind. But our topic is comparing lab experiments and regressions with respect to the treatment confounding problem. Where do you think the problem is bigger, in multivariate regressions on observational data or randomized lab experiments? That's not such a tough one, is it?

10/04/2011

Unpacking Karl Smith on Experiments and Regressions (An Introduction to Causality and How to Measure It), Part II

Here's where we left off.

Internal Validity (Part B)
So, multiple regression is an attempt to establish a causal connection between X and Y by controlling for everything that may have an influence on Y besides X. This will never work perfectly. One, there are variables that you would like to have measures for, but don't. Second, there are the ones you haven't even thought of, but which in fact do have an influence on Y. In the social sciences, you typically have both these problems.

Then there's the problem of overcontrol. The textbook example here is the influence of IQ (which we may have measured early in life) on earnings (which we may be measuring at age 40). Education has an influence on earnings, so we want to control for that, right? Well, given that education is itself influenced by IQ (it is a "mediating variable," to use the technical term), you would underestimate the full effect of IQ on earnings if you controlled for education. That's the problem of overcontrol.*

So, you don't control for education. But that might be wrong as well. If education does have an influence of its own on earnings (over and above IQ), and if it is not simply a function of IQ, then you have what Angrist & Pischke call a "proxy control" problem. That is, you want to control for education, but at the same time, you don't want to. There's not really a way out of that dilemma. The best you can do is report estimates with and without education controlled for and call one "upper bound" and the other "lower bound." Let's hope they're similar!

There is also the problem of causal direction. The setup of a multivariate regression does nothing to establish that, but it might be that you have information about your variables that help. For example, if you try to establish an effect of the weather on violent crime rates, you can be pretty sure that an association between the two does not represent an effect of violent crime rates on the weather.

Compare & contrast with the randomized experiment. If your sample is reasonably large, randomization takes care of all other variables that might have an influence on the outcome, whether you had thought of them or not. (Because you've randomized, the treatment and control groups are very similar on other variables.) Plus, you establish causal direction because you know that you have manipulated one variable, but not the other. VoilĂ : Causality established.

So you can say that "[t]here is no fundamental difference between performing a regression on data collected in the field and data generated in the lab," but that's like saying that there is no fundamental difference between the broken-down, rusty Lada with no wheels in my backyard and a brandnew Porsche. They're both cars, right?

As far as I can see, that takes care of the first two of Smith's sentences that I quoted. Which leaves us with one post about the very interesting bit about "double-blind" and the failure of controls, one about external validity, and perhaps an appendix post about assorted issues. I'm confident I'll be finished by the end of summer!

_____
*You might want to know whether an effect of IQ remains after you've controlled for education, but then you'd be asking a different question.

20/03/2011

Unpacking Karl Smith on Experiments and Regressions (An Introduction to Causality and How to Measure It), Part I

Karl Smith writes:
A random trial is simply a physically controlled regression analysis. There is no fundamental difference between performing a regression on data collected in the field and data generated in the lab. It is simply that in the lab you hope that you have performed all the necessary controls physically rather than statistically.

However, the physical controls can still fail. Notably, double-blind experiments are an attempt in medicine to go beyond simple randomness because simple randomness not enough. Even with double blind, however, results are often not generalizable.
Oh man! Smith jumps from one aspect to the next, only to introuduce a third in the following sentence and veils it all under vague language.

My idea was to write a quick post to say what I think is wrong with this statement. It turned out "quick post" wasn't possible. I'll hence write this post in multiple installments. This here is the first; I guess there'll be at least two more. I'm making no promises regarding the dates of publication of the other ones. I'll try my best, but I'm short on time these weeks. Follow using the Introduction to Causality label.

A useful starting point is to distinguish between internal and external validity. Internal validity refers to the question of whether you've measured what you say you've measured. This includes, but is not limited to, the question whether a correlation you report and claim represents causation actually does represent causation.

External validity refers to the question whether what you have observed in a specific situation (such as a lab experiment) may also be expected to be observed in similar, but different, situations (such as "the real world").

Internal validity (Part A)

As the internal validity question often is discussed as though it referred only to causality, let's note that you can get internal validity wrong without getting causality wrong. Let's say you want to measure the influence of frustration on human aggression. You define the latter, as usual, as behaviour intended to harm another person who is motivated to avoid that harm. You study this question in a psychological lab experiment. You take a bunch of undergraduates and divide them into a treatment group and a control group. The treatment group gets frustrated in some way, the control group does not. Then you have participants in both groups write essays about "an important event from your childhood and how you felt about it." You collect the essays and afterwards have a bunch of raters (who are blind as to which group the authors of the essays belonged to) rate them for the number of expressions of aggression, such as ". . . and every time I think about this, I'd really like to kick my dad's head in." You find that the essays from the treatment group contain significantly more expressions of aggression. Did you demonstrate that frustration causes aggression? No, because expressions of aggression in an essay are not intended to harm another person who is motivated to avoid that harm (or at least it would be hard to make that argument). Your dependent variable does not measure aggression. Hence, internal validity fail.

More often the question of internal validity refers to whether the measure of statistical association you say represents causality actually does. Which raises the question what causality is in the first place. The definition that underlies the whole logic of experimentation and that has widespread acceptance among people that try to develop methods for establishing causality with nonexperimental (observational) data is what philosophers call "the standard interpretation of the conterfactual model of causality:"

X may be said to have caused Y if X and Y co-occur, but Y would not have occured in an otherwise identical situation had X not been present.

This definition immediately lays bare the problem of establishing causality: You cannot make observations in two parrallel worlds, one of which features X and one of which doesn't. Your mission is to approximate this "parallel worlds" ideal as well as possible. That's where this whole business of "control" comes in. When you control for variables other than X, you try to statistically hold constant all variables that may have an influence on Y. In this way, you hope to isolate the effect specifically of X on Y. If you knew that you had controlled for all other relevant variables (measured without error, just like X and Y), you could be certain that any correlation that's left between X and Y represents a causal connection between the two.

There are multiple problems with this technique, however, on which in the next part.