Civil War and Violence on the Football Pitch Revisited, Leading to General Comments on Academics and Bloggers

About 20 months ago, I posted a critical analysis of a paper called "National Cultures and Soccer Violence" by by Edward Miguel, Sebastián M. Saiegh and Shanker Satyanath. I concluded:
A paper which operationalizes both the dependent and the main independent variable poorly, uses a questionable statistical model and finds a significant association in only one out of three tests of the hypothesis - an association which turns out not to be robust.
Now Tyler Cowen posts the most recent (October 2009) version of the paper, now entitled "Civil War Exposure and Violence". The front page footer reads (my emphasis):
We are grateful to Dan Altman, Ray Fisman, Matias Iaryczower, Abdul Nouri, Dani Rodrik, seminar participants at Stanford, UCSD, UCLA, IPES, and at the 4th Annual HiCN Workshop at Yale, and a host of anonymous bloggers for useful comments, and Dan Hartley, Teferi Mergo, Melanie Wasserman and Tom Zeitzoff for excellent research assistance. All errors remain our own.
I'll get back to that. First, let's have a look at what happened to the paper's weaknesses which I highlighted. Quotes from my old post are indented, new comments are unindented and quotations from the new version of the Miguel et al. paper are indented and in italics.
1. The authors do cite evidence showing that "civil conflict" is followed by more violent crime, but treating "years of conflict" as a measure of culture - a vague and broad concept - is taking it a bit far.
Addressed. This whole culture idea has disappeared; the paper is now framed more narrowly in terms of "civil war and violence". In fact, the word "culture" makes only one appearance in the main text of the paper.
2. Generally, the authors seem to have a poor grasp of football. For example, they write: "A player who receives a yellow card continues to play in the match, yet the yellow card serves as the first and last warning." Incorrect; anyone who watches football regularly knows that players who are already booked will often be verbally cautioned by refs that they're close to being sent off.
Addressed. "last warning" changed to "last formal warning" (p.5).
[pt. 2 ctd.] More importantly, they also write: "In some cases, a yellow card may be awarded for persistent fouling, or for non-violent forms of unsporting behavior, for instance, disobeying an explicit order given by the referee. However, in practice the vast majority of cards are granted for flagrantly hard fouls." I'd like to see some numbers on this claim; I would guess that about a fifth of bookings are handed out for "non-violent forms of unsporting behaviour". Note, however, that this biases the results against the authors' hypothesis.
Addressed with data (pp.5-6):
Figure 1 illustrates the causes of yellow cards in the Italian league during the 2005/2006, 2006/2007 and 2007/2008 seasons, and in the UEFA Champions League in 2004/2005 and 2005/2006.6 In the Italian league, nearly three quarters of all yellow cards were awarded for violent fouls (“assault”), while in the UEFA data the proportion is close to two thirds.
3. The authors write: "As mentioned, actual crime rates are unsatisfactory as measures of a 'culture of violence', since individuals’ real-world actions plausibly reflect the combined influence of legal institutions and economic factors, in addition to cultural norms." (pp. 2-3) Is that not true of "civil conflict"? (They later give better reasons for not using crime rates, yet this point reinforces the argument that the authors' measure of culture is poor.)
Addressed. Sentence dropped.
4. This one's hard to believe. The authors write: "The magnitude [of the association] is quite large [...]. The predicted number of yellow cards for [an African player in the French league] increases by 3.6 percent when civil conflict prevalence in his home country increases by one standard deviation, or 4 years" (pp. 8-9). You're calling that "large"?
Addressed, sort of. On p. 11, they write:
The predicted number of yellow cards for such a player increases by 3.6 percent when civil conflict prevalence in his home country increases by one standard deviation, or 4 years. Player age is also positively correlated with yellow cards and can serve as a basis for comparison. If the age of the representative African player decreases by two years, his estimated number of yellow cards decreases by 3.0 percent, roughly offsetting the positive conflict effect.
Yet on p. 12 they go back to calling it large.
5. The authors do not control for the quality of the team the player is on. I have not crunched the numbers on this but am pretty sure that holding player quality (which the authors control for) constant, players on lower-quality teams commit more fouls due to a) weaker teams having possession of the ball less often combined with b) players being much more likely to commit a foul when their team is not in possession.
Addressed, but not totally satisfactorily. The authors write (p.13):
The result is also robust to accounting for team quality, measured by their league standings in two variables: the first variable indicates if the team finished among the top five teams in its league, while the second indicates if they finished among the bottom five. Players on top-five teams are less likely to receive yellow cards (coefficient estimate -0.043, z-score 1.68) while players in lowly teams receive somewhat more cards (0.063, z-score 1.66), but most importantly, the point estimate on the civil war measure remains large and statistically significant (0.0072, z-score 2.48, not shown) when these team controls are included.
Top and bottom five dummies are not the first measures that come to mind when one wants to control for team quality; number of points per game played would have been more obvious. That this measure is not used raises suspicion.
6. The authors do control for exposure time by including variables on matches started and matches come on as a substitue (minutes on the pitch would have been better), but if their hypothesis were right, we should expect an interaction effect between these and the "years of conflict" variable. (Whatever your "culture", you're unlikely to get booked when you're not on the pitch.) They do not test for this.
Not addressed. No results for interactions. The authors do explain, however, that no statistics for minutes played are easily available.
7. When Colombia (an outlier) is excluded from the analysis, the results are not significant at the conventional 5% level anymore (p. 10).
This is still the case (Appendix Table 2, Column 4). However, the authors use a formal statistical test I'm not familiar with and exclude the outliers thus identified, which yields significant results (Appendix Table 2, Column 1). It is unclear players from which countries were excluded in this specification. Overall, the results seem a little unstable.
8. The association between years of conflict and red cards received - a much better measure of violent behaviour (see pt. 2) - is not significant at the 5% level (p. 11). In fairness, red cards are rare, which makes it hard to find an association.
Still the case and discussed pretty much along these lines.
9. The authors find no association of years of conflict with fouls that do not lead to a booking (p. 11). Strangely, they interpret this as supporting their hypothesis; I draw the opposite conclusion.
Even more strangely, this measure has disappeared altogether.

My new conclusion: Although it is nice to see many of the problems I pointed out addressed, I am still not convinced by the authors' results, due to what I said in my comments to points 4-7 and 9. Plus, no collinearity statistics.

Two more general points about the interplay between academics and bloggers

1. As acknowledged by Miguel et al., they profited from comments from blogistan. Unless this causes problems in the formal publication process (because journals want "unpublished" papers and may be strict about it), it seems like an excellent idea to get your paper onto blogs. Even if you get only ten valuable comments, it is definitely worth it. It should not be forgotten in this context that reactions from blogs often come quickly, in contrast to reactions from journal reviewers. And bloggers however many faults they find, bloggers can't reject your paper.

2. Although I'd like to think so, I don't know whether I was one of the commenters whose criticizms were used in rewriting the paper. But that is not the point. The point is that when you use someone's written comments in rewriting your paper, you mention that person by name. Even if it is a not very scholarly-sounding alias.

No comments: