Tuesday, May 10, 2016

Baseball, final exams, and omitted variables

Here are some fun graphics I put on the final exam this year. One of the fundamental problems in social science inference is that we observe inputs and outcomes without necessarily seeing all relevant variables, and it's difficult to say much about which X's are causing Y.

Here's a look at annual payroll and regular season wins in Major League Baseball between 2007 and 2015, courtesy of MLB.com and the USA Today payroll database.
It's definitely a cloud, but OLS finds an upward sloping line. Now take a look at two randomly selected teams, the Oakland Athletics and San Francisco Giants just for kicks:
There's not much of a clear pattern here at all, in either case. If there really were an upward sloping causal relationship between wins and payroll, you'd also expect to see it here within a team's history over time. But check out the LA Dodgers, whose owner experienced a messy, costly divorce, filed for bankruptcy, and finally sold the team before the 2012 season:
This is seriously fun stuff, because the natural experiment here, namely the essentially forced sale of the Dodgers to cover the enormous costs of a shattered marriage, revealed what looks like a pretty clear upward-sloping relationship between payroll and wins. Even 2015, which was a reduction in both, fits the pattern.

Thursday, February 4, 2016


Based on little research and an extreme approach to risk management, the CDC now recommends "that women who are pregnant or might be pregnant not drink alcohol at all." The Times writeup is here. Emily Oster, whose excellent book includes a chapter on this issue, has weighed in:

Wednesday, January 13, 2016

ACA and employment

The Washington Post reports how several recent articles find essentially no effect of the Affordable Care Act (Obamacare) on employment to date among Medicaid populations or part-time workers. The authors point out that long-term effects could differ from the short-term effects that these studies measure, however.

Tuesday, January 12, 2016

Marriage and health

Last year, Vox posted a helpful summary of a paper on the protective effect of marriage that utilized panel data in the PSID to help control for the selection of healthy people into marriage. The authors found that about half of the protective effects associated with marriage appeared to reflect selection.

Thursday, January 7, 2016

Unauthorized data visualizer

The Center for Migration Studies provides an online data visualizer that shows estimated characteristics of unauthorized immigrants, courtesy of extensive work by U-MN's Rob Warren. Pew also estimates authorization status in Census data and produces reports on trends, one of which concerns unauthorized workers by industry.

Thursday, December 3, 2015

Data science at Cal

This spring, Data Science 8: Foundations of Data Science, returns to Cal in its second appearance since its debut this fall. And also I'll be teaching L&S 88: Health, Human Behavior, and Data, a 2-unit connector course that accompanies DS 8, for the second time. Here's a YouTube video presentation about the course by springtime lead DS8 instructor John DeNero, in which I appear at about the 34 minute mark to talk about L&S 88.

Wednesday, December 2, 2015

Climate, growth, and conflict

The Times takes the long view of economic growth, conflict, and sustainable development, and its pessimism is well summarized by the invocation of "Blade Runner," "Mad Max," and "The Hunger Games." It's fun to read and probably to write pop-news pieces that unite like-minded strands of thinkers to push a particular theme. But it also seems like a disservice to spin apocalyptical yarns during a period when we should all be taking action on climate change more seriously.