To Punt or not to Punt: Policy Advice based on Observational Data

Rocky Long, second year head football coach of the San Diego State Aztecs, is facing a problem that is similar, in some respects, to what voters face.  Coach Long may choose to never punt the football once the Aztecs cross midfield, regardless of distance on fourth downs.  He is listening to advice suggesting that the strategies of many previous coaches were wrong.  Coach Long says “We had one of our professors in our business school … go over a system we are thinking about using.  We’ll have a chart come game time that will determine what we do in different situations.”

Voters must decide whether the policies of another Obama Administration will help turn the economy around and contribute to more job creation, or if the Romney campaign proposals will be more successful in achieving these goals.  Football and economic policy advice are both derived from observational, not experimental, data.  It is difficult to know what might happen if alternative policies were to be enacted.  Analysts look to history, in similar circumstances with similar policy options, and hope this provides useful guidance.

Coach Long’s decision apparently follows the strategy of high school coach Kevin Kelley of Pulaski Academy in Little Rock, Arkansas who has been remarkably successful.  But college football is not high school football.  The strategy also appears similar to advice given by Berkeley macroeconomist David Romer, who several years ago found that NFL teams kick surprisingly often on fourth down.

Romer concluded that teams are not pursuing strategies that would maximize their chance of winning the game.  Romer may be correct, but we should be cautious because his study is based purely on observational data.  It is possible that the real world problem is more complicated than the model used to analyze the data.  Whenever economics (or other social science) professors explain that agents motivated by self-interest are making choices not in their self-interest, the professors may have mis-specified their model or may misunderstand the real-world problem that people are attempting to solve.

Romer’s paper is interesting, well written, and well executed.  All of the criticisms raised here are ones that Romer acknowledges, but they aren’t enough to sway his opinion and policy advice.  The data Romer analyzed indicated that NFL teams rarely try for a first down on fourth down.  The primary question is why?  Romer’s explanation is myopia.  An alternative explanation is that a failed fourth down attempt will shift momentum in the game.

The key problem with observational data is that it is difficult to calculate the expected outcome from counterfactual decisions and policies.  Romer argues that teams are not optimizing because he believes they would be successful fairly often on fourth down and two yards to go.  He concludes this despite rarely seeing teams attempting this play.  So how does Romer “guesstimate” the likelihood of success on fourth down and two?  He looks at outcomes of third down plays with two yard to go in the first quarter of games.  He uses first quarter plays because once enough game time has elapsed and the score is uneven, both teams will adjust their game strategies.  He assumes that third down strategies and outcomes are very similar to what would happen on fourth down plays (if they actually were to occur) because he doesn’t have enough data on fourth down plays.  Even if he observed more fourth down plays, they would not be a random or representative sample of teams and/or game situations.  Romer’s assumptions are made out of necessity, not because they are realistic or accurate.

It is also important to note that Romer’s empirical model does not allow for momentum.  The value of having possession of the ball with first down and ten yards to go on a given yard line (say midfield) in his empirical model is completely independent of how one reached that position on the field.  In the language of dynamic programming, the state variables for the optimization problem include down and yardage, but not the sequence of plays leading up to that point.  He imposes this assumption, as good applied economists often do, to make the problem tractable.  He recognizes that momentum could matter and makes some supplemental calculations to show that teams don’t perform much differently immediately after very bad plays (fumbles, interceptions, blocked kicks, and long kickoff and punt returns by the opponent) and just after very good plays (touchdowns) than they do after typical plays.  This is, at best, a half-hearted attempt to determine whether there is momentum in NFL games.  The model simply doesn’t take the idea of momentum shifts seriously, but avoiding these shifts seems to be a key reason why coaches kick field goals rather than go for it on fourth down.

Rocky Long, at San Diego State, may follow the advice of economics and business school professors and forsake the punt.  But he should do so understanding how the policy advice was determined.  It is extremely difficult for economics professors to evaluate counterfactual policies- whether it is a forecast of what will happen if teams ran and passed on fourth down or a “guesstimate” of the 2012 unemployment rate had there been no stimulus package in 2009. 

Asking economics professors, the Congressional Budget Office or other forecasters to evaluate alternative policies and predict what might happen over the next decade also has limited value.  Many of these same professionals either didn’t forecast the recession or underestimated its severity.  Government economists and advisors didn’t know how deep the downturn was until a year or two later when the data came in.  Mis-estimation of the recession in the midst of the downturn is the explanation given for the woefully inaccurate prediction that the stimulus would keep the unemployment rate below 8%.

It is unwise to rely too heavily on economists as authorities on counterfactual policies.  Economists can’t easily determine what would have happened had there been no stimulus, or how the economy might perform if taxpayers earning more than $200,000 were to face higher marginal tax rates.  In fact they struggle to measure output and employment in real-time.  Predictions about hypothetical economic policies are as fraught with error as predictions about fourth down decisions that have rarely been tried in the past.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,229 other followers

%d bloggers like this: