Unpacking the election results using bayesian inference

As anyone whose read this blog recently can surmise, I’m pretty interested in how this election turned out, and have been doing some exploratory research into the makeup of our electorate. Over the past few weeks I’ve taken the analysis a step further and built a sophisticated regression that goes as far as anything I’ve seen to unpack what happened.

Background on probability distributions

(Skip this section if you’re familiar with the beta and binomial distributions.)

Before I get started explaining how the model works, we need to discuss some important probability distributions.

The first one is easy: the coin flip. In math, we call a coin flip a Bernoulli trial, but they’re the same thing. A flip of a fair coin is what a mathematician would call a “Bernoulli trial with p = 0.5”. The “p = 0.5” part simply means that the coin has a 50% chance of landing heads (and 50% chance of landing tails). But in principle you can weight coins however you want, and you can have Bernoulli trials with p = 0.1, p = 0.75, p = 0.9999999, or whatever.

Now let’s imagine we flip one of these coins 100 times. What is the probability that it comes up heads 50 times? Even if the coin is fair (p = 0.5), just by random chance it may come up heads only 40 times, or may come up heads more than you’d expect – like 60 times. It is even possible for it to come up 100 times in a row, although the odds of that are vanishingly small.

The distribution of possible times the coin comes up heads is called a binomial distribution. A probability distribution is a set of numbers that assigns a value to every possible outcome. In the case of 100 coin flips, the binomial distribution will assign a value to every number between 0 and 100 (which are all the possible numbers of times the coin could come up heads), and all of these values will sum to 1.

Now let’s go one step further. Let’s imagine you have a big bag of different coins, all with different weights. Let’s imagine we grab a bunch of coins out of the bag and then flip them. How can we model the distribution of the number of times those coins will come up heads?

First, we need to think about the distribution of possible weights the coins have. Let’s imagine we line up the coins from the lowest weight to the highest weight, and stack coins with the same weight on top of each other. The relative “heights” of each stack tell us how likely it is that we grab a coin with that weight.

Now we basically have something called the beta distribution, which is a family of distributions that tell us how likely it is we’ll get a number between 0 and 1. Beta distributions are very flexible, and they can look like any of these shapes and almost everything in between:

Taken from Bruce Hardie: http://www.brucehardie.com/talks/cba_tut_art_16_HO.pdf

 

So if you had a bag like the upper left, most of the coins would be weighted to come up tails, and if you had a bag like the lower right, most of the coins would be weighted to come up heads; if you had a bag like the lower left, the coins would either be weighted very strongly to come up tails or very strongly to come up heads.

This distribution is called the beta-binomial.

Model set up

You might now be seeing where this is going. While we can’t observe individuals’ voting behavior (other than whether or not they voted), we can look at the talleys at local levels, like counties. And let’s say, some time before the election, you lined up every voter in a county and stacked them the same way you did with coins as before, but instead of the probability of “coming up heads”, you’d be looking at a voter’s probability of voting for one of the two major candidates. That would look like a beta distribution. You could then model the number of votes for a particular candidate in a particular county would as a beta-binomial distribution.

So in our model we can say the number of votes V[i] in county i is distributed beta-binomial with N[i] voters and voters with p[i] propensity to vote for that candidate:

V[i] ~ binomial(p[i], N[i])

But we’re keeping in mind that p[i] is not a single number but a beta distribution with parameters alpha[i] and beta[i]:

p[i] ~ beta(alpha[i], beta[i])

So now we need to talk about alpha and beta. A beta distribution needs two parameters to tell you what kind of shape it has. Commonly, these are called alpha and beta (I know, it’s confusing to have the name of the distribution and one of its parameters be the same), and the way you can think about it is that alpha “pushes” the distribution to the right (i.e. in the lower right above) and that beta “pushes” the distribution to the left (i.e. in the upper left above). Both alpha and beta have to be greater than zero.

Unfortunately, while this helps us understand what’s going on with the shape of the distribution, it’s not a useful way to encapsulate the information if we were to talk about voting behavior. If something (say unemployment) were to “push” the distribution one way (say having an effect on alpha), it would also likely have an effect on beta (because they push in opposite directions). Ideally, we’d separate alpha and beta into two unrelated pieces of information. Let’s see how we can do that.

It’s a property of the beta distribution that its average is:

 
   alpha
------------
alpha + beta

So let’s just define a new term called mu that’s equal to this average.

        alpha
mu = ------------
     alpha + beta

And then we can define a new term phi like so

       alpha
phi = --------
        mu  

With a few lines of arithmetic, we can solve for everything else:

 
phi = alpha + beta
alpha = mu * phi 
beta = (1 - mu) * phi

And if alpha is the amount of “pushing” to the right and beta is the amount of “pushing” to the left in the distribution, then phi is all of the pushing (either left or right) in the distribution. This is a sort of “uniformity” parameter. Large values of phi mean that almost all of the distribution is near the average (think the upper right beta distribution above) – the alpha and beta are pushing up against each other – and small values of phi mean that almost all the values are away from the average (think the beta distribution on the lower left above).

In this parameterization, we can model propensity and polarization independently.

So now we can use county-level information to set up regressions on mu and phi – and therefore on the county’s distribution of voters, and how they ended up voting. Since mu has to be between 0 and 1 we use the logit link function, and since phi has to be greater than zero, we use the exponential link function

logit(mu[i]) = linear function of predictors in county i
log(phi[i]) = linear function of predictors in county i

The “linear functions of predictors” have the format:

coef[uninsured] * uninsured[i] + coef[unemployment] * unemployment[i] + ...

Where uninsured[i] is the uninsurance rate in that county and coef[uninsured] is the effect that uninsurance has on the average propensity of voters in that county (in the first equation) or the polarity/centrality of the voting distribution (in the second equation).

For each county, I extracted nine pieces of information:

  • The proportion of residents that do not have insurance
  • The rate of unemployment
  • The rate of diabetes (a proxy for overall health levels)
  • The median income
  • The violent crime rate
  • The median age
  • The gini coefficient (an index of income heterogeneity)
  • The rate of high-school graduation
  • The proportion of residents that are white

Since each of the above pieces of information had two coefficients (one each for the equations for mu and phi) the model I used had twenty parameters against 3111 observations.

The source for the data is the same as in this post, and is available and described here.

The BUGS model code is below: (all of the code is available here and the model code is in the file county_binom_model.bugs.R)

Model results / validation

The model performs very well on first inspection, especially when we take the log of the actual votes and the prediction (upper right plot), and even more so when we do that and restrict it only to counties with greater than 20,000 votes (lower left plot):

actual_v_estimate

This is actually cheating a bit, since the number of votes for HRC (which the model is fitting) in any county is constrained by the number of votes overall. Here’s a plot showing the estimated proportion vs. the actual proportion of votes for HRC, weighted by the number of votes overall:

proportions_plot

Here is the plot of coefficients for mu (the average propensity within a county):

mu_coefs_plot

All else being equal, coefficients to the left of the vertical bar helped Trump, and to the right helped Clinton. As we can see, since more Democratic support is concentrated in dense urban areas, there are many more counties that supported Trump, so the intercept is far to the left. Unsurprisingly (but perhaps sadly) whiteness was the strongest predictor overall and was very strong for Trump.

In addition, the rate of uninsurance was a relatively strong predictor for Trump support, and diabetes (a proxy for overall health) was a smaller but significant factor.

Economic factors (income, gini / income inequality, and unemployment) were either not a factor or predicted support for Clinton.

The effects on polarity can be seen here:

phi_coefs_plot

What we can see here (as the intercept is far to the right) is that most individual counties have a fairly uniform voter base. High rates of diabetes and whiteness predict high uniformity, and basically nothing except for income inequality predicts diversity in voting patterns (and this is unsurprising).

What is also striking is that we can map mu and phi against each other. This is a plot of “uniformity” – how similar voting preferences are within a county vs. “propensity” – the average direction a vote will go within a county. In this graph, mu is on the y axis, and log(phi) is on the x axis, and the size of a county is represented by the size of a circle:

propensity_uniformity

What we see is a positive relationship between support for Trump and uniformity within a county and vice versa.

And if you’re interested in bayesian inference using gibbs sampling, here are the trace plots for the parameters to show they converged nicely: mu trace / phi trace.

Conclusion and potential next steps

This modeling approach has the advantage of closely approximating the underlying dynamics of voting, and the plots showing the actual outcome vs. predicted outcome show the model has pretty good fit.

It also shows that whiteness was a major driver of Trump support, and that economic factors on their own were decidedly not a factor in supporting Trump. If anything, they predicted support for Clinton. It also provides an interesting way of directly modeling unit-level (in this case, county-level) uniformity / polarity among the electorate. This approach could perhaps be of use in better identifying “swing counties” (or at least a different approach in identifying them).

This modeling approach can be extended in an number of interesting ways. For example, instead of using a beta-binomial distribution to model two-way voting patterns, we could use a dirichlet-multinomial distribution (basically, the extension of beta-binomial to more than 2 possible outcomes) to model voting patterns across all candidates (including Libertarian and Green), and even flexibly model turnout by including not voting as an outcome in the distribution.

We could build similar regressions for past elections and see how coefficients have changed over time.

We could even match voting records across the ’12 and ’16 elections to make inferences about the components of the county-level vote swing: voters flipping their vote, voting in ’12 and not voting in ’16, or not voting in ’12 and then voting in ’16 – and which candidate they came to support.

The importance of model thinking

The other day, I got into a weird argument with my cousin (who is among the smartest people I know). We were discussing Sam Wang, the Princeton professor who runs the Princeton Election Consortium (PEC). Of the well-known prognosticators about the election, Dr. Wang was the most wrong, with his site estimating a probability of HRC winning to be over 99%.

My cousin was arguing that the results eliminated Dr. Wang’s credibility in this field and that we basically shouldn’t be listening to him any longer. Because he had been so spectacularly wrong, why should he be trusted again?

But this is wrong, and why it’s wrong is important in the discourse of ideas. First, Dr. Wang wasn’t reporting that he himself was estimating these odds for HRC, he was reporting that his model was outputting these estimates. This is an important distinction. He may have been convinced by the statistical model he was referring to, and he may have also believed its reported estimates, but what’s important for these purposes is that he was reporting the results of an independent model, not simply saying that’s what he believed.

Certainly, the model that the PEC had been using has lost its credibility. We now know that it didn’t properly incorporate correlated error in the outcomes at the state level (e.g. a miss in PA making a miss in MI and WI more likely), and it underestimated the distribution of overall polling bias. We shouldn’t use it again.

But what if Dr. Wang creates a new model that corrects for these mistakes? How now should we take my cousin’s advice to disregard Dr. Wang? Do we not even bother with the new model since the source is tainted by the previous election results? Do we inspect the model independently?

We can see here that my cousin’s advice doesn’t make much sense if you treat the model and its author separately. Clearly the new model must be treated on its own merits.

But this gets at a deeper question. What is a recommendation, a forecast, and estimate, an analysis, etc., without a model? The answer is that there is always a model, because there is always some kind of computation that leads to the end result, even if that computation is taking place entirely within the neural circuitry of the analyst. In these cases, when people simply come to their own conclusions, the author and the model are one and the same. There are no equations, parameters, logical relations, etc., that observers can evaluate and see if the specification does or does not make sense.

If Dr. Wang had not made his model explicit and had simply been reporting his own estimates, then my cousin perhaps would have been right. In this world, the logic would go something like this: his (Dr. Wang’s) model turned out to be bad, and his model was him, so disregarding his model and disregarding him are one and the same.

But this of course was not the case, and this is why it is so important to think in terms of explicit models. If you don’t have a model in mind facing something in the real world, it’s not even clear to me how you update your knowledge, aside from adding an additional memory to your bank of heuristics. When you understand how a model functions – the relationships between its several parts – you can adapt and improve it in the face of real-world experience.

 

Three reasons to be a raving lunatic about Trump

The other night, my mind was literally 💥 when two of my very smart friends challenged me on the idea that Trump being elected president was not the worst thing to happen in 2016. To me – it’s an absolute no-brainer, and a conclusion that even a little imagination combined with historical knowledge would usher you to.

I guess in part it depends on how bad you think things could get; I think the worst cases are so bad that it’s driven me to become a raving lunatic. Here’s why I think you should be one too:

International Conflict

This one is especially salient given the murder of the Russian ambassador to Turkey.

Matthew White calls the beginning of the 20th Century the “Hemoclysm” – literally, blood flood – because of the staggeringly large conflicts and loss of life. WWI, WWII, the Holocaust, the Great Purge, and two nuclear bombs all happened during this time. How did this era begin? As Steven Pinker puts it in The Better Angels of our Nature:

The war was a perfect storm of destructive currents, brought suddenly together by the iron dice of Mars: an ideological background of militarism and nationalism, a sudden contest of honor that threatened the credibility of each of the great powers, a Hobbesian trap that frightened leaders into attacking before they were attacked first, an overconfidence that deluded each of them into thinking that victory would come swiftly, military machines that could deliver massive quantities of men to a front that could mow them down as quickly as they arrived, and a game of attrition that locked the two sides into sinking exponentially greater costs into a ruinous situation – all set off by a Serbian nationalist who had a lucky day.

We live in a fragile world. The complexities of international diplomacy are enormous, and the consequences can be more severe than we can imagine. And Trump has shown himself to be quite willing to fly by the seat of his pants on things like the One China policy, our support of NATO, and just bombing the shit out of wherever.

The President’s #1 job is to keep America safe. This is not as simple as battening down the hatches and keeping “America First”; it involves real, nuanced thinking about how to carefully deploy threats, when to back them up, and when to offer the olive branch. Are we going to get anything close to that with Trump? Don’t tell me he has good advisers – all the easy decisions get made before they get to the oval office.

Norms of American Democracy

Although we have a Constitution that is the ultimate source of law in this country, most of what makes our system of government actually so great is that our leaders respect norms regarding the use and transfer of power. When incumbents lose, they leave office! When candidates lose, they concede! Until, maybe, now.

In addition, candidates for higher office have been careful to make clear that they have no conflicts of interest (say, by releasing their tax returns) and that they are acting in the interest of the country (at least according to their worldview).

Trump has done the exact opposite of all this. He refused to say he’d concede in the event of a loss. He hasn’t released his tax returns. His businesses present a bewildering array of conflicts of interest. He’s installing his family (who are managing his businesses) as his closest confidantes.

Now, putting aside whether any of this is illegal, it certainly screws up the incentives in our country. The way to succeed becomes to curry favor with the government. This is how countries become Russia – a sham democracy and a corrupt kleptocracy.

And then of course there’s the lying. So much lying. Like, for instance, that he had one of the largest electoral college wins (his was one of the smallest); or that the murder rates are the highest they’ve been in decades (they’re the lowest). Politicians lie, have always done so, and will always do so. But – at least in the USA – their lies at least maintain respect for the truth, as they’re couched in euphemisms or misdirection. Trump has no respect for the truth. His lies are so easy to fact-check that it’s hard to escape the impression that for him, lies like this are a much a demonstration of domination (“look how easily and bigly I can lie and you can’t do anything about it”) as anything else.

Degrading the norms of our republic – as Trump has done and seems to be intent on continuing to do – is one way to start ending the great American experiment in democracy.

Climate Change

While certainly not as sexy a way to go out as a nuclear war, climate change poses a similar kind of existential threat. We need a president that can at least recognize the science and the tradeoffs that come with different forms of climate policy – not someone who has once dismissed it as a “Chinese hoax”, or who puts a climate denier in charge of the EPA, or who puts a man who once forgot, in a nationally televised debate, the third federal department that he’d like to eliminate, in charge of said department – the Department of Energy, whose original mandate is to safeguard our stockpile of nuclear waste. What kind of thinking puts a man who’s famous for saying “oops” in charge of that department?

Putting it all together

Maybe raving lunatic was not the right word – you have to be taken seriously at the end of the day. But we should be uncomfortable about all this. We can’t let our vigilance slide and allow this (tax returns, the family posse rolling up to the White House, shattering of stable alliances, abuses on twitter, the lying etc.) to feel normal. Like the proverbial frog in the pot of increasingly-hot water, we can’t just let our government crumble around us and say “this is fine”. It’s exhausting, but we can’t get tired of calling Trump and his enablers out. As Voltaire said:

Those who can make you believe absurdities, can make you commit atrocities.

 

 

A different kind of political geography

In the wake of the surprising (to say the least) electoral result I initiated a few projects to try and understand the politics of the country. One thing I wanted to understand was the impact of demographic and underlying situational variables (e.g. health, income, unemployment, etc.) on how people voted. Was the vote about Obamacare? Was it about lost jobs? Was it all education levels? Or was it all racism? Theories have been floated but I haven’t seen a rigorous evaluation of these hypotheses. What’s below is just an exploratory analysis, but the data does point in some interesting directions.

What follows below are a series of visualizations of a large, aggregated dataset of both demographic, situational, and electoral data. Sources for the demographic and situational data are listed here and the electoral data is from the New York Times.

The type of visualization is called a self organized map. Roughly speaking, each hexagon is a group of counties; the map arranges the counties such that similar ones are closer to each other on the map and dissimilar ones further apart:

hs_diploma

For any given variable (here – the proportion of residents in the counties that graduated from high school) the map is a heatmap. Redder colors means the counties index higher, bluer means they index lower. Here, the upper right are the counties where fewer people have a high school diploma, and lower left are the most educated.

All of the maps shown below are available at this interactive site. (A very similar set of maps, but for voting swings as opposed to voting share, is available here)

Below, we look at the voting share for Hillary. The counties are arranged in the same way as above, but since we’re looking at different variable the map is colored differently. (Confusingly, more votes for HRC are red as opposed to the customary blue for liberals, but work with me here). The reason this map is more organized than the rest is that I used this variable to “supervise” the organization (don’t worry about the details of this – basically it just guaranteed that this particular coloring, which is the reference point, would be organized.)

base_plot

Now that we have the basics in place, we can look at other variables: let’s check a few variables and see if they line up w/ the HRC voting share map. What we can do is draw a boundary around the areas that went strongly for Trump and for HRC like so:

base_plot_w_annotation

And we’ll keep these annotations throughout.

Health and insurance:

The breakdowns for uninsurance and health variables like obesity and diabetes don’t break down along electoral lines: the split goes in the opposite direction, with the highest uninsurance and low health areas going to both candidates:

uninsured

adult_obesity

diabetes

Economic variables:

These graphs should put the “economic anxiety” argument to rest, as the areas with highest unemployment went strongest to HRC and those with the least went strongest to Trump.

unemploymen

earnings

Ethnic Variables:

A few graphs line up pretty well: whiteness and ethnic homogeneity. And whiteness and ethnic homogeneity line up basically on top of each other. This would support the hypothesis that the election for Trump was mainly a cultural (and not a policy) event; white enclaves are reacting against a diminishing place in the cultural landscape – hence the making things great again:

whiteness

homogeneity

See for yourself: 

My code is available here (it is not very well commented or formatted, but it’s there).

As mentioned above, all the maps above are available at this site. If you want to see something similar but with voting swings – the amount the county changed their vote from ’12 to ’16, you can see that here.

 

 

Inauguration Day Fundraiser

I sent this email to a group of my friends that live in NYC. If you live in the area and want to help organize, let me know at thomas.vladeck@gmail.com

Friends,

If you’re like me you’re shocked about what happened on election day. I can’t believe our country decided to put that orange-skinned, pussy-grabbing, bankrupt con artist that’s likely an agent of the Russian government into the Oval Office.

But here we are.

As bad as it is now, he’s not in office yet. And as unpredictable as he is, we have no idea what to expect he’ll do when he gets there (see these two back-to-back tweets as he does a real-time A/B test of governing styles), but it likely won’t be good. One of his top lieutenants already proposed bringing back the House Un-American Activities Committee (a black stain on the history of American Civil Liberties), and if he implements his campaign proposals to remove 11 million immigrants, ban muslims, punish women for having abortions, reinstate waterboarding, and change our libel and slander laws – he’ll be violating the first, fourth, fifth, eighth, and 14th amendments.

Not to mention he’s planning to rip up the watershed Paris Accords (makes sense as Climate Change is a Chinese hoax…) in a year that has seen every indicator of Global Warming reach dizzying new heights.

While we’re on the subject of existential threats, it’s becoming clear where the suspiciously pro-Russia policies of the Trump campaign came from, as Russian diplomats had been in contact with the Trump campaign throughout. This at a time that NATO has put three hundred thousand ground troops on high alert because of a feared confrontation with Russia.

I don’t know about you, but I’m probably going to be a wreck on inauguration day (January 20th). Instead of just watching, let’s do something. And that something we should do is party together, raising money for organizations that are going to fight these abhorrences tooth and nail.

Here’s what I’m thinking:

  • We rent out a space
  • We invite all our friends
  • We have a great time
  • Some great groups get some needed cash to keep the fight going

Who’s in?

Tom

 

Re: the election

Those who can make you believe absurdities, can make you commit atrocities

I owe many people an apology today. I was extremely confident that HRC had this election in the bag and presumptuously advertised this confidence over the past weeks, dismissing any thought that she would lose. It wasn’t an act, and it has made it all the more devastating to me now that she has in fact lost.

Because of my work with Gradient, which does statistical modeling in a business setting, over this cycle a lot of people have asked me for my interpretation of the various forecasting models. When discussing it with them, I was bullish, overconfident, and as it turns out, terribly wrong. I feel terrible for giving people this false impression of security and making the result any more jarring and devastating than it already is.

What was I wrong about? Well, mostly everything – but two very large things stand out: polling bias and the correlation of polling errors. In general there are two types of statistical error: bias and variance. Variance is when you’re dancing around the right result – any one measurement is off but on average the errors cancel out; bias is when the errors don’t cancel. Polling bias (even state polls) in presidential elections has been estimated over many cycles, and has typically been small (about 1%). What polling bias means in concrete terms is that many more people voted for Trump than the polls captured; there was a social movement happening in front of our eyes but that did not show up in the data. I trusted the data and did not foresee the possibility that the polling bias would be so large across the board in swing states.

That leads to the next point. State results are obviously correlated – blue and red states tend to vote together. This means too that polls of states should be correlated. So outcomes and the measurements of those outcomes should be correlated – but should errors be correlated? I had no reason to think so. For example, polls had HRC ahead in Michigan and Wisconsin; I thought that all the correlation between the two would be captured by the correlated polling results across the two states. I did not think that a polling error in one state meant that a polling error in the same direction in the other state was more likely. It is obvious now that this was the case.

This election is going to prompt a rethink across the entire polling and data analysis industry – myself very much included, even though politics is not my professional remit. But that’s small potatoes compared to the vastly more consequential implications of this election: climate change; respect for women, immigrants, and muslims; the supreme court; our security and trade relationships around the world; the nuclear codes, and so on.

As for what happens next, I hope you agree with me that it’s critical that we stay engaged with our country’s politics as opposed to recoiling in horror. Quite literally, I think our country needs us and people like us to fight tooth and nail to preserve our vision of what America is and what makes it great.

Variations on a theme

Anyone that knows me has probably noticed that over the past year-plus, I have gotten into photography in a major way. But as much as I post photos, I spend about as much time ogling the photos of more talented and able photographers. One thing I noticed is that the best photographers tend to have a very specific style – for Randy Martin, it’s a perfectly centered subject in a larger scene; for Michael Goldberg, it’s close-ups on the street using flash to illuminate his subject’s imperfections; for Nguan, it’s warmly-lit subjects lost in a moment of ennui.

I guess this shouldn’t be too much of a surprise, as variations on a theme are the rule in other forms of art, like music (bands have a sound that cuts across their songs/albums), movies (wes anderson), paintings (dali), etc. I want each photo to stand on its own and be considered in its own right, and I want the ability to take any photo that I think is compelling for whatever reason. But that’s not how it works; your audience (whoever that is and however many people that is) wants something familiar as well as novel and interesting. Perhaps it’s the commonality between your works establishes a common ground with your audience and without a familiar “vocabulary” they would lack the basis to appreciate the work – but I’m just speculating. 

I don’t (yet) have a style. Or at least not one that I’m aware of. Perhaps it takes time to find what you like. Perhaps it takes time to develop the skills that allow you to impose your style on the scene in front of you. I really don’t know! But it’s on my mind this morning.

Fossil Fuel Divestment

There was a great pair of articles recently in the New York Times discussing Stanford’s recent divestment from coal companies; an op-ed stating the case that divestment would do little, and a few responses to that argument. I think these articles really capture the debate. I agree with both sides: i.e. that the partial equilibrium of divestment should lead to absolutely zero impact on the targeted companies’ share price, profitability, or behavior. But divestment is a powerful political statement, not unlike a boycott.

Let’s back up: one can make roughly three arguments in favor of divestment:

  1. That it will reduce the share price of the targeted companies, increasing their cost of capital, in turn both reducing investment in the sector and increasing their cost of doing business.
  2. That the sector targeted for divestment is very high risk, so exposure needs to be curtailed. With respect to fossil fuels, this is the “carbon bubble” argument, which goes roughly like this: the current share prices of fossil fuel companies reflects a level of profitability they would only have if they burned more carbon than we can burn without frying Earth. So, the argument goes, eventually policy will step in and their valuations will crash.
  3. Finally, there’s the moral argument: you do not want your money invested in companies that do something you disagree with or find objectionable. Divesting, as Stanford has done, can be a powerful political statement. These political statements in turn can change the political environment.

Let’s take on these arguments in turn: first, the argument that divestment on its own will reduce the level of activity by increasing the cost of doing business. The fundamental reason this doesn’t work is that for every seller, there’s a buyer. And there are many, many, many buyers that don’t give a fuck about anything moral, and will buy these fossil assets on the cheap and drive the price back up to where it was. But let’s even say you DID succeed in impacting the cost of capital and drive up the cost of doing business; this would just raise the marginal price, helping these companies finance new investments internally. And even if you made a few more heroic assumptions about the shape of the supply curve, etc. – most energy assets (especially in the oil market) are owned by governments, so capital market prices aren’t nearly as much of a factor.

On to the carbon bubble: I certainly hope there is a carbon bubble. I’ve devoted my young professional life to climate change, and it saddens me to think how little progress we’ve made. I hope that global policy comes around and puts fossil fuel companies out of business. But, – so the question goes – what about our financial system – don’t institutional investors depend on these companies to stay afloat? Well, yes and no. Exxon is the largest component of the S&P500, so it’s certainly going to be a part of many investment portfolios. Wouldn’t it a Bad Thing if it went bust? Well, not really. Here’s why: (a) if Exxon goes bankrupt, it’s because some other company is getting rich supplying the same energy services (i.e. Tesla), and (b) institutional investors are so widely diversified that they’ll own the “other side” of the energy transition. Exxon’s market value won’t disappear, it will be replaced. 

Finally, the moral argument. This argument resonates with me, and I support those that make this decision. There is a tricky side to it, though, because if you’re selling for non-financial reasons (and therefore at below-market prices), you’re just creating economic opportunity for less-savory people. Think the Koch brothers buying a refinery on the cheap because you were divesting – you make a below-market return, the refinery doesn’t go offline, and the worst people ever make more than the buck they deserve. Not a great outcome.

The point to all this is that there is no substitute for policies that address the underlying value of fossil fuel use. If it’s more expensive to burn gasoline and coal, we’ll do less of it. Attacks on the share prices of companies is the tail wagging the dog.