Declining real income in the US

I am not a big fan of income inequality per se as a policy issue. I care more about poverty alleviation and improving the living situation of the majority of Americans, versus capping the income or wealth of a segment of the population. Of course, there is an argument around concentration of power, and of heading off a course where we end up having the modern equivalent of a landed aristocracy, which I’m receptive to. But putting that aside for the moment, when I see graphs that show that income inequality is rising, I find myself thinking “that is a combination of two statistics, only one (the income of the average or lower-income segment of Americans) I really care about”.

So I decided to do a quick analysis, thinking it would show that income was rising across the board for all segments (even if income was rising faster in some segments vs. others, creating inequality). This is not what I found.

I took a sample of IPUMS data, which has individual-level data from the US census. I then built a series of smooth additive quantile regression models using the package qgam. (A quantile regression is where you’re estimating a conditional percentile, such as the median — 50th percentile — or some other, like 90th or 10th, as opposed to the mean. Smooth and additive = think generalized additive model).

I then regressed the individual-level income data on the year, age of the respondent, and the sex of the respondent. Here are the graphs of the year coefficient for the 10th, 50th, and 90th percentiles:

Basically, the largest incomes have kept increasing. The median income has stagnated from sometime between 1990 and 2000. And worst of all, the lower incomes have declined to roughly 1980 levels in the past 20 years.

Now, this isn’t the same thing as saying the income of the wealthiest Americans have kept increasing — people’s incomes change quite a bit from year to year — but it is true to say that if you happen to find yourself in the 10th percentile of incomes in a given year, you’re going to be earning substantially less than you would have been in 2000. That is a Bad Thing.

What is socialism?

Like many Americans, I have watched with some amazement at the rise of AOC and other forces on the left, often under the guise of “socialism”.

But does anyone know what it means? “Socialism” has come to confuse two different things:
  1. A form of economy: State or collective ownership of the “means of production”, meaning government planning and control
  2. Policy aims: income redistribution, state subsidies to reduce poverty, government provision of healthcare
And boy, are these extremely different. Consider, the classic book The Road to Serfdom, which argued that a planned economy (socialist under definition [1] above) would invariably lead to a loss of freedoms, also contained explicit exceptions for the policy aims of a welfare state, so long as it was done within the context of a capitalist economy.
But is this true? At the time I wrote that tweet, I didn’t have any data to back up my assertion — I had just recently read The Road to Serfdom so I had a fairly good idea at how socialism was viewed then; and I follow the political discourse so had an intuition about how it’s viewed today. Turns out, according to Gallup data from 1949 and today, that my intuition was luckily right.
Why does this matter? You might, like me, strenuously object to any movement towards socialism if it meant collective control of the economy while being comfortable with a welfare state. But the muddled definition of socialism out there does not help people draw lines effectively. According to the graph above, most people think of socialism as a set of policy aims; whereas most die-hards view socialism as a different form of economy. That can lead to this kind of confusion:

solving the “wedding table” problem in prolog

what is prolog?

in a logic program you write down facts and rules, and run the program by writing a query. a typical fact in prolog looks like this:
which means that tom is an adult. in many ways this looks like a function, but it’s not; nothing gets returned by this line. a rule looks like this:
grandparent(X, Y) :-
  parent(X, Z),
  parent(Z, Y).
the :- part can be read aloud as “is true if”. the whole rule can be read very naturally as “X is Y’s grandparent if X is the parent of Z, and Z is the parent of Y”. (in prolog, variables are written with capital letters). and then once you have your facts and rules written down, you run the program by writing a query. say you wrote this:
grandparent(X, tom).
the program would try to find some X that satisfied the conditions of the program that would make everythign true.

what is the wedding table problem?

now that we’ve gotten those basics out of the way, let’s dive into the problem we’re going to solve. i’ve never had a wedding, but many of my friends have. a common problem, as i understand from second-hand experience, is how to set the seating arrangement. dozens of family and friends from all stages of life, in many cases coming from all over the country (or the world), to sit down at the same moment. it’s a big deal! complicating matters, from any given strata, your guests are probably friends with each other. but, crucially, some of your guests may hold grudges against others. being the good host that you are, you want to sit friends with each other, and keep people with grudges apart. this is all, of course, subject to the constraints to the number of tables and seats that the venue holds.

let’s work through the solution

let’s start simple. first we need to know if two people are at the same table.
at_this_table(Person1, Person2, ThisTable) :-
    member(Person1, ThisTable),
    member(Person2, ThisTable).
this is pretty self-explanatory. the member tests for set-membership. is Person1 in the set called ThisTable but, in general we’re not going to care much if two people are at a particular table, but rather that they’re both at the same table, whichever it is. to do that, we simply write the following.
at_same_table(Person1, Person2, AllTables) :-
    member(ThisTable, AllTables),
    at_this_table(Person1, Person2, ThisTable).

pair_at_same_table(Pair, AllTables) :-
    nth1(1, Pair, Person1),
    nth1(2, Pair, Person2),
    at_same_table(Person1, Person2, AllTables).
first we now introduced this idea of AllTables, which is a variable that will be important throughout the rest of the code. the first goal is basically just saying that, first, there is a table among all of the tables avaialable to us, and second, that that table contains both of the people we’re interested in. the second goal simply makes it a bit easier to work with a set containing both of the people in the pair. let’s move to the more dramatic issue: making sure that enemies don’t sit at the same table.
at_different_tables(Person1, Person2, AllTables) :-
    member(Table1, AllTables),
    member(Table2, AllTables),
    member(Person1, Table1),
    member(Person2, Table2),
    intersection(Table1, Table2, []).
this one was a bit tough to write. first of all, it’d be trivially easy to satisfy this constraint by not sitting one of the people at any of the tables. to guard against that, we first have to make sure that both people are, in fact, sitting at one of the tables in AllTables. that’s what the first four lines are doing. the last line is just saying that neither of those tables share a common member: that they are different tables. there is probably a cleaner and more direct way of doing this, but it works, so ¯_(ツ)_/¯ writing this so we can handle pairs of people we get the following. the first line is just handling the special case where the “pair” is an empty set.
pair_at_different_tables([], _) :- !.

pair_at_different_tables(Pair, AllTables) :-
    nth1(1, Pair, Person1),
    nth1(2, Pair, Person2),
    at_different_tables(Person1, Person2, AllTables).
so, now we have a bit of a problem. it’d be really easy for the program to get around the constraints by making multiple copies of people! say you want david to sit with geoff, and geoff to sit with andrew, but you want to keep david and andrew apart. easy! geoff will sit in two places at once. if you don’t explicitly tell the program that this is not possible, that’s exactly what you’ll get. but, when i tried writing simple solutions to this, i kept running into code that would hang endlessly. some investigations of the debugger told me why. to understand it, you have to understand a bit about what it means to be a “ground” variable in a logic program.


let’s say you’re solving a sudoku puzzle. in the early stages of solving it, if you were to enumerate all the options in each cell that you hadn’t eliminated yet, you might have something that looks like this:
   4      1679   12679  |  139     2369    269   |   8      1239     5
 26789     3    1256789 | 14589   24569   245689 | 12679    1249   124679
  2689   15689   125689 |   7     234569  245689 | 12369   12349   123469
  3789     2     15789  |  3459   34579    4579  | 13579     6     13789
  3679   15679   15679  |  359      8     25679  |   4     12359   12379
 36789     4     56789  |  359      1     25679  | 23579   23589   23789
  289      89     289   |   6      459      3    |  1259     7     12489
   5      6789     3    |   2      479      1    |   69     489     4689
   1      6789     4    |  589     579     5789  | 23569   23589   23689
most of the cells are not locked in yet. the 4 and 3 in the upper left have been resolved to single values, but everythign else in that 3×3 sector is still open. in this example, the 4 and 3 are “ground” — tied to a specific value, whereas the other cells are unground — they could still inhabit a range of values. when the logic program executes, it’s going to create a number of these unground variables that could take on a range of values. so if we were to stop the program mid-way and take a look at one of the tables in our arrangement, it would be a mix of people’s names and variables representing a range of possible people. when we want to make sure that someone is only at one table, we only care about these “ground” variables — those representing actual people. if we don’t do this, the
program doesn’t really know how to execute, and just hangs. you’ll notice this \+ in the third goal below. that means “not”. let’s work through these goals in turn.
ground_member(X, Y) :-
    member(X, Y),
simple — this is just finding an X such that X is ground and part of Y
ground_members_of_table(Table, GroundTable) :-
    findall(X, ground_member(X, Table), GroundTable).
sliiiightly more complicated. this is true if GroundTable has all of the “named” members of a table at any point in time. we’ll use that below.
not_ground_member_of_table(Person, AnotherTable) :-
    ground_members_of_table(AnotherTable, GroundTable),
    \+ member(Person, GroundTable).
as before, this pulls all of the “named” members into GroundTable and the next line ensures that the person we’re testing is not part of that set. finally we get to the end:
only_at_one_table(Person, AllTables) :-
    member(Table1, AllTables),
    member(Person, Table1),
    subtract(AllTables, [Table1], OtherTables),
    foreach(member(AnotherTable, OtherTables), not_ground_member_of_table(Person, AnotherTable)).
which very simply does a bit of work to create all the tables the person should not be at (any other table than the one they’re currently at), and then ensures that the person is, in fact, not at any of those other tables. we’re so close! now, we just have to make sure that our computer doesn’t do any funny stuff by making some tables magically large or small on us. we need a way to constrain the tables to be as large as we want. so below we have assign_length, which for reasons that will make sense below, assume that you have a list of tables and a list of lengths that you want to map to them, and this works for one member (at the specified index Idx).
assign_length(Idx, AllTables, Lengths) :-
    nth1(Idx, AllTables, ThisTable),
    nth1(Idx, Lengths, LengthOfThisTable),
    length(ThisTable, LengthOfThisTable).
okay, so now we can pull all this together. this is the final goal which we can query to find arrangements that will work! going through it line by line, it’s pretty straightfoward. the first three lines ensure that there will only be as many tables as we want, and that they will have only as many seats as we want. the fourth line ensures that pairs that we want to sit together are at the same table, and that pairs that we want to sit apart are at different tables. the last two lines ensure that everyone is just sitting at one table.
arrangement(Tables, Lengths, Pairs, Exclusions) :-
    length(Tables, Y),
    length(Lengths, Y),
    foreach(between(1, Y, X), assign_length(X, Tables, Lengths)),
    foreach(member(Pair, Pairs), pair_at_same_table(Pair, Tables)),
    foreach(member(Excl, Exclusions), pair_at_different_tables(Excl, Tables)),
    flatten(Pairs, AllPeople),
    foreach(member(Person, AllPeople), only_at_one_table(Person, Tables)).
run this code, and what do you get?
?- arrangement([X,Y], [3,3], [[a,b], [c,d]], [[a,c]]).
X = [a, b, a],
Y = [c, d, c]
% variables may be duplicated within the same table if there is space
and impossible arrangements will fail:
?- arrangment([X,Y], [3,3], [[a,b], [c,d]], [[a,b]]).

Microsoft should provide an R environment in Excel (and buy RStudio to help do it)

Excel doesn’t seem to be going anywhere, and nor should it — it’s a great tool, and the best one for countless use cases. At the same time, analyses that would previously live in Excel are now being done in R and Python, and many more analyses that would previously not have not been done at all are now being done in R and Python.

(Everything I’m about to write applies to both R and Python equally, but I know the R space a lot better.)

At the same time, the consumers of these analyses are often not code-literate. But they want to feel ownership of the analysis, be able to play with assumptions, etc. They don’t just want the results handed to them. This is a core benefit of Excel.

Additionally, while un- or semi-structured data (e.g. natural text, JSON files) and media (images, audio, video) are becoming more common as the subject of analyses, tabular data is, and will be, the most common data format for the foreseeable future. Excel handles these formats natively and transparently as worksheets.

I can’t tell you how many times I would have loved to have done the following for a client:

  1. Store the raw data as (a) worksheet(s)
  2. Have a key assumptions sheet that lists the key inputs for an analysis, which the user can change
  3. Have R scripts that read from the assumptions and the raw data, and produce outputs (charts, statistics, tables, etc) right into Excel
  4. Have a button that, when pressed, sources a script (that can of course call into other scripts)

Obvious bonus points would be if you could fire up an R editor right within Excel that could refer to worksheets and cells inside of its environment as just another tab inside of Excel. In short, why can’t Excel look like this:

Screenshot 2018-11-07 19.20.30

Of course, it would be great if the text files could be used with version control like Git.

Obviously, this would not be straightforward to do, but Microsoft could make it happen and build Excel for the next generation. It would also open up a zillion new use cases and deliverable formats for internal or external consulting firms (like mine).

Why do I make the additional assertion that Microsoft should buy RStudio? Because R is growing like crazy, and RStudio and Jupyter are becoming the default environments for doing analysis in R. But Jupyter is an open source organization, so not something Microsoft can buy, while RStudio is a company focused on supporting its development products.

If Microsoft wanted to catch up to the curve in terms of where the best teams do their analysis (no longer Excel), it should find a way to own the R space.

And it is trying to do that — it has an entire R ecosystem. But even as someone that works day-in, day-out in R, and follows everything going on in R, I have no idea how it works. It feels to me that it’s aimed at IT managers in enterprises and not the individual data scientists that are doing the most to drive adoption within those companies. It’s not the right approach — as they know, because they bought Github, whose primary adoption vector was through individual (or small team) adoption within larger enterprises.

If I were in corporate development and/or product strategy at Microsoft, this would be a no-brainer to me.


I think “should” — especially when found in its past tense construction “should have” — is one of the most dangerous words to utter and thoughts to allow yourself to think.

“Should” can’t exist on its own. There is no objective “should”. When something “should” be some way or another, that’s only true in relation to a particular standard, that’s not typically referenced or even acknowledged.

Why “should” is dangerous is that it’s easy to forget (or most people don’t know) that this standard needs to come along with it, and the “should” feels just objectively true. When one says “this should be easier”, “I shouldn’t have to work this hard”, etc, etc, it’s easy to feel like that “should” is just objectively true. But it’s not.

Make the Senate matter less

On Politics Twitter there has been a lot of chatter about how undemocratic the Senate is as an institution — e.g. that a person in Wyoming has 68x more influence on the Senate via their vote as someone in California.

The Senate is not going to change any time soon. Even if other things may even out the partisan imbalance like DC or Puerto Rican statehood:

And there is at least one straightforward way to make the presidential election by popular vote.

But there is another way to address this issue: make the Senate matter less.

I continue to see the intractability of agreement at the federal level something that could be more providentially solved by moving more domestic policy to the state level.

This is not an argument for less government, or more government, just a shift in the locus of government from the federal level to the state level. Theoretically, this should not be a partisan issue: let liberals in California and New York have the government they want and conservatives in Utah and Kansas have their style of government.


Trump & Federalism

I don’t think I’m the only person who grew up distrustful of the generically Republican position that we should weaken the federal government and “send more power back to the States” who is now re-evaluating the power of the federal government, now that it is in the hands of the spray tan menace.

I think for a lot of us, this position seemed a bit too close to a modern articulation of “states rights”. It was definitely true in the past — and is probably true today — that it is impossible to evaluate Republican policy positions separately from the party’s antagonism toward civil rights.

But, assuming it is possible to do that, I’m finding the idea of moving more and more policy decisions out of the federal government and into respective state governments more and more attractive.

For example, the country does not agree on health care policy. The citizens of New York, Massachusetts, and California probably agree. And those of Utah and Idaho probably agree. (In all cases, I mean agree to the degree required to reach a consensus in government). But New York and Utah definitely don’t. But should they have to?

Most states are as large as national governments elsewhere in the world. Even Utah, which is 31st in US State GDP, has an economy almost as large as New Zealand’s and a population over 3 million people. By comparison, Denmark and Norway, two Scandinavian countries often cited as the gold standard for quality of government, have populations of 5.7 million and 5.2 million, respectively.

So each state definitely has the resources to run their own health care system. Why not let Utah run their system and let New York run their system? I don’t know why this is not possible, but I also don’t think it makes much sense to continue trying to get the whole country to adopt a single system when some coalitions in some states want one thing and other coalitions in other states want something else. And why wouldn’t this be the best possible way to resolve the disagreement? If one approach succeeds and the other fails, won’t there be a lot of pressure on the unsuccessful states to adopt the more successful approach?

This logic definitely doesn’t apply everywhere — some things really do have to be handled at a national level. Climate policy, for example; or the military.

But many domestic policies that are intractable at the national level are much more tractable at the state level (or lower). I’m beginning to believe we should stop trying to agree on policies and accept our disagreement.

Whatever else ____ is, ____ is also a thought.

Over the past few years I’ve experienced with meditation. I’ve always been an “in my own head” guy and — like many millennials — I’ve turned to meditation in an attempt to wrangle more control over the thoughts and feelings spinning in my brain.

I’ve got much more to say about my experience with meditation, but for now I’ll share one transformative (and perhaps banal) personal insight I had the other day.

I used to be a Headspace guy, but recently I’ve been trying 10% Happier and am really loving the variety of approaches and teachers. One meditation I’ve done a few times is called “The Reset” and is designed for times when you’re feeling overwhelmed and totally crazy busy.

The line the teacher used is: “whatever else ‘busy-ness’ is, it’s also a thought”. And it is! Busy-ness is real in the sense that you may have way more things to do then can reasonably be done; you may be running late; etc. But it’s also the experience of feeling “I’m so busy” “I have so much to do” “I’m so overwhelmed” etc. And those are thoughts. They are prompted by the real-life experience you’re having, but they are also independent of them.

Meditation teaches you to identify, label, and ultimately dismiss thoughts on command. So why not do so with busy-ness? It can be done.

But there’s nothing special about the experience of “busy-ness”. Whatever else ____ is, ____ is a thought. You can learn to say, “oh, that’s me being angry”, or “oh, that’s me being nervous” or “oh, that’s me feeling slighted”, and so on.

So am I trying to say that I’ve reached this level of enlightenment and that I can choose not to feel busy, get angry, or feel anxious? No — not at all. Or at least not in all situations all the time. But, sometimes, I can. Maybe about 10% of the time. And that’s a pretty good improvement.


I have read many times that successful people often have very stringent routines. The book Daily Routines is literally just a list of different habits of famous artists and creatives. You can go to the accompanying blog to get an idea.

The most important constant is that there is a routine at all; even if there aren’t many similarities across routines (some work in the morning, some in the evening, and so on).

I’ve always struggled to create a routine for myself. For instance, I exercise about 70% of days. But the time I exercise is hardly the same from day to day. Breakfast? Who knows? Lunch? Could be anywhere from 11am to 4pm. And so on.

This writing — every day when I have my coffee, I write a short blog post — is an(other) attempt at creating a rhythm for myself.

For for the most part where I’ve landed is the idea of a daily checklist; I have a few things that I want to do every day, and I hit the x on each one as I do them. And as I go about my day, I’m thinking about getting those things done. I use Strides to do that (which is, for instance, how I know I work out about 70% of days).

The checklist helps keep me on track, but I still want to get that rhythm down.


On the same day this article came out about Linus Torvalds, the founder and BDFL of Linux

Screenshot 2018-09-28 11.47.11

I had the occasion to apologize to someone I work with. The parallels between Linus’s story on my own struck me (although everything about his story is much larger compared to mine): we both lead technical, remote teams that principally communicate in writing. And we both can have the tendency to forget how much of an impact our words can have when we fire them into the ether.

In short, I had been an asshole, and it was time for me to apologize.

I’ll never understand why people don’t apologize, and apologize directly. I think it’s one of the most underutilized skills and behaviors in the world. Everyone is human; everyone makes mistakes; everyone hurts other people, intentionally or unintentionally. If you don’t take responsibility for it, those transgressions accumulate and eventually people (even the people you’re closest with) will no longer want to have anything to do with you. If you do, then not only do you reset the relationship, but you can actually build a stronger one by signaling that you’re willing to accept some humble pie for the sake of the relationship — that it matters that much to you. There’s a very big in results between the two behaviors and it just doesn’t compute to me why people don’t see that and can’t bring themselves to say they’re sorry.

Relatedly, Sorry Watch is a pretty funny blog that analyzes public apologies (e.g. by celebrities and corporations) and savages wishy-washy non-apologies.