Microsoft should provide an R environment in Excel (and buy RStudio to help do it)

Excel doesn’t seem to be going anywhere, and nor should it — it’s a great tool, and the best one for countless use cases. At the same time, analyses that would previously live in Excel are now being done in R and Python, and many more analyses that would previously not have not been done at all are now being done in R and Python.

(Everything I’m about to write applies to both R and Python equally, but I know the R space a lot better.)

At the same time, the consumers of these analyses are often not code-literate. But they want to feel ownership of the analysis, be able to play with assumptions, etc. They don’t just want the results handed to them. This is a core benefit of Excel.

Additionally, while un- or semi-structured data (e.g. natural text, JSON files) and media (images, audio, video) are becoming more common as the subject of analyses, tabular data is, and will be, the most common data format for the foreseeable future. Excel handles these formats natively and transparently as worksheets.

I can’t tell you how many times I would have loved to have done the following for a client:

  1. Store the raw data as (a) worksheet(s)
  2. Have a key assumptions sheet that lists the key inputs for an analysis, which the user can change
  3. Have R scripts that read from the assumptions and the raw data, and produce outputs (charts, statistics, tables, etc) right into Excel
  4. Have a button that, when pressed, sources a script (that can of course call into other scripts)

Obvious bonus points would be if you could fire up an R editor right within Excel that could refer to worksheets and cells inside of its environment as just another tab inside of Excel. In short, why can’t Excel look like this:

Screenshot 2018-11-07 19.20.30

Of course, it would be great if the text files could be used with version control like Git.

Obviously, this would not be straightforward to do, but Microsoft could make it happen and build Excel for the next generation. It would also open up a zillion new use cases and deliverable formats for internal or external consulting firms (like mine).

Why do I make the additional assertion that Microsoft should buy RStudio? Because R is growing like crazy, and RStudio and Jupyter are becoming the default environments for doing analysis in R. But Jupyter is an open source organization, so not something Microsoft can buy, while RStudio is a company focused on supporting its development products.

If Microsoft wanted to catch up to the curve in terms of where the best teams do their analysis (no longer Excel), it should find a way to own the R space.

And it is trying to do that — it has an entire R ecosystem. But even as someone that works day-in, day-out in R, and follows everything going on in R, I have no idea how it works. It feels to me that it’s aimed at IT managers in enterprises and not the individual data scientists that are doing the most to drive adoption within those companies. It’s not the right approach — as they know, because they bought Github, whose primary adoption vector was through individual (or small team) adoption within larger enterprises.

If I were in corporate development and/or product strategy at Microsoft, this would be a no-brainer to me.

“Should”

I think “should” — especially when found in its past tense construction “should have” — is one of the most dangerous words to utter and thoughts to allow yourself to think.

“Should” can’t exist on its own. There is no objective “should”. When something “should” be some way or another, that’s only true in relation to a particular standard, that’s not typically referenced or even acknowledged.

Why “should” is dangerous is that it’s easy to forget (or most people don’t know) that this standard needs to come along with it, and the “should” feels just objectively true. When one says “this should be easier”, “I shouldn’t have to work this hard”, etc, etc, it’s easy to feel like that “should” is just objectively true. But it’s not.

Make the Senate matter less

On Politics Twitter there has been a lot of chatter about how undemocratic the Senate is as an institution — e.g. that a person in Wyoming has 68x more influence on the Senate via their vote as someone in California.

The Senate is not going to change any time soon. Even if other things may even out the partisan imbalance like DC or Puerto Rican statehood:

And there is at least one straightforward way to make the presidential election by popular vote.

But there is another way to address this issue: make the Senate matter less.

I continue to see the intractability of agreement at the federal level something that could be more providentially solved by moving more domestic policy to the state level.

This is not an argument for less government, or more government, just a shift in the locus of government from the federal level to the state level. Theoretically, this should not be a partisan issue: let liberals in California and New York have the government they want and conservatives in Utah and Kansas have their style of government.

 

Trump & Federalism

I don’t think I’m the only person who grew up distrustful of the generically Republican position that we should weaken the federal government and “send more power back to the States” who is now re-evaluating the power of the federal government, now that it is in the hands of the spray tan menace.

I think for a lot of us, this position seemed a bit too close to a modern articulation of “states rights”. It was definitely true in the past — and is probably true today — that it is impossible to evaluate Republican policy positions separately from the party’s antagonism toward civil rights.

But, assuming it is possible to do that, I’m finding the idea of moving more and more policy decisions out of the federal government and into respective state governments more and more attractive.

For example, the country does not agree on health care policy. The citizens of New York, Massachusetts, and California probably agree. And those of Utah and Idaho probably agree. (In all cases, I mean agree to the degree required to reach a consensus in government). But New York and Utah definitely don’t. But should they have to?

Most states are as large as national governments elsewhere in the world. Even Utah, which is 31st in US State GDP, has an economy almost as large as New Zealand’s and a population over 3 million people. By comparison, Denmark and Norway, two Scandinavian countries often cited as the gold standard for quality of government, have populations of 5.7 million and 5.2 million, respectively.

So each state definitely has the resources to run their own health care system. Why not let Utah run their system and let New York run their system? I don’t know why this is not possible, but I also don’t think it makes much sense to continue trying to get the whole country to adopt a single system when some coalitions in some states want one thing and other coalitions in other states want something else. And why wouldn’t this be the best possible way to resolve the disagreement? If one approach succeeds and the other fails, won’t there be a lot of pressure on the unsuccessful states to adopt the more successful approach?

This logic definitely doesn’t apply everywhere — some things really do have to be handled at a national level. Climate policy, for example; or the military.

But many domestic policies that are intractable at the national level are much more tractable at the state level (or lower). I’m beginning to believe we should stop trying to agree on policies and accept our disagreement.

Whatever else ____ is, ____ is also a thought.

Over the past few years I’ve experienced with meditation. I’ve always been an “in my own head” guy and — like many millennials — I’ve turned to meditation in an attempt to wrangle more control over the thoughts and feelings spinning in my brain.

I’ve got much more to say about my experience with meditation, but for now I’ll share one transformative (and perhaps banal) personal insight I had the other day.

I used to be a Headspace guy, but recently I’ve been trying 10% Happier and am really loving the variety of approaches and teachers. One meditation I’ve done a few times is called “The Reset” and is designed for times when you’re feeling overwhelmed and totally crazy busy.

The line the teacher used is: “whatever else ‘busy-ness’ is, it’s also a thought”. And it is! Busy-ness is real in the sense that you may have way more things to do then can reasonably be done; you may be running late; etc. But it’s also the experience of feeling “I’m so busy” “I have so much to do” “I’m so overwhelmed” etc. And those are thoughts. They are prompted by the real-life experience you’re having, but they are also independent of them.

Meditation teaches you to identify, label, and ultimately dismiss thoughts on command. So why not do so with busy-ness? It can be done.

But there’s nothing special about the experience of “busy-ness”. Whatever else ____ is, ____ is a thought. You can learn to say, “oh, that’s me being angry”, or “oh, that’s me being nervous” or “oh, that’s me feeling slighted”, and so on.

So am I trying to say that I’ve reached this level of enlightenment and that I can choose not to feel busy, get angry, or feel anxious? No — not at all. Or at least not in all situations all the time. But, sometimes, I can. Maybe about 10% of the time. And that’s a pretty good improvement.

Routines

I have read many times that successful people often have very stringent routines. The book Daily Routines is literally just a list of different habits of famous artists and creatives. You can go to the accompanying blog to get an idea.

The most important constant is that there is a routine at all; even if there aren’t many similarities across routines (some work in the morning, some in the evening, and so on).

I’ve always struggled to create a routine for myself. For instance, I exercise about 70% of days. But the time I exercise is hardly the same from day to day. Breakfast? Who knows? Lunch? Could be anywhere from 11am to 4pm. And so on.

This writing — every day when I have my coffee, I write a short blog post — is an(other) attempt at creating a rhythm for myself.

For for the most part where I’ve landed is the idea of a daily checklist; I have a few things that I want to do every day, and I hit the x on each one as I do them. And as I go about my day, I’m thinking about getting those things done. I use Strides to do that (which is, for instance, how I know I work out about 70% of days).

The checklist helps keep me on track, but I still want to get that rhythm down.

Apologies

On the same day this article came out about Linus Torvalds, the founder and BDFL of Linux

Screenshot 2018-09-28 11.47.11

I had the occasion to apologize to someone I work with. The parallels between Linus’s story on my own struck me (although everything about his story is much larger compared to mine): we both lead technical, remote teams that principally communicate in writing. And we both can have the tendency to forget how much of an impact our words can have when we fire them into the ether.

In short, I had been an asshole, and it was time for me to apologize.

I’ll never understand why people don’t apologize, and apologize directly. I think it’s one of the most underutilized skills and behaviors in the world. Everyone is human; everyone makes mistakes; everyone hurts other people, intentionally or unintentionally. If you don’t take responsibility for it, those transgressions accumulate and eventually people (even the people you’re closest with) will no longer want to have anything to do with you. If you do, then not only do you reset the relationship, but you can actually build a stronger one by signaling that you’re willing to accept some humble pie for the sake of the relationship — that it matters that much to you. There’s a very big in results between the two behaviors and it just doesn’t compute to me why people don’t see that and can’t bring themselves to say they’re sorry.

Relatedly, Sorry Watch is a pretty funny blog that analyzes public apologies (e.g. by celebrities and corporations) and savages wishy-washy non-apologies.

 

Insomnia

I’ve struggled with on-and-off insomnia my whole life. Sometimes I can’t fall asleep. Sometimes I can’t stay asleep. For months at a time, I will sleep normally, and hope that the issue is behind me. Then, out of nowhere, or sometimes out of somewhere, insomnia will come roaring back. For a week, or sometimes weeks at a time, I will barely sleep. The thought of “it’s bedtime, time to get into bed” fills me with anxiety because I know that tonight ain’t it. Only when I finally build up a ton of sleep pressure do I know that things will sort themselves out. I just got out of a major insomnia streak, but last night things reverted. Here’s hoping they revert back quickly.

Textbooks

Textbooks are great.

Although I have some training in statistics and machine learning, for the most part I’m completely self-taught. A big part of that teaching has been to buy a great textbook on a given subject and to follow along until I really understood the subject. Coursera, YouTube, Stack Overflow, tutorials on the internet, etc are great, but textbooks are excellent too, and I think they are underutilized. Especially since there is a mental categorization of “that’s for when you’re in school” that people fall into.

It was at business school that I started doing some serious data science that my addiction to buying textbooks started. Right now I’m working through Design and Analysis of Experiments with R since Gradient is doing more (and more complex) survey-based experiments (I think at this very moment we are running four discrete choice experiments for clients)

Screenshot 2018-09-26 08.57.41

I’ll be honest, this one is a bit of a slog. But getting experimental design right is crucial for successful projects so I’m working through it; I’ve had a few a ha moments already — like why they’re called “orthogonal” designs.

Most of the textbooks I use are written to work with R, although not all. Some of the textbooks that go into my hall of fame are:

How we think about hiring

Gradient is growing, and we need to hire. We are kicking off our process this week. It’s funny, along with sales, how to hire well is one of those crucial, practical things they don’t teach you in business school. We have been sort of flying by the seat of our pants each time when we run this process.

In our typical fashion, we decided we would try thinking things through and doing them a bit differently. For our last process, we decided very early on that we would not think about defining a “role” and finding someone that could be slot into that. I basically don’t believe in defined roles for knowledge-work. The whole idea is that it’s something that requires creative thought, so how can you predict in advance what will be needed?

Anyway, even if defined roles make sense elsewhere, work at Gradient is far too fluid for it to work with us. We all need to pitch in on all areas of the business: building statistical models, hooking up our CRM system to our email system, evaluating different AWS servers, and so on.

So the way we did it last time was to define a whole bunch of

  • Experiences (projects or models they had built in the past that were synonymous with what we do)
  • Skills (things they know how to do, from statistical modeling to server management and so on)
  • Traits (most importantly: do they want to work remotely?)

And give each of them a weight. People that had a bunch of highly-weighted attributes were more attractive to us than people that did not, simple as that. We even automated a large part of this process, having interested candidates take a survey that gave us a score. Take it here, if you’re interested — although consider this one obsolete.

I think that for the next hire, we’ll follow a similar process. It maps well to my mental model of how our company works and worked very well for us last time.