← All episodes
Metamuse Episode 29 — April 29, 2021

Thinking in probabilities with Taimur Abdaal

Probabilistic modeling is useful for answering all kinds of questions, from assessing financial risk to making engineering time estimates. Yet spreadsheets are poor at this job, which is why Taimur and his colleagues are building Casual. Taimur talks with Mark and Adam about ranges as an intuitive way to estimate; the usefulness of Monte Carlo simulations; and the role of math in dating cave paintings.

Episode notes

Transcript

00:00:00 - Speaker 1: I can really empathize with it because even in my own sort of maths degree, I really struggled with terminology and notation. And I think a big problem in kind of maths education generally is that there’s a lot of focus on notation and terminology, and you kind of miss the forest for the trees.

00:00:25 - Speaker 2: Hello and welcome to Meta Muse. Muse is a tool for thought on iPad. This podcast isn’t about Muse the product, it’s about Muse the company and the small team behind it. I’m Adam Wiggins, joined today by my colleague Mark McGranaghan.

00:00:39 - Speaker 2: Hey, Adam, and our guest Tamir Abdul of Kazul.

00:00:43 - Speaker 1: Hey guys, how’s it going?

00:00:44 - Speaker 2: And tamer, I understand you’re enjoying London Springs so far.

00:00:48 - Speaker 1: Yeah, I actually went outdoors for the first time in, I don’t know, 6 months or something. Yeah, I I’d forgotten just how nice it is to sit on the grass in the sun, just chatting with friends about nothing in particular. Yeah, it was amazing. What an experience.

00:01:03 - Speaker 2: Yeah, the ability to go out and enjoy, we’ve had sort of triple threat here in my household because we’ve had one, the lockdown, which has been pretty severe, of course, for the last 6 months or so.

2, we had a pretty serious winter. In fact, it was snowing today, and 3, I’ve got a, a young child at home, so all of those things mean that I basically barely leave the house.

Happily I do have a dog, so I have to go out for walks on that. If it wasn’t for that, I would never see the outside, I think. Yeah, that’s pretty rough. Well, Tamara, maybe you can tell our audience a little bit about your background, including your podcast and the product you’re working on now.

00:01:41 - Speaker 1: Awesome. So I’m Taymor. I’m one of the co-founders of a company called Causal. We’re building a spreadsheet just for number crunching. So anything involving numbers, we want causal to be the way to do that. On the side, I have a podcast with my brother where we just catch up once a week and chat about whatever’s on our mind. And my background is mostly in maths, so I studied maths at university, and specialized in statistics and machine learning and that kind of stuff.

00:02:06 - Speaker 2: And what sorts of things do people use your product for? Is this a total replacement for a spreadsheet or just a subset of that?

00:02:13 - Speaker 1: Yeah, so it’s really just a subset of that. We can sort of think of spreadsheets as something like causal, our products, plus something like Air Table. So Air Table is kind of taking Over all the non-numerical stuff you might do in a spreadsheet. So making lists of things, managing processes, you know, internal tools and that kind of stuff. And we want causal to be used for anything involving numbers. So any time you need to sort of write formulas that do calculations or visualize data, that kind of stuff is really what causal is about.

00:02:42 - Speaker 2: And I certainly think that one of the main uses for spreadsheets for me in my business life, I guess, as well as helping others, is this modeling, often financial modeling, where you’re just trying to understand, cause of course, money is the lifeblood of a business, but how you earn money because that proves you’re providing value to people, as well as just not running out the money in the bank so that your business doesn’t die, and spreadsheets as a what if tool to understand. Both what might happen in the future, but in many cases it’s just the viability of your business model.

One example I remember is I had a friend who was starting a retro kind of 1980s arcade, and they really wanted to run the games off of quarters, because that gives that authentic 80s feel, and then I’m saying, well, OK, but if you look at the inflation since the 1980s, a quarter isn’t what it used to be. This is a US dollar quarter, of course. So we actually modeled all that out and plugged in a bunch of what if values and basically figured out that under no reasonable, we’re just taking guesstimates for how many games an hour someone’s gonna play, how long they’re going to spend in the arcade, that sort of thing, but basically nothing we modeled showed that it would be viable to stay in business with all the costs. And eventually did settle on a model which was more like a flat rate, you pay $10 or $12 or something when you come in the front door, which ends up both feeling maybe more fair, more fun for the patrons, but also is actually viable. And I think it’s maybe an example of where having ranges, which we were talking about a little bit earlier, where necessarily know what exactly each patron is going to spend on drinks or quarters they’re going to put in or how many games they’re going to play an hour or whatever, but you can plug in reasonable ranges and from that you can infer maybe it ends up being like a Drake’s equation kind of thing in that scenario where you can figure out what’s viable and what isn’t.

00:04:28 - Speaker 1: Yeah, I think this idea of ranges is really powerful.

And I sort of, personally, whenever I’m giving an estimate for something, I get really anxious that my estimate is gonna be wrong. And it just gives me a lot of comfort in providing a range, because then I, I know that it’s probably right, rather than sort of precisely wrong.

And so I find that even when just sort of communicating day to day, if someone asks me for an estimate of something, if I give a single number, then for the next 5 minutes, I’ll be like, thinking through it in my head of like, Oh, maybe that was wrong. Whereas if I I had a range, like, yeah, I think it’s between 5 and 10, then I sort of have the peace of mind of knowing that I haven’t sort of been too inaccurate, I guess.

00:05:06 - Speaker 2: And I think that’s also a way to train yourself to give estimates. I’ve run into this with a lot of folks who, exactly as you said, don’t feel comfortable giving an estimate because they feel like, well, I don’t know, but you can always kind of start with, OK, you know, can you guess the price of product X in a supermarket, or can you Guess the weight of this and maybe you can’t do that or you feel like you don’t, but you can come up with a number that is so low that it’s clearly outside the bottom of the range. You come up with another number that’s clearly so high it’s outside of the top of the range. All right, so now you’re working on it. Now let’s narrow this window in.

00:05:40 - Speaker 1: Yeah, that’s one of my favorite tactics is a strong word, but one of my favorite things is, if I’m like talking to a friend, and, yeah, exactly like you described, I think if you ask someone to try and quantify something that they’re not used to quantifying, then they’ll probably just say, Oh, I don’t know, I, I could possibly put a number on that. But then if you ask them, Well, is it more than 10? Is it less than 500, you know, you can actually get to a pretty good range. And it is actually helpful to know that range, rather than just put your hands up and say that it’s unquantifiable.

00:06:08 - Speaker 2: Yeah. Well, maybe that brings us to our topic for today, which is thinking and probabilities, and I thought it was really interesting that you mentioned this as kind of a founding idea for you and then maybe in some ways you moved away from it in the product or maybe just in the marketing. But tell us what it means to think in probabilities.

00:06:28 - Speaker 1: Yeah, absolutely. So the sort of origin story for causal, it kind of comes from some work I did as a data scientist in a previous job.

I was working for a property tech company where, essentially, the company was placing big bets on houses. And so, in typical fashion, we had a bunch of spreadsheet financial models that would forecast the company’s cash flow, and some pretty big decisions were made on the back of these models, like how many deals can we do every month, how many people can we hire, so on.

And one of the really important things for this company was understanding the risk that we were taking on in each deal. If we were placing a big bet on a house, the house might be worth a lot more than what we thought it’d be worth, or it might be worth a lot less. And actually understanding how those would affect our bottom line was really important. And, you know, in spreadsheets, Google Sheets, in this instance, we had to do a bunch of work around. and hacks to try and get at this idea of essentially a probability distribution for how much a house would be worth. And there’s various kinds of ways to try and approximate that in a spreadsheet. But essentially, trying to sort of get at this idea of probability added so much complexity to these spreadsheets that they became unmaintainable. No one really understood how they worked. It was very hard to actually iterate on them. And so that was kind of my first exposure to this problem of how do you crunch numbers when some of them are uncertain? How do you build probabilistic models to try and understand the world? And our starting point for causal, and, and sort of our original mission was kind of to bring probability to the masses, to build a tool that makes it so easy to work with probability and uncertainty and so on, that it becomes sort of the standard way that people sort of think numerically. Does that kind of make sense?

00:07:59 - Speaker 2: Yeah, to me it leads into the question of how much is it a tools gap that the average intelligent educated, let’s say knowledge worker that has a reason to want to be able to think in probabilities or model uncertainty numerically, how much is it that the tools make it tricky like you described with spreadsheets, and how much is it more a matter of It’s very hard for humans to think this way, even intelligent, educated people, it doesn’t come naturally unless you’ve studied math or made this your career or your passion in life that you’ve sort of struggled to apply this approach.

00:08:39 - Speaker 1: Yeah, absolutely. I think it’s a really good question and it’s hard to know which side leads to which. An example that I often think of is this idea of having a line of best fit for some data set. It’s quite common, even sort of newspapers, magazines, to see like a 2D chart with a bunch of data points, and there’s some kind of straight line drawn through these data points to kind of extrapolate some kind of trend and tell some kind of story. And if we think about what does that actually mean? I think most people, if they look at a graph like that, they will understand immediately what the graph is trying to say. The graph will typically be trying to say that as this one thing increases, this other thing increases as well, or as this one thing increases, this other thing decreases, without a particularly maths-y background, you can read a chart like that and you understand what’s going on. I think the really cool thing about the line of best fits that is now just sort of super common and everyone gets it.

Is that very few people, unless you’ve sort of studied maths or maybe computer science, very few people will be able to tell you how you’d arrive in that line of best fit.

And the best part is, they don’t need to be able to tell you that. They don’t need to know that behind the scenes, you have to invert a matrix in order to, like, figure out this line or anything like that. And I think in that sense, just visualizing something in the right way is kind of a powerful tool to unlock intuition that we already had.

And so, in the example of line of best fit, I, I can describe to you some effects, like, As you get closer to the center of London, property prices go up. You know, I can describe that to you. You understand what that means in your head. And if I showed that to you on a chart, you’d immediately kind of get what I’m trying to communicate. And so, I think the probability stuff might be similar, where so far, we haven’t had the line of best fit moment for probability. We haven’t found the sort of killer tool or killer sort of visualization that anyone can sort of look at and understand.

I do think probability is just really unintuitive in general as well. But again, it’s hard to say whether it’s unintuitive because we haven’t had some really basic tools like just being able to visualize it, or whether it’s sort of inherently unintuitive for humans.

So I studied a lot of probability and statistics in my degree. And so, after graduating, I kind of felt like I had a good handle on this stuff. But it was after actually facing a lot of these problems involving how do you account for uncertainty in models and things like that. I kind of realized that studying the theory of probability and, you know, being able to prove certain theorems and things like that is actually almost a completely separate task from having the right intuition about these things. And so, I think there’s a really common example that Naseem Taleb is a big fan of, which is that you wouldn’t want to cross a river that is 4 ft deep on average. And, yeah, obviously, if it’s 4 ft deep on average, it might be 8 ft deep in one particular part and you might drown.

And so I think he often talks about the dangers of working with averages.

I think another kind of Illustrative example, sort of to do with a buffet. If you imagine, you know, you’re putting together a buffet and there’s 10 dishes in the buffet, and each dish takes on average, about an hour to prepare, and the whole buffet is ready once all 10 dishes are ready. So each dish has an average time of 1 hour. And if you were trying to think about, you know, what is the average time for the whole buffet to be ready, it’s tempting to think that.

00:11:50 - Speaker 2: Each dish is ready in an hour on average, and so the whole buffet will be ready in 1 hour on average, but the two, the two answers you would jump to to there is either 10 hours because it’s sequential, or 1 hour because it’s all parallel.

00:11:57 - Speaker 1: Yeah, exactly. So actually, even in the parallel case, it turns out that the average time for the buffet to be ready is actually a lot more than 1 hour and.

This is sort of like the most basic example of where average outcomes don’t always come from sort of average inputs, essentially.

But I think even after studying statistics at a university level, that would be the kind of thing that I wouldn’t immediately spot.

And now having sort of spent a lot of time thinking about this and kind of building a product around this concept of probability. Any time I hear the word average, an alarm bell basically goes off in my head as to like, OK, what are like the sort of 3 or 4 different traps I can fall into when thinking about this problem through the lens of averages.

00:12:37 - Speaker 3: Yeah, I agree. I tend to think there are two big hurdles people have to overcome.

The first is recognizing that you’re in a probabilistic situation, which is almost all the time that you can’t use a point estimate, you can’t use an average, you need to understand the distributions and the samplings.

And the second is, what is the correct formula basically to use or how exactly do you mathematically navigate this probabilistic situation? And in my experience, most people miss the first step. They go to a point estimate and then it’s already over before it started, you’re not even wrong, right? You’re in flat land. Your answer has the wrong shape. And so I think there’s a lot of value in having tools that Help you navigate the mathematics once you’re over the first step, but perhaps even more so, tools, stories, experiences, histories that help people be more likely to raise the probabilistic flag, like warning, we’re entering probabilistic territory, that alarm bell should be going off almost all the time. And so I’m very interested in things that will help people get more acclimated to that idea.

00:13:38 - Speaker 1: Yeah, absolutely. I think one sort of common-ish thing people do with spreadsheets is that, you know, if you do want to understand the uncertainty of whatever you’re trying to model, you know, some people might have 3 different scenarios, like a best case scenario and a worst case scenario, and like a sort of average case scenario. Yeah. And then you’d kind of run your whole model for the best case and the worst case and average case. And then you have these sort of 3 estimates for like, OK, this is what my outcomes could be. So some people do make an effort to do that in some settings.

And it’s a step in the right direction, but actually, under the hood, the maths doesn’t really work out there, right? You know, back to our buffet, we have these 10 dishes, which we can prepare in parallel, so we can do them all at the same time. If we said that, OK, on average, each dish takes 1 hour to prepare, and in the worst case, it takes an hour and a half, and in the best case, it takes half an hour.

If you were then trying to figure out what is the total time for the buffet, you might be able to get some kind of range based on sort of assuming they all hit the best case scenario, and that would be like the best case scenario for the buffet, and then assuming they all hit the worst case scenario, and that would be the worst case scenario for the buffet. But the math doesn’t quite work out there. And it’s mostly because our definition for best case and worst case changes from the start to the finish.

So, by best case scenario for a single dish, in our heads, we probably don’t mean the absolute best case scenario. We probably mean that like, this is, uh, 95% of the time it’ll be slower than this or quicker than this or whatever. And same for the worst case, you know, the worst case scenario is the dish doesn’t get ready for 3 years or something, right? And so you don’t actually think about the best case. And the worst case, you are thinking about this sort of plausible range. But the issue is when you start to think about this plausible range, and you’re doing this lots of times, so we’re doing this 10 times in this case, because we have 10 dishes, the equivalent plausible range for the total buffet. It is not when every dish hits the bottom of the range or every dish hits the top of the range, because every dish hitting the top of the range or the bottom of the range is actually extremely unlikely. It’s like very implausible.

00:15:42 - Speaker 2: I think that scenario you just described is how engineers estimate their time in a sprint, which is that every single thing they’re going to implement is going to be the best possible scenario.

00:15:54 - Speaker 1: Yeah, absolutely. Yeah, I think this is why it’s so hard to plan projects, because if you just do it on the basis of averages, then, you know, there’s a decent chance at least one of your tasks is not going to be delivered on time.

And if you do want to get some kind of bounds on, like, best case and worst case scenario, if you have like 10 tasks or whatever, you can’t actually just take the best case for each and sum them up, or take the worst case for each and sum them up. And so the only sort of rigorous way to do this is by running lots and lots of simulations for possible scenarios that could happen. And so, you know, in one simulation of the buffet, you know, 3 dishes might take less than an hour, and 7 dishes might take more than 1 hour or something. And another simulation, they could all take less than an hour, and so on. And if you ran a few 1000 simulations, you could get an idea of, like, you know, 95% of the time, how long does the buffet take. And so, actually running these simulations is actually the only general and rigorous way to understand the range of possible outcomes for your buffet. Does that kind of make sense?

00:16:51 - Speaker 2: And what you’re talking about here is a Monte Carlo simulation, is that right?

00:16:55 - Speaker 1: Exactly, yeah, yeah. So in maths, this would be called a Monte Carlo simulation.

And actually, you know, running thousands of Monte Carlo simulations for a basic calculation that you might be doing, it’s usually pretty tricky.

The only way to really do it is to write, you know, a script that can loop through some calculation 10,000 times and then show you, you know, 95% of the time your buffet takes between this time and this time. And a big part of what we’re trying to do with causal is sort of abstract away all of this stuff around simulation and probability distributions, and let people just say, Hey, you know, each of my dishes takes between 45 and 90 minutes to cook. And now, can you just tell me, like, what is the equivalent range for the total buffet?

00:17:39 - Speaker 2: Yeah, I can see how simulation does cover it, but there is something fun about the Monte Carlo name a little bit, and when I first learned about that, I don’t, unlike, I think both of you, I don’t have any kind of solid educational background in mathematics, but I later learned about it when I was kind of digging into the data science world of things, particularly with working with the R programming language, and they had essentially some exercises that involved doing these simulations, some very visual ones that I quite liked where essentially Allowed you, they said, OK, you can calculate the area of a circle with the formula, or you can run a simulation where you essentially, you know, draw a circle on the wall and then throw darts that land in random XY locations and if you do that 1000 times and count how many darts are on the inside of the circle and how many on the outside of the circle, you can close in on the value of pi, essentially, which I found somehow very amusing and fun way of going about things.

00:18:35 - Speaker 1: I love that example. Yeah, I think simulation is a surprisingly powerful tool where if you can reframe any problem as almost like a probability question where you can run simulations, it’s surprisingly generally applicable.

And so in the example you gave, you’re sort of reframing the question of the area of the circle in terms of.

The probability of a dart landing in the circle versus outside the circle, and as soon as you reframe it in terms of probabilities, then you can just like run a bunch of simulations, and it takes a while, but you don’t have to be particularly smart about it.

I think most complex problems in maths, they’re often intractable. You know, it’s very hard to express them as a clean equation that you have to solve.

And even if you can express it as a clean equation, there’s often no general way to solve this equation. And so reframing things in terms of like, how can we just do this really dumb thing a million times to get like a really good approximation to the answer is surprisingly generally applicable.

00:19:28 - Speaker 3: Yeah, very powerful technique and especially useful for situations where you have multiple steps or branches, even just a few of those, they can be very simple to describe in human terms, if this then that some chance and so forth, but Once you have any complexity and situation, it often becomes impossible to get a so-called closed form solution, which is what you were alluding to where you have basically some formula you can write down, you plug in numbers and you get the result. Mathematicians always like such closed form solutions to the point where I think initially they kind of pooh poohed the Monte Carlo world, but I think now it’s shown its power and folks are more open to the numerical approaches.

The study of probability is so interesting because it pops up in so many domains. Once you know to be looking for probabilistic situations, you see them everywhere.

I can give two examples from my experience. The first was in college, I worked on this thing called RoboCup. RoboCup is where you have toy robotic dogs play soccer. And these are dogs that can do basic seeing, and then you use video processing algorithms to extract information and you program the dogs to play soccer autonomously on this sort of toy soccer field. And anyways, one of the big advantages that our team had was the ability for the dogs to locate themselves on the field, which is, as you can imagine, is a sort of fundamental thing for programming dogs to play soccer.

And the reason that this was so hard was because these are like really bad cameras basically so you’re getting really choppy visual information. Really the only way to deal with that is probabilistically, because the data that’s coming in is so noisy, you can’t do anything on it if this and that basis. You basically have to say, OK, given all of these observations I’m making about the different landmarks I know about on the field, what is probabilistically the most likely location for me to be in? And furthermore, what is my sort of probability cloud of where I plausibly am on the field, and if I have enough certainty about this probability cloud, then I can undertake certain actions like kick the ball towards the goal and so on.

And then to give a very different example in the world of engineering management, I think it’s very fundamental to understand that engineering is a risky endeavor, especially when you’re like developing new products. This is the area where I think a lot of people think too deterministically. So one example that I like to give is, imagine you have a multi-step software development process you need to do A and B and C, and this is actually kind of similar to the buffet example. Each one takes an engineer, one unit of work, and an engineer can do 1 unit of work at any given time. Now you might think you should just assign one engineer to A, one engineer to B, and one engineer C. and in a totally deterministic world, that works perfectly. The gears, they all mesh everything turns in unison, it’s perfect, but you have to recognize that there’s inherent variability in how long these tasks take. And so what can happen is if you’re running the entire team at So-called maximum capacity, then if anyone experiences a task that’s slightly harder than you anticipated, you basically grind the gears for the entire thing because A is holding up B is holding up C, and then you go from this world of everyone is fully optimally working to everyone is basically stuck waiting for someone else and everything is kind of ground. up. And that’s where this idea of slack comes from, where if you’re in a situation where you have uncertainty about how long things are going to take and you have dependencies, counterintuitively, the correct thing to do is to spend some of your time twiddling your thumbs, basically. Because if you try to be doing stuff all the time, you’re inevitably going to be getting in the situation where you’re grinding the gears out.

00:22:50 - Speaker 2: And I think there by Slack you’re referring to the concept of slack, not the product, and perhaps there is a book that was influential to me, recommended by one of our mutual colleagues at Hiroku, that’s essentially a management book that’s titled Slack and makes that very argument. It’s sort of a. theory thing a little bit and there’s some things about creativity as well, but ultimately, even if you just want to think of everyone on the team as being a worker automaton that needs to provide end units of productivity, it actually turns out you have a more efficient system when there’s space in the system, there’s slack in the system.

00:23:25 - Speaker 3: Yeah, and along these lines for people who enjoy thinking in probabilistic terms, I would also highly recommend principles of product development flow. This is basically a mathematical cutheoretic treatment of product development, and when I first heard that, I’m like, how can you possibly write interesting equations about product development, but if you just approach it with this lens of probability or alternatively risk, all kinds of interesting things fall out. So for folks who have a mathematical inclination, I suggest that book.

00:23:54 - Speaker 2: Hm. Yeah, I guess a risk and probability the same thing in what we’re talking about here? It seems like one is sort of just like the inverse of the other, at least in my kind of layperson’s understanding, but I don’t know if that’s correct.

00:24:07 - Speaker 3: Yeah, that’s my intuition. So you could think of risk in engineering delivery time means that there’s a probability distribution.

And in fact, it’s probably long-tailed, where there’s some chance it goes on time, there’s, uh, frankly small chance it happens before you expect it to happen. And then there’s the real possibility it takes 23 times as long, it never gets done, right? That’s what I mean by risk and Similarly, there’s probability distribution around how customers are likely to value or not a given feature, and that’s another thing that’s important to consider. So you can’t say customers are definitely like that. And in fact, there’s some chance they like it, some chance they don’t like it, some chance they really like it.

And in the same way that you need to correctly consider distributions when you’re planning your buffet preparation, you need to consider these distributions when you’re doing product development.

00:24:50 - Speaker 1: Yeah, I think just to add to that, when I think about sort of risk and probability and kind of how are these concepts related, I think risk also kind of captures, I guess, kind of the magnitude of what could result from something. So, for example, if you knew that there was a 1% chance that you’d die by driving a car, yeah, that would be a much higher risk than if there was a 20% chance of getting wet, you know, from walking outside. So I think risk also sort of captures the actual impact of some low probability event. Right.

00:25:19 - Speaker 2: Yeah, there’s some good discussion of this, the 80,000 hours group which I follow, they spend a lot of time talking about these kind of tail risk events, pandemics, which they were big on before we had one that captured the Western consciousness.

But also things like meteor strikes and other events that obviously things that are climate related and in many cases it is an acknowledgement of, yeah, the chance of this happening, the probability of this happening is small, but maybe this is sort of the expected value of something is the likelihood of it happening times the result.

And so if the result is this huge, huge event like a species ending extinction event, even a very small chance of it is something that maybe it’s worth investing some resources protecting against.

00:26:04 - Speaker 3: Yeah, this is an area where even if you do take that first jump of thinking probabilistically, you can still fall short, in particular, if the cases that end up mattering in the expected value calculation are outside of the intuitive probable range.

So you can think of things like meteor strikes and nuclear war and so on, but one that’s very familiar to us, Adam, is earthquakes in.

So the chance of a very serious earthquake in California is on the order of 1 every 100 years.

So if you just take that, you know, it’s basically outside the 95% confidence interval. So we could say, if we weren’t being too careful that basically we’re not going to have an earthquake, don’t worry about it. But in fact, the expected damage from such an earthquake is enormous. So therefore, any year the EV on earthquakes in California is actually non-trivial and therefore you should do some amount of preparation.

00:26:49 - Speaker 2: That also highlights another challenge or fallacy or just a way that this whole thing is nonintuitive for the way that humans think, which is you often hear folks in California speaking in terms of quote unquote, we’re due for a big one because you hear that we talk about it that way, we should. one every 100 years and that actually masks or does not correctly capture the probability that we’re trying to express. And so people convert that to more of a cyclical time thing like that we expect the sun to rise once a day.

In fact, that is not at all what it is. So working on casual and working with your users and customers who of course are again smart people educated, need to think in terms of probabilities or risks for their work and yet maybe don’t have the same mathematics background that both of you have.

I mean, there’s countless, I don’t know, well known fallacies, I don’t know, expecting a string of coin flips to have fewer.

Long runs of heads and tails, for example, than it does in actuality.

But what are some of the things where either one you see folks have their intuition not matching what reality is, and then two, what are some things you found in the product or maybe it’s even more of a almost like a marketing thing, and explaining thing to help folks bridge that gap without necessarily getting the mathematics degree.

00:28:09 - Speaker 1: Yeah, for sure. Yeah, I think we’ve had a ton of learnings on the more sort of marketing and positioning side of this kind of product.

In the very early days, you know, our mission was to really focus on this probability stuff. And so when we, you know, on our landing page, we would literally describe causal as a probabilistic modeling tool. That means something to us.

But I think what we didn’t realize is that for people without a maths background, words like probabilistic and words like Monte Carlo simulation, They’re just quite scary. I, I initially found this a little bit frustrating because, you know, the term probabilistic model to me, it means like a very specific thing and it was really hard to try and describe this concept to folks with less mathematical backgrounds.

But actually, I can really empathize with it because even in my own sort of maths degree. I really struggled with terminology and notation and things like that.

And I think a big problem in kind of maths education generally is that there’s a lot of focus on notation and terminology, and you kind of miss the forest for the trees.

And so, you know, even when I was in my 2nd or 3rd year of university. Anytime I would see a capital sigma, you know, the big sort of sum symbol, which is basically everywhere in every branch of maths, you’re going to be summing things up.

Any time I’d see like the sum of like some expression, I’d immediately think, oh man, this is so hard. This looks really complicated. There’s all these symbols going on. And so I’ve definitely felt that pain of. Being intimidated by terminology and notation. And I think that was part of the problem initially when we were using words like probabilistic, when we were using words like Monte Carlo, you know, it took me sort of, yeah, I’d say, in my 4th year of my maths degree, I didn’t have notation anxiety anymore. But it took me a long time to get over that. And I think a lot of people who didn’t like maths in school or feel like they were bad at maths, I think a lot of it just comes down to notation. You know, once you’re introduced to algebra, you start seeing all these symbols like X and Y and so on. And it takes a while to get comfortable with that. And it’s easy to fall into the trap of thinking, Oh man, I find the notation confusing. Therefore, I am bad at maths. Therefore, you know, I shouldn’t tell you this thing.

But I think getting past the language, getting past the notation is actually a big hurdle. And so, for causal specifically, you know, it took us a few months to figure this out, but we stopped using words like probabilistic. We stopped. Using words like Monte Carlo. I think generally, people understand the idea of uncertainty. And so, in terms of how we position, I guess, the probabilistic aspect of causal, is that we usually describe it in terms of, you know, hey, if you’re uncertain about a particular number, so writing a single number, you can say, Hey, I think it’s between 3 and 5, or I think it’s between 5 and 10. And people, you know, pretty intuitively understand ranges. They can probably come up with a range for. Any quantity in their day to day life that they might want to model. And saying, like, I think something is between 5 and 10 doesn’t require any sort of technical knowledge, it’s sort of pure intuition.

And so, in causal, people just need to apply their intuition at the point where they can do it well. So at the point where they can estimate a range for a particular quantity, where the intuition breaks down is, you know, you now have this model with a bunch of formulas, a bunch of calculations, where you’re taking all of these 5 to 10s and 10 to 20s. and so on, and combining them in some weird way to get a final result, that’s where intuition really breaks down. It’s actually very hard to punch those numbers in your head.

And that’s where Corle handles it for you. It runs, you know, 10,000 simulations, and then just shows you the sort of 10 to 20 results, rather than you having to worry about that side of things. So I think, yeah, lots of learnings on the sort of positioning and kind of the marketing side of things.

In terms of actually getting people to think more probabilistically. I think most of the folks that use causal previously used spreadsheets and if you’ve had to build a financial model in the spreadsheets, you’re probably somewhat familiar with the idea of best case and worst case scenarios, but I think most people just don’t do them because it’s just very fitly, it requires a bunch of formulas and things like that. And so.

Actually, getting people to start thinking in terms of ranges has been pretty easy because people have wanted to do that anyway. It’s just so much of a pain to set that up in a spreadsheet that they haven’t ended up doing it. And so being able to just write in an expression, like 5 to 10 in causal comes very naturally to people, and they do tend to do that quite a lot because causal handles the complexity of all of that.

00:32:16 - Speaker 2: In terms of the output, they see, you mentioned just seeing, you put in a range or a series of ranges, and you get out a single range, but there’s also maybe you found ways to represent that visually in plots or Yeah, so representing it visually is trickier.

00:32:28 - Speaker 1: I mean, so cos all under the hood, you know, running all these simulations and so.

It has a lot more information than just the range of your possible outcome. It also has the sort of precise distribution of your possible outcome.

And you know, the range might be 5 to 10, but it might be more likely to be closer to 10 than closer to 5, and so on, where it might have this sort of bimodal thing where it’s really likely to be close to 5 or 10, but not likely to be anywhere in the middle.

And so there’s lots of different distribution shapes that might underlie a range, like 5 to 10.

We found that.

Most folks don’t have too much familiarity with reading probability distribution charts. It is a featuring causal.

You can actually see it, like a bell curve if it happens to be like that, or other equivalent charts.

Most people aren’t too familiar with those, and so most people don’t end up using them.

What people are fairly familiar with is sort of like fan charts. So if you’re projecting something over time, you know, you might have like a single line or something. And then instead of a single line, you might have like a sort of fanning out range where there’s kind of visible upper bound to this range and invisible lower bound. And most people really intuitively understand what a fan chart looks like. And so those are really common, but unfortunately, it does kind of hide the underlying distribution, and we haven’t yet figured out a really intuitive way to show people the actual distribution in a way that they’ll understand.

00:33:44 - Speaker 3: I do feel like those fan charts, which now I know the name for, that’s useful, are perhaps the closest thing we have to the line through a dots in terms of comprehensibility and universality.

I’ve seen those a lot in the financial domain where you have a balance or a bankroll or similar investment balance and you run a 100 simulations, and you can kind of get a sense of the probability distribution if you have the right amount of lines in your fan chart because you see that there’s kind of more lines in the middle and fewer lines. And the scraggly edges, not perfect, but it’s pretty intuitive. I also like those because they do show the dynamism. So if you’re looking at a bankroll, for example, you see that some of these lines, they really dip close to zero and some go way up but then come back down and a lot of them just kind of chunk along, so you get some sense for the randomness.

00:34:28 - Speaker 1: Yeah, we’ve had to put a lot of thought into how much detail we want to show in these kinds of visualizations.

So when it comes to fan charts, for example, causal does have all 10,000 of their simulations, and we could draw on, you know, each of those 10,000, maybe with like a sort of 1% opacity or something. And so then you can actually get an idea of the distribution.

But it just adds a lot more complexity to the visualization. And so we’ve had to sort of try and find the balance between sort of complexity and comprehensibility, where if we try and be super rigorous and show every single simulation on the charts, chances are most people will look at it, get a bit confused, and not be able to make any sense of it. Whereas if we kind of show the sort of 90% range or the 95% range, it’s much more understandable. And at least People will have an idea of a range of possible outcomes, and then maybe if they want, they can kind of double click and zoom into the distribution itself. But it is very challenging to actually visually represent uncertainty. There’s a few research departments and a few universities that are doing a lot of work into figuring out the best ways to visually represent uncertainty. But, yeah, it’s all about the balance between sort of complexity and comprehensibility.

00:35:35 - Speaker 3: Now we’ve talked mostly about modeling in the sense of going forward, so you were about to begin preparation of the buffet, what should you expect in terms of the completion times approximately one hour from now. I also think there’s this very interesting world of probability, which is basically going backwards. You’ve observed that everything completed in 1 hour and 15 minutes. What does that mean about the underlying tendency for us to complete individual sections of the buffet? And there are all kinds of other examples that we could talk about. I’m curious if you see those sort of use cases in causal or if you have other thoughts on that space.

00:36:07 - Speaker 1: We definitely see less of those use cases. The one time it does come up is if you have a bunch of historical data about a particular quantity, maybe you have a bunch of historical exchange rates between the dollar and US start or something. If you then want to kind of use that exchange rate to project something forwards, it is helpful to kind of look at, you know, what has been the distribution of this exchange rate historically.

And then let’s just assume it’ll probably have a similar distribution going forwards. And so, in that way, instead of just sort of plucking a range out of thin air of like, oh, I think the exchange rates between 0.9 and 0.99 or something like that, you can actually infer the distribution from historical data.

And that is a feature that we do have, where if you have a A bunch of historical data for something, we can sort of try and fit an empirical probability distribution, is what it would technically be called, onto that, so that you don’t have to put your finger in the air and come up with a range. We see a lot less of that, and the more useful thing does seem to be being able to apply ranges based on your own assumptions rather than figuring out ranges or distributions from historical data.

00:37:09 - Speaker 3: Yeah, maybe we can just talk about some examples from our own experience of this type of probability.

One example that I think is really cool, and this one’s due to Sammo Beria, I hope I’m pronouncing his name correctly. This is Sammo of Bismarck analytics, we can link to him in the show notes, but he’s made this point that with how we’ve historically thought about archaeological discoveries, our timelines only go backwards.

So say for example, we’d find the first cave painting and we date it to 5000 years ago, and we say cave painting has been around for 5000 years. And then we find another cave painting, and it’s 8000 years old, and then we say, I guess cave painting has been around for 8000 years. Now, the first observation is that if you take this naive approach, our timelines are only ever going to go backwards, cause anytime we discover a newer one, OK, we’ve known about that, anytime we discover an older one, our timelines for when humans were doing certain things are going backwards. And perhaps Then the correct way to think about this probabilistically would be to say that when we discover the 8000 year old cave painting, there’s some underlying distribution of cave paintings, some of which are probably older than 8000 years old. So therefore, the correct estimate is probably older than that. And if we were in fact doing that correctly, we wouldn’t always be getting older. We would be kind of getting more and more refined around the true date on either side. It’s just one example of how if you don’t think about things in careful probabilistic terms, especially when you’re doing this sort of backwards projection onto the underlying distribution, you can very easily make mistakes.

00:38:38 - Speaker 1: Yeah, that’s a really interesting example. I think this actually came up during some of the stats courses that I did at the university. We did a course on Bayesian inference, so using kind of Bayesian theory of probability.

And I think this is one of the few areas where people have actually been applying sort of Bayesian ideas of probability in practice in real life. And I think we actually had a bunch of examples in our sort of lecture notes specifically around archaeological digs. And if you dig up something that’s 50 layers of, I don’t know, sand deep or something, and you think that that’s dated from a certain period, how should you actually think about your new best estimate for how long we’ve been doing the cave paintings? And so there’s a bunch of maths that can actually sort of help you with that. And from my understanding, people are using that maths in archaeological stuff.

Nice. I’m curious as to how you guys personally think about how much to trust numbers, how much to trust data and statistics. I found that for myself, I’m just very skeptical of any numbers that anyone tries to throw at me. And I’m, I’m usually much more convinced by a theory or an argument that I find highly plausible than by someone trying to convince me of something. Using data, where do you guys fall that spectrum and like, in what context do you trust numbers that people throw at you and in what context do you not?

00:39:56 - Speaker 3: Oh man, so this gets us into the conversation of what you should believe when you read in the newspaper according to a study.

And so for me that’s very little, basically nothing. And so I have a lot of trust in statistics and numbers and experiments, but you gotta consider the whole ecosystem.

And when you’re looking, for example, at the ecosystem of publicly described science, there are many, many steps where Where the data gets systematically corrupted. And so what you’re likely to read at the end is just not that useful.

So just to give some examples here, when a newspaper reports on a scientific study, they’re very likely to report incorrectly because of probabilistic illiteracy.

And then, even among the studies that they choose to report on that, they’re sampling from the universe of studies, and they might have biases or reasons to only. Report on a subset of them.

And then furthermore, the stuff that gets published, that is systematically corrupted because only certain types of results they get published.

And then in terms of the data that goes into both the published and unpublished studies, there’s a lot of fraud and other issues with it. And so by the time you get out to the end, it’s just not that useful.

And if you want to have a chance, you basically need to do a meta review or a meta study. I forget what the exact term is, maybe you know. But Basically, where you round up all of the studies that have ever existed, both published and unpublished, and try to synthesize all the data to say something useful. So because these universes tend to be so complex and because of all the principal agent problems involved, I tend not to trust them that much. But when I have my hand on a specific experiment that I understand end to end, and ideally was pre-registered, then I’m quite likely to trust it.

00:41:29 - Speaker 2: And pre-registered here means they didn’t extract a meaning or find meaning post hoc once they looked at the data, but rather that they were using it to test or falsify or prove or falsify a particular hypothesis.

00:41:43 - Speaker 3: Right, so this is one of the areas where historically scientific publishing has gone very wrong. So say you have a new drug, for example, or you have 100 new drugs.

And you privately conduct tests on 1 hundreds of the drugs using the standard 95% confidence interval. Well, you would expect that 5 of those will falsely return, even if all the drugs are placebos, they do nothing. You would expect that 5 of those placebos return, given you’re 95% confident intervals by definition, that they are helpful. And so if you have the opportunity to publish or not. Publish whatever studies you want, you can just publish those 5 and say, hey, look, well, we have 5 drugs that are magic. And in fact, you’re attempting to fool the public by randomness. Whereas if you pre-register all 100 studies, then you can’t do that. People can see that, you know, wait, 95% of the stuff that you think might be useful is actually not useful. So therefore, you’re just not a very good development company.

00:42:33 - Speaker 2: That makes me think of a related concept in terms of like, yeah, 95% sort of effectiveness, which is essentially medical tests. And that there’s a pretty strong argument against sort of testing. You would think that the best thing to do, whether you’re talking about a disease or early cancer screening or anything like that is just test as much and as often as possible.

But the challenge with a lot of these things is that you get this asymmetry between the false positives and the false negatives, which is essentially if the test is even 99% accurate. But the disease only appears in 1 out of every 50,000 people. The number of people who get the false positive, that is to say, saying that they have the disease when they don’t, vastly outweighs the people that actually get correct positives on it, and then they spend a bunch of time with stressed out people.

Thinking they have a terrible disease and in fact they want the doctors to make the judgment call if there’s some reason, some symptom we see here that makes us want to do the test rather than kind of a proactive test, which I thought was very interesting and again to me was a surprising result. I think coming back to that.

Intuitively, you don’t think of a test that has, for example, 99% effectiveness as being something that would produce such kind of skewedly wrong or just misleading results, but without knowing that other number, which what’s the incidence of this particular disease in the population that you’re running the test against, you actually don’t know what the balance of false positives to true positives is.

For me, the question of whether I’m convinced by data. Certainly, I think for me it does come down to putting numbers on things, quantifying things, brings a I don’t know if rigor is quite the right word, but perhaps a concreteness.

When you say something is really, really big versus saying it is 50 m tall, those two have very different qualities to it, and I feel when people either do bring numbers in either from their own volition or because they’re forced to by scientific practices or something like that. That that actually sharpens the thinking.

Now that doesn’t mean that numbers are a magic wand and by quantifying things and turning that into data sets, whether it’s a spreadsheet or something else or the new favorite magic wand which is data science, that just because you bring those things in that now your results are unimpeachable, but rather that I in general that is gonna probably do better than more kind of broad abstract kind of reasoning by analogy or something like that.

But yeah, I guess, certainly the specific case marked names of studies as reported on in the news is something to be very suspect of, but looking at a data set and using that to draw some conclusions, I think can be a very powerful way to understand the world. Tim or I might also turn the question back to you on product development, and say to what degree, being a, certainly a very numerically literate person, to what degree do you use some kind of data or quantification in Making product decisions or business decisions, or do you really guide that, especially maybe in the early days, where just the end in terms of number of users, number of customers, total time elapsed, just isn’t big enough, you need to just kind of go with building what’s in your heart, as I said earlier, or following your product intuition and not getting hung up on trying to make sense out of a small data set.

00:46:03 - Speaker 1: Yeah, so I think, ironically, we very much err on the side of our intuition and conviction on things. I think, particularly when it comes to big product things, like, you’re just not going to find the answers in any data set. And so one a really big thing that we kind of grappled with from day one was, we’re building this tool for working with numbers, it’s very general, and so on.

What should the UI for this thing actually be? And, you know, we were always kind of aware that, well, maybe we could make it look a bit like a spreadsheet, cause that’ll be more familiar to people, and so on. But, you know, maybe we want to move people away from that and get them to stop thinking in those terms and maybe don’t, don’t do that. We actually kind of had our own proprietary UI until about 4 months ago.

And maybe about 6 months ago, we decided, actually, you know what, a lot of people are having trouble getting onboarded. You know, no one is explicitly telling us that, look, give me a spreadsheet interface. No one was explicitly telling us that, but You know, there was a lot of like friction. There were a lot of things which just quite weren’t working out. And so we had to, you know, in the absence of data, we had to ourselves, come up with an analytical model of the world of like, hey, you know, we’re having these problems because our interface is too hard to use for new people. It’s too confusing. And so we should build the spreadsheet. And I can’t imagine, you know, maybe we could have run some survey asking people, like, hey, you know, would you prefer a spreadsheet interface or another interface? Like, you know, but again, I think like designing a survey in a way that I would actually trust it would be really tricky, and I don’t know how much I would trust the results of that kind of survey. So I think big product stuff generally does not come from any kind of data. It’s more around our own intuitions. I’m very happy to trust data. To sort of tune an existing thing that we have created. I think data is very good for tuning something that you’ve come up with. Uh, so I think like, an onboarding flow is an example of this. So, you know, yeah, I think you guys had like a previous episode just about onboarding or something like that in the early days of the podcast. Onboarding is a big challenge for causal, and we have like a guided onboarding. Once you make an account, we then show, like, little dots on different parts of the UI saying, oh, click here, now type this thing in, now press enter, and so on to guide people through kind of the main flows. Uh, of the product. And that’s the kind of thing where we can come up with the structure of like, OK, we think these are the five steps, and this is what we should tell people. And then we can look at the data to kind of optimize this structure that we’ve come up with. And so we can see that, OK, you know, loads of people are falling off after step 3, so there’s probably a problem there, we should probably change that. Data wouldn’t tell us what the steps should be. We have to come up with that structure ourselves. And then once you have the structure, then data is good for refining it and tuning it. And so that’s really how I see data as like, you know, it’s up to us and our own conviction to build the main structure, and then we can use data to kind of refine it a little bit.

00:48:42 - Speaker 2: That reminds me of something a product manager from Pinterest told me that they did there, at least at the time, which was to use split tests automatically just to check that there isn’t a regression in whatever their core metrics are, which might include, you know, monetary things, people converting to purchase or Whatever was there, but also things having to do with, yeah, basically check their core metrics and make sure this exciting new feature they rolled out didn’t just cause something important to tank. It’s kind of a safety check, so it’s almost more of a regression test rather than something that was intended to decide product direction.

00:49:20 - Speaker 3: This conversation reminds me of a couple of things. One is so-called AA tests, where you test your AB testing framework and analysis by making the two sides of the test exactly the same.

And so that’s a good way to see if you are likely to fool yourself by randomness, because if you come back with the result that A is bigger than A, well, something’s probably wrong with your probabilistic reasoning.

When you mentioned the idea of a spreadsheet interface, that’s something you see in a lot of tools for thought and productivity apps, for example, notion and air table will have this idea of a sort of spreadsheet like thing that you can put in.

It reminds me of the phenomenon of carsonization, which is the tendency of crustaceans to evolve into crab-like things. Spreadsheets are sort of the crab of the productivity tool world. It’s like everyone kind of wants to be a crab slash spreadsheet, depending on where you are.

00:50:08 - Speaker 1: Yeah, that’s really funny. We had similar things. So one of our sort of investors slash advisors is a chap who’s kind of been in the financial modeling game for a very long time. He sort of has a business selling Excel financial model templates. And so he’s tried every sort of number crunching tool under the sun, really. And early on, you know, he basically predicts this, and he, he sort of told us that, look, every tool that I’ve ever seen would be created for this, eventually ends up looking like a spreadsheet. That’s all I’m saying, you know, do whatever you want with that information. Um, but he called it about a year and a half ago.

00:50:39 - Speaker 2: Yeah, well, sometimes the process of being a product creator, especially when you’re trying to do something truly novel, is to try all your weird and exciting ideas, and unfortunately, most of them will probably turn out to be not effective, and then you realize why it is that the boring standard thing that everyone uses is boring and standard, is because it really works. But hopefully you find those few weird ideas that, in fact, are breakthrough and can make a difference in the world.

00:51:11 - Speaker 1: Yeah, we almost sort of had to figure out from first principles that a 2D grid is a good way of displaying two dimensional data.

00:51:20 - Speaker 2: Yeah, I like a little bit that approach of throwing out assumptions and throwing out kind of a sense of, well we’re doing it this way because that’s the way we’ve always done it. I think it’s very easy to build products that way to say, OK, well, obviously you start with the login page because everyone has a login page and then you have a page that’s like this and a screen that’s like this. And you’re just going based on assumptions of following established patterns and throwing those out and saying, OK, now what are we trying to accomplish here and what if we design something truly new? And more often than not, you do end up back at those established patterns because they’re good for a reason or they work well for a reason, but I feel like it’s a truer and more pure way to arrive at those, kind of building it up yourself, as opposed to kind of just imitating without knowing the underlying reasons.

00:52:05 - Speaker 1: Yeah, absolutely.

00:52:06 - Speaker 3: This is actually reminding me of the importance to my mind of studying combinatorics and probability as a predecessor to statistics. So a lot of folks I see these days, they study statistics and so they just get the formulas for like how do you do a two-tailed tea test or whatever. And I don’t have the underlying intuition, whereas I think it’s much more useful to have the underlying intuition of especially combinatorics, which is the study of counting and therefore gives you probability. So yeah, if folks are interested in this space, I would suggest starting with how to count things in combinatorics.

00:52:39 - Speaker 2: Well, let’s wrap it there. Thanks everyone for listening. If you have feedback, write us on Twitter at MuseAppHQ or via email, hello at museApp.com. You can help us out by leaving a review on Apple Podcasts. And Tamar, I’m glad you’re building a tool for thinking and probabilities because I think we all need it.

00:52:59 - Speaker 1: Cool, thanks a lot for having me. This has been a lot of fun.

Discuss this episode in the Muse community

Metamuse is a podcast about tools for thought, product design & how to have good ideas.

Hosted by Mark McGranaghan and Adam Wiggins
Apple Podcasts Overcast Pocket Casts Spotify
https://museapp.com/podcast.rss