#37. Lies, Damned Lies and Statistics - Dr. Liza Bolton Artwork

Smooth Brain Society

In an attempt to change the way information is presented, we’ll be speaking to researchers, experts, and all round wrinkly brained individuals, making them simplify what they have to say and in turn, hopefully, improving our understanding of a broad range of topics rooted in psychology. Join us as we try to develop ourselves, one brain fold at a time.
Instagram: @thesmoothbrainsociety
TikTok: @thesmoothbrainsociety
Youtube: @thesmoothbrainsociety
Facebook: @thesmoothbrainsociety
Threads: @thesmoothbrainsociety
X/twitter: @smoothbrainsoc
https://linktr.ee/thesmoothbrainsociety

All Episodes

Smooth Brain Society

#37. Lies, Damned Lies and Statistics - Dr. Liza Bolton

May 16, 2024 • Guest: Dr. Liza Bolton • Season 2 • Episode 37

The use of stats and throwing around numbers in conversation is incredibly common, yet statistics itself is poorly understood. Dr. Liza Bolton from @universityofauckland discusses the dark art that is statistics. Using examples, she takes us through some misconceptions and dispels the notion that numbers don’t lie. We cover how to identify the best ice cream store, how to not be fooled when stats are mischaracterized in media and politics, why we worship the nat 20 dice roll and is the 27 club for musicians a real thing?

Dr. Bolton's website: https://www.lizabolton.com/

Missing data in police warrant information: https://www.stuff.co.nz/pou-tiaki/300938784/data-shows-ongoing-racial-bias-in-police-warrantless-searches
Gang numbers: https://thespinoff.co.nz/politics/23-04-2024/have-gang-numbers-really-skyrocketed-in-recent-years

Support the show

Support us and reach out!
https://smoothbrainsociety.com
https://www.patreon.com/SmoothBrainSociety

Instagram: @thesmoothbrainsociety
TikTok: @thesmoothbrainsociety
Twitter/X: @SmoothBrainSoc
Facebook: @thesmoothbrainsociety
Merch and all other links: Linktree
email: thesmoothbrainsociety@gmail.com

Statistics is a lifestyle baby. Like how do
you think better in the world? You should be

leaving this to when the actual podcast starts
too. That's such a good line. Saying these

things. That's such a good line. Yeah, hey,
anything you like I can do it for the TikTok.

Welcome everybody to the Smooth Brain Society.
Today Alex is back with us all the way from

Germany now actually, because his move to be
from India right now, so international. And

we have Dr. Liza Bolton on from the University
of Auckland. Wait, is it Auckland University

or University of Auckland? Ooh, I think our
brand guide would tell you University of Auckland.

but that does not stop a single person calling
at Auckland Uni. Yeah, no, cause, cause we

were technically Victoria University of Wellington
and they tried to rebrand us for God knows

how long and people still call it Vic. So, yeah.
But yeah. So Dr. Liza Bolton is a teaching

fellow at Auckland University and is and describes
herself as a stats enthusiast. We met at a

conference where she was talking about statistics
and what do you say, journalistic practices

and understanding stats. That was basically
the crux of her talk and she's been keen to

come on and kind of talk about some of the misconceptions
in statistics, some things to be wary of. I

think it's very important considering any news
article ever or anything, even vaguely political,

stats are thrown around, but they're not necessarily
always interpreted in the right way. So thank

you for coming on, Liza. Thank you so much for
having me and for also being flexible about

the timing when I was sick last week. So I will
hopefully not cough into your microphone too

much for today's session. And yeah, thank you
for that introduction. I think that's a really...

I'll just bring you everywhere with me to introduce
me actually, because that's a much better job

than I tend to do.

No, I quite like the way you introduce yourself
on your LinkedIn, on your website, all those

things. You make stats sound way more fun than
someone like me who does research thinks it

is.

I think I tend to have a lot more fun with statistics
than a lot of people's experiences have. have

been, which makes me, I feel like I'm almost
a little bit of a stats evangelical where I'm

like, everyone could be having this much fun
if only, if only. So I am, I'm definitely on

a mission to give you a bit of a statistical
smile by the end of this. I love that phrasing.

I'm just gonna, I'm gonna tell everybody, for
audio listeners mainly, that Liza came on with,

I think a 60 slide. 60 slides presentation as
well with all different examples which she

wants to share with us. Of course we're not
going to go through all of them but that just

kind of shows the excitement in wanting to talk
about her work and we're happy to learn from

her. So I'll actually turn it over to you and
I'll let you decide where you want to get started

from. Where do you think we should go? Oh yeah,
no, I love that, very dangerous, but I'm very

excited to have control of the ship for a little
bit. So what I thought I would start with is

actually putting up one of my close to 60 slides,
you're right, because one of the things I really

like about your show is the sort of premise
of having someone like me who's really keen

to talk about my topic, having yourself of course
as the host, and having Alex who I've just

met for the first time. Sort of here as I guess
our guinea pig. That's my understanding that

we're just we're here to experiment I would
get to know Alex live a little bit by asking

him which of these bogs he falls into So a bog
is like a swamp or a wetland or basically something

You don't really want to stick your foot in
you want to stay on the boardwalk in between

and this is a concept That's sort of an old
concept in stats education, which is sort of

part of my background But I think it's a really
helpful concept for all of us to kind of check

it in on. So I'll explain what these two things
are. And for folks who are only listening to

the audio, there is like a visual for this,
but it does not matter whatsoever. It is just

two chat GPT, Dali generated pictures of bogs.
So on one side you have the bog of overconfidence.

And this is the person who kind of thinks that
statistics is basically math, right? and it

in some way, or maths, I should say, for our
New Zealand and British audience members, but

that statistics is this kind of perfect science
and that it tells us the truth and only the

truth. And it is one way to just be right. You
win all your arguments, you're always correct.

Ta-da, that is statistics. That is the bog of
overconfidence. If you ever interacted with

numbers and statistics and thought of them as
these perfect truths, that might be the bog

that you're... you're teetering off the boardwalk
towards. On the other side though, there is

the bog of disbelief. And this is probably the
bog that me as a statistician, I am most liable

to fall into, Alex. So I'm curious which one
of these will be your bog. But the bog of disbelief

is kind of, oh, well, statistics is just magic.
If you are, you know, unscrupulous enough,

you can make the numbers say anything you want.
Lies, damned lies, and statistics. So yeah,

this bog of disbelief is there's nothing of
value. There's nothing I can trust here because

anyone can just make it say anything. It can
mean anything. Does that make sense? Do you

sort of feel those two different swamps? Which
one is your swamp balance? Which swamp would

you like to take home today? Um, wow. It's such
a great, such great options between these two

bogs. Um, I think I've definitely, I think when
I was younger, like back at like middle and

high school. totally the bog of overconfidence
where I was really into science and I thought

stats and science meant the same thing and or
were representative of each other. And it was

only when I became more, I don't know, I started
taking up more subjects in the humanities.

And for context, I did a film and media studies
degree where I would probably put myself in

the bog of disbelief because I feel like I've
become a lot more interested in bias or the

ways in which people present information to
get certain results. And I think that's, you

know, that tweaking that you were mentioning,
that sort of magic, it can seem like where

you get a set of data and two different people
with two different perspectives might come

up with completely separate, you know, interpretations.
So I definitely would, I'm going to take home

the bog of disbelief. I think that's me. That's
my swan. All right. Ding, ding, ding. We have

it worked. Lovely. Yeah. I really like the way
you described that. How about for you, Sahir?

Is it one of those bogs called to you in your
research experience? So it's actually really

funny. We were... I can give you an anecdote,
which is pretty cool. One of my colleagues,

lab mates, he was doing his master's project
and he was doing it on proteomics. So this

is like this entire thing of trying to measure
proteins in your blood and looking at their

composition and basically the idea that different
levels of different proteins are going to kind

of indicate towards different mental health disorders.
That was basically the idea but it involved

a lot of data and therefore a lot of stats and
then a lot of the readings can also get messed

up. So how do you clean all your data and so
on and so forth. I asked him, so what stats

analysis are you going to do? And he just said,
I'm basically going to sacrifice a goat over

the keyboard and see what shows up. And statistics
very, very close. And you kind of, you kind

of see it. The more you start doing all working
with stats. And I've worked with a lot of stats.

I've done like large data stuff. I've done like
molecular research and all these things and

the different types of stats you use and the
different formulations people come up with

it can of skin it can at sometimes feel like
a bit of a dark art it can That yeah, what's

going on? There's this thing called a this is
type of correction. I think called LSD correction

and My my supervisor loves it because he's like
it's LSD correction. Who doesn't love LSD?

Yeah, we have drugs, sex, and rock and roll.
That is the statistics way, as we say. I love

that. Oh, I love that. But yes, short of the
long, I'm also probably more in the bog of

disbelief now. Yeah, or more towards that side.
I know that you need stats and they can be

very useful and they are very useful, generally
speaking. But yeah. Yeah. No, it's interesting

because I'm definitely in there with you on
my darker days. But when I've done this kind

of question with my students, students who are
intentionally in a statistics classroom, a

lot more than I expected put themselves in the
bog of overconfidence. And this is something

we tend to know in stats education that the
folks who come to us on purpose, even though

there's a lot of different pathways to being
a wannabe statistician or realizing you need

some for your degree, whatever else it is, I
think... Because I'm someone who really cares

about teaching ethics and statistics and teaching
statisticians how to write and talk to non-statisticians.

And yet a lot of the folks who traditionally
come to us come with that mindset of statistics

as just the math that pays better. You know,
you get to maybe be a data scientist and they

have good marketing and there's a good job at
the end of it. And so actually trying to help

them see how much of human decision making and
human choice and our attitudes and biases.

that are present in this work we do. That's
something that I really, really care about.

And, cause you were talking, Alex, a little
bit about, especially being exposed more to

the humanities. There's this really cool art
installation about statistics. And this is

one of the reasons why I really like statistics
because statistics art installation is probably

not something we all have a lot of experience
with. But being in this field is actually like

quite a creatively fulfilling kind of place
to be. And so I think there's a lot more to

the experience of being in a statistical field
that is usually what we represent outside world.

And so I don't actually know how to say this
name, but it's an art installation called the

Library of Missing Datasets. So in 2016, it
had a physical installation, but you can go

see some cool pictures of it online. And one
of the really important skills of a statistician,

but also of... all of us, I would argue, is
this idea of seeing what's not there as well.

So you can only sacrifice the goat over the
keyboard for the data you have, but how did

you choose to get that data? So, sir, in this
research, you set out an experiment, you just

decided certain measures to collect, but what
about the things you didn't think to measure?

Or what about the group of people who aren't
represented in your study? Because you only

have access to college students from a certain
A lot of, sure, my day-to-day work is dealing

with what's in front of me in terms of my data,
but having an eye for what's not there, seeing

the unseen, and advocating to measure the unmeasured,
I think, can be a really important role. It's

the yin-yang, I guess, of statistics. You work
with the data in front of you, sure, you do

your best, but it's just as important you advocate
for what's not there and realize what isn't

there might mean you're missing. makes sense.
So a little bit of statistical art for you.

I was not expecting that. That's really cool.
That's really interesting. A way of representing,

you know, to people, there is a bigger picture
when you're looking at statistics. That's really

interesting. I've just been thinking about the
Boggs metaphor. It feels like people, let's

say actors who use statistics in a political
context, for example, they, they have a From

your description, it seems like there's a trust
that people view statistics with that overconfidence

that when someone is handed the statistic, it
will meaningfully change the things they believe

because it's a statistic and statistics hold
power. But I think it's quite interesting to

realize that statistics are an interpretation
of a data set that was constructed with purpose.

Right? Yeah, that's really, that's really cool.
That was so beautifully said, Alex. Like, so

beautifully said.

As part of the interesting example from just
like New Zealand News Today, I saw an article,

or maybe it was like last night, popped up on
the spin-off, and it was about some very specific

statistical claims being made about gang membership
in New Zealand. Now, this is not my research

area or anything like that, but... as someone
who likes to think about numbers and how we

measure things. I was really curious to see
what else they'd say about it. And it turns

out that these, it basically was claiming these
really high proportional increases of like

six new gang members a day in New Zealand and
it's increased 41%. I don't have the number

in front of me, but I'll give you the link to
make sure you can pop it in the doobly-doo

if you want. But people were kind of going like,
oh, that sounds bad, but is that true? And...

If you have a look through the article, the
source of the data is New Zealand police. There's

sort of a list of, my understanding is gang
members that they are like aware of basically.

So a registry of some format. And the police
themselves were actually pretty good about

trying to communicate the limitations of that
data and that changes in how they do surveillance

or how they categorize these different engagements
could be having just as much. an effect on

those numbers at the moment as actual increases
in members. So you could have actually had

a decrease in gang membership, but if there
was better systems from the police to capture

information and add it to this registry, you
would see an increase. Right. So it's like

one of those exactly as you were saying, Alex,
this piece of political data that is being

used to hammer home that national is tough on
crime and labor wasn't. And yet the data they're

using. Well, I'm sure there is some signal there.
Um, and doesn't mean it's not a matter for

concern or discussion, but I think making sure
that the people, as you were sort of talking

about Alex, who're engaging with that data politically.
Realize that statistics is the science of uncertainty,

not the science of, I am absolutely right and
you should listen to me and exactly what I

say should change your mind. Like that's not,
that's not what we do here. Wow. That's interesting.

I haven't looked at that article. I should look
at that because that sounds really interesting

how they explore that limitation and how that's
been perceived by the media or political groups

as well in the general public. Yeah. Wow. I
have a friend who's a data journalist who's

worked, tried to work quite a lot with New Zealand
police data. And we had a really interesting,

so there's articles out, so I'm not saying anything
that hasn't been published yet, but they were

looking at missing this. So once again, this
idea of what's not there. And I believe it

was like... uh, search warrant data, if I'm
remembering, and this is the articles by Sapir

Mehran for Stuff. And it was looking at basically
interactions from police, I think in the context

of search warrants. And I was looking at ethnicity
for this. Um, but when we're having this conversation,

we're going, well, how does this data capture
someone with more than one ethnicity? Are they

only choosing one? Is the police officer looking
at someone and picking what they think their

ethnicity is? Or is the person... being stopped
for a survey about these. And it was the kind

of stuff that was really quite hard to get a
handle on the quality of this data because

it's not collected as the main purpose of a
search warrant. It's collected as administrative

data alongside trying to do whatever else is
being done. And so trying to tease out like

one of the points of the story was that the
number of missing ethnicity codes seemed to

really increase. And you could say, well, is
that a sign that they're choosing not to record

it because they got a lot of flack over how
high the proportions of Maori and Pacific people

were, or has it just become harder for police
to collect that data? Did someone make the

form back at the office worse and it's harder
to collect it? And like, there's so many logistical

and deeply human reasons why data quality might
not be good that has very little to do with

the data itself in some ways. So. Yeah, well,
what's not there is a pretty important feature

of the New Zealand political social landscape.
It's a tricky one. Yeah. Should we move from

what's not there to what is there? Because that's
the only thing we can actually really interpret

and use. Well, no, I guess we can fill things
up with what's not there. but I see you pulled

up something called restaurant reviews, so you
know where you want to go with this. Yeah,

it's a short one for you. It's like a little
interlude. It's a little more fun than searching

people's homes and police data. So I have made
up two fake restaurants for the folks listening.

Because I'm a girl who loves a reference and
a pun, they're called Gentleman's Gelato and

Ihaka's Ice Cream after Ross Ihaka and Robert
Gentleman. the folks who were here at the University

of Auckland when they created R, which is a
statistical programming language. I am not

going to make anyone do any programming today,
but I did think I might like to honour these

folks in a very, very silly way. So Alex, I'm
putting you on the spot, my friend. You are

picking, we're all getting ice cream after this,
you know, you and Germany, me here in New Zealand,

Sahir in India. And it's going to be up to you
to pick where we're ordering delivery. Okay.

So I've got two options for you. You've got
Gentleman's Gelato, they have a 4.9 star rating

with four reviews, and you have Ihaka's Ice
Cream, slightly lower at 4.7 with 3,141 reviews.

Right. Ignoring the names and the signs because
these are made up. Well, all of it's made up.

But don't use the vibes, use the reviews and
the ratings. Tell me a little bit about where

you would order ice cream for us from. I think...
I feel like I... there's an idea, I guess the

idea of safety in numbers would come up to me,
right? Like looking at Gentleman's Gelato with

the four ratings but a higher overall rating,
I feel like I would say, well, I guess four

people potentially had an amazing experience
here or three people and then one didn't have

such a great experience, but that's great. And
then, but more people have had a really awesome

experience. or average awesome experience at
Ihaka's ice cream. So I'd probably go with

the one with more people, even though it's a
slightly lower rating. I love that, really

well talked through. So here, how about you?
See, you shouldn't ask me these questions because

I'll say something completely different. I would
be going to Gentleman's Gelato more because

I want to find out where they lost that 0.1
star. Like, what happened?

Or maybe it's because they're new and like we'll
try a new place out. Maybe that's why they

don't have many reviews. But on the face of
it, suppose instead of like going to an ice

cream shop, this was like a recipe for something
online. And if you're looking for recipes,

the two recipes are there. One has like thousands
of reviews, the other has three or four. Even

if it has like a slightly lower star rating,
I'd probably go with that one because... feel

more people would have tried the recipe out
and vouched for it than these. Yeah, I love

how you both describe that. This is what I taught
my dad to do whenever he looks at Google reviews

for places. He always tells me when he's done
it. I'm very proud of him. Because yeah, exactly

as you had mentioned, there's certainly a whole
bunch of other human considerations here. Like,

oh, four reviews, they must be new. I should
give them a try. So I'm not saying... only

make your decisions with statistics, but I am
saying you can use statistics to support good

decisions. There used to be sort of a saying
in the field of like using data to drive decision

making, and I think it should be more like using
data to support or inform decision making,

because it's no good taking the human out of
it, but it can be really good putting the data

into our decision making as a support. So my
statistician approved order. for our ice cream

gentlemen, will be indeed Ihaka's ice cream
here. But exactly what you were verbalizing

to me about, I'd be willing to take the slightly
lower rating for having more information here,

more reviews. And this actually comes back to
a pretty important idea in all statistics.

So anytime you have run like a statistical test
in your research, there's these ideas, both

of the magnitude of the effect, the size of
this effect, but also how our sample size or

how the- quantity of data we have available
to us, how that helps us think about uncertainty.

So the way you could think about it is, okay,
sure, you rock up to Gentleman's Gelato, you

might have that exceptional 4.9 experience,
but you could just as easily, if that's the

average experience, but there's not a lot going
on in there, sort of pull up and get the two-star

version of the restaurant's experience here.
That smaller sample size? is going to make

it harder for you to sort of narrow in on your
uncertainty. Basically, you see a small amount

of data, you realize you have a lot of uncertainty
about what this experience is going to be like.

Even though it looks good, the signal looks
good, that 4.9, lovely. I, as the statistician

in me, would be like, nope, let someone else
try that, it was feeling adventurous, I want

a good ice cream experience. Whereas with the
Ihaka's ice cream, even though it's a lower

value, because I have so much more data, so
much more information, other people, assuming

there's not a bunch of bots behind our website,
which is a whole different story. In this case,

I'm going, well, hey, I might have a 4.8 or
a 4.6 experience, but in general, it's probably

going to be something pretty much in that range.
Like, you know, I'm expecting a pretty solid

experience. A 4.7 is not bad at all. I wouldn't
sacrifice that difference between these two

for the sacrifice of certainty that comes from
that size of my sample size. Yeah, that makes

a lot of sense. It kind of matters in things
like polling as well. Like sometimes it can

feel like it's really a small number of people
to have like 1,000 people to try to predict

an election. But that's actually a pretty good
sample size for those kinds of questions. Whereas

if I asked five people, I would not be putting
any money on predicting an outcome for a society

level on something like that. So in that case,
suppose we're looking at news articles whatever

about a claim, what kind of sample size would
you advise people to be looking for on, yeah,

for any claim that's made. And this is one that
students ask me and I always disappoint them

on, so I'm going to disappoint you guys too,
is that while there's definitely rules of thumb

out there, there's no true perfect line here.
Everything is, there's not some place where

it moves from darkness into light. It's a lot
to do with the context. So you might see people

sort 30 or the number 100, but a lot of it also
really matters on how you're cutting up your

data into groups. So if I'm trying to make a
comment about all New Zealanders and make a

pretty good prediction for an election outcome
or some opinion pool type thing, a thousand

is actually pretty good if I'm trying to make
a claim for let's say adult New Zealanders

and I ask adult New Zealanders. But then if
I try to take that same data and make a claim

about women in their 30s who live in Auckland,
suddenly that's a lot smaller of a group with

that original data. So even though I might be
reasonably trusting of the quality of an estimate

from that larger dataset, once I'm starting
to cut it up into smaller and smaller pieces,

that's losing some of that ability to be precise
about it for me. So it's not that the data

quality has in any way got worse, it just means
that now in the same way with the only four

review gelato place, I don't have a lot of confidence.
I don't, I were, basically the number, the

range of numbers that I'd be willing to accept
would be true is gonna get really, really wide

because I just have a lot of uncertainty about
this number if I don't have a lot of data to

sort of drive my prediction. And so how you
cut the data up really matters. I'd still want

at least that sort of 30ish if I'm doing some
sort of simple group kind of discussion. But

then if I'm trying to make. more complicated
subgroups within my groups, that's definitely

not gonna be enough. So short answer is you
probably will see somewhere in a textbook or

if you ask Chad GPT, he'll be like, do you need
at least 30 observations? But it depends on

your method and it depends what you care about
saying and how. Yeah, that's really interesting.

I've never done a stats course, especially not
at the university level. But I remember when

I was doing like science and I did like high
school psychology or something. learning about

sample size. I think the thing that surprised
me the most was you mentioned it earlier, that

a lot of studies only have access to college-age
students as their sample group. And so a lot

of these general generalizations that are made
about society at large have a very limited

sample or at least range of people or observations,
as you said, because it's this one specific

group, usually in North America, I think. And
so that obviously has impacts for the rest

of the world as well. But yeah, it's, yeah,
just again, bog of disbelief, I think, it's

where I'm at. Yeah, I love that they were exposing
that to you because that also ties back into

the who's not here, who's not being measured.
And I'm forgetting what the acronym or the

initialism is, but they sometimes talk about
WASP and weird. White, is it European or educated?

I- White, educated, yeah. What's art? I'm forgetting
what R is. Yeah, I'll have to come back to

it. But yeah, it was Western, educated, industrialized.
I don't know what R is. And then D is developed,

but yeah. Yes, yeah. So basically what Alex
did a better job of describing because we both

forgot the initialism is that, yeah, you have
in your studies, the 30 to 40 people who saw

the poster on campus thought that they could
do with an extra, you know. $20 gift card for

half an hour of their time and went and did
the study or went and did the survey. And that

really limits our understanding of the world
around us and our understanding of the variety

and uncertainty. It's not all about getting
a good guess of a specific single number. Part

of the goal is also to understand the variability,
the variation between people is often a really

interesting and important question when we're
planning for a world that humans live in together.

I googled the acronym and it's Western educated,
industrialized, rich and democratic. So there

we go. There we go. R was for rich. I feel bad
now. Rich. That's the one I forgot. I'm disappointed

we all forgot to be rich today. Yeah, damn it.
It's that meme which I keep seeing these days.

I was in my mom's womb in 1995 as opposed to
buying a house for 25 pence or something. Yeah,

that's on you, man. Really unproductive financial
fetus were you. All right. Yeah. So what next,

Queen of Stats? I've got a story for you. This
one's kind of old, but I think it's a really

good one and that you'll see echoes of the kind
of issues that we as humans have with information.

You'll see echoes of this in probably what you
saw from people on social media at COVID. what

you might have seen from people around elections.
But it's one specific sort of example of how

we're not always the most reasonable with information.
I do just want to flag as a brief content warning

that I'm going to quickly mention abortion in
the context of this case study, but it's not

the main topic. But if that is an issue for
any of the listeners, maybe you can timestamp

when we're finished or something like that.
I'm putting that on future you. But yeah, it's

not a big part of the study. Alright, so this
one is old, I guess 1995, back when you were

not buying a house, silly, silly you. The story
takes place in the United Kingdom. And so there's

a United Kingdom, or at least at the time, committee
on the safety of medicines. And so if you have

people in your lives who take oral contraceptive
pills, you may be aware already that these,

like, for a long time now have an associated
risk of blood clots. So current ones as well,

this has kind of been a true feature of these.
But by improving the formulation of them or

tweaking different things, that's something
that's being worked on. But it is just one

of those things that's a risk with this particular
medication, as any medication can have side

effects. But there was this sort of, they called
it like a doctor's note, I guess, that came

out and was sort of meant for public, like the
public to be aware of. And they announced that

this third generation, this sort of third version
of the pill that was what was most widely being

prescribed, and taken by people in the UK, it
actually had double the risk of blood clots

compared to what that previous version had.
And they were only sort of seeing that now

as they had more data on adverse events from
this, but like that's a hundred percent increase.

And so this came out and this was publicized
quite a lot in the newspapers at the time.

So I've got a couple of snippets on screen.
Oh, I'm sorry that I've got my alt text has

kind of shown up for people who can see it.
But basically I've got a clipping from the

Guardian newspaper and the title is Blood Clot
Alert on the Pill and that women were being

warned about seven brands of contraceptive that
were under that third generation and this doubled

risk of blood clots. And as you might imagine
this caused some concern that people who you

know had been taking this might take this every
day of their life we're now hearing this very

scary feature of the this medication that they
were on and it was probably quite important

to them. And so there were like health clinic
lines that were full of callers, like there

was quite a lot of panic is basically what I'm
trying to get to you at this point in the story.

And so as I said at the beginning, we already
sort of knew that oral contraceptives could

cause blood costs and that was a risk and if
you had a precondition that made you more likely

to have them, it wasn't an option for you. And
the other thing here was that part of that

same message that I told you about at the beginning
where it says double the risk, 100% increase,

it also in the same message said, for the vast
majority of women, the pill is a safe and highly

effective form of contraception. No one needs
stop taking the pill before obtaining medical

advice. So Alex, what do you think people did?
Do you think they calmly waited for medical

advice? Or do you think they did something else?
I'm going to wager that they did something

else. You're a betting man, you're safe for
this one. Um, the, uh, use of this bill dropped

like 80%. People just stopped taking it. They
were freaked out. They did like, didn't want

to deal with the risk. They just stopped taking
it. There was like a run on the pharmacies.

People were trying to get the second generation
one, which, you know, was the previous one

that this new one has doubled the risk of, so
let's go back to the safer one. none of the

chemists had anything left in them. Like it
was a real little bit of a public panic kind

of moment. And now one really important thing
to consider here is it wasn't just that people

stopped taking the pill and that then meant
there was fewer blood clots. Like sure, if

it was just that, fine. But there's kind of
a reason that people take this medication in

the first place. And there's lots of different
reasons why different people might be taking

it, but for quite a lot of people, it is for
the purposes of contraception. So for people

listening to the audio alone, I've got a graph
on the screen and the title is that there was...

13,500 more abortions in the period after, or
it's sort of the appropriate period after October

1995, and that sort of flow-on effect. And so
the graph on my screen shows that in general,

the rate of abortions had been, in fact, not
even just the rate, the absolute number of

abortions in the UK had been decreasing, even
as the population was probably increasing at

the same time. So you were on trend where people
had access to contraception that worked for

them, arguably. and you're seeing fewer and
fewer abortions. You saw them in this period

after 1995 jump right back up to what they had
been five years previously before we were doing

a good job of decreasing. And then, sorry, excuse
me. And then proceed to stay high for a few

years afterwards as the information about the
fear remained. But that message, that wasn't

that first part. of like at the same time that
it was announced where it's like, no one needs

to stop taking this. It's fine. Talk to your
doctor at your next checkup. That part didn't

really manage to be part of this persistent
messaging. And I think one fact that was particularly

shocking to me, I guess, from this, or that
really gave me sort of emotional connection

to the story, was that there was 800 additional
conceptions among girls under 16. So folks

who might've been relying on this in some way,
not have a lot of access to other things, and

arguably may not have been intending pregnancy
at that time. And I found that quite a surprising

number, that this statistic and its influence
on our emotions was- a really powerful social

disruptor for a little while there. This also
just has dollar signs attached, so if you're

like, I don't care, I don't need to take the
pill, four to six million pounds more were

spent by the National Health Service in that
time, which is a good chunk of change. So this

to me is clearly something went wrong here.
This wasn't the appropriate response from the

public, although it probably could have been
anticipated. I'm curious if I sort of stop

at this point, what questions do you have? Like,
what would you have wanted to know if you were

receiving this in the media or if you had a
person in your life who was trying to make

a decision? Do you feel like you had all the
information you needed to make a choice here?

I think I'm interested in double the risk, right?
Because double the risk sounds really alarming,

and I think you can kind of understand the response
from that perspective, because that's a scary

phrase. But I'm interested in the initial risk
from the second generation before it gets doubled.

Because if it's like the risk is like 0.01,
and it doubles to 0.02, is that correct for

doubling? Is that significant? I don't know.
I love that because yes, if only Alex, folks

that had you to ask at that time, because that
is exactly the question you want folks asking

themselves. And there's a whole other area of
study that's outside of mind about like our

risk tolerance and like when we think of different
risks and things like that. But fundamentally,

if you double a really small number, you still
have a really small number. sort of briefly

what this kind of looks like, I guess, as a
picture. But basically, you've got two types

of risk that you might have reported to you
in the media. And this is a pretty common statistic.

Maybe odds are more common if you're really
into horse betting, but risk, like absolute

risk is just a straight probability. So your
risk of getting hit by lightning is pretty

low. Unless you're a New Zealand MP, it appears,
in which we've had like three of them hit by

lightning. I do wonder. But absolute risk is
just the chance of something happening. And

so for the oral contraceptive pill, you could
calculate that for yourself and what the people,

the researchers had done, was they looked at
the total number of people who were taking

that pill, and then they found out how many
people had a blood clot. And that's just one

number divided by another number. So I have
on screen some terrible animations that I still

love because I'm the one who drew them. And
so let's say we have our 10 people who are

taking a oral contraceptive pill. If three of
them get blood clots, that's three out of 10

people, that's 30%, that's awful. That's a pretty
high level of blood clots that I probably wouldn't,

you know, I too would be jumping ship pretty
quickly on a medication like that. But that's

not what we really meant here, because what
does it mean to double the risk? This is something

we call a relative risk, because it's one number
being compared to another. It's not that baseline

or that initial risk that you were talking about,
Alex. And you might hear this as like twice

the risk or a two-fold increase or any of this
kind of language of doubling and increments.

This is this relative risk idea. And so exactly
as you were trying to think through for yourself,

you're like, okay, 0.01, if I double that, that's
only 0.02. You can see that if you had an absolute

risk of 30% and that doubled, that would be
60%. Awful. Jump ship. Absolutely. If you had

a 3% risk, that's a 6% risk. I'm don't feel
as invested in that difference in this case,

personally at least, right? And that goes back
to not just statistics, but your personal risk

tolerance. But now if you're going from 0.3
to 0.6, I'm struggling to muster much of an

emotional response at all to that difference.
Like that doesn't feel like something that's

gonna overly motivate me to change, sorry, to
change what I'm doing. I don't actually have

the number directly in front of me, but the
risk, from the second generation oral contraceptives

was much smaller than even what I have on screen.
It was something like 0.1 in 1000 or even smaller

than that, I think. I'm sorry, I don't have
it to hand. I can send it to you later if you

wanna put it somewhere. And so in the end, the
doubling of that risk, while it may have some

influences on a population level across millions
and millions of people, personally, the risk

of not taking a medication like this far outweighed
or the risks associated with not taking the

medication would far outweigh, and I think on
a social level, the potentially negative consequences

also far outweighed what the blood clots could
have caused on both a societal level and on

an individual level. And so exactly as you said,
Alex, like freaking scary to have this headline

about blood clots. Awful, awful, no thank you.
But then if we don't know how to ask ourselves

these questions, we can very easily. trap ourselves
in situations where we're just making decisions

based on fear of things we don't need to be
afraid of. There's enough things I'm afraid

of on a day-to-day basis anyway. I don't need
to have fake new ones, you know? Yeah. Wow.

All right. So I have a question for you, Alex.
Do you eat bacon personally or do you have

a favorite food? I have bacon occasionally.
I don't know that I could come up with one

single favorite food. Let's take it because
I have had a look at these numbers once upon

a time. But a more recent example is, I don't
know if you ever saw it, and by recent I think

it's probably the last five years, is that bacon
was listed as a carcinogen, a cancer-causing

substance, along with, and you perhaps may be
sorry to hear this in Germany, land of the

lovely sausages, that any of those kind of cured
meats or processed meats in some way increased

people's risk specifically of... bowel cancer.
And I believe, and I'm sorry, I don't have

this number in front of me, but it was something
like a 14% increase in risk. And when I started

looking at that, I was like, the people I know
who really like bacon, that seems like not

that bad an increase or anything like that.
And then when I was taking a look at the numbers,

especially in New Zealand, and especially I
believe it was among older men, and there's

also variation by ethnicity, the baseline prevalence
or sort of if you were eating fairly healthy,

just your general risk of getting bowel cancer
is actually kind of high for some of our populations.

And so while a 14% increase is nothing as scary
sounding as a 100% increase, when I was looking

at those numbers, and if we're going to report
bias, I am a vegetarian, so perhaps I don't

have as much skin in this game. But looking
at that increase, to me, I was like, oh, that

was actually enough to like, tell my dad that
I don't think you should use much bacon because...

even though it was a smaller relative increase,
because it was still of a kind of high level

of just prevalence of the people just get quite
a lot of bowel cancer in New Zealand. It's

a real medical problem for us. That actually
had me kind of worried. So I think it's a total

overreaction to stop your pill for a hundred
percent increase, but I don't think it's that

bad an overreaction to limit your bacon intake
for a 14% increase. So that relative and that

absolute, it really matters that you have the
full. sure there. Well I was I was gonna ask

about another example just to elaborate on that
point. This is a news example so maybe you

might not have the exact things or I don't have
the exact numbers either but I remember during

COVID there was a lot of justification by certain
governments being like oh it's only a 3% of

the population is affected by this or is dying
or something like that right like they were

telling like it's only 2% or 3% of the population.
the population will probably die by this or

whatever. But then you kind of extrapolate it
to the pure numbers and you think, so you're

talking about, what's it? If you have a hundred
million people in a country, 3%, you're talking

about 3 million people dying here. You're okay
with this. All of a sudden these small percentages

actually mean quite large numbers. I think that's
such an important human example. If the question

is, okay, I am sort of discounting small percentages
in some cases. But in the COVID example, you're

like, oh, it's only 3% of the population. But
then when you're starting to think of, you

know, thousands or hundreds of thousands of
people or millions of people globally, I think

it gets to this really important interface between
humans and numbers. Because a small change,

like a change between 0.3 and 0.6 that I was
talking about earlier as not being particularly

meaningful to me, in some contexts, that would
actually be super important. And... It's one

of those things where for some stories, it would
be a rounding error. But for other contexts,

that could be the difference between a business
going bankrupt or, you know, growing and profiting.

It could be the difference between a global
pandemic and, oh, thank goodness we had that

contained and nothing happened. And I think
just as you sort of questioned earlier about

sample sizes, like what's the threshold? What's
the threshold? All of this is so context dependent

that you need to know. And one thing I really
cherish as a statistician is working with subject

matter experts in the applied disciplines that
I'm working on now. I have been talking to

this wonderful woman doing research on tuna
and how people interact with and use tuna in

Kiribati. And I know nothing about any of those
things, but I know a little bit about survey

design. And so having conversations with a subject
area expert helps you tease out what actually

matters to measure and how. And that's the same
with interpreting these numbers. And I guess

the only guide I'd have for folks there is the
same sort of media literacy you'd apply to

influencers on social media, to political claims,
even if they don't involve numbers. The key

question to ask yourself as a person who has
to navigate this information saturated world

is what is the person who's telling me this?
What do they want me to think? And why might

it be important to look farther than that? And
numbers are just one of those tools that we

have to influence people. And as Alex was saying
earlier, for those of us who might fall into

the bog of sort of overconfidence, where we
think a statistic can be used and wielded basically

as a cudgel against the non-believers, we want
to be really careful of like, why are we using

that number that way? Are we trying to present
it as the small percentage? Because we know

our reader or our listener would be horrified
by the actual numbers. No easy answer, we just

got to use our brains. as smooth as they may
be. That was nice. That was very well done.

That was very eloquent. Well, I was so thrilled
with your response, Alex. What question would

you ask? You nailed it in one. That is the question
you want to be asking yourself when you're

presented with those things. Thank you. Yeah,
no. It just feels like, I don't know, I guess

I am quite suspicious of data. I think it is
because in media studies and stuff, especially

when you're studying things like news, this
sort of thing comes up when you're looking

at how different events, let's say climate change
is portrayed by people on either side of the

issue. People who strongly believe in climate
change and are urging people to how they use

data versus, I don't know, should we say skeptics
would use other data to try and say, well,

look at this. It's periods of... heating and
warming or whatever they're saying, right?

Totally. Yeah, numbers are a weapon that everyone
seems to use. Yeah. Heck yeah. And you want

your weapon sharp and your wits sharper when
it comes to numbers, I think. Yeah. I love

how you describe that. I think from a more science
perspective, I think one of the biggest things

which changed in between me doing, me being
what we were taught in undergrad versus our

approach to looking at research in masters and
PhD was that we went from reading all these

articles and believing that this is the truth,
that oh they found this association, oh they

found this to going... we went from that in
undergrad to all of a sudden being like, okay,

so how many sample... what was their sample
size? Like, how did they run this experiment?

Or if we're doing a bio experiment, what chemicals
did they use because that could affect things?

There were so many other things we started looking
at and started critiquing, which we did before.

What stats, what actual statistics did they
use? Like what analysis did they use? We were

also, are they hiding any data in the graphs
which they presented? Oh, you kind of changed

completely from what you're taught in undergrad.
And I feel, I think all of us would have been

a lot better off if they had taught us those
things beforehand, because then all of a sudden

we wouldn't be so scared when we read these
massive eye catching like... stories and like

all these weird graphs which show up without
a correct X and Y axis. Yeah, I feel like there's

a real missed opportunity for so long in terms
of that helping people learn how to understand

statistics and how they work. Yeah, that's so
well said. Now Alex, I did a little bit of

sneaky looking up of you before I actually came
on this call. Would I be correct in thinking

this? I swear this is related. Sahir, to what
you just said. Would I be correct in thinking

you are interested in the ATTRPG, tabletop multi-game?
Oh my goodness, you're 100% correct. Yeah,

yeah, all right, well, I was very happy to see
that because I didn't have to plant some knowledge

in y'all beforehand. So Sahir, when you have
done statistical, or read statistics specifically,

even in your own, whether it's reading other
people's work or doing your own work, have

you... come across something called a p-value.
Yeah. Have you learned to kind of hate and

fear them in the same way that I have? I mostly
hate them, sometimes fear them, but yeah, I

tell all my students that it's not scary, it's
easy and trust me, but at the same time, inside

you know that it's a bit more.

Have you ever heard the quote about democracy
that democracy is the worst system except for

all the others? Yeah. That is basically how
I feel about the p-value. Although if you have

any Bayesians in your audience, which is like
kind of the version of statisticians that like

wear leather jackets and they smoke if it's,
if smoking wasn't bad for you and is cool.

Like think sort of Greece. lightning greasers.
Like they're the cool kids and I'm not a Bayesian.

I'm just a boring old normie frequentist. But
anyway, the p-value. It is the worst metric

except for all the others. It's the best one
we've got. But the reason, Alex, I wanted to

ask you about tabletop games, specifically ones
using dice, is as someone who's interested

in this gaming, could you tell me what the chances
of getting a natural 20? Rolling a 20 on a

20-sided dice. You don't have to calculate it.

I know, I think it's 5%. You got a one in 20
chance, obviously. I think that's 5%, right?

Yeah, absolutely. Right. One in 20 is exactly
right. Assuming you brought a fair dice to

the table, which I'm sure we all would do here.
You're exactly right. It's a one in 20 chance.

And so that I think tends to like, it's special
for anyone who's listening, going, what are

they talking about? Getting this 20 is like
a pretty special experience in gaming, because

usually it means something good, and it's rare
enough that it's special if it happens, but

it's so rare that it's impossible. Like you'd
still be like, maybe I'll get one, even if

you try something impossible. And that kind
of vibe is basically the vibe that underpins

most statistical research. We use this 5% threshold
to say, if this was true, so we make some statement

like, okay, if in this research, there's really
no difference between the drug I'm testing

and a sugar pill. Sorry about the window again.
If there's a really no difference, how likely

is it that I would see the kind of result I
got in my study? And basically Alex, what we

do in statistics, in applied statistics, is
a lot of the time if we say if it's less likely

than getting a nat 20 that I would see this
result if nothing was going on, I'm going to

claim something is going on. But the thing about
this is that it is still uncertainty. It is

still a roll of a dice. I could see a really
big difference between two values. just by

chance, or I could fail to see a really important
difference because I don't have enough data,

or I just got unlucky with the rules of the
dice the universe was giving me. So we use

it and a lot of folks, especially who come through,
I think psychology is probably very much persecuted

by the p-value. I think we sort of raise our
young researchers to have this 0.05 threshold.

And I think at the back of their heads, it must
be in some stone scroll or like stone tablet

upon a hill. Surely there is some reason why
we devote our lives to the worship of the 0.05

threshold. And I've heard two stories about
why this is the case. One is that back in the

day when statistics books had to have tables
in the back of them, so you could like look

up different numbers when we didn't just be
able to put it in a calculator. The version

I first heard was whoever was making it did
a couple columns. got tired and figured that

was a good place to stop, because yeah, it's
pretty good. I have not, I may be absolutely

spreading misinformation, but it's joyful misinformation,
because I have not found anyone online who

corroborates that. The other part of the story
I've heard is there was a little bit of a copyright

war between two statisticians in the 70s who
weren't really getting along. Statistics tea

that's like 50 years old for you, so piping
hot. And basically the one guy was like, I

can't just copy the table the other guy did
because he'll sue me because he's like that.

So I'll just pick a couple of columns and make
my own version and it will be different enough.

And then in trying, and I think the underlying
effort was to try to make it a little bit easier

for non-statisticians or new statisticians.
But in this effort to simplify, as is the case

in I think all fields of research in life, in
the effort to simplify some of that nuance

is lost. And so now we worship at the altar
of the nat 20 effectively, Alex, um, a lot

of our research without, I think, as sort of
Sahir was saying earlier, we sort of get our

students reading these papers or reading this
research and kind of thinking about it as like

these researchers have done this perfectly and
these findings are perfect. And this is truth.

This is bog of overconfidence. And then the
farther in you go, you're like, we're doing

it on vibes, man. One in 20 seems kind of cool.
Let's make our whole discipline do that. right?

So I think it's so important to kind of dismantle
some of this, like, the ivory tower that statistics

and academia is in to try to give people not
so that they doubt us, right, not so that they

think, oh, there's no point listening to scientists
or listening to researchers. But so they kind

of realize it's a bunch of folks doing their
best to give you the best information they

have to hand. And even when they do their best,
they will be wrong. This is why we collaborate.

This is why we iterate. And this is why we don't
do something only once. You missed out. You

could have said, this is why we replicate in
order to rhyme really well. This one reveling

turn. Take it from the top boys. Yeah. I'll
save that one for my statistics. Dr. Zeus

book that, you know, publication imminent for
sure. Sounds good. Great. So you've got another

slide up. I think because we've got about 10,
15 minutes left, maybe we can go through one

last example on what you were talking about.
Yeah. And yeah. Does the music one sound like

a good one to end on? Let's do it. All right.
So Alex, what kind of music do you listen to

if you don't mind me asking you? I'm mostly,
oh gosh, I mostly listen to indie rock. Indie

rock? Yeah. And lots of British rock and stuff
like that. I love, I love. Any bands want to

drop for the discerning among us to go frantically
check out later to act like we're cool? Well,

I would say my all-time favorite band would
be Wolf Alice, who are definitely in that rock

genre. Yeah, and very indie, I think. Gosh,
I can't think of anything really bigger right

now. I'm listening to a lot of Hozier at the
moment. Oh, they're a man of good taste for

sure. How old is Wolf Alice for that? Is it
a band? Is it a single person?

They've been making music for a decade now and
I think they're in their mid 30s, maybe early

to mid 30s, I think. I'm not sure. Yeah, perfect.
And that was actually going to be the exact

next thing I asked you, so thanks for setting
that one up for me. Okay. Yeah, so Indie Rock

certainly isn't a completely new genre, right?
Because even just your example, they've been

around for 10 years. But I don't think my Nana,
who's turning, I believe 88 next month. She

wasn't listening to a lot of indie rock back
in her sort of teens, 20s or anything like

that. Does that seem a little fair? She could
have just been missing out, but I think it

was generally because there wasn't a lot of
indie rock back then. Yeah. Now, the reason

I've sort of wanted to have that little bit
of a conversation first is there is a chart

I have on the screen, and this was really sort
of doing the numbers on social media, actually

probably about 10 years ago, I think, at this
point, maybe a little less. But it was a really

interesting study that someone had done looking
at kind of like the survival of musicians.

Now, have you guys heard of the idea of like
the 27 Club, like Kurt Cobain and Amy Winehouse

tragically dying really young? And there's-
Jimi Hendrix as well? Oh, I think you might

be right. Gene Hendrix as well, 27? Yeah, so
some like iconic basically members of this

pretty tragic club idea. And so- I think that
probably you might also have the general sense

that certain types of music are associated with
like harder livings. There might be like drug

culture or it might be a lot more common to
be shocked or something like that. Like it

could be dangerous to be a popular musician
in certain genres and certain places in the

world, right? We don't tend to think of jazz
as that dangerous in the modern era. I don't

know about you, but I haven't heard too many.
jazz musician rivalries ending with a shootout,

at least this year, you know, maybe last year.
So the graph that I have in front of me and

the last good thing I wanted to share with you,
because I think it comes up a lot in things

we see in our lives, but isn't necessarily something
we even teach in like an intro stats course

or teach people or do a good job of teaching
people. And I pick that blame on myself. So

on this chart, you have a range of different
genres from blues at the far left end, down

to like rap and hip hop on the far right. And
the graph, there's a couple of things wrong

with this graph and I'll, there's a link that
I can share if you want to go deeper into it.

But basically the sense of this graph is you
can see that U.S. female and male life expectancy

has sort of increased over time. But you can
see really big differences between musicians

who are like blues, jazz and country who are
kind of doing pretty well for the national

averages. And yet you're seeing this really
shocking sort of dip for more, what you might

associate with harder lifestyles these days,
perhaps, in punk, in metal, in rap, in hip

hop. And so the reason this chart was so popular
is, and we've just had this conversation, right?

27 Club, you can see in your head, how many
rap songs do I know that are about drugs? A

lot more than I know that are about jazz. Then
again, Frank Sinatra got up to some stuff with

the rap pack. But anyway, this, if you take
a quick look, sorry, you can see where my musical

reference points are coming in. Take it. look
at this graph, imagining it comes up on your

Facebook feed or at the time it was published,
your Twitter feed. It really sort of confirmed

things that people kind of already might have
had a vibe on. And so they've got shared and

was really used to kind of, you know, beat up
metal, rap and hip hop for those terrible lifestyle

choices. The thing that we sort of talked about,
Alex, though, with the genre that you like

with indie rock, is it's not that old a genre.
So some of the sort of leading lights in it

are probably not even that old, right? So the
band you were talking about are in their 30s.

You could argue to your blue in the face about
the origins of rap and hip hop, and they all

have deep historical origins. But as popular
music genres that are recognizable, they're

not very old. No, the 70s maybe. So for you
to be an artist in that genre, you haven't

had a chance to live to a ripe old age yet.
You cannot be a hundred and have started as

a rapper in your 20s at this point. That's just
not, unless you're doing some time travel,

that's probably not going to math. The math
is not mathing. Absolutely can be a hundred year

old former jazz singer. Um, because jazz has
been around for a long time. And this is, um,

it's the subset of things that are sort of the
survival or selection biases, but this is a

specific subset. So the vocab I'm going to drop
on you is this idea of right censoring. So

censoring, you might think of in Alex's like
in media as if I was putting my black redacted

tape over text. And it is kind of that idea.
I can't see into the future. No matter how

good I get at statistics, I do not get that
power. And by my inability to see into the

future, I, it is censored to me. It is redacted.
It is a black line on the text of the future.

And so I can't see how long the people who are
still alive are gonna live. I can only see

the people who have currently died. And because
it's a young genre, if you've died, you've

died young at this time. So 400 years from now,
when the ethnomusicologists do this graph again

for these exact same sets of genres, and we've
far moved past them into super techno solar

cyberpunk, we will probably see all of these
look pretty similar because they will just

be people who are making music and dying like
all people do. Is that an upbeat thing to finish

your podcast on? Yes! And music, and rock and
roll! I mean, kind of. It's good to know that

Snoop Dogg and Eminem and all will probably
live to this average age of most jazz musicians.

It may be reassuring. Yeah, we can have that
for a while longer. I'm so thankful. I guess

with the graph as well, as it shows... increasing
life expectancy as well as life expectancy

goes up. You might even see a hundred years
down the line, some rock and hip hop and all

of these artists actually exceeding or going
over, right? Because, uh, yeah, the music they

place probably not the driver of their longevity.
Exactly. As you're saying, the context they

live in better medicine. We know you shouldn't
smoke now. Like all of that societal knowledge

is probably going to be a bigger driver. for
the majority of musicians for what their long-term

health looks like. Not just that they are like
cool and hip and slightly emo bands or whatever.

That's such a good insight. Cause yeah, that's
exactly right. Those ethnomusicologists 400

years from now might see something that looks
completely inverted from this where you go,

wow, what good health all our hip hop stars
were in. How delightful. Wow.

I hadn't even realized, I was trying to make sense of this graph myself.
It's a tricky one. I don't love it. What could be the thing?

Yeah, no, that's great though. The youth of the genre means that you just – that line you said, you only have the
data of the people who have died and because the genre is young, they've died young.

So it's going to skew it completely. I just didn't even make that connection.
Wow. I think that was a great one to end on.

Your first thing is already like, oh yeah, this
agrees with my preconceptions. And I will be

fair to the original author that she does mention
that, or I believe it was she or the author

mentions this as a confounding feature, but
you can't start from this chart because as

we were talking about before, anytime you attempt
to simplify, you lose nuance. And that's actually

really a hard kind of thing to get across in
a graphic. You'd really have to change the

graphic here. And so I think whoever probably
put this together was doing the best with what

they had. But a really key part of the understanding
of the way to think about this data in order

to understand it, completely gone if you don't
have the little questioning voice, the little

statistician who sits on your shoulder and asks
you these questions. Would a more, let's say,

quote unquote, fair, or I don't know if that's
the right word, graph be something that looked

at the relative use of these genres and then
went back to something like blues and said,

let's say 30 years, and then looked at the debt
they, I don't know, I don't even know how you'd

do that, but comparing the same span of time
would probably provide a more equal result,

right? That's a fantastic idea, because exactly,
if you were interested, if you changed your

research question a little bit, and were like,
which genre of music saw the worst premature

death rates? And you go death before 30 or death
before 35, pick something that seems kind of

fair, based on some of the genres you're picking.
And that's a fabulous question, Alex. And that

would give you probably a really different insight
where it's old, old blues or jazz. You might

be seeing people who are still suffering from
tuberculosis. People who are dying in World

War II when like, you know, there's all these
other features that exactly as Sahir was saying

before, you might actually see that there's
lower premature mortality in some of our modern

genres because these folks haven't gone to fight
in a war. They have access to treatment for

tuberculosis. Like a lot of the things that
are actually driving mortality here are not

your taste in music. Wow. That's such a good
question. That was really good. I think that's

a good place to end on. I think so. Yeah, absolutely.
That was great. Just like letting me run wild

on all my favorite topics. How cool is this?
Well, we can let you run wild for longer. Maybe

you can... have you back on next time with more
examples. We haven't covered so many- Don't

tempt the girl. We haven't covered so many things.
I saw a slide there called P-hacking, which

is a whole different thing. We just touched
upon the P-value, but we didn't even get into

any of the scarier stuff of how things, data
can be manipulated. Oh yeah, there's a fantastic

interactive that either you can pop a link into
it, or yeah, if your listeners will let you

have me back. Believe me, could I tell you some
more stories about what you can do if you want

to do wrong? Awesome. Well, before you say goodbye
to the listeners, last thing which we will

ask you is if you have any one piece of advice
for them before you go and that's how we would

end. So yeah, thank you for this question. What
I hope folks might have got from this or might

be able to get from other sources is you don't
have to have been a math genius to be a good

statistical thinker. And I want for all of us,
if I was ruler of the world or had my magic

wand, good statistical thinking would be the
wish. If I was the world's fairy godmother,

let's make it a little more benign. I wave my
little wand and we are all more confident and

thoughtful statistical thinkers. On the other
side, if you are a math whiz, that doesn't

automatically make you a good statistical thinker
because of comfort with uncertainty. the ability

to see what's not there, and a sort of questioning
mindset. Might be part of your good math set,

but if you're just a timetable's whiz, that's
not the same thing. So I hope your listeners

are a little more enthusiastic. They have perhaps
the beginnings of a statistical smile after

this conversation, and maybe a little bit more
confidence that they're asking good questions

and that there are good ways to interrogate
the data that is being presented to them. Half

the world goes to... goals this year in 2024
when we're recording. And I think there's no

more important time than present to be more
confident in our ability to make good decisions

and to have data as one of the things in our
toolkit to make those good decisions. Awesome.

Beautiful. Well said. Very well said. Thanks
for letting me say it. Awesome. So thanks guys.

Thanks Alex. I hope you enjoyed that. I loved
that. It was really great. Super fun. Okay.

I hope you enjoyed Dr. Boat in stocking you
online as well. Or... Yeah, well, that was

a genuine surprise. I love your substacks! Shout
out to Alex's substacks! Yeah, Alex has got

a few podcasts going as well. You should follow
all of them. Yeah. Thanks. It's actually really

interesting. I've spoken to quite a few people
about statistics and dice things as part of

those projects, which has been really fun. I
literally coded up one... to make better decisions

in Baldur's Gate 3, because I was trying to
figure out whether using bless or having advantage

was going to be better for one of my skills
to X. So,

that way is a slippery slope to further statistics,
my friend. Oh, delightful. All right, awesome.

Thanks, guys. Thanks everyone for listening
and yeah, see you guys next time. Thank you.

People on this episode

Sahir Hussain

Host