Smooth Brain Society
In an attempt to change the way information is presented, we’ll be speaking to researchers, experts, and all round wrinkly brained individuals, making them simplify what they have to say and in turn, hopefully, improving our understanding of a broad range of topics rooted in psychology. Join us as we try to develop ourselves, one brain fold at a time.
Instagram: @thesmoothbrainsociety
TikTok: @thesmoothbrainsociety
Youtube: @thesmoothbrainsociety
Facebook: @thesmoothbrainsociety
Threads: @thesmoothbrainsociety
X/twitter: @smoothbrainsoc
https://linktr.ee/thesmoothbrainsociety
Smooth Brain Society
#37. Lies, Damned Lies and Statistics - Dr. Liza Bolton
The use of stats and throwing around numbers in conversation is incredibly common, yet statistics itself is poorly understood. Dr. Liza Bolton from @universityofauckland discusses the dark art that is statistics. Using examples, she takes us through some misconceptions and dispels the notion that numbers don’t lie. We cover how to identify the best ice cream store, how to not be fooled when stats are mischaracterized in media and politics, why we worship the nat 20 dice roll and is the 27 club for musicians a real thing?
Dr. Bolton's website: https://www.lizabolton.com/
Missing data in police warrant information: https://www.stuff.co.nz/pou-tiaki/300938784/data-shows-ongoing-racial-bias-in-police-warrantless-searches
Gang numbers: https://thespinoff.co.nz/politics/23-04-2024/have-gang-numbers-really-skyrocketed-in-recent-years
Support us and reach out!
https://smoothbrainsociety.com
https://www.patreon.com/SmoothBrainSociety
Instagram: @thesmoothbrainsociety
TikTok: @thesmoothbrainsociety
Twitter/X: @SmoothBrainSoc
Facebook: @thesmoothbrainsociety
Merch and all other links: Linktree
email: thesmoothbrainsociety@gmail.com
Statistics is a lifestyle baby. Like how do
you think better in the world? You should be
leaving this to when the actual podcast starts
too. That's such a good line. Saying these
things. That's such a good line. Yeah, hey,
anything you like I can do it for the TikTok.
Welcome everybody to the Smooth Brain Society.
Today Alex is back with us all the way from
Germany now actually, because his move to be
from India right now, so international. And
we have Dr. Liza Bolton on from the University
of Auckland. Wait, is it Auckland University
or University of Auckland? Ooh, I think our
brand guide would tell you University of Auckland.
but that does not stop a single person calling
at Auckland Uni. Yeah, no, cause, cause we
were technically Victoria University of Wellington
and they tried to rebrand us for God knows
how long and people still call it Vic. So, yeah.
But yeah. So Dr. Liza Bolton is a teaching
fellow at Auckland University and is and describes
herself as a stats enthusiast. We met at a
conference where she was talking about statistics
and what do you say, journalistic practices
and understanding stats. That was basically
the crux of her talk and she's been keen to
come on and kind of talk about some of the misconceptions
in statistics, some things to be wary of. I
think it's very important considering any news
article ever or anything, even vaguely political,
stats are thrown around, but they're not necessarily
always interpreted in the right way. So thank
you for coming on, Liza. Thank you so much for
having me and for also being flexible about
the timing when I was sick last week. So I will
hopefully not cough into your microphone too
much for today's session. And yeah, thank you
for that introduction. I think that's a really...
I'll just bring you everywhere with me to introduce
me actually, because that's a much better job
than I tend to do.
No, I quite like the way you introduce yourself
on your LinkedIn, on your website, all those
things. You make stats sound way more fun than
someone like me who does research thinks it
is.
I think I tend to have a lot more fun with statistics
than a lot of people's experiences have. have
been, which makes me, I feel like I'm almost
a little bit of a stats evangelical where I'm
like, everyone could be having this much fun
if only, if only. So I am, I'm definitely on
a mission to give you a bit of a statistical
smile by the end of this. I love that phrasing.
I'm just gonna, I'm gonna tell everybody, for
audio listeners mainly, that Liza came on with,
I think a 60 slide. 60 slides presentation as
well with all different examples which she
wants to share with us. Of course we're not
going to go through all of them but that just
kind of shows the excitement in wanting to talk
about her work and we're happy to learn from
her. So I'll actually turn it over to you and
I'll let you decide where you want to get started
from. Where do you think we should go? Oh yeah,
no, I love that, very dangerous, but I'm very
excited to have control of the ship for a little
bit. So what I thought I would start with is
actually putting up one of my close to 60 slides,
you're right, because one of the things I really
like about your show is the sort of premise
of having someone like me who's really keen
to talk about my topic, having yourself of course
as the host, and having Alex who I've just
met for the first time. Sort of here as I guess
our guinea pig. That's my understanding that
we're just we're here to experiment I would
get to know Alex live a little bit by asking
him which of these bogs he falls into So a bog
is like a swamp or a wetland or basically something
You don't really want to stick your foot in
you want to stay on the boardwalk in between
and this is a concept That's sort of an old
concept in stats education, which is sort of
part of my background But I think it's a really
helpful concept for all of us to kind of check
it in on. So I'll explain what these two things
are. And for folks who are only listening to
the audio, there is like a visual for this,
but it does not matter whatsoever. It is just
two chat GPT, Dali generated pictures of bogs.
So on one side you have the bog of overconfidence.
And this is the person who kind of thinks that
statistics is basically math, right? and it
in some way, or maths, I should say, for our
New Zealand and British audience members, but
that statistics is this kind of perfect science
and that it tells us the truth and only the
truth. And it is one way to just be right. You
win all your arguments, you're always correct.
Ta-da, that is statistics. That is the bog of
overconfidence. If you ever interacted with
numbers and statistics and thought of them as
these perfect truths, that might be the bog
that you're... you're teetering off the boardwalk
towards. On the other side though, there is
the bog of disbelief. And this is probably the
bog that me as a statistician, I am most liable
to fall into, Alex. So I'm curious which one
of these will be your bog. But the bog of disbelief
is kind of, oh, well, statistics is just magic.
If you are, you know, unscrupulous enough,
you can make the numbers say anything you want.
Lies, damned lies, and statistics. So yeah,
this bog of disbelief is there's nothing of
value. There's nothing I can trust here because
anyone can just make it say anything. It can
mean anything. Does that make sense? Do you
sort of feel those two different swamps? Which
one is your swamp balance? Which swamp would
you like to take home today? Um, wow. It's such
a great, such great options between these two
bogs. Um, I think I've definitely, I think when
I was younger, like back at like middle and
high school. totally the bog of overconfidence
where I was really into science and I thought
stats and science meant the same thing and or
were representative of each other. And it was
only when I became more, I don't know, I started
taking up more subjects in the humanities.
And for context, I did a film and media studies
degree where I would probably put myself in
the bog of disbelief because I feel like I've
become a lot more interested in bias or the
ways in which people present information to
get certain results. And I think that's, you
know, that tweaking that you were mentioning,
that sort of magic, it can seem like where
you get a set of data and two different people
with two different perspectives might come
up with completely separate, you know, interpretations.
So I definitely would, I'm going to take home
the bog of disbelief. I think that's me. That's
my swan. All right. Ding, ding, ding. We have
it worked. Lovely. Yeah. I really like the way
you described that. How about for you, Sahir?
Is it one of those bogs called to you in your
research experience? So it's actually really
funny. We were... I can give you an anecdote,
which is pretty cool. One of my colleagues,
lab mates, he was doing his master's project
and he was doing it on proteomics. So this
is like this entire thing of trying to measure
proteins in your blood and looking at their
composition and basically the idea that different
levels of different proteins are going to kind
of indicate towards different mental health disorders.
That was basically the idea but it involved
a lot of data and therefore a lot of stats and
then a lot of the readings can also get messed
up. So how do you clean all your data and so
on and so forth. I asked him, so what stats
analysis are you going to do? And he just said,
I'm basically going to sacrifice a goat over
the keyboard and see what shows up. And statistics
very, very close. And you kind of, you kind
of see it. The more you start doing all working
with stats. And I've worked with a lot of stats.
I've done like large data stuff. I've done like
molecular research and all these things and
the different types of stats you use and the
different formulations people come up with
it can of skin it can at sometimes feel like
a bit of a dark art it can That yeah, what's
going on? There's this thing called a this is
type of correction. I think called LSD correction
and My my supervisor loves it because he's like
it's LSD correction. Who doesn't love LSD?
Yeah, we have drugs, sex, and rock and roll.
That is the statistics way, as we say. I love
that. Oh, I love that. But yes, short of the
long, I'm also probably more in the bog of
disbelief now. Yeah, or more towards that side.
I know that you need stats and they can be
very useful and they are very useful, generally
speaking. But yeah. Yeah. No, it's interesting
because I'm definitely in there with you on
my darker days. But when I've done this kind
of question with my students, students who are
intentionally in a statistics classroom, a
lot more than I expected put themselves in the
bog of overconfidence. And this is something
we tend to know in stats education that the
folks who come to us on purpose, even though
there's a lot of different pathways to being
a wannabe statistician or realizing you need
some for your degree, whatever else it is, I
think... Because I'm someone who really cares
about teaching ethics and statistics and teaching
statisticians how to write and talk to non-statisticians.
And yet a lot of the folks who traditionally
come to us come with that mindset of statistics
as just the math that pays better. You know,
you get to maybe be a data scientist and they
have good marketing and there's a good job at
the end of it. And so actually trying to help
them see how much of human decision making and
human choice and our attitudes and biases.
that are present in this work we do. That's
something that I really, really care about.
And, cause you were talking, Alex, a little
bit about, especially being exposed more to
the humanities. There's this really cool art
installation about statistics. And this is
one of the reasons why I really like statistics
because statistics art installation is probably
not something we all have a lot of experience
with. But being in this field is actually like
quite a creatively fulfilling kind of place
to be. And so I think there's a lot more to
the experience of being in a statistical field
that is usually what we represent outside world.
And so I don't actually know how to say this
name, but it's an art installation called the
Library of Missing Datasets. So in 2016, it
had a physical installation, but you can go
see some cool pictures of it online. And one
of the really important skills of a statistician,
but also of... all of us, I would argue, is
this idea of seeing what's not there as well.
So you can only sacrifice the goat over the
keyboard for the data you have, but how did
you choose to get that data? So, sir, in this
research, you set out an experiment, you just
decided certain measures to collect, but what
about the things you didn't think to measure?
Or what about the group of people who aren't
represented in your study? Because you only
have access to college students from a certain
A lot of, sure, my day-to-day work is dealing
with what's in front of me in terms of my data,
but having an eye for what's not there, seeing
the unseen, and advocating to measure the unmeasured,
I think, can be a really important role. It's
the yin-yang, I guess, of statistics. You work
with the data in front of you, sure, you do
your best, but it's just as important you advocate
for what's not there and realize what isn't
there might mean you're missing. makes sense.
So a little bit of statistical art for you.
I was not expecting that. That's really cool.
That's really interesting. A way of representing,
you know, to people, there is a bigger picture
when you're looking at statistics. That's really
interesting. I've just been thinking about the
Boggs metaphor. It feels like people, let's
say actors who use statistics in a political
context, for example, they, they have a From
your description, it seems like there's a trust
that people view statistics with that overconfidence
that when someone is handed the statistic, it
will meaningfully change the things they believe
because it's a statistic and statistics hold
power. But I think it's quite interesting to
realize that statistics are an interpretation
of a data set that was constructed with purpose.
Right? Yeah, that's really, that's really cool.
That was so beautifully said, Alex. Like, so
beautifully said.
As part of the interesting example from just
like New Zealand News Today, I saw an article,
or maybe it was like last night, popped up on
the spin-off, and it was about some very specific
statistical claims being made about gang membership
in New Zealand. Now, this is not my research
area or anything like that, but... as someone
who likes to think about numbers and how we
measure things. I was really curious to see
what else they'd say about it. And it turns
out that these, it basically was claiming these
really high proportional increases of like
six new gang members a day in New Zealand and
it's increased 41%. I don't have the number
in front of me, but I'll give you the link to
make sure you can pop it in the doobly-doo
if you want. But people were kind of going like,
oh, that sounds bad, but is that true? And...
If you have a look through the article, the
source of the data is New Zealand police. There's
sort of a list of, my understanding is gang
members that they are like aware of basically.
So a registry of some format. And the police
themselves were actually pretty good about
trying to communicate the limitations of that
data and that changes in how they do surveillance
or how they categorize these different engagements
could be having just as much. an effect on
those numbers at the moment as actual increases
in members. So you could have actually had
a decrease in gang membership, but if there
was better systems from the police to capture
information and add it to this registry, you
would see an increase. Right. So it's like
one of those exactly as you were saying, Alex,
this piece of political data that is being
used to hammer home that national is tough on
crime and labor wasn't. And yet the data they're
using. Well, I'm sure there is some signal there.
Um, and doesn't mean it's not a matter for
concern or discussion, but I think making sure
that the people, as you were sort of talking
about Alex, who're engaging with that data politically.
Realize that statistics is the science of uncertainty,
not the science of, I am absolutely right and
you should listen to me and exactly what I
say should change your mind. Like that's not,
that's not what we do here. Wow. That's interesting.
I haven't looked at that article. I should look
at that because that sounds really interesting
how they explore that limitation and how that's
been perceived by the media or political groups
as well in the general public. Yeah. Wow. I
have a friend who's a data journalist who's
worked, tried to work quite a lot with New Zealand
police data. And we had a really interesting,
so there's articles out, so I'm not saying anything
that hasn't been published yet, but they were
looking at missing this. So once again, this
idea of what's not there. And I believe it
was like... uh, search warrant data, if I'm
remembering, and this is the articles by Sapir
Mehran for Stuff. And it was looking at basically
interactions from police, I think in the context
of search warrants. And I was looking at ethnicity
for this. Um, but when we're having this conversation,
we're going, well, how does this data capture
someone with more than one ethnicity? Are they
only choosing one? Is the police officer looking
at someone and picking what they think their
ethnicity is? Or is the person... being stopped
for a survey about these. And it was the kind
of stuff that was really quite hard to get a
handle on the quality of this data because
it's not collected as the main purpose of a
search warrant. It's collected as administrative
data alongside trying to do whatever else is
being done. And so trying to tease out like
one of the points of the story was that the
number of missing ethnicity codes seemed to
really increase. And you could say, well, is
that a sign that they're choosing not to record
it because they got a lot of flack over how
high the proportions of Maori and Pacific people
were, or has it just become harder for police
to collect that data? Did someone make the
form back at the office worse and it's harder
to collect it? And like, there's so many logistical
and deeply human reasons why data quality might
not be good that has very little to do with
the data itself in some ways. So. Yeah, well,
what's not there is a pretty important feature
of the New Zealand political social landscape.
It's a tricky one. Yeah. Should we move from
what's not there to what is there? Because that's
the only thing we can actually really interpret
and use. Well, no, I guess we can fill things
up with what's not there. but I see you pulled
up something called restaurant reviews, so you
know where you want to go with this. Yeah,
it's a short one for you. It's like a little
interlude. It's a little more fun than searching
people's homes and police data. So I have made
up two fake restaurants for the folks listening.
Because I'm a girl who loves a reference and
a pun, they're called Gentleman's Gelato and
Ihaka's Ice Cream after Ross Ihaka and Robert
Gentleman. the folks who were here at the University
of Auckland when they created R, which is a
statistical programming language. I am not
going to make anyone do any programming today,
but I did think I might like to honour these
folks in a very, very silly way. So Alex, I'm
putting you on the spot, my friend. You are
picking, we're all getting ice cream after this,
you know, you and Germany, me here in New Zealand,
Sahir in India. And it's going to be up to you
to pick where we're ordering delivery. Okay.
So I've got two options for you. You've got
Gentleman's Gelato, they have a 4.9 star rating
with four reviews, and you have Ihaka's Ice
Cream, slightly lower at 4.7 with 3,141 reviews.
Right. Ignoring the names and the signs because
these are made up. Well, all of it's made up.
But don't use the vibes, use the reviews and
the ratings. Tell me a little bit about where
you would order ice cream for us from. I think...
I feel like I... there's an idea, I guess the
idea of safety in numbers would come up to me,
right? Like looking at Gentleman's Gelato with
the four ratings but a higher overall rating,
I feel like I would say, well, I guess four
people potentially had an amazing experience
here or three people and then one didn't have
such a great experience, but that's great. And
then, but more people have had a really awesome
experience. or average awesome experience at
Ihaka's ice cream. So I'd probably go with
the one with more people, even though it's a
slightly lower rating. I love that, really
well talked through. So here, how about you?
See, you shouldn't ask me these questions because
I'll say something completely different. I would
be going to Gentleman's Gelato more because
I want to find out where they lost that 0.1
star. Like, what happened?
Or maybe it's because they're new and like we'll
try a new place out. Maybe that's why they
don't have many reviews. But on the face of
it, suppose instead of like going to an ice
cream shop, this was like a recipe for something
online. And if you're looking for recipes,
the two recipes are there. One has like thousands
of reviews, the other has three or four. Even
if it has like a slightly lower star rating,
I'd probably go with that one because... feel
more people would have tried the recipe out
and vouched for it than these. Yeah, I love
how you both describe that. This is what I taught
my dad to do whenever he looks at Google reviews
for places. He always tells me when he's done
it. I'm very proud of him. Because yeah, exactly
as you had mentioned, there's certainly a whole
bunch of other human considerations here. Like,
oh, four reviews, they must be new. I should
give them a try. So I'm not saying... only
make your decisions with statistics, but I am
saying you can use statistics to support good
decisions. There used to be sort of a saying
in the field of like using data to drive decision
making, and I think it should be more like using
data to support or inform decision making,
because it's no good taking the human out of
it, but it can be really good putting the data
into our decision making as a support. So my
statistician approved order. for our ice cream
gentlemen, will be indeed Ihaka's ice cream
here. But exactly what you were verbalizing
to me about, I'd be willing to take the slightly
lower rating for having more information here,
more reviews. And this actually comes back to
a pretty important idea in all statistics.
So anytime you have run like a statistical test
in your research, there's these ideas, both
of the magnitude of the effect, the size of
this effect, but also how our sample size or
how the- quantity of data we have available
to us, how that helps us think about uncertainty.
So the way you could think about it is, okay,
sure, you rock up to Gentleman's Gelato, you
might have that exceptional 4.9 experience,
but you could just as easily, if that's the
average experience, but there's not a lot going
on in there, sort of pull up and get the two-star
version of the restaurant's experience here.
That smaller sample size? is going to make
it harder for you to sort of narrow in on your
uncertainty. Basically, you see a small amount
of data, you realize you have a lot of uncertainty
about what this experience is going to be like.
Even though it looks good, the signal looks
good, that 4.9, lovely. I, as the statistician
in me, would be like, nope, let someone else
try that, it was feeling adventurous, I want
a good ice cream experience. Whereas with the
Ihaka's ice cream, even though it's a lower
value, because I have so much more data, so
much more information, other people, assuming
there's not a bunch of bots behind our website,
which is a whole different story. In this case,
I'm going, well, hey, I might have a 4.8 or
a 4.6 experience, but in general, it's probably
going to be something pretty much in that range.
Like, you know, I'm expecting a pretty solid
experience. A 4.7 is not bad at all. I wouldn't
sacrifice that difference between these two
for the sacrifice of certainty that comes from
that size of my sample size. Yeah, that makes
a lot of sense. It kind of matters in things
like polling as well. Like sometimes it can
feel like it's really a small number of people
to have like 1,000 people to try to predict
an election. But that's actually a pretty good
sample size for those kinds of questions. Whereas
if I asked five people, I would not be putting
any money on predicting an outcome for a society
level on something like that. So in that case,
suppose we're looking at news articles whatever
about a claim, what kind of sample size would
you advise people to be looking for on, yeah,
for any claim that's made. And this is one that
students ask me and I always disappoint them
on, so I'm going to disappoint you guys too,
is that while there's definitely rules of thumb
out there, there's no true perfect line here.
Everything is, there's not some place where
it moves from darkness into light. It's a lot
to do with the context. So you might see people
sort 30 or the number 100, but a lot of it also
really matters on how you're cutting up your
data into groups. So if I'm trying to make a
comment about all New Zealanders and make a
pretty good prediction for an election outcome
or some opinion pool type thing, a thousand
is actually pretty good if I'm trying to make
a claim for let's say adult New Zealanders
and I ask adult New Zealanders. But then if
I try to take that same data and make a claim
about women in their 30s who live in Auckland,
suddenly that's a lot smaller of a group with
that original data. So even though I might be
reasonably trusting of the quality of an estimate
from that larger dataset, once I'm starting
to cut it up into smaller and smaller pieces,
that's losing some of that ability to be precise
about it for me. So it's not that the data
quality has in any way got worse, it just means
that now in the same way with the only four
review gelato place, I don't have a lot of confidence.
I don't, I were, basically the number, the
range of numbers that I'd be willing to accept
would be true is gonna get really, really wide
because I just have a lot of uncertainty about
this number if I don't have a lot of data to
sort of drive my prediction. And so how you
cut the data up really matters. I'd still want
at least that sort of 30ish if I'm doing some
sort of simple group kind of discussion. But
then if I'm trying to make. more complicated
subgroups within my groups, that's definitely
not gonna be enough. So short answer is you
probably will see somewhere in a textbook or
if you ask Chad GPT, he'll be like, do you need
at least 30 observations? But it depends on
your method and it depends what you care about
saying and how. Yeah, that's really interesting.
I've never done a stats course, especially not
at the university level. But I remember when
I was doing like science and I did like high
school psychology or something. learning about
sample size. I think the thing that surprised
me the most was you mentioned it earlier, that
a lot of studies only have access to college-age
students as their sample group. And so a lot
of these general generalizations that are made
about society at large have a very limited
sample or at least range of people or observations,
as you said, because it's this one specific
group, usually in North America, I think. And
so that obviously has impacts for the rest
of the world as well. But yeah, it's, yeah,
just again, bog of disbelief, I think, it's
where I'm at. Yeah, I love that they were exposing
that to you because that also ties back into
the who's not here, who's not being measured.
And I'm forgetting what the acronym or the
initialism is, but they sometimes talk about
WASP and weird. White, is it European or educated?
I- White, educated, yeah. What's art? I'm forgetting
what R is. Yeah, I'll have to come back to
it. But yeah, it was Western, educated, industrialized.
I don't know what R is. And then D is developed,
but yeah. Yes, yeah. So basically what Alex
did a better job of describing because we both
forgot the initialism is that, yeah, you have
in your studies, the 30 to 40 people who saw
the poster on campus thought that they could
do with an extra, you know. $20 gift card for
half an hour of their time and went and did
the study or went and did the survey. And that
really limits our understanding of the world
around us and our understanding of the variety
and uncertainty. It's not all about getting
a good guess of a specific single number. Part
of the goal is also to understand the variability,
the variation between people is often a really
interesting and important question when we're
planning for a world that humans live in together.
I googled the acronym and it's Western educated,
industrialized, rich and democratic. So there
we go. There we go. R was for rich. I feel bad
now. Rich. That's the one I forgot. I'm disappointed
we all forgot to be rich today. Yeah, damn it.
It's that meme which I keep seeing these days.
I was in my mom's womb in 1995 as opposed to
buying a house for 25 pence or something. Yeah,
that's on you, man. Really unproductive financial
fetus were you. All right. Yeah. So what next,
Queen of Stats? I've got a story for you. This
one's kind of old, but I think it's a really
good one and that you'll see echoes of the kind
of issues that we as humans have with information.
You'll see echoes of this in probably what you
saw from people on social media at COVID. what
you might have seen from people around elections.
But it's one specific sort of example of how
we're not always the most reasonable with information.
I do just want to flag as a brief content warning
that I'm going to quickly mention abortion in
the context of this case study, but it's not
the main topic. But if that is an issue for
any of the listeners, maybe you can timestamp
when we're finished or something like that.
I'm putting that on future you. But yeah, it's
not a big part of the study. Alright, so this
one is old, I guess 1995, back when you were
not buying a house, silly, silly you. The story
takes place in the United Kingdom. And so there's
a United Kingdom, or at least at the time, committee
on the safety of medicines. And so if you have
people in your lives who take oral contraceptive
pills, you may be aware already that these,
like, for a long time now have an associated
risk of blood clots. So current ones as well,
this has kind of been a true feature of these.
But by improving the formulation of them or
tweaking different things, that's something
that's being worked on. But it is just one
of those things that's a risk with this particular
medication, as any medication can have side
effects. But there was this sort of, they called
it like a doctor's note, I guess, that came
out and was sort of meant for public, like the
public to be aware of. And they announced that
this third generation, this sort of third version
of the pill that was what was most widely being
prescribed, and taken by people in the UK, it
actually had double the risk of blood clots
compared to what that previous version had.
And they were only sort of seeing that now
as they had more data on adverse events from
this, but like that's a hundred percent increase.
And so this came out and this was publicized
quite a lot in the newspapers at the time.
So I've got a couple of snippets on screen.
Oh, I'm sorry that I've got my alt text has
kind of shown up for people who can see it.
But basically I've got a clipping from the
Guardian newspaper and the title is Blood Clot
Alert on the Pill and that women were being
warned about seven brands of contraceptive that
were under that third generation and this doubled
risk of blood clots. And as you might imagine
this caused some concern that people who you
know had been taking this might take this every
day of their life we're now hearing this very
scary feature of the this medication that they
were on and it was probably quite important
to them. And so there were like health clinic
lines that were full of callers, like there
was quite a lot of panic is basically what I'm
trying to get to you at this point in the story.
And so as I said at the beginning, we already
sort of knew that oral contraceptives could
cause blood costs and that was a risk and if
you had a precondition that made you more likely
to have them, it wasn't an option for you. And
the other thing here was that part of that
same message that I told you about at the beginning
where it says double the risk, 100% increase,
it also in the same message said, for the vast
majority of women, the pill is a safe and highly
effective form of contraception. No one needs
stop taking the pill before obtaining medical
advice. So Alex, what do you think people did?
Do you think they calmly waited for medical
advice? Or do you think they did something else?
I'm going to wager that they did something
else. You're a betting man, you're safe for
this one. Um, the, uh, use of this bill dropped
like 80%. People just stopped taking it. They
were freaked out. They did like, didn't want
to deal with the risk. They just stopped taking
it. There was like a run on the pharmacies.
People were trying to get the second generation
one, which, you know, was the previous one
that this new one has doubled the risk of, so
let's go back to the safer one. none of the
chemists had anything left in them. Like it
was a real little bit of a public panic kind
of moment. And now one really important thing
to consider here is it wasn't just that people
stopped taking the pill and that then meant
there was fewer blood clots. Like sure, if
it was just that, fine. But there's kind of
a reason that people take this medication in
the first place. And there's lots of different
reasons why different people might be taking
it, but for quite a lot of people, it is for
the purposes of contraception. So for people
listening to the audio alone, I've got a graph
on the screen and the title is that there was...
13,500 more abortions in the period after, or
it's sort of the appropriate period after October
1995, and that sort of flow-on effect. And so
the graph on my screen shows that in general,
the rate of abortions had been, in fact, not
even just the rate, the absolute number of
abortions in the UK had been decreasing, even
as the population was probably increasing at
the same time. So you were on trend where people
had access to contraception that worked for
them, arguably. and you're seeing fewer and
fewer abortions. You saw them in this period
after 1995 jump right back up to what they had
been five years previously before we were doing
a good job of decreasing. And then, sorry, excuse
me. And then proceed to stay high for a few
years afterwards as the information about the
fear remained. But that message, that wasn't
that first part. of like at the same time that
it was announced where it's like, no one needs
to stop taking this. It's fine. Talk to your
doctor at your next checkup. That part didn't
really manage to be part of this persistent
messaging. And I think one fact that was particularly
shocking to me, I guess, from this, or that
really gave me sort of emotional connection
to the story, was that there was 800 additional
conceptions among girls under 16. So folks
who might've been relying on this in some way,
not have a lot of access to other things, and
arguably may not have been intending pregnancy
at that time. And I found that quite a surprising
number, that this statistic and its influence
on our emotions was- a really powerful social
disruptor for a little while there. This also
just has dollar signs attached, so if you're
like, I don't care, I don't need to take the
pill, four to six million pounds more were
spent by the National Health Service in that
time, which is a good chunk of change. So this
to me is clearly something went wrong here.
This wasn't the appropriate response from the
public, although it probably could have been
anticipated. I'm curious if I sort of stop
at this point, what questions do you have? Like,
what would you have wanted to know if you were
receiving this in the media or if you had a
person in your life who was trying to make
a decision? Do you feel like you had all the
information you needed to make a choice here?
I think I'm interested in double the risk, right?
Because double the risk sounds really alarming,
and I think you can kind of understand the response
from that perspective, because that's a scary
phrase. But I'm interested in the initial risk
from the second generation before it gets doubled.
Because if it's like the risk is like 0.01,
and it doubles to 0.02, is that correct for
doubling? Is that significant? I don't know.
I love that because yes, if only Alex, folks
that had you to ask at that time, because that
is exactly the question you want folks asking
themselves. And there's a whole other area of
study that's outside of mind about like our
risk tolerance and like when we think of different
risks and things like that. But fundamentally,
if you double a really small number, you still
have a really small number. sort of briefly
what this kind of looks like, I guess, as a
picture. But basically, you've got two types
of risk that you might have reported to you
in the media. And this is a pretty common statistic.
Maybe odds are more common if you're really
into horse betting, but risk, like absolute
risk is just a straight probability. So your
risk of getting hit by lightning is pretty
low. Unless you're a New Zealand MP, it appears,
in which we've had like three of them hit by
lightning. I do wonder. But absolute risk is
just the chance of something happening. And
so for the oral contraceptive pill, you could
calculate that for yourself and what the people,
the researchers had done, was they looked at
the total number of people who were taking
that pill, and then they found out how many
people had a blood clot. And that's just one
number divided by another number. So I have
on screen some terrible animations that I still
love because I'm the one who drew them. And
so let's say we have our 10 people who are
taking a oral contraceptive pill. If three of
them get blood clots, that's three out of 10
people, that's 30%, that's awful. That's a pretty
high level of blood clots that I probably wouldn't,
you know, I too would be jumping ship pretty
quickly on a medication like that. But that's
not what we really meant here, because what
does it mean to double the risk? This is something
we call a relative risk, because it's one number
being compared to another. It's not that baseline
or that initial risk that you were talking about,
Alex. And you might hear this as like twice
the risk or a two-fold increase or any of this
kind of language of doubling and increments.
This is this relative risk idea. And so exactly
as you were trying to think through for yourself,
you're like, okay, 0.01, if I double that, that's
only 0.02. You can see that if you had an absolute
risk of 30% and that doubled, that would be
60%. Awful. Jump ship. Absolutely. If you had
a 3% risk, that's a 6% risk. I'm don't feel
as invested in that difference in this case,
personally at least, right? And that goes back
to not just statistics, but your personal risk
tolerance. But now if you're going from 0.3
to 0.6, I'm struggling to muster much of an
emotional response at all to that difference.
Like that doesn't feel like something that's
gonna overly motivate me to change, sorry, to
change what I'm doing. I don't actually have
the number directly in front of me, but the
risk, from the second generation oral contraceptives
was much smaller than even what I have on screen.
It was something like 0.1 in 1000 or even smaller
than that, I think. I'm sorry, I don't have
it to hand. I can send it to you later if you
wanna put it somewhere. And so in the end, the
doubling of that risk, while it may have some
influences on a population level across millions
and millions of people, personally, the risk
of not taking a medication like this far outweighed
or the risks associated with not taking the
medication would far outweigh, and I think on
a social level, the potentially negative consequences
also far outweighed what the blood clots could
have caused on both a societal level and on
an individual level. And so exactly as you said,
Alex, like freaking scary to have this headline
about blood clots. Awful, awful, no thank you.
But then if we don't know how to ask ourselves
these questions, we can very easily. trap ourselves
in situations where we're just making decisions
based on fear of things we don't need to be
afraid of. There's enough things I'm afraid
of on a day-to-day basis anyway. I don't need
to have fake new ones, you know? Yeah. Wow.
All right. So I have a question for you, Alex.
Do you eat bacon personally or do you have
a favorite food? I have bacon occasionally.
I don't know that I could come up with one
single favorite food. Let's take it because
I have had a look at these numbers once upon
a time. But a more recent example is, I don't
know if you ever saw it, and by recent I think
it's probably the last five years, is that bacon
was listed as a carcinogen, a cancer-causing
substance, along with, and you perhaps may be
sorry to hear this in Germany, land of the
lovely sausages, that any of those kind of cured
meats or processed meats in some way increased
people's risk specifically of... bowel cancer.
And I believe, and I'm sorry, I don't have
this number in front of me, but it was something
like a 14% increase in risk. And when I started
looking at that, I was like, the people I know
who really like bacon, that seems like not
that bad an increase or anything like that.
And then when I was taking a look at the numbers,
especially in New Zealand, and especially I
believe it was among older men, and there's
also variation by ethnicity, the baseline prevalence
or sort of if you were eating fairly healthy,
just your general risk of getting bowel cancer
is actually kind of high for some of our populations.
And so while a 14% increase is nothing as scary
sounding as a 100% increase, when I was looking
at those numbers, and if we're going to report
bias, I am a vegetarian, so perhaps I don't
have as much skin in this game. But looking
at that increase, to me, I was like, oh, that
was actually enough to like, tell my dad that
I don't think you should use much bacon because...
even though it was a smaller relative increase,
because it was still of a kind of high level
of just prevalence of the people just get quite
a lot of bowel cancer in New Zealand. It's
a real medical problem for us. That actually
had me kind of worried. So I think it's a total
overreaction to stop your pill for a hundred
percent increase, but I don't think it's that
bad an overreaction to limit your bacon intake
for a 14% increase. So that relative and that
absolute, it really matters that you have the
full. sure there. Well I was I was gonna ask
about another example just to elaborate on that
point. This is a news example so maybe you
might not have the exact things or I don't have
the exact numbers either but I remember during
COVID there was a lot of justification by certain
governments being like oh it's only a 3% of
the population is affected by this or is dying
or something like that right like they were
telling like it's only 2% or 3% of the population.
the population will probably die by this or
whatever. But then you kind of extrapolate it
to the pure numbers and you think, so you're
talking about, what's it? If you have a hundred
million people in a country, 3%, you're talking
about 3 million people dying here. You're okay
with this. All of a sudden these small percentages
actually mean quite large numbers. I think that's
such an important human example. If the question
is, okay, I am sort of discounting small percentages
in some cases. But in the COVID example, you're
like, oh, it's only 3% of the population. But
then when you're starting to think of, you
know, thousands or hundreds of thousands of
people or millions of people globally, I think
it gets to this really important interface between
humans and numbers. Because a small change,
like a change between 0.3 and 0.6 that I was
talking about earlier as not being particularly
meaningful to me, in some contexts, that would
actually be super important. And... It's one
of those things where for some stories, it would
be a rounding error. But for other contexts,
that could be the difference between a business
going bankrupt or, you know, growing and profiting.
It could be the difference between a global
pandemic and, oh, thank goodness we had that
contained and nothing happened. And I think
just as you sort of questioned earlier about
sample sizes, like what's the threshold? What's
the threshold? All of this is so context dependent
that you need to know. And one thing I really
cherish as a statistician is working with subject
matter experts in the applied disciplines that
I'm working on now. I have been talking to
this wonderful woman doing research on tuna
and how people interact with and use tuna in
Kiribati. And I know nothing about any of those
things, but I know a little bit about survey
design. And so having conversations with a subject
area expert helps you tease out what actually
matters to measure and how. And that's the same
with interpreting these numbers. And I guess
the only guide I'd have for folks there is the
same sort of media literacy you'd apply to
influencers on social media, to political claims,
even if they don't involve numbers. The key
question to ask yourself as a person who has
to navigate this information saturated world
is what is the person who's telling me this?
What do they want me to think? And why might
it be important to look farther than that? And
numbers are just one of those tools that we
have to influence people. And as Alex was saying
earlier, for those of us who might fall into
the bog of sort of overconfidence, where we
think a statistic can be used and wielded basically
as a cudgel against the non-believers, we want
to be really careful of like, why are we using
that number that way? Are we trying to present
it as the small percentage? Because we know
our reader or our listener would be horrified
by the actual numbers. No easy answer, we just
got to use our brains. as smooth as they may
be. That was nice. That was very well done.
That was very eloquent. Well, I was so thrilled
with your response, Alex. What question would
you ask? You nailed it in one. That is the question
you want to be asking yourself when you're
presented with those things. Thank you. Yeah,
no. It just feels like, I don't know, I guess
I am quite suspicious of data. I think it is
because in media studies and stuff, especially
when you're studying things like news, this
sort of thing comes up when you're looking
at how different events, let's say climate change
is portrayed by people on either side of the
issue. People who strongly believe in climate
change and are urging people to how they use
data versus, I don't know, should we say skeptics
would use other data to try and say, well,
look at this. It's periods of... heating and
warming or whatever they're saying, right?
Totally. Yeah, numbers are a weapon that everyone
seems to use. Yeah. Heck yeah. And you want
your weapon sharp and your wits sharper when
it comes to numbers, I think. Yeah. I love
how you describe that. I think from a more science
perspective, I think one of the biggest things
which changed in between me doing, me being
what we were taught in undergrad versus our
approach to looking at research in masters and
PhD was that we went from reading all these
articles and believing that this is the truth,
that oh they found this association, oh they
found this to going... we went from that in
undergrad to all of a sudden being like, okay,
so how many sample... what was their sample
size? Like, how did they run this experiment?
Or if we're doing a bio experiment, what chemicals
did they use because that could affect things?
There were so many other things we started looking
at and started critiquing, which we did before.
What stats, what actual statistics did they
use? Like what analysis did they use? We were
also, are they hiding any data in the graphs
which they presented? Oh, you kind of changed
completely from what you're taught in undergrad.
And I feel, I think all of us would have been
a lot better off if they had taught us those
things beforehand, because then all of a sudden
we wouldn't be so scared when we read these
massive eye catching like... stories and like
all these weird graphs which show up without
a correct X and Y axis. Yeah, I feel like there's
a real missed opportunity for so long in terms
of that helping people learn how to understand
statistics and how they work. Yeah, that's so
well said. Now Alex, I did a little bit of
sneaky looking up of you before I actually came
on this call. Would I be correct in thinking
this? I swear this is related. Sahir, to what
you just said. Would I be correct in thinking
you are interested in the ATTRPG, tabletop multi-game?
Oh my goodness, you're 100% correct. Yeah,
yeah, all right, well, I was very happy to see
that because I didn't have to plant some knowledge
in y'all beforehand. So Sahir, when you have
done statistical, or read statistics specifically,
even in your own, whether it's reading other
people's work or doing your own work, have
you... come across something called a p-value.
Yeah. Have you learned to kind of hate and
fear them in the same way that I have? I mostly
hate them, sometimes fear them, but yeah, I
tell all my students that it's not scary, it's
easy and trust me, but at the same time, inside
you know that it's a bit more.
Have you ever heard the quote about democracy
that democracy is the worst system except for
all the others? Yeah. That is basically how
I feel about the p-value. Although if you have
any Bayesians in your audience, which is like
kind of the version of statisticians that like
wear leather jackets and they smoke if it's,
if smoking wasn't bad for you and is cool.
Like think sort of Greece. lightning greasers.
Like they're the cool kids and I'm not a Bayesian.
I'm just a boring old normie frequentist. But
anyway, the p-value. It is the worst metric
except for all the others. It's the best one
we've got. But the reason, Alex, I wanted to
ask you about tabletop games, specifically ones
using dice, is as someone who's interested
in this gaming, could you tell me what the chances
of getting a natural 20? Rolling a 20 on a
20-sided dice. You don't have to calculate it.
I know, I think it's 5%. You got a one in 20
chance, obviously. I think that's 5%, right?
Yeah, absolutely. Right. One in 20 is exactly
right. Assuming you brought a fair dice to
the table, which I'm sure we all would do here.
You're exactly right. It's a one in 20 chance.
And so that I think tends to like, it's special
for anyone who's listening, going, what are
they talking about? Getting this 20 is like
a pretty special experience in gaming, because
usually it means something good, and it's rare
enough that it's special if it happens, but
it's so rare that it's impossible. Like you'd
still be like, maybe I'll get one, even if
you try something impossible. And that kind
of vibe is basically the vibe that underpins
most statistical research. We use this 5% threshold
to say, if this was true, so we make some statement
like, okay, if in this research, there's really
no difference between the drug I'm testing
and a sugar pill. Sorry about the window again.
If there's a really no difference, how likely
is it that I would see the kind of result I
got in my study? And basically Alex, what we
do in statistics, in applied statistics, is
a lot of the time if we say if it's less likely
than getting a nat 20 that I would see this
result if nothing was going on, I'm going to
claim something is going on. But the thing about
this is that it is still uncertainty. It is
still a roll of a dice. I could see a really
big difference between two values. just by
chance, or I could fail to see a really important
difference because I don't have enough data,
or I just got unlucky with the rules of the
dice the universe was giving me. So we use
it and a lot of folks, especially who come through,
I think psychology is probably very much persecuted
by the p-value. I think we sort of raise our
young researchers to have this 0.05 threshold.
And I think at the back of their heads, it must
be in some stone scroll or like stone tablet
upon a hill. Surely there is some reason why
we devote our lives to the worship of the 0.05
threshold. And I've heard two stories about
why this is the case. One is that back in the
day when statistics books had to have tables
in the back of them, so you could like look
up different numbers when we didn't just be
able to put it in a calculator. The version
I first heard was whoever was making it did
a couple columns. got tired and figured that
was a good place to stop, because yeah, it's
pretty good. I have not, I may be absolutely
spreading misinformation, but it's joyful misinformation,
because I have not found anyone online who
corroborates that. The other part of the story
I've heard is there was a little bit of a copyright
war between two statisticians in the 70s who
weren't really getting along. Statistics tea
that's like 50 years old for you, so piping
hot. And basically the one guy was like, I
can't just copy the table the other guy did
because he'll sue me because he's like that.
So I'll just pick a couple of columns and make
my own version and it will be different enough.
And then in trying, and I think the underlying
effort was to try to make it a little bit easier
for non-statisticians or new statisticians.
But in this effort to simplify, as is the case
in I think all fields of research in life, in
the effort to simplify some of that nuance
is lost. And so now we worship at the altar
of the nat 20 effectively, Alex, um, a lot
of our research without, I think, as sort of
Sahir was saying earlier, we sort of get our
students reading these papers or reading this
research and kind of thinking about it as like
these researchers have done this perfectly and
these findings are perfect. And this is truth.
This is bog of overconfidence. And then the
farther in you go, you're like, we're doing
it on vibes, man. One in 20 seems kind of cool.
Let's make our whole discipline do that. right?
So I think it's so important to kind of dismantle
some of this, like, the ivory tower that statistics
and academia is in to try to give people not
so that they doubt us, right, not so that they
think, oh, there's no point listening to scientists
or listening to researchers. But so they kind
of realize it's a bunch of folks doing their
best to give you the best information they
have to hand. And even when they do their best,
they will be wrong. This is why we collaborate.
This is why we iterate. And this is why we don't
do something only once. You missed out. You
could have said, this is why we replicate in
order to rhyme really well. This one reveling
turn. Take it from the top boys. Yeah. I'll
save that one for my statistics. Dr. Zeus
book that, you know, publication imminent for
sure. Sounds good. Great. So you've got another
slide up. I think because we've got about 10,
15 minutes left, maybe we can go through one
last example on what you were talking about.
Yeah. And yeah. Does the music one sound like
a good one to end on? Let's do it. All right.
So Alex, what kind of music do you listen to
if you don't mind me asking you? I'm mostly,
oh gosh, I mostly listen to indie rock. Indie
rock? Yeah. And lots of British rock and stuff
like that. I love, I love. Any bands want to
drop for the discerning among us to go frantically
check out later to act like we're cool? Well,
I would say my all-time favorite band would
be Wolf Alice, who are definitely in that rock
genre. Yeah, and very indie, I think. Gosh,
I can't think of anything really bigger right
now. I'm listening to a lot of Hozier at the
moment. Oh, they're a man of good taste for
sure. How old is Wolf Alice for that? Is it
a band? Is it a single person?
They've been making music for a decade now and
I think they're in their mid 30s, maybe early
to mid 30s, I think. I'm not sure. Yeah, perfect.
And that was actually going to be the exact
next thing I asked you, so thanks for setting
that one up for me. Okay. Yeah, so Indie Rock
certainly isn't a completely new genre, right?
Because even just your example, they've been
around for 10 years. But I don't think my Nana,
who's turning, I believe 88 next month. She
wasn't listening to a lot of indie rock back
in her sort of teens, 20s or anything like
that. Does that seem a little fair? She could
have just been missing out, but I think it
was generally because there wasn't a lot of
indie rock back then. Yeah. Now, the reason
I've sort of wanted to have that little bit
of a conversation first is there is a chart
I have on the screen, and this was really sort
of doing the numbers on social media, actually
probably about 10 years ago, I think, at this
point, maybe a little less. But it was a really
interesting study that someone had done looking
at kind of like the survival of musicians.
Now, have you guys heard of the idea of like
the 27 Club, like Kurt Cobain and Amy Winehouse
tragically dying really young? And there's-
Jimi Hendrix as well? Oh, I think you might
be right. Gene Hendrix as well, 27? Yeah, so
some like iconic basically members of this
pretty tragic club idea. And so- I think that
probably you might also have the general sense
that certain types of music are associated with
like harder livings. There might be like drug
culture or it might be a lot more common to
be shocked or something like that. Like it
could be dangerous to be a popular musician
in certain genres and certain places in the
world, right? We don't tend to think of jazz
as that dangerous in the modern era. I don't
know about you, but I haven't heard too many.
jazz musician rivalries ending with a shootout,
at least this year, you know, maybe last year.
So the graph that I have in front of me and
the last good thing I wanted to share with you,
because I think it comes up a lot in things
we see in our lives, but isn't necessarily something
we even teach in like an intro stats course
or teach people or do a good job of teaching
people. And I pick that blame on myself. So
on this chart, you have a range of different
genres from blues at the far left end, down
to like rap and hip hop on the far right. And
the graph, there's a couple of things wrong
with this graph and I'll, there's a link that
I can share if you want to go deeper into it.
But basically the sense of this graph is you
can see that U.S. female and male life expectancy
has sort of increased over time. But you can
see really big differences between musicians
who are like blues, jazz and country who are
kind of doing pretty well for the national
averages. And yet you're seeing this really
shocking sort of dip for more, what you might
associate with harder lifestyles these days,
perhaps, in punk, in metal, in rap, in hip
hop. And so the reason this chart was so popular
is, and we've just had this conversation, right?
27 Club, you can see in your head, how many
rap songs do I know that are about drugs? A
lot more than I know that are about jazz. Then
again, Frank Sinatra got up to some stuff with
the rap pack. But anyway, this, if you take
a quick look, sorry, you can see where my musical
reference points are coming in. Take it. look
at this graph, imagining it comes up on your
Facebook feed or at the time it was published,
your Twitter feed. It really sort of confirmed
things that people kind of already might have
had a vibe on. And so they've got shared and
was really used to kind of, you know, beat up
metal, rap and hip hop for those terrible lifestyle
choices. The thing that we sort of talked about,
Alex, though, with the genre that you like
with indie rock, is it's not that old a genre.
So some of the sort of leading lights in it
are probably not even that old, right? So the
band you were talking about are in their 30s.
You could argue to your blue in the face about
the origins of rap and hip hop, and they all
have deep historical origins. But as popular
music genres that are recognizable, they're
not very old. No, the 70s maybe. So for you
to be an artist in that genre, you haven't
had a chance to live to a ripe old age yet.
You cannot be a hundred and have started as
a rapper in your 20s at this point. That's just
not, unless you're doing some time travel,
that's probably not going to math. The math
is not mathing. Absolutely can be a hundred year
old former jazz singer. Um, because jazz has
been around for a long time. And this is, um,
it's the subset of things that are sort of the
survival or selection biases, but this is a
specific subset. So the vocab I'm going to drop
on you is this idea of right censoring. So
censoring, you might think of in Alex's like
in media as if I was putting my black redacted
tape over text. And it is kind of that idea.
I can't see into the future. No matter how
good I get at statistics, I do not get that
power. And by my inability to see into the
future, I, it is censored to me. It is redacted.
It is a black line on the text of the future.
And so I can't see how long the people who are
still alive are gonna live. I can only see
the people who have currently died. And because
it's a young genre, if you've died, you've
died young at this time. So 400 years from now,
when the ethnomusicologists do this graph again
for these exact same sets of genres, and we've
far moved past them into super techno solar
cyberpunk, we will probably see all of these
look pretty similar because they will just
be people who are making music and dying like
all people do. Is that an upbeat thing to finish
your podcast on? Yes! And music, and rock and
roll! I mean, kind of. It's good to know that
Snoop Dogg and Eminem and all will probably
live to this average age of most jazz musicians.
It may be reassuring. Yeah, we can have that
for a while longer. I'm so thankful. I guess
with the graph as well, as it shows... increasing
life expectancy as well as life expectancy
goes up. You might even see a hundred years
down the line, some rock and hip hop and all
of these artists actually exceeding or going
over, right? Because, uh, yeah, the music they
place probably not the driver of their longevity.
Exactly. As you're saying, the context they
live in better medicine. We know you shouldn't
smoke now. Like all of that societal knowledge
is probably going to be a bigger driver. for
the majority of musicians for what their long-term
health looks like. Not just that they are like
cool and hip and slightly emo bands or whatever.
That's such a good insight. Cause yeah, that's
exactly right. Those ethnomusicologists 400
years from now might see something that looks
completely inverted from this where you go,
wow, what good health all our hip hop stars
were in. How delightful. Wow.
I hadn't even realized, I was trying to make sense of this graph myself.
It's a tricky one. I don't love it. What could be the thing?
Yeah, no, that's great though. The youth of the genre means that you just – that line you said, you only have the
data of the people who have died and because the genre is young, they've died young.
So it's going to skew it completely. I just didn't even make that connection.
Wow. I think that was a great one to end on.
Your first thing is already like, oh yeah, this
agrees with my preconceptions. And I will be
fair to the original author that she does mention
that, or I believe it was she or the author
mentions this as a confounding feature, but
you can't start from this chart because as
we were talking about before, anytime you attempt
to simplify, you lose nuance. And that's actually
really a hard kind of thing to get across in
a graphic. You'd really have to change the
graphic here. And so I think whoever probably
put this together was doing the best with what
they had. But a really key part of the understanding
of the way to think about this data in order
to understand it, completely gone if you don't
have the little questioning voice, the little
statistician who sits on your shoulder and asks
you these questions. Would a more, let's say,
quote unquote, fair, or I don't know if that's
the right word, graph be something that looked
at the relative use of these genres and then
went back to something like blues and said,
let's say 30 years, and then looked at the debt
they, I don't know, I don't even know how you'd
do that, but comparing the same span of time
would probably provide a more equal result,
right? That's a fantastic idea, because exactly,
if you were interested, if you changed your
research question a little bit, and were like,
which genre of music saw the worst premature
death rates? And you go death before 30 or death
before 35, pick something that seems kind of
fair, based on some of the genres you're picking.
And that's a fabulous question, Alex. And that
would give you probably a really different insight
where it's old, old blues or jazz. You might
be seeing people who are still suffering from
tuberculosis. People who are dying in World
War II when like, you know, there's all these
other features that exactly as Sahir was saying
before, you might actually see that there's
lower premature mortality in some of our modern
genres because these folks haven't gone to fight
in a war. They have access to treatment for
tuberculosis. Like a lot of the things that
are actually driving mortality here are not
your taste in music. Wow. That's such a good
question. That was really good. I think that's
a good place to end on. I think so. Yeah, absolutely.
That was great. Just like letting me run wild
on all my favorite topics. How cool is this?
Well, we can let you run wild for longer. Maybe
you can... have you back on next time with more
examples. We haven't covered so many- Don't
tempt the girl. We haven't covered so many things.
I saw a slide there called P-hacking, which
is a whole different thing. We just touched
upon the P-value, but we didn't even get into
any of the scarier stuff of how things, data
can be manipulated. Oh yeah, there's a fantastic
interactive that either you can pop a link into
it, or yeah, if your listeners will let you
have me back. Believe me, could I tell you some
more stories about what you can do if you want
to do wrong? Awesome. Well, before you say goodbye
to the listeners, last thing which we will
ask you is if you have any one piece of advice
for them before you go and that's how we would
end. So yeah, thank you for this question. What
I hope folks might have got from this or might
be able to get from other sources is you don't
have to have been a math genius to be a good
statistical thinker. And I want for all of us,
if I was ruler of the world or had my magic
wand, good statistical thinking would be the
wish. If I was the world's fairy godmother,
let's make it a little more benign. I wave my
little wand and we are all more confident and
thoughtful statistical thinkers. On the other
side, if you are a math whiz, that doesn't
automatically make you a good statistical thinker
because of comfort with uncertainty. the ability
to see what's not there, and a sort of questioning
mindset. Might be part of your good math set,
but if you're just a timetable's whiz, that's
not the same thing. So I hope your listeners
are a little more enthusiastic. They have perhaps
the beginnings of a statistical smile after
this conversation, and maybe a little bit more
confidence that they're asking good questions
and that there are good ways to interrogate
the data that is being presented to them. Half
the world goes to... goals this year in 2024
when we're recording. And I think there's no
more important time than present to be more
confident in our ability to make good decisions
and to have data as one of the things in our
toolkit to make those good decisions. Awesome.
Beautiful. Well said. Very well said. Thanks
for letting me say it. Awesome. So thanks guys.
Thanks Alex. I hope you enjoyed that. I loved
that. It was really great. Super fun. Okay.
I hope you enjoyed Dr. Boat in stocking you
online as well. Or... Yeah, well, that was
a genuine surprise. I love your substacks! Shout
out to Alex's substacks! Yeah, Alex has got
a few podcasts going as well. You should follow
all of them. Yeah. Thanks. It's actually really
interesting. I've spoken to quite a few people
about statistics and dice things as part of
those projects, which has been really fun. I
literally coded up one... to make better decisions
in Baldur's Gate 3, because I was trying to
figure out whether using bless or having advantage
was going to be better for one of my skills
to X. So,
that way is a slippery slope to further statistics,
my friend. Oh, delightful. All right, awesome.
Thanks, guys. Thanks everyone for listening
and yeah, see you guys next time. Thank you.