#86. Thinking Together: Humans, AI, and Better Decisions - Dr. Eeshan Hasan Artwork

Smooth Brain Society

In an attempt to change the way information is presented, we’ll be speaking to researchers, experts, and all round wrinkly brained individuals, making them simplify what they have to say and in turn, hopefully, improving our understanding of a broad range of topics rooted in psychology. Join us as we try to develop ourselves, one brain fold at a time.
Instagram: @thesmoothbrainsociety
TikTok: @thesmoothbrainsociety
Youtube: @thesmoothbrainsociety
Facebook: @thesmoothbrainsociety
Threads: @thesmoothbrainsociety
X/twitter: @smoothbrainsoc
https://linktr.ee/thesmoothbrainsociety

All Episodes

Smooth Brain Society

#86. Thinking Together: Humans, AI, and Better Decisions - Dr. Eeshan Hasan

April 16, 2026 • Smooth Brain Society • Season 2 • Episode 86

0:00 | 1:16:23

We explore the rapidly evolving intersection of human psychology and artificial intelligence with Dr. Eeshan Hasan,of the Ohio State University, specialising in computational models of decision-making.

We dive into how humans and AI can work together more effectively, from improving medical diagnoses using “wisdom of the crowd” techniques to tackling misinformation and distortions on social media. Dr. Hasan shares insights into how both human and machine intelligence represent information and what that means for the future of healthcare, and human-AI collaboration.

This conversation unpacks big questions: Can AI help us think better? Where do human biases still outperform algorithms? And how do we design systems that enhance, rather than distort, our decision-making?

If you're interested in the future of AI, psychology, and the way we navigate an increasingly digital world, this episode offers a thoughtful and accessible deep dive.

Dr. Eeshan Hasan: https://eeshanhasan.com/

Support the show

Support us and reach out!
https://smoothbrainsociety.com
https://www.patreon.com/SmoothBrainSociety

Instagram: @thesmoothbrainsociety
TikTok: @thesmoothbrainsociety
Twitter/X: @SmoothBrainSoc
Facebook: @thesmoothbrainsociety
Merch and all other links: Linktree
email: thesmoothbrainsociety@gmail.com

0:03

Hello everyone. Welcome to the Smooth Brain Society. I am leading this episode, Mr. Hussain, being co-host, Dr. Hussain. And our guest today is Dr. Hasan. But today we'll be talking about the fascinating frontier where human psychology meets artificial intelligence. To do this, we have, as earlier mentioned, Dr. Hasan or Dr. Eeshan Hasan, postdoctoral researcher at The Ohio State University. He specializes in computational models of how we make decisions. What makes Eeshan's work so timely is his focus on human and AI collaboration. In his work, he has used wisdom of the crowd techniques to sharpen medical diagnosis. And some of his current work examines representations in human intelligence and artificial intelligence, as well as studying how we could address cognitive distortions on social media. he's been published on top tier journals. And today we are going to dive into how his research is shaping the future of healthcare, digital truth, and the way we interact with machines. Eeshan, welcome to the show. Thanks for having me, Sahir and Amer. Awesome! Well, so to start off, don't you, well, for those of you who don't know, well, most probably most of our listeners, Eeshan, Dr. Hasan and Dr. Hussain and myself, Mr. Hussain, grew up together, all together as kids. So we know a lot about you, but could you give the audience a little taste of your origin story? Yeah, we do know each other. We go way back. So I was always interested in mathematics, even growing up. So I started studying mathematics. That's what I did my bachelor's and master's in Hyderabad, in the University of Hyderabad. And then I took this one course in philosophy, which made me question maths and science. It basically said that, hey, mathematics is sort of true regardless of what we observe in the real world. So then I started asking myself, what was I even studying in the first place if it didn't have anything to do with the real world? So I sort of wondered something that was relevant, something that actually interacted with the real world. So I moved to the very applied data science field and I worked for a year as a data scientist. But then very quickly I realized I sort of wanted to answer more abstract theoretical questions. And the questions at that time were about machine learning, learning, intelligence. And I felt like there are these two related fields of psychology and cognitive science that were really answering these at a very deep theoretical level. So I decided to do my PhD in that. So I started off at Vanderbilt University and then I moved to Indiana University and now I'm at Ohio State. I think a good place to start in terms of your research journey is some of your earlier work, which focused on this... notion this on so for from an AI perspective my background consider this ensemble learning or group thinking right so I have a softball question to start given the current trajectory of collective thinking we see around us and the news and on social media will crowd-based thinking be even capable of in a literal sense, raging against the machine once AI takes over and therefore lead humanity to its inevitable conclusion. So collective intelligence or crowd-based thinking sort of is thought to underlie a lot of our systems, right from democracy to stock markets to Google searches and things like that. And to sort of function effectively, you kind of need to extract it in an intelligent way. And I fear that right now we may not be doing the best. So I'm actually worried about this and I think that we may have to really keep this in mind while designing a future technology. Yeah, fair enough. just wanted to throw you off there. Apologies. circle back. Yeah, it So for instance, the stuff on social media, sort of, for collective intelligence, you really need somewhat of independent evaluation of evidence, right? Like you sort of need independent decision-making. And if you don't have that, and if all these decisions are somewhat correlated, then you don't get collective intelligence necessarily. And if you have extremely effective social media, that sort of amplifies some weird voices. you don't necessarily get collective intelligence. That's very true. So I think let's take a step back for the general public and start with what with the foundation of this research, right? Collective thinking. Let's start there. What did you look into? So there are two different things that come to mind when you think about collective thinking, right? One of them is where you have people sort of interacting maybe in a group and making a decision as a group. But what I studied was sort of non-interacting individuals who kind of give a vote. And that vote is sort of used to do something intelligent. In my case, it was to sort of produce high quality medical data sets. So this happens in medicine all the time. You have a doctor and the doctor is confused so they ask someone else for a second opinion and that other person gives their opinion and somehow their collective opinion can sort of outperform an individual doctor. This is like the basis of collective intelligence. What we showed, fascinatingly, in our specific tasks was that even novices, if you put enough of them together like undergrads, they can outperform experts in something like medical image classification. Well, so let's look into that cohort. Let's look at that collection. So what was the sample like? So what type of people did you involve in the study? So I mean, during my PhD, I had two major papers on this. One of them just used undergrads. So these were undergrads at Vanderbilt University. I mean, admittedly, they are not the most random people. They are kind of smart undergrads, but they're definitely not medical doctors. And we sort of trained them to classify white blood cells. So these are cells that are sort of very important to diagnosing cancer. And they kind of went through a very quick, like a 10, 15 minute training phase where they just learn through feedback. So they were not given like, here's a cell, this is the morphology, this is a DNA, this is what the DNA does. It was literally like, this is cancerous, this is not cancerous, this is cancerous, this is not cancerous. And then at the end of that, they kind of started making their own decisions. And we combined decisions across people and we found you can have a wisdom of the crowds. You can also have a wisdom of the inner crowd where you can combine two decisions made by the same person and outperform their average performance, you know. Wait, wait. So what does that mean? uh Just for somebody who doesn't know. So you're giving people, you're showing people images of white blood cells and you're saying, OK, this one causes cancer, this one doesn't cause cancer. And when you say that, when you're talking about inner inner wisdom, you're saying that the same person over time is basically going to. sorry, not not over time, but the same the decision on one cell. over time is going to get better or what does inner wisdom mean in that sense, compared to the average? question. Great question. So, so you do a training phase, which is where you sort of learn the task. And then after that, you're sort of just testing it without feedback. Wisdom of the inner crowd is how can you sort of use one mind and sample information from one mind to sort of get an accurate decision. And the basic argument is that if you get two samples from the same person, and you sort of take the more confident decision. that confident decision would actually be a um decision. It sort of improves performance by like 2%. So you ask the same question twice and you have two decisions on it. And you have the confidence on each of these decisions. You consider the more confident decision, it sort of improves performance. So this is called the maximum confidence slating. Okay, so you're also basically measuring how confident they are in their choice when you ask them yes or no, if something's cancerous, do you think it's yes or was there any, I imagine there'd be bias in when you're sampling from taking answers from the same person. this the same set of questions? I imagine so, right? Yeah. So wouldn't that initial answer, the bias from the initial answer affect. uh in the results or how is that taken into consideration in terms of determining this overall confidence level. So, wisdom of the crowds technically is not about bias, it's about reducing noise. so humans sort of make probabilistic decisions and so even if they make a decision, it's sort of not necessarily the best decision. So if you repeat the same decision, they sort of make a different decision and that's what you leverage, right? And so if there's a bias and that person sort of is making biased responses, then combining those two decisions will actually not really help. That's why it's better to sort of combine decisions from two different people. than from the same person. However, it's also interesting that you can sort of resample decisions from the same individual, right, to improve performance. I mean, I guess that comes to the thing of if I if I would have done this again, would I have done it differently kind of thinking? And then if I had thought that I would have done it differently when it comes up again, I would I mean, that kind of using hindsight to then influence your future decisions or whatever. Right. Exactly, exactly. it's, yeah, I don't think in this case they would remember their old decision because they made like 300 decisions in say block one and then 300 decisions in block two. And you kind of make these decisions pretty quickly, like maybe two to four seconds. And they all look, you know, like these blobby images. So I don't think you'd remember it in that way, but it might have implications for that as well. OK, awesome. so the key thing which you said was the students just ran. mean, yeah, random students were outperforming experts or like were. So were they performing in outperforming individual experts or like groups of experts? How does I'm interested in? Yeah. great question, groups of experts outperformed these undergrads. So they were way above. They could sort of start matching a single expert performance. OK. So in a funny way, in a funny way, I mean, this is a very specific one case thing, but in a funny way, if if you had an issue, you could you might as well ask 300 students versus a single doctor and you might get the same responses or almost a better average response from 300 students. Something like that, right? So if they have a little bit of task training, a little bit of perceptual like training in this one specific task, then in our studies, we at least find that they might start performing, outperforming experts in these very, very specific kind of tasks. Right, so I'm not sure if you can actually go and get yourself diagnosed with a certain condition, but I might imagine, you know, that if you have some problem, like, I mean, no one has done this experiment. but it'll be interesting to check it as a collective intelligence kind of a thing. Whether you have some kind of a problem and you kind of post it on some sort of a platform and individual people sort of combine the little bits of information that they have and they might actually be able to help you out over there. But I mean, we don't know the limits of these things, right? So in our study as well, we had about 30 undergrads and you could sort of plot the performance curve and an individual person is about 60-65 % accurate. And if you put all of these 30 people, they were about 80 % accurate. So they were doing pretty well, but the group of experts were about at 95. So I've always wondered what that gap is like. When does the wisdom of the novice crowd sort of never get something right? Is that, when does that happen? Right. was given the given especially the questions you're asking seem to be in the field of medicine right and medicine is uh is a practice right so there's no there's no at this point there's no like the perfect doctor right the perfect solution to a problem so that is that does raise a very fascinating question of a field that keeps growing how the experts grow with it and where the gap is between the novice and the expert, which increases or decreases over time, how that would change. Yeah. Exactly. And even experts disagree with each other. That's like one thing we found, right? So if you have two experts and you sort of need wisdom of the crowds, that requires that these experts don't always agree with each other. So there may not be a right answer in some sense. So that's itself a little difficult to establish. And then what are the limits of what novices might be able to do with a little bit of training? So now these novices, these undergrad students, are they from a particular school? So the School of Psychology, the School of Medical Sciences, Arts? Mm-hmm. I don't think we asked that question, but I think, this was done on Sona, which is this university platform and mostly undergrads in psychology do it for course credit. So it's probably the psychology students. Yes, I mean, depending on this university, they probably don't have much cellular experience. So pretty novice that way. I don't know how your university works. But so just medically, I'm trying to think of the relevance of this, uh because you said that people showed like you should people saw like 300 odd images of cells, but quite quickly, like four seconds, five seconds had to make a decision. In the real world, usually, guess, an expert, even in a novice if you show them the cell, they might have ages to make a decision. Does that play a factor in decision making, just the sheer speed of it in your experimental condition versus ah what real life might look like? I haven't studied this myself, but there's a big question of what speed does. So you can sort of draw these really nice speed accuracy trade-off curves. So you can sort of be quick and maybe inaccurate or slow and accurate. that you can sort of, and draw those things and try to figure out why is it that people are making those kinds of errors. So typically performance goes down as you force people to speed up. I've analyzed some data from that kind of a data set. um And there are interesting cognitive models of how people kind of make the trade off So you mentioned how this particular aspect of your sort of research in this space, there's not a lot of literature in it. So would the next steps be sort of trying to find a different route? So I imagine there could be certain innate environmental biases for taking up undergraduate students, right? So would the next sort of expansion be trying to get a cohort of vocational or just high school graduates or high school students or just a larger population as a whole? Great question. we actually, so this has an application. There's a commercial company called Centaur Labs, which actually produces really high quality medical data sets. And the way they do that is with the wisdom of the crowds. So we work with them to sort of validate this technique. So the question over there was not, can you sort of lower expertise and improve performance, but rather can we sort of have an app-based interface where people sort of log in? and participate for performance-based rewards, you like little competitions, and can that be used to uh annotate datasets? And this was with a different task. It was a skin lesion classification task. So melanoma, non-melanoma kind of a classification, seven different categories. And so most people on that app were actually medical students, but there were other people who were also non-medical students. There were some people who were like super experts. So it's not a completely lay crowd in some sense, but the wisdom of that crowd also outperformed an individual dermatologist, like a board certified dermatologist. So yes, this has applications. even though I've moved on, my lab is continuing to collaborate with them on this. So that begs the question, do you think there is a crowd which would see these results? I'm talking specifically a crowd in the field of medicine which should look at these results and go, hmm, maybe we should be training our doctors better rather than sort of reckoning with the fact that maybe collective intelligence could be as good as a singular expert, not a group of experts. So, I mean, I think if you can improve education, that's obviously going to improve things. Of course, that would be better to bring in better technology to train people better. However, um you also need to acknowledge that doctors are humans too, right? And there was this big report which came out say in the beginning of 2000s saying that, hey, look, we can't expect doctors to be completely perfect. We have to sort of design systems around them. knowing that they're humans and that humans might make certain kinds of errors, right? So now we are in the age of AI where we are trying to produce these really high quality datasets. And we sort of have to remember that we can't sort of have this one doctor which gave you the truth, the correct, the gold standard label. You might need a group of experts and you might need to sort of acknowledge that they might make mistakes, individual experts. Okay, so this this flows on nicely to sort of like the next step, which is so then do you use sort of this crowd source knowledge to train artificial intelligence or train these models to protect doctors or the errors an individual doctor can make? Yeah, yeah, we are working on things like that. So you can actually train these machines on the wisdom of the crowd output. So something that's um not just a zero or one, but something that actually gives you the uncertainty. And we are trying to see what the implications of that are. So in this other project, we are looking at what happens when you have a really accurate AI that is giving you these confidence judgments that was sort of trained on this crowd thing, right? And the AI is extremely accurate, like 95 % or something. So it's confidence judgments are like 95%. But these individuals, these participants were mostly undergrads, I think. no, this was like an MTURK population. these humans interacting with this really high quality AI system kind of learn to also become overly confident in their own responses. So the way you can address maybe human overconfidence is by actually reducing the confidence with which AI sort of report their own judgments. So let me take a step back. So I think what I'm saying is that you can use wisdom of the crowds to train machines. There's a big question of how those machines give outputs. And if machines kind of, you're not careful, humans might sort of become overconfident because these machines might be somewhat really accurate on these tasks. the thing of would you like a machine that provides accountability for its decisions versus, oh, it's always 100 % accurate, so I don't really need to care about what underlying thinking is. I really care about that. know, I... Machines that kind of just give you an output and not really a reason is difficult to trust, um difficult to debug. a school of thought in the field of AI, though. Does it really matter if the machine is right 99.9 % of the time? That's still better than false positive rates of most tests, right? So I personally agree with you, but that is a school of thought in the AI industry. So one possibility is that we give off certain tasks to AI. And I think it's on a case by case basis and we might have to sort of do it. But as long as we sort of have humans in the loop, you have to worry about the utilization of the AI machine. And you have this tricky balance of over and under utilization. Over utilization is where you sort of over rely on the machine and you don't even use your own judgment. Under utilization is where you just completely ignore it. um And it's actually very tricky to define design systems that are at that perfect level of utilization, right? I want to take a step back. You both are AI people. I am not. So I am trying to think about how is sort of like a crowdsourced trained AI or a group, yeah, a crowdsourced trained AI different from how we train AI now in terms of like a generic term of how most AI models you see now. What's the difference between these two? And yeah, I'll stop there for now. I mean, at one level, a lot of the AI right now are crowd sourced, right, in different ways. So for instance, you have this ImageNet train. ImageNet is like a big database of everyday objects, like cats, dogs, ships, airplanes, and that's what was used to train ImageNet. So in some sense, that was like a crowd sourced thing. You also have LLMs, which sort of trained on this large corpus of language and things like that. So in some sense, you have... different kinds of collective intelligence are being used to train these machines. in medicine, again, in medicine, you have a wisdom of the crowds kind of approaches to where you kind of don't have this one expert, but you have many experts and you sort of take the mean of that. Right. This was formalizing it and studying it a little more um seriously. But but the very lay on you can compare this to like a few other sort of intuitive ideas where You have a big data set, you want to produce labels, so you go ask this one expert, they give you labels, right? So that is kind of dangerous. In fact, what we show in our study at Bitcentor is that if you go get the best performers, like you sort people by accuracy and you get their best label from that one person, that again could be a little dangerous. That could also have like lower performance in terms of AI. Partially because you have these selection biases. So you pick people up based on the training accuracy. based on what they did in that one task where you kind of knew what the correct answer was. But then when you are looking at the test accuracy, it might be slightly lower because there was a selection bias. The other issue is that if you just have one label, then you don't have a measure of the underlying uncertainty. And those kind of AI systems are kind of thought to be a little less robust than systems that kind of have a more sense of the underlying uncertainty. I want you to take on this, Eeshan, just to extend on the point you made about crowdsource versus these old school images, in some sense are crowdsourced as well. uh In those earlier data sets, are predominantly classification, uh tasks, right? So this is an image of a vase or this is an image of a cat, this is an image of a dog, right? You can rely on fewer experts because not every expert has all the time in the world to just do these sort of labeling tasks, right? So those are tasks which generally would have naturally have a high confidence. level in their decisions. So even if there's some errors, a second or third opinion can amend those. But this opens the door, this collective form of gathering, this collective thinking form of intelligence gathering can open the door to have training on more subjective topics. For example, at the end of the day, medicine is a practice. So there could be some cases of 100 % confidence, but There can also be a case of 60 % confidence, where maybe crowd source thinking can just elevate from 60 % to maybe 70%, 75 % to add that robustness to those types of data sets. Yeah, you're right. I think if you're sort of thinking about labeling, annotating a data set of cats and dogs, I think it's not as difficult of a task, right? So you kind of maybe need to just make sure that there's not like a random mistake that a person has made. But with something like medicine, you kind of have many features that you're considering and you're kind of waiting each of these different features. And every time you see it, you might sort of think of slightly different features. um So that's maybe why people sort of disagree with themselves, right? Sometimes, which is what we were talking about right in the beginning and disagree with each other. um And that's why it kind of helps to sort of take many, many samples. And you get this really nice relationship between accuracy and number of people. I but but you said it kind of hits a limit right of accuracy at one point it doesn't with people it doesn't go up to 100 percent does it? It I mean so. We had 35 people right? So you kind of see that it goes it. It seems like it's. So the way these work is that they often. They don't grow as quickly, so getting 1232527 sort of gives you really nice improvements, but then the 7 to maybe 15 and then 30 and then the next big improvement might be around 100. So you kind of have this kind of diminishing returns in terms of. each individual and the person and the increase in accuracy. So I don't know if we have like a thousand people, will we actually hit a hundred percent accuracy? All I know is that the 35 people, we were not there, right? So. I imagine just the floor raises rather than increasing the limits. Maybe. So there are these cases where wisdom of the crowd kind of fails and people have some idea of why that might happen. But these are thought to be wicked cases, right? So these are subtle cases. um One example in general knowledge is if you ask people, hey, what's the capital of Australia? Most people might say something like Sydney, right? Or Melbourne. because that's what comes to mind. But the correct answer is Canberra. the people that say Sydney might actually be... So these are where the crowd consensus is kind of wrong. And the people that say Sydney might also be more confident. And the people that say Canberra might also be less confident, right? So it's an open question of like, why these happen? When do they occur? How often do they happen in medicine? And what can we do to handle them? Do you have any theories or ideas on it of why this happens or no? I think it might have to do with... just like trying to explain biases in either perception or in decision making. like the Canberra example might be explained by some sort of a thing where often the largest or more popular city is the capital, right? Some sort of a heuristic that you use. So that might explain why people do this. You might also have like, so these kind of errors they also occur for like perceptual illusions. So if I sort of draw a squiggly line that is circular and another one that's um you know, straight edged. The circular one is often, I think, underestimated in terms of its total length. If you ask people what's the length of these things. So you kind of have these mistakes that sort of happen because of a perceptual system, right? So I think those might be the kind of cases. In medicine, also, it's easy to imagine something like this happening, where you have... So I mean, it mostly looks say non-cancerous but there's this one very subtle feature which not everyone picks up on, but some people pick up on, and you kind of get that. So there are these surprisingly popular type of wisdom of the crowd algorithms where something that you expect not to be the correct answer, but has like an unusual number of responses, sort of can get to this kind of hidden um wicked cases. But yeah, I haven't looked at that too much, at least in my tasks. No, fair enough. How about we move into some of the other tasks which you're looking at? So that is one of the things which you're looking at is developing models for human cognition using AI. So comparisons of artificial intelligence to human intelligence. From my understanding and very lay understanding, artificial intelligence, because it's basically run on probabilities and of basically if you put If you put something in, gets an accurate answer by being trained on these things many, many, many times. And it seems to be very logical. Human responses aren't always very logical. So how do these two things line up is how can you use one to understand the other? I don't know if human responses are necessarily illogical. I think they're mostly logical until you sort of look for these weird kind of biases or weird kind of examples. And in fact, the interest comes in, right? It's not necessarily the standard logical responses is where these differences lie between a standard logical model and human decisions, which are not necessarily rational. Yeah, so that's a really good question. know, so I sort of also am interested in the similarities between humans and AI because AI has a huge history in cognitive science and cognitive psychology, which focused on sort of human decision making and human judgments and things like that. Learning, right? So you have something like the perceptron that was developed, which sort of was a system that learned And then you sort of put stacked perceptrons together to develop these modern AI systems, know, paired with things like backpropagation, so these learning algorithms. But I would argue that a lot of these were actually inspired by trying to model how humans think, learn, decide, right? So I'm sort of really interested in asking these questions for the modern machines that we have right now about whether... their representations are similar to how we represent things. And if they differ, why and when do they differ? So have you, so to start to start there, what has been your sort of first example you've been looking into to make these comparisons? Great question. I really like this one idea um which in cognitive science of psychological representational spaces. So the idea is that we sort of see something and then we have this internal representation space that is almost geometric. And then you can sort of place every object in that representational space. And then you do something with that to sort of make decisions. So for instance, I might represent food, right? and I might represent the different kinds of pasta. So if you could actually look into my mind, you will see all the pasta sort of clustering together and you would sort of see all of the different kinds of kebabs clustering together. And so if I'm trying to like decide what to eat, I kind of transform different stimuli, like different foods into these abstract spaces. And then I try to like do some math on it in my mind, which is what cognitive science argues for. And then you sort of make a decision. So what I'm interested in is trying to see if these human representations are similar to how machines represent them too, right? Because you can sense stimuli through these artificial neural networks and yes, outplops an answer, but between all of those inputs and outputs, you've got all of these layers and you can pull out activation patterns in any of these layers and then start asking, hey, look, these representations in my mind, do they correspond to the representations in this machine? So that's what I've been doing with... medical images right now. OK, so that's very interesting because I because this is just a theory, right, of how we represent things in our mind. Are there any other theories that way of how humans might represent things in their mind? Just sort of map. mean, it's been like a big, the biggest issue is trying to reconcile this with like these logical abstract kind of representations, symbolic representations, you know? So symbolic representations is, um let's say you're trying to tell me what you like, right? Like in terms of food. And one idea is that I sort of build up this big representational space and I'm trying to like figure out. where your preferences lie, right? That's one explanation. Another thing I can do is I'm thinking through, right? Maybe like a rule that calculates all of your preferences, right? And that rule could be expressed in symbols like Sahar likes meat, right? It's like a statement. And I've sort of learned the statement. So how do you sort of take this abstract representational space where everything is represented in this geometry and sort of Reconcile that with these kind of symbolic statements that Sahir likes meat and therefore Sahir should like lamb, right? Something like that. This actually reminds me of, uh isn't there a part of the brain, in the previous episode we discussed this and I could never get its name right and I just kept calling it the hippopotamus. The hippocampus, yes. Doesn't this remind you of the function of a hippocampus that sort of creates your perception of the environment around you? So... Go on, Ishaan. Both of you, sorry, both of you, it's because Sahir and I were doing that episode about a year and a half ago, maybe. I mean the current theory as far as I understand it, in a nutshell, is the idea that maybe the hippocampus, like you said, has individual examples, but then the cortex sort of does something to those examples to sort of make meaning or semantics. And that's how you kind of um learn and sort of go from these individual examples to sort of these rule-based systems. I guess another thing which I'll add to that uh is what we were talking about then with Professor Kate Jeffries was about place cells. So that was more about how any visual representation of space is represented in our mind in those particular cells, which are found in the hippocampus. guess when we're and considering that's more the external space being represented inside for our directions and idea of space and balance and so on and so forth, maybe not balance, but like idea of space and where things are. um This probably comes more into how the hippocampus sort of accesses memory in a different way to internally represent things. So internally pulling through different things in our cortex and our working memory to sort of what we perceive while somebody's talking or while somebody's doing something. So like Eeshan said with the Sahir likes meat example, pulling memories of Sahir, pulling memories of like what he likes, pulling memories of food, and then sort of like going through the ideas of what matches, what doesn't match based on memory. that, am I representing this correctly, Sean, or am I not? So looks like our lay audience has like a PhD in psychology. I believe at this segment I am the smooth brain because we've moved from the AI side to the psychology side. Well, so ah I think with the hippocampus place cells, that's like a one place where we kind of have a very good understanding of what these different neurons are representing. And that has sort of laid the groundwork for a lot of very cool stuff comparing humans and machines. So that's one place where people study this. And we sort of know a lot about the neuroscience and we sort of know a lot about what individual cells are doing and how they are arranged and. whether they sort of correspond to the outside world and things like that. So we've also sort of mapped this out really well for the visual stream where you sort of have the visual inputs sort of go all the way to the back of the brain and then get processed along two very distinct streams. One of them sort of does motion and the other one sort of identifies individual objects. And then you sort of know, at least in the beginning, what almost individual cells are doing. So you sort of know that individual cells identify a specific edge and then they sort of identify different shapes. And then you can sort of really nicely map this on. But for these more complex stimuli, right, we kind of don't know how the brain does it. it's very, at least I don't know if we know where and how people represent these very specific medical images. um But we can sort of study these things behaviorally too. And with the machine, we know exactly what individual neurons are doing. So we can sort of really pull out those representations. So having taken the throne of smooth brain now, I'm trying to think of an analogy to make sure I've understood this. a perceptron is effectively a nerve cell, correct? And multiple connections of perceptron represents a neural network, correct? So as so for us as humans, the perceptron nerve cells takes the information, gives it to say either the spinal cord or the brain for those central operations and processing and then messages back the output of what needs to be done. So I think, Amer, think you are on the right track, but the analogy does not work as deeply as you are kind of making it out to be. know, the way we studied, like the way AI sort of artificial intelligence studies these perceptrons is kind of just having these layers of perceptrons and then trying to see what these, what they are learning, you know, just by sort of having different constraints, different loss functions, different tasks, different data sets, different capacities, different. architectures, right? So do you sort of really need attention? Do you need some other kind of a system? How does that change it? Does it work the same in language tasks and visual tasks? What about multimodality? Right? So those are the kinds of questions that people are talking about. Now you have specific people who study vision neuroscience, who sort of identify these special cells and what they're responding to. You'll have people in sort of connectivity who study brain connectivity, who talk about how this information is sort of moving back and forth. within the brain. But what I am doing, for instance, is I'm not even looking at the brain. I am doing this really tedious task of asking people, like, here are two images, do you think they're similar to each other or not? Here are two other images, do you think they're similar to each other or not? And over time, I can sort of see which images they think are similar, which images kind of cluster together. And I can see what images cluster together for the machine. And then the question is, are they clustering? the same kind of images together or are they not clustering the same kind of images together? Mm hmm. So have you have you done any experiments like that? Are you in the process of doing them? I am in the process of doing them. So ah I have like initial results from one interesting result. And this was with novices who actually had almost no training on this white blood cell task. This is the same task that I was talking about. And I sort of got them to sort of say, hey, what's similar? What's different? What's similar? What's different? And these are MTurk people. So these are like, these are not MTurk because these are people who you can sort of recruit from cloud research. It's a website where people sort of participate on online psych experiments. And all that I'm asking them is, hey, look, these are two images. Do you think they're similar? On a 1 to 10 scale, how similar do you think they are? There are two other images. So it's not a very difficult task. Almost anyone can do it. And if you sort of take that whole thing and you map it out, right, like with the same technique that I was mentioning before, you can produce an atlas, a geometrical representation of what that person's representation looks like. And it seems like the cancer cells a sort of clustering together and the non-cancer cells are kind of clustering together. And then that makes me think that, look, this untrained novice, maybe there's something about their representation where they can sort of somehow extract features, you know, maybe like cell size or texture of the nucleus, you know, which sort of really correspond to morphological attributes of the cell, right? Like, so you and I can look at a cell and identify the texture of a nucleus. And then you can call that and that maps on pretty well to say something like chromatin density, right? So there's something really amazing about our representations where even if you've not been exposed to these images, we can extract information from them in somewhat of a reasonable way, you know? And machines may not be able to do that. AI may not be able to do that right now, at least the way we are training them. So, but as far as representations go, do you think that the reasons for the clustering of images are the same or could they be different? Could different people be reasoning differently for why certain images are clustered somewhere but getting the same response? I mean, so whenever you're this cognitive science stuff, at one level you ask the question, what does this average person look like? Right? So you sort of superimpose all humans ever. And you say, this is all of these people sort of see this perceptual illusion. All of these people, or most of these people kind of have this one regularity that I'm finding. Right? So I'm at that stage right now, where I'm kind of just trying to see what does this average human beings representation kind of look like. And then we can sort of maybe make theories about how do individuals sort of, then you sort of start accounting for how individuals might differ from each other. Right? But that's hard because you can only ask people to make these similarity judgments for about 300, I think, or maybe four to 500 before they start getting annoyed. Right? So we... I mean, so... Unfortunately, some of these cognitive science experiments are not the most interesting, right? But we gotta do what we gotta do, right? To learn about the human mind. That just sort of reminded me of the company you're partnering in which has the app, so Gamify Things. I think they should go well into late stage capitalism and turn it into a gambling game. And that will definitely get in a lot more of a, that'll sort of widen that of audience who participate. How do I sort of give a ground truth of how similar two images are? If this person thinks that they are not similar to each other, then I sort of have to accept it. So it's kind of... how polymarket works. You see the line move depending on who's how many people are very similar or different. you can do strange things, okay? So you can say, okay, so I'm not asking you for your similarity judgment. I'm asking you for like, I'm going to ask 20 other people as well. And whoever's closest to the mean, they get the correct, they get the reward. Right? So I think there might be ways of doing it, but incentives are tricky, okay? So now let's say at the end of all of this, I conclude that, hey, look, people have somewhat similar representations because they all correlate with each other very highly. anyone is going to argue that hey look that was built into your system right With the way you gave the incentives, um yeah. OK, so as far as those representations go, it kind of looks like it's clustering kind of neatly in your experiment between sort of cancerous cells and non-cancerous cells or whatever sort of objects you kind of are trying to separate between. And do you have any inkling of how AI represents these differences or similarities? So I've been looking at AI systems too, right? And again, these are initial results. However, say something trained on image net, something trained on classifying everyday objects, cats and dogs and plants and animals. They don't seem to learn the same kind of clustering. They kind of have these overlapping representations which might be a bit of a problem. In fact, this kind of gets into a pretty big question and debate. in AI. So you have LLMs, right? Everyone's been working using LLMs. And they can produce the next word. We all can see that. But then why do they produce it? How do they produce it? What are they representing? What are they misrepresenting? It's sort of not really known. And some people argue that, hey, look, these machines, they build what's called a world model, which basically means that they kind of learn all of these features and they represent everything very nicely and you kind of develop these really nice representations that can help you do many different tasks. so, and other people kind of argue that, hey, look, these machines are just learning shortcuts. So they're sort of just identifying like in the ImageNet example, they just identify some weird stripes which are associated with the tiger and that's kind of it. And this has problems, this has implications, right? So there was a machine which had really high accuracy and medical image classification and then when people sort of looked under the hood they found that it was actually identifying the ruler in the image because mostly cancerous images had a ruler because someone was trying to like talk about how big this lesion is and if you are not careful right we can be in trouble there was another recent study with LLMs which came out which somehow showed something very interesting Again, this is an archive. again, take this, be careful with this. But what they argued, what they showed was that they had like some sort of a medical classification task, but due to whatever mistake, they could not give the images associated with the question. It's not classification. So they had some reasoning or some medical thing, and they didn't actually give the thing. And yet this machine actually did pretty well on the exam. And then people were surprised. They were like, how can this machine? It was not like 90%, but it was definitely better than chance. And the explanation that's coming out of it is that, hey, look, maybe there's information in the question itself, even if you don't have the image. It's like, it's like what happens when you go for an exam and you don't really study for anything. And yet you can sort of look at the answers and say that, hey, look, this is probably what the teacher was trying to do. This is probably what they're trying to confuse me about. And so it's definitely not, you know, it's probably A or B and that might actually, and these statistical regularities might actually give you non-trivial results, right? So why are these machines working? What are they representing are big questions. And the world model people argue that these machines represent everything about the world. And the shortcut learning people kind of argue that these machines kind of represent interesting shortcuts and statistical regularities about the world. A world where strawberry has two Rs. Yeah, yeah, that's another good example, right? Where it's not actually representing what we would have represented, which is S T R A W B E R R Y. For us, it's trivial. But the machine was representing some transformations of these tokens, right? Which looked pretty different from our alphabets, which is why it made that mistake. So that leads me to ask that all this background set is set for us. How are you proceeding with your research in this field? So one line of work is literally what we spoke about right now. I'm actually trying to map out human representations, machines trying to understand what actually are both of these systems representing and why do they differ. I'm also sort of just trying to take these machine representations and make models of human cognition. So this sort of assumes at some level that, hey, look, machines are able to represent these images. And it gives you some kind of similarity between two images. But if I know that two images are similar, I can start modeling your decision space, right? So I can sort of model that, hey, look, like, let's go back to the food example because that's easier, right? So I can start modeling that, hey, look, I know that Amer likes pasta of type A, Amer likes pasta of type B, and therefore I will sort of try to predict that Amer also likes pasta of type C. And I can sort of infer this because I have an underlying model that tells me what is similar to each other. And then I sort of... work out some more mathematics to predict how much you might like individual um things. And this is very helpful for uh medical stuff. That's what I've been looking at. And this ties right back at the initial question that we were talking about. Right? So I have this um project where I'm trying to ask this question of, look, if there's wisdom of the inner crowd, then can we sort of figure out when we should ask someone to make a second decision. And over there, the argument is that, hey, look, I have a model which can predict someone's decision. I also have their decision. And if it's inconsistent with each other, then that means that there's something wrong, right? So they don't say what I expect them to say. And so therefore I can sort of ask them to reevaluate the decision. And this has been pretty successful. So even if, you know, we don't know just how similar human representations are to machine representations, as long as they're somewhat correlated, you can start using them to make very specific individual models of decision making. just to make sure I've understood it. You take whatever sort of decisions you'd, what humans would, what sort of inference a human would possibly make relating to different things and seeing if the machine has a similar thought process of connecting two different things is not quite. think it's quite, um I think we are like bleeding in once again from like the previous project. Over here, I'm starting off with the assumption that machines can tell you when are two objects similar to each other. So I can pull out representations from machines. I can compare these two representations and I know if two images are similar to each other or like two foods are similar to each other or two real world objects are similar to each other. Now using that information, I can actually build out a cognitive model of this person's preference. Right? So, I know for instance that I get, I ask you for your preferences on say 300 different food items, right? And I have representations of these individual food items. I'm trying to predict, like I'm trying to make a model of what Amer likes in terms of food, right? And I kind of have various clues for it, right? The clue is that Amer likes pasta of type A and Amer likes pasta of type B, right? So, I can sort of calculate the... similarity to pasta of type C and I can sort of say that hey look, Amer will probably also like pasta of type C or rather if I have your pasta like you I have various things I know that you like pesto I know that you like Alfredo sauce I realize that you don't like marinara and I sort of use that to sort of talk about anything else that is similar to marinara you will probably not like anything else that's similar to these other fettuccine I mean these Alfredo type of light sauces, not too acidic, tomatoey, you you actually like. But the underlying model uses similarity, which is obtained from representation. So I use that to generalize and produce an entire decision or a preference base. So in medical decisions, have 300 decisions made by Amer. And I sort of try to figure out whether specific decisions are sort of not what Amer would have anyway said, right? So. You said cancerous on a specific image and you said cancerous on other similar images. That's great. But what if you said cancerous on one image but on other similar images you kind of said not cancerous. I sort of infer that hey look, Amer probably made a mistake and maybe he means not cancerous. So I might be able to sort of ask you to re-evaluate this and design all kinds of if there's some sort of connection between the non-cancers, understand why I'm saying they're non-cancers. Yeah, so if there's like some consistent pattern, then it kind of doesn't help to ask you to reevaluate it because you actually did mean it's not cancerous. So it's like this very specific kind of flagging system, which I think might answer some of these over and under reliance kind of problems that you were talking about because the collaboration, human AI collaboration research sort of has many different parts. But the part that I find the most interesting is The part that argues that, look, if you're designing human AI collaborative systems, AI might need to have a model of the problem the task is trying to solve, but also have a model of the human that is trying to solve that task, the human that it's trying to support. So it has to combine information from both of these sources in order to make an accurate, helpful suggestion to the human. No, awesome. So then it's basically like the AI, instead of being like, is right, this is wrong, is more like flagging something being like, uh if a doctor inputs something that this looks cancerous or whatever, or says it, the AI can basically double check and be like, based on how you respond, this is not consistent. then, therefore, you're flagging the AI is basically flagging the doctor's decision making, or in this one regard. Yeah, and then you don't need ground truth labels. You just need a representation and you need a person and then you can sort of evaluate whether they're consistent or inconsistent at any point in time, right? So yeah, so then this gives the AI is not making the decision for you. So therefore you're not. So hope you're hoping that nobody becomes too reliable, as you said, in the over under utilization. Yeah, yeah, yeah, it's really time to get to that human AI collaboration thing. But are you afraid that then that might reduce certainty or like make people more uncertain if they rely on every time putting a response into like a machine being like, can you check if I think it's this, can you check for me? oh And over time, them becoming more uncertain of their decisions. So what I haven't done is I have not actually run tests of this algorithm. I've sort of run simulations showing that it's effective. And one thing that constantly keeps coming up whenever you're studying humans is that simulations can only go so far. You have to test these things out because they're all kinds of interesting things like trust and like uh self-efficacy, right? So like your own judgment of how good you are. And you kind of have to study these things. test them out in the real world to see what's happening. Because if you sort of give up on doing the task, the machine keeps pointing out inconsistencies, you go back, then that might be a problem. But it might also not be a problem because if you're an expert in the task, then it probably doesn't happen too often. And you might just find that the machine is like a helpful person that kind of understands your specific decision-making tendencies and supports you well in that. Also, if you're looking for consistency and you sort of give really bad responses, you actually will not be able to identify too much, right? Because everything is kind of messy, right? But if you kind of start having patterns, that's when you can sort of start identifying places of improvement. Now I think we can very briefly cover the third research area which we mentioned at the start of the episode focused on distorted thinking. So Eeshan, could you please tell us a bit more about what it means? Distorted thinking is unrealistic, exaggerated way of thinking, which is kind of associated with many, mental illnesses, particularly depression. So one example of a distortion is I'll never find love. So if you if you believe this, then that might sort of lead to things like social withdrawal because you give up on life and society and people. And then that might actually be true, right? I thought after a certain point that, but the sort of thing is this really exaggerated that this I'll never find love, right? Versus you have something like, I haven't found love yet, but I hope to find it soon, right? Or something like that. So you take the same content, but you sort of help people reframe it so that it does not sort of lead to these maladaptive behaviors. So this is a big part of what cognitive behavioral therapy does. So the underlying idea is that you have thoughts and thoughts drive behaviors. And if you have distorted thoughts, these exaggerated unrealistic thinking, then that causes maladaptive behaviors. The way you address maladaptive behaviors is by addressing the distorted thinking and helping people reframe them in order to. act effectively or like healthily in the world. Awesome. do say it's a negative sort of distortion that leads to emotions versus say making them think, change your distortion to think glass half full instead of glass half empty. Yeah, that's a great example of distorted thinking and how you can sort of reframe it to sort of have a more positive look if that helps you in life. There's this other thing, interesting thing that you brought up, which people often ask me. um It's about positive. distortions, people sort of, we've talked a lot about this in our lab and there's somewhat disagreement on whether it's distorted thinking too, because you can sort of think that you are the best person in the world and that might actually lead to maladaptive behaviors like arrogance and you know, like not accepting something that was good because you thought it was too good for it. You were too good for it. So you can have that. Now, not all distorted thinking might have problems, Like only some of them might be leading to maladaptive behavior. So if you're a therapist, then you might want to only sort of help them really reframe the distorted stuff that is driving maladaptive behavior. But if you're teaching reframing, is called cognitive reappraisal in your session, then you might sort of want to teach it more generally for other things in life too. So what are you doing within this space? So you described it quite well, but what roles are you doing here apart from debating in labs? is or whether positive distorted thinking is distorted thinking? Yeah, so I was very concerned about depression because I've known so many people, so many friends, so many family members that have struggled with it, gone through it at different points in time. So I sort of got into the project because I sort of wanted to understand what can be done about it. And I collaborated with a social media expert, Johann Boland, Lorenzo Lucas, who is a clinical psychologist. And we sort of came together along with my advisor, Jennifer Trueblood, and we sort of designed this experiment to study social media and depression, and if you can do anything about it. So we designed this little experiment where people had these tweets that we actually generated with ChatGPT um for distorted and non-distorted language. And we sort of tried testing if training people to identify distortions changed what they liked and retweeted online. Awesome. uh And so how did that work? So you just generated heaps of tweets, got people to look through them, and showed them what was distorted, what was not distorted, and then tracked their behavior? So there's a lot of social media work that is observational, not as much as experimental. So this was like an experimental study. So what we did was ah we had an interaction block where people sort of saw these little tweets, you know, which had a little heart and retweet button. This was designed by this really amazing lab mate I had called Gunner. He sort of made these really nice looking realistic tweets. And... oh We sort of wanted to see what people were liking and retweeting. So we generated 30 tweets that were distorted and 30 tweets that were not distorted. So Lorenzo sort of made sure that these tweets were sort of valid in some sense. That these, the ones that ChachiPT thought had distortions actually did have distortion and the ones that did not have distortions actually did not have distortions. So we could sort of measure how people are interacting with it. Now we could measure depression through what's called a PHQ-9 scale. It's this. really short nine question scale, which is inspired by the APA manual on like diagnosing depression. It has really simple questions, know, like appetite, sleeplessness that are often associated with depression. So you can get a pretty good measure on how depressed a person is. And then you can sort of see what they're interacting with. um And what we did was we sort of had a training block which sometimes came before this. So we could actually train people before they interacted with this, or we train them after they interacted with it. And we found that if you train people to identify distortions, they actually liked and repeated with it a little lesser. Hmm. So basically, like giving people a little bit more awareness of what distorted thinking is, sort of reduces the amount of things they tend to condone, which were distorted if they saw these tweets. Yeah, so this was pretty inspired by like a lot of social media research which actually focuses on misinformation, which is identifying true and false information. So over there, people have shown things like train little trainings about how misinformation works presented, or even just thinking about misinformation can change what people like and repeat. Right? So uh training people to identify distortions in a very short training document written by Lorenzo. It's actually online. You can read it if you want. It can actually help people um identify these pretty well and change their behavior. Now, what's even more interesting or concerning is that depressed people actually, in our study as well, interact with distorted content more than people with lower levels of depression. So that's the thing where they sort of are interacting with it more. previous work in the lab sort of also showed that people that said that they have depression, they also produced more distorted content. And so now the next question would also be about whether this can produce negative reinforcing cycles for people ah interacting with content online. because well in general the algorithm is well known to push more controversial or know decisive uh posts online that just sort of spirals and just sort of Eventually sort of becomes I wouldn't say a self fulfilling prophecy but more akin to a catch-22 where you've got a chicken and egg problem, right? That keeps pushing this disorder views and then they try to get away but they can't because that just goes again and again and again. If they're starting off distorted they end up depressed. they're depressed they produce distorted. um Yeah, so I think the kind of dynamics that you are exploring, that you're explaining actually have been explored a lot with regards to something like political polarization, right? But the way it works for mental health is not very well understood, in my opinion, right? So is it distorted language or just like political polarization? Political polarization produces this kind of stuff. There's other work which is also interesting, which sort of argues that, uh moral, emotional content is often the stuff that goes ah viral online, right? And this kind of moral, emotional content is also somewhat related to distorted thinking. So there is that connection also that we sort of drew out a little bit on our paper. So it's not only just related, you're more likely to even see such content online because those are the ones which go viral like this distorted. It's technically like that paper at least looked at moral emotional language, which is somewhat associated with depression, distorted content, but it's not necessarily distorted content. I mean, it's not the definition of it, but there's kind of like this overlap, right? Awesome. This between those three sort of aspects which we spoke about in your research, I guess the next question which we kind of ask everybody is if ethics are on holiday, you have all the money in the world. What would be like your dream next research project to do? Well, the one that I'm really, really excited about is mapping out these human and machine representations. And over there, data and human constraints are reasonable constraints, but I would sort of really scale that up if I could. I mean, like these pairwise similarity judgments, the number of judgments that you actually need grows really quickly based on the number of images that you have. Right? So I would really sort of try to map out what human representations look like and then sort of really nicely map out what machine representations look like and make pretty fine-grained analyses on why they actually differ and what machines actually represent what humans do and if not, then why not? Right? So I'm actually living the dream because I'm doing this project. just needs to be upscaled. just needs to be upscaled. you know, I would request tens of thousands of people to participate in this. make it a bet, a betting platform. That's how you do it. Yeah, yeah. And the next question, Dr. Eeshan Hasan what is your science hot take? So I'm gonna mention two, okay? One of them is the wisdom of the crowds. And one of the important findings is that wisdom of the crowds and collective intelligence, this goes back to our first question actually, is super effective when uh people are making independent decisions. And I think that we focus a lot on performance, but not on independence um in terms of decision-making. So what do mean by that? So if you have everyone gets correlated information and everyone does the same task extremely well, right? Then there's not too much to be gained in terms of collective intelligence, right? So if you had the exact same person, you actually don't gain anything at all, right? You just maybe multiply your labor force. But if you actually have diverse opinions and if you actually have, you know, oh different schools of thought, then you actually have, then you have some potential on uh combining information. So I think that the way we design many of our systems right now, we ask questions about performance, but not so much about trying to ensure independence of and like de-correlatedness of opinions. So that's my hot take. I feel like we should focus on that more. The other one is that I think we should legitimately study machine cognition. So right now when I'm justifying a lot of my work, it's, I'm trying to use machine cognition to understand how humans work. Or if I'm studying machines, I'm trying to explain that, if I do this, it's actually going to improve performance in some task. But I actually think that studying machine cognition is a legitimate ah thing that we could have more and more journals and studies and departments on. literally studying how these neural networks work. Like how machines think. How Machines Tink. So yeah, so you have cognitive psychology departments which are taking over a lot of cognitive science, but cognitive science as a field also exists. And I was lucky enough to be a part of Indiana University, which actually has a cognitive science program. And so I sort of have a great appreciation for cognition independent of human cognition. And I think machine cognition is so important right now and yet always justified in terms of some downstream application. I think it's like a legitimate point of basic research at this point. So long story short, you want a major called the psychology of machines. That's exactly what I'm trying to argue against, that we don't need the psychology, just cognition, right? We don't need to appeal to the humans and the psychology. We can just study these machines as inherently interesting. That is a fascinating perspective. Well, on that note, thank you Dr. Hasan. Thank you Dr. Hussain. And thank you for our audience for joining us for another episode of the Smooth Brain Society. Until the next one, ciao. Ciao! Yeah. Ciao.

Sahir Hussain

Host