1 00:00:00,000 --> 00:00:03,509 Now we're starting with a really general solution, this model that is pretty good 2 00:00:03,509 --> 00:00:07,740 at everything and the trick is how do we get this general model to focus up, right? 3 00:00:07,740 --> 00:00:09,226 Like, focus in on medicine. 4 00:00:09,286 --> 00:00:11,129 Be better at that specific task. 5 00:00:16,729 --> 00:00:19,489 Hi everyone, and welcome to another episode of EMPlify 6 00:00:19,489 --> 00:00:20,999 I'm your host, Sam Ashoo. 7 00:00:21,279 --> 00:00:25,229 Before we dive into this month's episode, I want to say thank you for joining us. 8 00:00:25,279 --> 00:00:28,819 I sincerely hope that you find it to be helpful and informative for your 9 00:00:28,819 --> 00:00:33,249 clinical practice, and I want to remind you that you can go to ebmedicine.net 10 00:00:33,319 --> 00:00:37,469 where you will find our three journals, Emergency Medicine Practice, Pediatric 11 00:00:37,479 --> 00:00:42,629 Emergency Medicine Practice, and Evidence Based Urgent Care, and a multitude of 12 00:00:42,649 --> 00:00:46,999 other resources, like the EKG course, the laceration course, interactive 13 00:00:46,999 --> 00:00:51,439 clinical pathways, just tons of information to support your practice 14 00:00:51,579 --> 00:00:53,189 and help you in your patient care. 15 00:00:53,459 --> 00:00:55,649 And now, let's jump into this month's episode. 16 00:00:56,259 --> 00:00:57,989 Hey, I'm Jack Teitel. 17 00:00:58,284 --> 00:00:58,614 Great. 18 00:00:58,614 --> 00:01:01,074 Thanks Jack for joining us on the podcast today. 19 00:01:01,074 --> 00:01:06,144 I asked you to come on as a special guest because we've had a lot of interest 20 00:01:06,144 --> 00:01:12,204 recently on artificial intelligence in medicine, and I cannot think of 21 00:01:12,204 --> 00:01:14,814 someone with more expertise than you. 22 00:01:15,154 --> 00:01:19,654 You have a pretty significant background in healthcare and artificial intelligence. 23 00:01:19,654 --> 00:01:22,444 Tell me where that journey started for you. 24 00:01:22,944 --> 00:01:26,599 Totally, well started all the way back in college for me. 25 00:01:26,879 --> 00:01:30,239 I actually tripped and fell into an AI lab where we were doing a lot 26 00:01:30,239 --> 00:01:31,919 of population health level stuff. 27 00:01:31,919 --> 00:01:36,779 So looking at social media, tracking flu trends through Twitter, analyzing 28 00:01:36,779 --> 00:01:39,994 real time outbreaks of food poisoning, again via Twitter, all that kind of 29 00:01:39,994 --> 00:01:45,134 stuff and then that kind of, led to my passion in AI a nd also for healthcare. 30 00:01:45,354 --> 00:01:49,914 From there, I went and actually started leading a AI lab at a hospital 31 00:01:49,914 --> 00:01:51,354 network up in upstate New York. 32 00:01:51,564 --> 00:01:55,519 This was originally a software lab that they called a innovation team. 33 00:01:55,709 --> 00:01:58,989 Basically different department heads or researchers who wanted custom 34 00:01:58,989 --> 00:02:02,379 software built and had a little bit of budget to spare, could come 35 00:02:02,379 --> 00:02:03,909 to us, we would build it for them. 36 00:02:04,129 --> 00:02:06,869 And they brought me in to see what AI was all about. 37 00:02:06,869 --> 00:02:08,399 This is about 10 years ago now. 38 00:02:08,469 --> 00:02:08,889 Mm-hmm. 39 00:02:09,219 --> 00:02:09,369 Wow. 40 00:02:09,759 --> 00:02:11,122 Yeah, this is about 10 years ago. 41 00:02:11,189 --> 00:02:14,082 We started working AI, obviously at that time it looked very different than 42 00:02:14,082 --> 00:02:17,492 it does now, a lot of machine learning algorithms, a lot of computer vision, 43 00:02:17,839 --> 00:02:21,302 and so ran that team for about five years, grew it to about five of us, 44 00:02:21,302 --> 00:02:25,452 worked in everything from predicting surgical outcomes, using chart data 45 00:02:25,452 --> 00:02:30,497 to image analysis, helping providers read x-rays and ultrasounds better. 46 00:02:30,914 --> 00:02:35,257 Doing some things like flow analysis of patients in the hospital, helping 47 00:02:35,377 --> 00:02:38,257 doing analyses of how physicians interact with the medical record 48 00:02:38,287 --> 00:02:40,087 to help optimize that process. 49 00:02:40,324 --> 00:02:41,297 All sorts of things. 50 00:02:41,814 --> 00:02:45,517 After doing that for five years, I switched over to Blue Cross Blue Shield 51 00:02:45,664 --> 00:02:48,197 moved down south where it's a little bit warmer than upstate New York. 52 00:02:48,214 --> 00:02:48,434 Yes. 53 00:02:48,482 --> 00:02:51,482 Started working for Blue Cross Blue Shield doing things like fraud detection, 54 00:02:51,482 --> 00:02:53,372 member profiling for claims overpayment. 55 00:02:53,639 --> 00:02:57,532 And for the past five years, I've been working in just AI consulting. 56 00:02:57,604 --> 00:03:00,887 General custom AI work for various companies, including a lot of health 57 00:03:00,887 --> 00:03:04,987 tech companies, sports medicine, medical bill coding that kind of stuff. 58 00:03:05,224 --> 00:03:08,447 The past year of that has been actually as my own company, Title AI. 59 00:03:08,954 --> 00:03:12,897 So yeah, quite a long track where good in healthcare and to AI specifically 60 00:03:13,157 --> 00:03:17,474 touching a lot of different points along the way also I used to teach as well. 61 00:03:17,540 --> 00:03:20,514 Back when I was working in New York, I was adjunct at the local 62 00:03:20,514 --> 00:03:24,124 university where I taught a course mostly based on my own experience for 63 00:03:24,124 --> 00:03:29,284 PhD students on how to apply machine learning, deep learning, AI concepts 64 00:03:29,284 --> 00:03:31,344 in the healthcare space effectively. 65 00:03:31,344 --> 00:03:34,674 So not just how do you build good models, but how do you deploy them? 66 00:03:34,674 --> 00:03:35,754 How do you validate them? 67 00:03:35,754 --> 00:03:38,004 How do you make sure people are actually using them? 68 00:03:38,240 --> 00:03:41,014 All that stuff that can slip through the cracks a lot of the time. 69 00:03:41,530 --> 00:03:45,370 Now this is fascinating because I think most of the people who are listening 70 00:03:45,520 --> 00:03:49,040 think of artificial intelligence as something that, was born like 71 00:03:49,040 --> 00:03:51,800 in 2023, maybe the end of 2024. 72 00:03:52,040 --> 00:03:56,340 But this has been a work in progress for decades. 73 00:03:56,340 --> 00:03:56,820 Is that right? 74 00:03:57,304 --> 00:04:00,934 Yeah I could talk about the history of AI for the whole podcast, 75 00:04:00,934 --> 00:04:03,964 but I'll contain myself here and just do a little quick summary. 76 00:04:04,260 --> 00:04:07,844 AI kinda started in the fifties and that was mathematicians like 77 00:04:07,844 --> 00:04:10,514 Alan Turing, for example, some of you may have heard of Godfather of 78 00:04:10,514 --> 00:04:12,404 AI, godfather of computer science. 79 00:04:12,805 --> 00:04:13,885 The Turing test, right? 80 00:04:13,885 --> 00:04:14,769 Yeah, exactly. 81 00:04:14,769 --> 00:04:16,389 The guy who came up with the Turing test. 82 00:04:16,715 --> 00:04:20,039 They just were theorizing about what AI could look like, what neural 83 00:04:20,039 --> 00:04:23,129 networks could look like, what was the mathematical foundations of these. 84 00:04:23,575 --> 00:04:27,919 That went being theoretical until about the nineties, roughly when 85 00:04:27,919 --> 00:04:30,109 people started actually being able to implement neural networks. 86 00:04:30,625 --> 00:04:32,859 And those are nothing like the AI we see these days. 87 00:04:32,875 --> 00:04:37,159 They were five layers as opposed to a thousand layers and a thousand parameters 88 00:04:37,159 --> 00:04:40,379 instead of a trillion parameters, which is the size we're looking at now. 89 00:04:40,379 --> 00:04:43,469 So much smaller, really task specific but they worked and it was cool. 90 00:04:43,802 --> 00:04:50,157 And then around 2000, late 2010's, 2016, 2017, that kind of era, is when we 91 00:04:50,157 --> 00:04:51,957 started being able to run these on GPUs. 92 00:04:52,257 --> 00:04:57,447 And that is when AI kind of took off because now it stopped being theoretical 93 00:04:57,597 --> 00:05:01,047 or super simple and we started being able to build really serious models. 94 00:05:01,314 --> 00:05:07,447 And that kind of culminated 2023 GPT 3.5 came out ChatGPT as everybody knows it. 95 00:05:07,654 --> 00:05:12,297 And AI moved from this niche research space, where there were already a lot 96 00:05:12,297 --> 00:05:17,757 of us implementing it for many years and getting really cool results, but it became 97 00:05:17,787 --> 00:05:20,937 just like generally good at everything as opposed to having to be really 98 00:05:20,937 --> 00:05:24,997 trained hard on task specific things and entered the public consciousness and 99 00:05:24,997 --> 00:05:27,127 became AI as everybody sees it today. 100 00:05:27,334 --> 00:05:27,642 Which is pretty cool. 101 00:05:28,159 --> 00:05:32,479 Yeah, so again, you think in 2023 we finally hit that culmination where like 102 00:05:32,479 --> 00:05:39,889 our tech level and our software finally just coincided in this miraculous 103 00:05:39,889 --> 00:05:43,549 point where it was good enough to offer it to the general public as a 104 00:05:43,549 --> 00:05:47,799 tool instead of just something that was in a closed lab at a university? 105 00:05:48,282 --> 00:05:52,182 Yeah, it was the perfect storm because people had been working on 106 00:05:52,182 --> 00:05:55,062 the software side of things since the fifties, that's all the theory. 107 00:05:55,302 --> 00:05:58,032 The hardware side of things had finally caught up, right? 108 00:05:58,032 --> 00:06:02,272 We'd been able to run things on GPUs for five, six years, we were finally 109 00:06:02,272 --> 00:06:06,462 learning how to chain multiple GPUs together and train on these huge clusters 110 00:06:06,462 --> 00:06:09,972 of GPUs and also the internet, right? 111 00:06:10,152 --> 00:06:12,972 Having data to train these models on was really important. 112 00:06:13,489 --> 00:06:16,482 These initial models are trained on all of the data on the internet. 113 00:06:16,482 --> 00:06:19,632 That's how they know things about the world and without that data there to 114 00:06:19,632 --> 00:06:22,872 train the models, it wouldn't matter how good our hardware is or how good our 115 00:06:22,872 --> 00:06:24,642 software is, we still wouldn't have AI. 116 00:06:24,642 --> 00:06:28,212 So this kind of perfect storm conditions happened and we were able to build these 117 00:06:28,212 --> 00:06:30,752 models that are generally intelligent. 118 00:06:30,752 --> 00:06:32,455 They were called foundational models. 119 00:06:32,455 --> 00:06:34,975 So they're no longer super task specific. 120 00:06:35,095 --> 00:06:37,225 They're pretty good at everything. 121 00:06:37,642 --> 00:06:40,125 So that is when people paid attention. 122 00:06:40,125 --> 00:06:43,475 It was like, wow, I can use this in my job and what I do every day 123 00:06:43,607 --> 00:06:44,027 Mm-hmm. 124 00:06:44,175 --> 00:06:47,445 And it wasn't just somebody in a really niche field with a fine tuned 125 00:06:47,445 --> 00:06:50,415 model saying, I can use this model to do one specific thing in my day. 126 00:06:50,625 --> 00:06:55,610 It was everybody in every industry around the world saying, oh, I can use this. 127 00:06:55,717 --> 00:06:56,007 Yeah. 128 00:06:56,050 --> 00:07:00,510 And that's why it blew up, ChatGPT reached a million users, I think four 129 00:07:00,510 --> 00:07:04,080 or five times faster than the last app which I believe is Instagram. 130 00:07:04,330 --> 00:07:08,512 Now I think that's still fascinating because I feel like we've gone from hey, 131 00:07:08,512 --> 00:07:13,162 here's a cool tool that you can apply in any field to do anything and the 132 00:07:13,162 --> 00:07:18,082 pendulum is swinging now as we applied it in healthcare and we went, oh okay. 133 00:07:18,112 --> 00:07:21,022 It can help, but man, it sure makes a lot of mistakes. 134 00:07:21,182 --> 00:07:24,722 And now the pendulum is swinging and people are going what you really need 135 00:07:24,722 --> 00:07:27,692 is this AI that's tailored for medicine. 136 00:07:27,872 --> 00:07:29,299 It's now task specific. 137 00:07:29,299 --> 00:07:31,669 The pendulum seems to be swinging the other direction. 138 00:07:32,152 --> 00:07:32,692 Yeah. 139 00:07:32,792 --> 00:07:34,382 That is happening everywhere. 140 00:07:34,482 --> 00:07:37,772 Being generally good at everything does not make you a good doctor. 141 00:07:37,789 --> 00:07:40,337 Usually Usually these doctors have to go to school. 142 00:07:40,337 --> 00:07:43,644 They have extra years of schooling and residency and all these things which 143 00:07:43,644 --> 00:07:46,631 hyper specialize them to be good at this one specific thing: medicine. 144 00:07:46,871 --> 00:07:49,391 You can't expect just an average person who's good at everything 145 00:07:49,391 --> 00:07:50,621 to come in and be able to do that. 146 00:07:50,951 --> 00:07:54,531 So what we're starting to see is exactly what we've been seeing before. 147 00:07:55,031 --> 00:07:59,441 Basically, the way that I've always been doing AI is task specific, which 148 00:07:59,441 --> 00:08:01,001 is still the best way to do things. 149 00:08:01,188 --> 00:08:01,608 Mm-hmm. 150 00:08:01,851 --> 00:08:05,821 But it used to be we have to build a very specific model architecture, get a very 151 00:08:05,821 --> 00:08:09,391 specific set of data, and train it to solve a really specific set of problems. 152 00:08:09,391 --> 00:08:13,596 So we're starting from nothing and building a really narrow solution. 153 00:08:14,016 --> 00:08:17,076 Now we're starting with a really general solution, this model that 154 00:08:17,076 --> 00:08:18,036 is pretty good at everything. 155 00:08:18,036 --> 00:08:21,756 And the trick is how do we get this general model to focus up, right? 156 00:08:21,756 --> 00:08:23,243 Like, focus in on medicine. 157 00:08:23,303 --> 00:08:25,146 Be better at that specific task. 158 00:08:25,146 --> 00:08:28,566 So instead of starting from nothing and building something narrow, we're starting 159 00:08:28,566 --> 00:08:32,786 from something very wide and focusing it in, into kind of a focused beam. 160 00:08:33,113 --> 00:08:36,546 Which is how a lot of folks have been approaching this and been seeing a lot 161 00:08:36,546 --> 00:08:42,026 of success along with LLMs have some very foundational failings like memory 162 00:08:42,043 --> 00:08:43,496 and knowledge and that kind of stuff. 163 00:08:43,496 --> 00:08:48,351 So building structures around these AI systems to shore up those areas 164 00:08:48,351 --> 00:08:51,426 where it's a little weaker, has led to some really big gains as well. 165 00:08:52,303 --> 00:08:55,841 One of the things that's fascinating is that the large language models actually 166 00:08:55,841 --> 00:08:57,874 have a really good command of language. 167 00:08:57,874 --> 00:09:01,357 If you can give them something that then just needs to be summarized or 168 00:09:01,357 --> 00:09:04,597 searched or understood and comprehended, that seems like a no brainer. 169 00:09:04,747 --> 00:09:06,487 Really haven't seen any issues there. 170 00:09:06,487 --> 00:09:09,505 It's more pointing it in the right direction for the content , that 171 00:09:09,505 --> 00:09:12,205 seems to be the specific target that we're hitting for healthcare 172 00:09:12,205 --> 00:09:13,735 now in these newer products. 173 00:09:13,855 --> 00:09:14,305 Does that sound right? 174 00:09:14,392 --> 00:09:15,232 Yeah, totally. 175 00:09:15,282 --> 00:09:15,972 It's hard. 176 00:09:16,032 --> 00:09:17,532 It's a hard problem to solve. 177 00:09:17,652 --> 00:09:20,862 Think about how much more medical knowledge there is now than there 178 00:09:20,862 --> 00:09:22,742 was a decade ago or 20 years ago. 179 00:09:23,005 --> 00:09:26,125 How much more training doctors have to do and papers people have 180 00:09:26,125 --> 00:09:27,535 to read to stay on top of things. 181 00:09:27,702 --> 00:09:31,335 It's a hard task and the expectations for these models are high, right? 182 00:09:31,335 --> 00:09:32,865 This is not just a general assist tool. 183 00:09:32,865 --> 00:09:36,095 This is supposed give me really specific answers for really specific 184 00:09:36,095 --> 00:09:38,135 edge cases in my specialty, right? 185 00:09:38,185 --> 00:09:38,275 Yeah. 186 00:09:38,592 --> 00:09:42,187 We're expecting the single model to perform as well as top tier specialists 187 00:09:42,203 --> 00:09:45,845 and sort through all of medical literature on the internet to find the answer 188 00:09:45,845 --> 00:09:46,025 . Yeah. 189 00:09:46,025 --> 00:09:46,505 Easy. 190 00:09:46,688 --> 00:09:47,078 Come on. 191 00:09:47,183 --> 00:09:47,813 Yeah, easy, right? 192 00:09:47,813 --> 00:09:48,743 Why isn't it better? 193 00:09:48,980 --> 00:09:50,703 So yeah of course it makes mistakes. 194 00:09:50,850 --> 00:09:51,733 It does hallucinate. 195 00:09:51,750 --> 00:09:53,263 That's a foundational problem here. 196 00:09:53,620 --> 00:09:57,277 But the question isn't, how close is this to perfect in my mind. 197 00:09:57,277 --> 00:09:59,437 The question is always, what's the alternative? 198 00:09:59,842 --> 00:10:01,822 What are we comparing this against? 199 00:10:01,972 --> 00:10:04,152 Holding it to the standard of this needs to be perfect or else. 200 00:10:04,152 --> 00:10:05,822 It's tough, right? 201 00:10:05,832 --> 00:10:06,512 It feels good. 202 00:10:06,512 --> 00:10:07,682 It feels like that's what it should be. 203 00:10:07,698 --> 00:10:08,312 It's a robot. 204 00:10:08,342 --> 00:10:09,032 It's a machine. 205 00:10:09,032 --> 00:10:09,992 It should be perfect. 206 00:10:10,042 --> 00:10:14,292 But as long as it makes an improvement over the current system, it's really good. 207 00:10:14,792 --> 00:10:16,982 Now the tricky part comes in. 208 00:10:16,998 --> 00:10:19,352 So in AI, we really talk, especially in healthcare, 209 00:10:19,352 --> 00:10:21,392 about human in the loop, right? 210 00:10:21,905 --> 00:10:22,205 Mm-hmm. 211 00:10:22,392 --> 00:10:25,508 These models are really good, but they do make mistakes and have zero 212 00:10:25,508 --> 00:10:27,338 accountability for their mistakes, right? 213 00:10:27,338 --> 00:10:30,518 No one's suing the AI model for getting something wrong, they're suing you. 214 00:10:31,035 --> 00:10:34,498 You have to be in the loop in supervising and monitoring these outputs. 215 00:10:34,998 --> 00:10:40,528 But, is right 85% of the time, 90% of the time maybe. 216 00:10:41,045 --> 00:10:45,168 So it's really easy for a doctor who is stressed and overworked and having 217 00:10:45,168 --> 00:10:49,738 to do documentation and see a thousand patients every day to take the time to 218 00:10:49,738 --> 00:10:53,698 sit there and meticulously check an answer that is right 90% of the time. 219 00:10:53,767 --> 00:10:54,073 Hmm. 220 00:10:54,103 --> 00:10:56,098 It's really easy for that to just slip through the cracks. 221 00:10:56,615 --> 00:10:59,838 So in my mind, a lot of the work that still needs to be done is building 222 00:11:00,255 --> 00:11:05,208 systems and policies and things around these tools so that everybody is 223 00:11:05,208 --> 00:11:09,385 using the same tools and using them in the same way and using them in a way 224 00:11:09,425 --> 00:11:11,775 that doesn't easily lead to mistakes. 225 00:11:11,805 --> 00:11:14,595 Right now, I wonder if you have the same experience with this, but to me 226 00:11:14,595 --> 00:11:16,005 it feels like the wild West out here. 227 00:11:16,155 --> 00:11:19,305 Everybody is trying their own tools and coming up with their 228 00:11:19,305 --> 00:11:21,165 own workflows, and some people do 229 00:11:21,198 --> 00:11:21,258 Yeah. 230 00:11:21,385 --> 00:11:23,895 a lot better than others, and some people do a lot worse, and 231 00:11:24,015 --> 00:11:26,615 everybody's just trying to figure this out on their own, and that's 232 00:11:26,615 --> 00:11:28,622 not really sustainable, you know? 233 00:11:29,145 --> 00:11:32,418 Yeah and no one's really sharing their benchmarks publicly. 234 00:11:32,418 --> 00:11:37,338 I couldn't tell you today that chat GPTs error rate is 15%. 235 00:11:37,338 --> 00:11:38,898 Geminis is 8%. 236 00:11:38,915 --> 00:11:45,240 Open evidence has a 3.5% We don't know, but it sure would be nice because we do 237 00:11:45,240 --> 00:11:47,997 have some model for that in healthcare. 238 00:11:47,997 --> 00:11:53,277 We never really quantify on an individual physician basis what the error rates are. 239 00:11:53,277 --> 00:11:56,397 Like, I couldn't tell you what my error rate is compared to my partner. 240 00:11:56,667 --> 00:12:02,217 But we can tell you that if you follow, say, an institutional guideline or 241 00:12:02,253 --> 00:12:05,918 one of our organizational guidelines, on say, acute coronary syndrome. 242 00:12:05,995 --> 00:12:07,315 You know what's your risk of a heart attack? 243 00:12:07,345 --> 00:12:10,315 What's your risk of missing a heart attack if you have two normal blood 244 00:12:10,315 --> 00:12:13,615 tests and a normal EKG and you've been observed in the emergency department? 245 00:12:13,645 --> 00:12:14,845 Well, it's less than 1%. 246 00:12:15,055 --> 00:12:19,075 So at least I can have that conversation with a patient and say, 247 00:12:19,075 --> 00:12:19,405 "Hey. 248 00:12:19,705 --> 00:12:20,935 We did these two blood tests. 249 00:12:20,935 --> 00:12:22,225 They're very high sensitivity. 250 00:12:22,375 --> 00:12:23,515 We did this EKG. 251 00:12:23,785 --> 00:12:27,025 You have chest pain, but your chances are less than 1%. 252 00:12:27,025 --> 00:12:28,375 You don't have very many risks. 253 00:12:28,375 --> 00:12:29,455 You're looking at about this. 254 00:12:29,455 --> 00:12:32,935 So you know, happy to put you in observation and do a stress and do 255 00:12:32,935 --> 00:12:36,955 all these things, but it's not gonna really reduce that risk very much. 256 00:12:37,015 --> 00:12:38,538 It's not a very valuable. 257 00:12:39,165 --> 00:12:43,035 Use of your time and of our healthcare resources, but I'm happy to do 258 00:12:43,035 --> 00:12:45,818 it for you if you're still super concerned." And most of those patients 259 00:12:45,818 --> 00:12:48,158 appreciate that kind of quantification. 260 00:12:48,178 --> 00:12:51,902 I don't really get that when I'm using a model and saying, "Hey, what's the 261 00:12:51,902 --> 00:12:57,332 answer to this question?" It seems like it's very confident in itself, but unable 262 00:12:57,332 --> 00:13:00,652 to tell me its own level of inaccuracy. 263 00:13:00,807 --> 00:13:01,287 Totally. 264 00:13:01,317 --> 00:13:02,937 Okay, there's so much to unpack there. 265 00:13:02,937 --> 00:13:05,720 In terms of confidence, very interesting phenomena with AI. 266 00:13:06,504 --> 00:13:07,404 It's not an accident. 267 00:13:07,404 --> 00:13:09,084 So there's a couple things at play here. 268 00:13:09,354 --> 00:13:12,010 AI AI mostly is trained on the internet. 269 00:13:12,527 --> 00:13:14,490 I don't know if you've been on the internet, but people tend 270 00:13:14,490 --> 00:13:15,870 to speak very confidently. 271 00:13:16,094 --> 00:13:17,954 That's how the AI learned to speak, right? 272 00:13:17,970 --> 00:13:19,454 It's a predictive model. 273 00:13:19,624 --> 00:13:22,447 It samples from its distribution of what it knows, and what 274 00:13:22,447 --> 00:13:23,712 it knows is the internet. 275 00:13:23,982 --> 00:13:25,272 And the internet is very confident. 276 00:13:25,272 --> 00:13:28,640 You don't see a lot of people just questioning life and being uncertain or 277 00:13:28,640 --> 00:13:30,380 talking in like very statistical terms. 278 00:13:30,380 --> 00:13:32,810 It is out there, but it's not the main thing out there. 279 00:13:33,359 --> 00:13:35,949 That's right, my blog of uncertainty is not very popular. 280 00:13:35,960 --> 00:13:36,680 Exactly. 281 00:13:36,680 --> 00:13:40,760 So overcoming that is tough, just inherently. 282 00:13:41,260 --> 00:13:44,950 And the second thing, which you may have noticed, it's not quite as much 283 00:13:44,950 --> 00:13:46,900 confidence, but more reinforcement. 284 00:13:47,287 --> 00:13:48,770 AI likes to tell you that you're right. 285 00:13:49,384 --> 00:13:50,034 It does it a lot. 286 00:13:50,034 --> 00:13:53,020 I'm sure you've seen, especially ChatGPT is the biggest offender of this. 287 00:13:53,020 --> 00:13:53,735 Claude is a little bit better. 288 00:13:54,235 --> 00:13:57,875 But ChatGPT you say something and it'll be, oh, what a great idea. 289 00:13:57,875 --> 00:13:58,865 You are a genius. 290 00:13:58,865 --> 00:13:59,975 This is why you're right. 291 00:13:59,975 --> 00:14:01,445 And then you go, wait a minute, I'm not right. 292 00:14:01,445 --> 00:14:02,512 And it's like, oh, wow. 293 00:14:02,752 --> 00:14:04,102 You are so insightful. 294 00:14:04,102 --> 00:14:05,812 Absolutely, I was wrong before. 295 00:14:05,812 --> 00:14:06,742 Now you're right. 296 00:14:06,892 --> 00:14:10,402 There's been studies that if you just prompt an AI to ask it, 297 00:14:10,462 --> 00:14:11,842 should you flip your answer or not? 298 00:14:11,842 --> 00:14:15,232 About 75% of the time it will, whether the original answer was right or not. 299 00:14:15,542 --> 00:14:15,852 But the reason 300 00:14:15,890 --> 00:14:16,220 Wow. 301 00:14:16,272 --> 00:14:18,145 it does this is because they train it to. 302 00:14:18,365 --> 00:14:21,365 People, it turns out, like hearing that they're right. 303 00:14:21,420 --> 00:14:25,187 These AI systems are built to make money, right? 304 00:14:25,187 --> 00:14:26,537 They want people to use them. 305 00:14:26,867 --> 00:14:29,137 And so it's tuned for that engagement level. 306 00:14:29,137 --> 00:14:32,697 They actually have a specific phase of training towards the end called 307 00:14:32,869 --> 00:14:36,192 reinforcement based on human feedback, where they tune the model to give 308 00:14:36,192 --> 00:14:39,677 answers that people prefer and people prefer to hear that they're right. 309 00:14:39,977 --> 00:14:44,267 Some models, like I said, Claude is doing a lot better about mitigating that, but 310 00:14:44,267 --> 00:14:46,357 it's just baked into human psychology. 311 00:14:46,537 --> 00:14:50,407 That we like to hear that they're right and these models are built to cater to 312 00:14:50,407 --> 00:14:53,010 us humans And so that's what they do. 313 00:14:53,507 --> 00:14:56,470 Okay, back to the benchmark stuff, 'cause this is actually very interesting. 314 00:14:56,910 --> 00:14:58,800 There are a lot of benchmarks out there. 315 00:14:59,300 --> 00:15:02,750 You should always be very skeptical of benchmarks unless they are brand new. 316 00:15:02,879 --> 00:15:06,659 The thing about AI systems, is they are very good at gaming things. 317 00:15:06,955 --> 00:15:08,319 Very good at memorization. 318 00:15:08,465 --> 00:15:10,999 Very good at fitting to a specific task. 319 00:15:11,355 --> 00:15:12,040 Like a high school student? 320 00:15:12,690 --> 00:15:15,294 Kind of, the AI systems will cheat wherever possible. 321 00:15:15,314 --> 00:15:19,524 Cheat and memorize as much possible to get the best results on a specific task. 322 00:15:19,790 --> 00:15:22,574 Which is called overfitting, that's the technical name for it, overfitting. 323 00:15:22,584 --> 00:15:24,184 You're overfitting to the task. 324 00:15:24,384 --> 00:15:28,577 And what happens when you do that is you become less good at everything else. 325 00:15:28,724 --> 00:15:32,154 So you can think of it a s memorizing the answers to a test without 326 00:15:32,154 --> 00:15:33,414 actually learning the content. 327 00:15:33,809 --> 00:15:38,429 So when people release benchmarks publicly, there is incentive for 328 00:15:38,429 --> 00:15:41,587 people to train the models to those benchmarks, and they might preform 329 00:15:41,587 --> 00:15:42,007 Mm-hmm. 330 00:15:42,179 --> 00:15:45,525 really well on those benchmarks and be kind of bad when you actually 331 00:15:45,525 --> 00:15:46,905 go to use them in the real world. 332 00:15:47,022 --> 00:15:50,375 This is something that Meta was a little bit notorious for, especially with 333 00:15:50,425 --> 00:15:52,265 Llama 4 that they released last year. 334 00:15:52,502 --> 00:15:54,170 Llama 4 came out, open source model. 335 00:15:54,670 --> 00:15:56,560 Big splash in the AI community. 336 00:15:56,560 --> 00:15:59,140 It was killing all the benchmarks, looking really good. 337 00:15:59,260 --> 00:16:02,340 And then people went to use it and and they're like, wait a minute, 338 00:16:02,662 --> 00:16:02,962 Yeah. 339 00:16:03,589 --> 00:16:06,752 While it would be awesome to say like, lets get public benchmarks out 340 00:16:06,752 --> 00:16:09,872 there to test these models and see how they stack up in the real world, 341 00:16:10,009 --> 00:16:13,989 it's not really possible because the systems will become better at those 342 00:16:13,989 --> 00:16:15,594 benchmarks and it's hard to know. 343 00:16:16,094 --> 00:16:18,464 Are these systems better at those benchmarks because they've been 344 00:16:18,464 --> 00:16:21,824 out for a year or two and all the AI developers know these are the 345 00:16:21,824 --> 00:16:23,204 benchmarks that people are looking at? 346 00:16:23,907 --> 00:16:25,037 Or because the models have gotten better. 347 00:16:25,304 --> 00:16:30,074 This is like the AI can pass a board examination for a specialty 348 00:16:30,074 --> 00:16:34,394 certification in some area of medicine, but it's still not a good doctor. 349 00:16:34,495 --> 00:16:35,395 Yeah, exactly. 350 00:16:35,395 --> 00:16:37,964 There's so much to being a doctor, but doing one test. 351 00:16:38,480 --> 00:16:40,050 It can't encompass all there is. 352 00:16:40,070 --> 00:16:45,850 So defining the intelligence of the system in a really huge area with 353 00:16:45,850 --> 00:16:48,152 just a few questions isn't gonna work. 354 00:16:48,192 --> 00:16:51,565 But more importantly, having one measure for that across everybody. 355 00:16:51,565 --> 00:16:55,133 It's like if, everybody took the same test in the world and the test 356 00:16:55,133 --> 00:16:59,613 questions stayed the same every time and then everybody knew what the test 357 00:16:59,613 --> 00:17:02,888 questions were and knew if they did that, they would be hired into a job. 358 00:17:03,038 --> 00:17:04,598 You're gonna study those questions. 359 00:17:04,598 --> 00:17:06,038 You're not gonna how to do the job. 360 00:17:06,305 --> 00:17:10,023 It's just kind of about this implicit reward and bias of these models and 361 00:17:10,023 --> 00:17:11,103 these systems and how it's trained. 362 00:17:11,103 --> 00:17:12,393 And there's not really much to do about it. 363 00:17:12,393 --> 00:17:14,923 It's not like the developers are wrong for doing that, or we 364 00:17:14,923 --> 00:17:16,153 should say, Hey, don't do that. 365 00:17:16,450 --> 00:17:19,613 Again, it comes back to human nature and how we build out these systems. 366 00:17:19,850 --> 00:17:23,253 So in my opinion, it's very important to have your own personal test. 367 00:17:23,493 --> 00:17:24,933 What are the things that you do? 368 00:17:25,130 --> 00:17:28,603 What are problems that you've tried to solve with AI where it didn't do good? 369 00:17:28,813 --> 00:17:29,803 Or where it did do good? 370 00:17:30,143 --> 00:17:34,570 If you're using these tools on a regular basis, it's not that hard to just take 371 00:17:34,570 --> 00:17:38,960 one question you ask it a week and the answer that you liked or didn't like, put 372 00:17:38,960 --> 00:17:42,050 that down in a spreadsheet or something, and then the next time a new system 373 00:17:42,050 --> 00:17:46,760 comes out, run your 15, 20 questions through it, and then see how well does 374 00:17:46,760 --> 00:17:48,470 this do versus the last thing I tried. 375 00:17:48,979 --> 00:17:50,638 It a little while to build that up. 376 00:17:50,698 --> 00:17:53,368 It's not something I would say take a week and do it all right now. 377 00:17:53,368 --> 00:17:58,128 No, do it as you naturally work, but it'll really be helpful for just having 378 00:17:58,178 --> 00:17:59,888 a little bit of empirical evidence. 379 00:18:00,368 --> 00:18:03,878 Particularly related to you and the work that you do, that you can use 380 00:18:03,878 --> 00:18:07,458 to judge these systems by and I think that again, comes back to these kind 381 00:18:07,458 --> 00:18:10,338 of departmental policies, right? 382 00:18:10,388 --> 00:18:13,538 Wouldn't it be great if your department had a set of benchmarks that they 383 00:18:13,538 --> 00:18:16,538 don't release publicly that you use internally within your department? 384 00:18:16,658 --> 00:18:18,892 You can't train on those, you can't tune on those. 385 00:18:18,892 --> 00:18:23,048 You can't game that system and it tells you how well that model is gonna 386 00:18:23,048 --> 00:18:27,338 work for your team in your area with the patient population that you have 387 00:18:27,338 --> 00:18:28,408 and the problems that you deal with. 388 00:18:28,408 --> 00:18:33,212 I I think having those kind of system level benchmarks, as opposed 389 00:18:33,212 --> 00:18:36,595 to general global level benchmarks, are really the future to measuring 390 00:18:36,595 --> 00:18:37,685 the success of these models. 391 00:18:37,685 --> 00:18:41,475 But they take time and money to build and maintain and it's tough 392 00:18:41,475 --> 00:18:45,465 to invest in that because you don't get an immediate reward, right? 393 00:18:45,465 --> 00:18:48,615 It's this kind of slow rolling thing where like eventually you'll adapt better 394 00:18:48,615 --> 00:18:51,945 models and improve better efficiencies, but it's tough to justify the budget for 395 00:18:51,945 --> 00:18:53,445 those kind of things right now, I think. 396 00:18:54,072 --> 00:18:58,752 Yeah, so tomorrow I'm gonna release the Sam Ashoo Benchmark System that's 397 00:18:58,752 --> 00:19:02,202 completely black boxed and closed, and it's just gonna be a number 398 00:19:02,202 --> 00:19:06,372 associated with the LLMs and no one's gonna know how I scored them or why. 399 00:19:06,577 --> 00:19:09,637 They're just gonna have to take my word that, Hey, this one is performing 400 00:19:09,637 --> 00:19:11,497 better than that one for today. 401 00:19:11,587 --> 00:19:14,377 And I'll run it again in a month and I'll let you know if that's still a 402 00:19:14,382 --> 00:19:16,472 live running, recurring benchmark score. 403 00:19:16,529 --> 00:19:18,319 It's funny, you say it as a joke, but people are actually 404 00:19:18,319 --> 00:19:19,669 starting to do things like that. 405 00:19:19,769 --> 00:19:20,059 Yeah. 406 00:19:20,457 --> 00:19:22,540 Some of these bigger companies there, there's whole companies now who 407 00:19:22,540 --> 00:19:24,400 specialize in building benchmarks. 408 00:19:24,577 --> 00:19:27,425 And they're starting to do more along those lines of we're not really 409 00:19:27,425 --> 00:19:28,925 gonna tell you a lot of the details. 410 00:19:28,925 --> 00:19:31,655 We're gonna tell you like the general concept that we're testing for and we're 411 00:19:31,655 --> 00:19:35,105 gonna have a public testing set where you can use to measure yourself against. 412 00:19:35,432 --> 00:19:38,355 And then we're gonna have a private testing set that is different and 413 00:19:38,355 --> 00:19:39,975 we're gonna test it independently. 414 00:19:40,182 --> 00:19:41,892 And that's kind of how it's gonna work. 415 00:19:42,182 --> 00:19:44,755 But those are trust-based systems, which are tough in healthcare 416 00:19:44,795 --> 00:19:45,815 Yeah, exactly. 417 00:19:46,685 --> 00:19:46,935 That's right. 418 00:19:46,935 --> 00:19:47,920 Trust me, it's okay. 419 00:19:48,227 --> 00:19:49,404 Yeah, it's tough. 420 00:19:49,404 --> 00:19:53,197 Doctors like to see evidence trails and papers and published results and 421 00:19:53,197 --> 00:19:56,527 study criteria, and if you publish that stuff, then the AI is going 422 00:19:56,527 --> 00:19:57,847 to be able to gain your system. 423 00:19:57,847 --> 00:19:59,837 So it's a really a tough situation that we're in here. 424 00:20:00,730 --> 00:20:04,660 And then some of that can be overcome in the system design, right? 425 00:20:04,660 --> 00:20:08,904 So like we talked about how it might tailor a response because it knows 426 00:20:08,904 --> 00:20:12,484 that people like to be reinforced or told that they're correct, but you 427 00:20:12,484 --> 00:20:14,914 don't have to have that version, right? 428 00:20:14,914 --> 00:20:17,867 It would be ideal to say, well, in healthcare I've got a bunch 429 00:20:17,867 --> 00:20:19,067 of doctors using this system. 430 00:20:19,067 --> 00:20:20,472 Can we just turn this feature off? 431 00:20:20,715 --> 00:20:23,679 You know, can I just toggle that to off so that it just gives me an 432 00:20:23,679 --> 00:20:28,042 honest answer and doesn't try to reinforce a yes or try and reinforce 433 00:20:28,042 --> 00:20:30,052 my bias as I'm entering this question. 434 00:20:30,262 --> 00:20:31,282 Is that not possible? 435 00:20:31,329 --> 00:20:31,759 I love it. 436 00:20:31,796 --> 00:20:32,739 The answer is no. 437 00:20:32,826 --> 00:20:34,019 Not possible, but let's break it down as to why. 438 00:20:34,789 --> 00:20:34,939 Okay. 439 00:20:35,266 --> 00:20:39,196 There's two things at play in AI systems that are different and that 440 00:20:39,196 --> 00:20:41,446 most people conflate as being the same. 441 00:20:41,946 --> 00:20:46,006 AI models have a set of parameters you can think of if you think of an AI 442 00:20:46,006 --> 00:20:49,306 model as a brain, which was the original inspiration for how these models work. 443 00:20:49,482 --> 00:20:53,256 It's layers of artificial neurons that are all connected to each other and 444 00:20:53,256 --> 00:20:55,541 then each connection has a weight. 445 00:20:55,541 --> 00:20:57,161 You can think of it as a neuron firing. 446 00:20:57,367 --> 00:21:00,591 It takes a certain amount of charge into a neuron and then it fires an electrical 447 00:21:00,591 --> 00:21:04,131 signal, which passes to the next neuron, and then that continues along this whole 448 00:21:04,131 --> 00:21:06,421 big network to take these raw inputs. 449 00:21:06,601 --> 00:21:10,591 Which to us are our senses and to the LLM as text language and 450 00:21:10,591 --> 00:21:12,241 translate them into outputs. 451 00:21:12,631 --> 00:21:17,521 So those weights are what we train when we are building an LLM. 452 00:21:17,787 --> 00:21:21,787 That is what is the difference between one model and the next. 453 00:21:21,874 --> 00:21:25,457 That is those training phases I talked about where it's trained on all the data 454 00:21:25,457 --> 00:21:29,637 on the internet, and then it's tuned on things like human reinforcement. 455 00:21:29,787 --> 00:21:31,347 Those are baked into the weights. 456 00:21:31,467 --> 00:21:33,237 You can't just turn those on or off. 457 00:21:33,237 --> 00:21:37,447 It would be like deleting neurons in your brain to try and make your brain perform 458 00:21:37,447 --> 00:21:39,127 better at one thing versus another. 459 00:21:39,631 --> 00:21:40,754 It's just not really possible. 460 00:21:41,074 --> 00:21:44,854 What you could do, is you could fine tune your own network, right? 461 00:21:44,854 --> 00:21:47,524 That doesn't have those specific post-training steps or has 462 00:21:47,524 --> 00:21:50,804 post-training steps more geared specifically to the things you want. 463 00:21:51,071 --> 00:21:54,454 But turns out that is super, super expensive to do and to do well. 464 00:21:54,484 --> 00:21:59,344 That's why there's only, maybe three or four frontier level AI models. 465 00:21:59,361 --> 00:22:04,694 We have Claw, Gemini, Grock, ChatGPT that are all kind of state of the art and 466 00:22:04,694 --> 00:22:08,644 you don't have Joe Schmo in his basement building his own state of the AI system. 467 00:22:08,644 --> 00:22:08,944 It's pretty expensive 468 00:22:09,046 --> 00:22:09,376 yeah. 469 00:22:09,492 --> 00:22:09,732 to do. 470 00:22:09,732 --> 00:22:10,632 Takes a lot of data. 471 00:22:10,632 --> 00:22:12,072 Takes a lot of compute resources. 472 00:22:12,072 --> 00:22:14,352 Takes a very long time to train these systems. 473 00:22:14,709 --> 00:22:17,812 So the customization layer though is very interesting. 474 00:22:18,109 --> 00:22:19,352 'Cause there's all sort of. 475 00:22:19,389 --> 00:22:23,652 tools and harnesses and scaffolds and frameworks that you can put these models 476 00:22:23,652 --> 00:22:27,322 into to make them more specialized and perform in a manner that you want. 477 00:22:27,559 --> 00:22:29,982 So things like prompt optimization, right? 478 00:22:30,012 --> 00:22:32,322 Giving the AI a persona, telling it explicitly things 479 00:22:32,322 --> 00:22:33,312 that you want or don't want. 480 00:22:33,734 --> 00:22:34,737 Giving the AI a memory. 481 00:22:34,917 --> 00:22:38,977 So built into those weights, is a form of memory because it can 482 00:22:38,977 --> 00:22:40,567 remember everything that it's seen. 483 00:22:40,834 --> 00:22:45,917 But what if you want it to specifically focus on medical research as opposed 484 00:22:45,917 --> 00:22:50,837 to weighting medical research as equally as Elon Musk Twitter posts 485 00:22:50,989 --> 00:22:52,609 . You know, it happens in the AI system. 486 00:22:52,656 --> 00:22:55,836 It considers all the information on the internet, not just a specific subset. 487 00:22:56,646 --> 00:22:58,836 You can give it memory features. 488 00:22:58,902 --> 00:23:02,876 These are systems that are built in between the user and the AI. 489 00:23:03,086 --> 00:23:06,246 Where you ask a question and then you can imagine you have a whole database 490 00:23:06,246 --> 00:23:09,306 of information and you can go and query that database of information 491 00:23:09,306 --> 00:23:12,366 using your question and say, give me everything related to this question. 492 00:23:12,692 --> 00:23:15,506 And then it can automatically pull all the stuff outta that 493 00:23:15,506 --> 00:23:16,916 database related to the question. 494 00:23:17,152 --> 00:23:20,071 Paste that right on top of the question, say, answer this question 495 00:23:20,131 --> 00:23:21,421 according to this information. 496 00:23:21,601 --> 00:23:26,711 And then you feed that to the LLM and now it has a really focused chunk of memory 497 00:23:27,067 --> 00:23:31,341 that is directly associated to what you need from your set of resources that you 498 00:23:31,341 --> 00:23:35,501 say, this is the stuff that I want you to pay attention to, and that we way we 499 00:23:35,501 --> 00:23:37,121 can reinforce what we want it to know. 500 00:23:37,121 --> 00:23:40,971 That's how systems like Open Evidence work is with those kind of intermediary layers. 501 00:23:40,971 --> 00:23:44,524 They're generally referred to as RAG Systems: retrieval augmented generation. 502 00:23:45,041 --> 00:23:47,424 And then there's all sorts of other things, like you can fine tune 503 00:23:47,424 --> 00:23:50,994 models, you can do your own little training phase and tweak the actual 504 00:23:50,994 --> 00:23:53,964 weights of the system to be what you want, although it gets really tough. 505 00:23:53,964 --> 00:23:57,504 The more you tweak them, the further away they get from the general intelligence 506 00:23:57,504 --> 00:23:58,854 and the more specialized they get. 507 00:23:59,124 --> 00:24:02,364 So they might become better at one thing, but worse at everything else. 508 00:24:02,431 --> 00:24:05,144 And then also these systems are moving so fast, right? 509 00:24:05,144 --> 00:24:06,434 If you train your own system. 510 00:24:06,934 --> 00:24:10,484 What happens next year when ChatGPT six comes out and it's 511 00:24:10,484 --> 00:24:12,104 way better than your system is? 512 00:24:12,194 --> 00:24:13,844 So you have to train that. 513 00:24:14,084 --> 00:24:17,414 It becomes this kind of expensive game of maintenance and whatnot. 514 00:24:17,464 --> 00:24:21,364 Dropping these large language models into these kind of scaffolds things 515 00:24:21,364 --> 00:24:26,597 like RAG and things like, you know, enforcing citations on the system and 516 00:24:26,597 --> 00:24:28,067 checking the output against the model. 517 00:24:28,067 --> 00:24:31,727 There's a lot of just small steps before and after you get the output 518 00:24:31,727 --> 00:24:34,647 from the LM system that you can use to verify, reduce hallucinations, 519 00:24:35,417 --> 00:24:37,647 focus its intent on specific things. 520 00:24:37,944 --> 00:24:42,687 That has been, in my opinion, the best way that we've adapted these models to 521 00:24:42,687 --> 00:24:44,727 perform well in specialized domains. 522 00:24:44,794 --> 00:24:48,117 And I think that's most of what you're seeing when you are seeing 523 00:24:48,294 --> 00:24:52,877 Bob's AI for doctors or whatever new specialty tool comes out. 524 00:24:52,877 --> 00:24:54,287 There's so many of them these days. 525 00:24:54,354 --> 00:24:58,637 A lot of them are using those types of techniques as opposed to changing the 526 00:24:58,727 --> 00:25:00,737 foundations of how the model is trained. 527 00:25:01,515 --> 00:25:08,187 Yeah, I mean, so things like OpenEvidence, Doximity , DocsGPT, UpToDate's AI, 528 00:25:08,207 --> 00:25:12,418 all of these guys have taken some form of a large language model and 529 00:25:12,418 --> 00:25:17,840 just fed it or restricted it to their library, whatever their library is, 530 00:25:17,840 --> 00:25:21,110 and said, you will search just this library and nothing else, and you're 531 00:25:21,110 --> 00:25:22,497 not allowed to access the internet. 532 00:25:22,497 --> 00:25:26,127 Just find the answer to the question from this library, and if it isn't 533 00:25:26,127 --> 00:25:28,497 there, just say, Hey, it isn't here. 534 00:25:28,527 --> 00:25:30,857 I don't have any evidence to answer your question. 535 00:25:30,880 --> 00:25:31,150 Right? 536 00:25:31,277 --> 00:25:31,877 It sounds simple, right? 537 00:25:32,850 --> 00:25:33,790 It should be right. 538 00:25:33,930 --> 00:25:34,590 It should be. 539 00:25:34,688 --> 00:25:37,248 Trying to reign it in is proving to be more difficult. 540 00:25:37,260 --> 00:25:38,160 Yeah, it's a lot of work. 541 00:25:38,160 --> 00:25:40,230 You're fighting the nature of these systems a little 542 00:25:40,230 --> 00:25:41,250 bit when you're doing that. 543 00:25:41,250 --> 00:25:44,220 And there's a lot of techniques, and like I said, this is the wild West. 544 00:25:44,250 --> 00:25:48,320 This technology basically hit the public three years ago. 545 00:25:48,590 --> 00:25:51,373 These types of systems of controlling these and specializing 546 00:25:51,373 --> 00:25:53,238 these models, two years ago. 547 00:25:53,472 --> 00:25:57,445 This is all brand new stuff and AI moves fast and people are coming up 548 00:25:57,445 --> 00:25:59,225 with new tools and techniques every day. 549 00:25:59,462 --> 00:26:02,025 And the amount of people working on this is staggering. 550 00:26:02,115 --> 00:26:06,485 And the knowledge sharing is excellent, but it's still a new tool. 551 00:26:06,632 --> 00:26:08,838 And there are new systems and we're still figuring out 552 00:26:08,838 --> 00:26:09,888 really what are the best ways. 553 00:26:09,898 --> 00:26:13,128 I wouldn't say there's any established best practices for how to build these 554 00:26:13,128 --> 00:26:16,032 systems or things like, this is the right to do it, this is the wrong way to do it. 555 00:26:16,062 --> 00:26:17,352 We're still figuring it out as we go. 556 00:26:17,918 --> 00:26:22,778 Like today in April, 2026, if a colleague came to me and said, Hey, 557 00:26:22,958 --> 00:26:27,608 I'm going to use AI to answer this question or I want to do a study on 558 00:26:27,608 --> 00:26:31,602 whether AI is helpful in the clinical setting with my residents or something. 559 00:26:31,602 --> 00:26:37,227 I wouldn't recommend that they just run with out of the box GPT and have at it. 560 00:26:37,227 --> 00:26:40,257 I say, use something that's actually tailored to medicine and 561 00:26:40,257 --> 00:26:41,997 has some kind of medical library. 562 00:26:41,997 --> 00:26:45,500 And not one of just the commercial, yeah, I use Gemini, you know, I'm 563 00:26:45,500 --> 00:26:47,924 like, well, that's good for you, but I wouldn't trust anything that's 564 00:26:47,924 --> 00:26:51,224 coming out of any of those models to treat a patient right now anyway. 565 00:26:51,528 --> 00:26:54,768 I think there is, at least for the time being, an understanding that, 566 00:26:54,768 --> 00:26:58,608 yeah, okay, there is like this general commercial product, which is great. 567 00:26:58,788 --> 00:27:01,458 You wanna throw in some PDFs, crunch some numbers, have it spit out an 568 00:27:01,458 --> 00:27:03,108 Excel sheet, whatever, that's great. 569 00:27:03,288 --> 00:27:07,128 But if you're actually treating someone in front of you or putting 570 00:27:07,128 --> 00:27:10,561 someone's health at risk, you need to be careful which one of these things 571 00:27:10,561 --> 00:27:12,868 you're gonna put some trust in, right? 572 00:27:13,078 --> 00:27:16,348 And even that still isn't gonna give you a confidence percentage 573 00:27:16,348 --> 00:27:17,278 or anything of that sort. 574 00:27:17,385 --> 00:27:17,895 Exactly. 575 00:27:17,895 --> 00:27:20,188 I fully agree, but you know, which of the systems? 576 00:27:20,188 --> 00:27:21,118 There's so many coming out. 577 00:27:21,118 --> 00:27:22,618 Epic has some tools. 578 00:27:22,618 --> 00:27:26,368 Claude Anthropic has tools specifically designed for clinicians. 579 00:27:26,445 --> 00:27:28,998 ChatGPT is starting to get into that realm, although they're more on 580 00:27:28,998 --> 00:27:30,678 the consumer health side of things. 581 00:27:30,678 --> 00:27:33,951 There's all sorts of boutique tools that are coming out there. 582 00:27:34,101 --> 00:27:35,001 Open evidence. 583 00:27:35,501 --> 00:27:37,536 Which tool do you use and why, and how? 584 00:27:38,041 --> 00:27:38,311 Right. 585 00:27:38,450 --> 00:27:38,960 And how. 586 00:27:39,171 --> 00:27:40,111 What are the cases where you should be using it? 587 00:27:40,111 --> 00:27:41,731 What are the cases where you need to be checking it? 588 00:27:41,911 --> 00:27:42,991 How do you check it? 589 00:27:43,381 --> 00:27:46,861 These are systems, that at this point, people just learn by doing. 590 00:27:47,178 --> 00:27:47,408 Yeah. 591 00:27:47,575 --> 00:27:51,185 I bet that you didn't take a course on how to use these systems or have somebody 592 00:27:51,185 --> 00:27:52,805 tell you like, oh, this is the best way. 593 00:27:52,955 --> 00:27:54,888 You probably learned by doing it and talking to your colleagues 594 00:27:54,938 --> 00:27:55,611 yeah, absolutely. 595 00:27:55,845 --> 00:27:57,245 And different people figured out different things. 596 00:27:57,245 --> 00:27:58,385 And that's not sustainable, right? 597 00:27:58,425 --> 00:28:02,575 You can't establish real systems that work effectively just through 598 00:28:02,575 --> 00:28:04,080 everybody figuring out on their own. 599 00:28:04,376 --> 00:28:09,956 So to me that's where the real challenge lies is in adoption of AI systems. 600 00:28:10,166 --> 00:28:12,405 There's a fun stat that came out from. 601 00:28:12,921 --> 00:28:13,745 Ooh, I can't remember. 602 00:28:13,745 --> 00:28:15,275 I think it was outta MIT, I'm not sure. 603 00:28:15,275 --> 00:28:19,745 But they said 95% of AI pilots fail in deployment. 604 00:28:19,898 --> 00:28:23,318 Most of the reason for that, it's not because the models aren't good. 605 00:28:23,608 --> 00:28:24,538 These models work. 606 00:28:24,628 --> 00:28:26,608 They are good, they give good results. 607 00:28:26,938 --> 00:28:28,648 It's because people don't know what they're doing. 608 00:28:29,065 --> 00:28:30,015 People the user? 609 00:28:30,015 --> 00:28:31,575 Or people the programmers? 610 00:28:31,575 --> 00:28:33,067 Or the people in between? 611 00:28:33,164 --> 00:28:34,064 Which people are we talking about? 612 00:28:34,230 --> 00:28:34,690 All of 'em. 613 00:28:34,840 --> 00:28:35,290 All of 'em. 614 00:28:35,414 --> 00:28:35,834 All people. 615 00:28:36,329 --> 00:28:36,619 Okay. 616 00:28:36,700 --> 00:28:37,960 Not just the end users. 617 00:28:37,977 --> 00:28:39,965 It's the folks who are developing the systems. 618 00:28:39,965 --> 00:28:41,555 Don't think about the end users enough. 619 00:28:41,975 --> 00:28:46,115 How many AI tools are just a new screen, a new tool, a new app that you have 620 00:28:46,115 --> 00:28:50,285 to go out of your workflow, go load up another thing, copy and paste stuff 621 00:28:50,285 --> 00:28:52,145 into, copy and paste the answer back. 622 00:28:52,275 --> 00:28:52,635 Yeah. 623 00:28:52,727 --> 00:28:55,872 There's a lot of friction in those systems, which just regardless of how 624 00:28:55,872 --> 00:28:57,247 good they are, make them hard to use. 625 00:28:57,907 --> 00:29:00,547 Doctors' times have already been eaten away by dealing with patients and 626 00:29:00,547 --> 00:29:02,587 dealing with charting and all this stuff. 627 00:29:02,587 --> 00:29:05,617 How much time do they have to open up a new screen outside their 628 00:29:05,617 --> 00:29:08,707 system, learn to use a new app, copy and paste between systems? 629 00:29:08,707 --> 00:29:09,727 It's a tough ask. 630 00:29:10,230 --> 00:29:10,530 Yeah. 631 00:29:11,220 --> 00:29:14,220 How do you validate these systems moving forward over time? 632 00:29:14,237 --> 00:29:17,920 I have this great story, one of the very first AI tools I built, again, this 633 00:29:17,920 --> 00:29:19,820 is, eight, nine years ago, whatever. 634 00:29:20,200 --> 00:29:20,890 It's a great tool. 635 00:29:20,890 --> 00:29:24,230 It was predicting surgical skilled nursing facility placement 636 00:29:24,230 --> 00:29:26,920 after total joint replacement. 637 00:29:26,987 --> 00:29:29,600 Who's gonna have to go to a nursing home for recovery 638 00:29:29,600 --> 00:29:30,970 after hip or knee replacement. 639 00:29:31,417 --> 00:29:32,150 And it worked great. 640 00:29:32,435 --> 00:29:37,405 We were getting fantastic results for about a year until the system changed 641 00:29:37,435 --> 00:29:40,978 and we went to bundle payments and the incentive structure for how people were 642 00:29:40,978 --> 00:29:42,688 sent to these nursing facilities changed. 643 00:29:42,768 --> 00:29:47,268 Almost overnight, our model went from very accurate, very predictive. 644 00:29:47,268 --> 00:29:48,798 We were doing like intervention planning. 645 00:29:48,798 --> 00:29:51,318 What happened if the patient stopped smoking or lost weight? 646 00:29:51,318 --> 00:29:52,463 How would that change their outcome? 647 00:29:52,463 --> 00:29:55,916 It It went from working really well to, it it didn't work at all. 648 00:29:56,511 --> 00:29:59,891 But we had to completely retrain it to get it to work with a new system, and 649 00:29:59,891 --> 00:30:01,691 that took time and it took monitoring. 650 00:30:01,721 --> 00:30:04,951 If we weren't watching it, we never would have know that it didn't work. 651 00:30:05,188 --> 00:30:09,911 A lot of people aren't thinking ahead enough as to model drift and 652 00:30:09,911 --> 00:30:11,634 how priorities change over time. 653 00:30:11,684 --> 00:30:13,754 What would happen if we had AI before COVID, right? 654 00:30:13,754 --> 00:30:16,964 If AI had become popular five years before COVID and had been trained 655 00:30:16,964 --> 00:30:20,774 to work in those types of hospital systems and then COVID hit and people 656 00:30:20,774 --> 00:30:24,319 were relying on AI, the whole medical system was turned on its head. 657 00:30:24,319 --> 00:30:27,526 The problems people were dealing with on a daily basis changed entirely and 658 00:30:27,526 --> 00:30:30,826 it wasn't stuff that was out there on the internet for you to train on. 659 00:30:31,036 --> 00:30:34,156 These AI systems would have gone from really useful to, you know 660 00:30:34,278 --> 00:30:34,818 Garbage. 661 00:30:35,329 --> 00:30:36,166 Yeah, exactly. 662 00:30:36,216 --> 00:30:40,116 So friction in the system, how easy is it for people to actually adopt it? 663 00:30:40,616 --> 00:30:43,946 How well is this system going to hold up over time and consistently 664 00:30:43,946 --> 00:30:46,826 give you good results that you can have confidence in that those are 665 00:30:46,826 --> 00:30:49,206 good results, and then training. 666 00:30:49,473 --> 00:30:53,423 The number of people that I've seen open up a specialized AI tool, ask it 667 00:30:53,423 --> 00:30:57,383 one question, not get the answer they like, and then never use it again. 668 00:30:58,433 --> 00:30:59,433 It's staggering. 669 00:30:59,528 --> 00:31:04,818 So many people try that, and usually it's not because the AI tool is bad. 670 00:31:04,968 --> 00:31:07,488 It's because they didn't ask it the question in a good way. 671 00:31:07,844 --> 00:31:10,438 Or they asked it something that it never stood chance of answering 672 00:31:10,658 --> 00:31:13,358 or they didn't give it enough context around the situation. 673 00:31:13,638 --> 00:31:17,716 So just knowing how to interact with these tools and utilize them 674 00:31:17,923 --> 00:31:21,246 in the way that they were designed, to enable them to give you the good 675 00:31:21,246 --> 00:31:25,054 results that they can, is skill and it's not one that's taught usually. 676 00:31:25,294 --> 00:31:29,979 Usually when an AI system is deployed they show benchmarks that they've tested on. 677 00:31:30,066 --> 00:31:33,419 They show, here's what we predict if everybody in the hospital uses 678 00:31:33,419 --> 00:31:36,209 it 100% of the time, here's the amount of savings you're gonna get. 679 00:31:36,389 --> 00:31:38,549 They deploy it and then they move on to the next system. 680 00:31:38,886 --> 00:31:38,976 The 681 00:31:39,216 --> 00:31:39,966 hospital, the next 682 00:31:39,989 --> 00:31:41,339 AI system, the next thing. 683 00:31:41,606 --> 00:31:44,036 Sticking around to do that maintenance, the training, the 684 00:31:44,036 --> 00:31:49,416 monitoring, the dirty work, that is where all of the value lies, right? 685 00:31:49,416 --> 00:31:52,026 You can build the best system in the world, but if nobody 686 00:31:52,026 --> 00:31:55,716 knows how to use it or uses it regularly, doesn't really matter. 687 00:31:56,544 --> 00:32:00,234 At least in our community hospital here, I'm starting to get involved in 688 00:32:00,264 --> 00:32:04,914 like an AI curriculum for residents and physicians and teaching them more 689 00:32:04,914 --> 00:32:08,864 about the differences, what models are all about, what prompts are all about, 690 00:32:08,864 --> 00:32:14,624 and how the way that you ask a question can lead down a path that is gonna 691 00:32:14,649 --> 00:32:17,984 give you an incorrect answer simply based on how you ask the question. 692 00:32:18,134 --> 00:32:23,564 Which seems a little strange to say because you're dealing with a 693 00:32:23,564 --> 00:32:28,754 model that has a command of human language that surpasses most humans, 694 00:32:29,024 --> 00:32:33,794 and yet you still have to be careful how you phrase a question as though 695 00:32:33,794 --> 00:32:35,054 you were speaking to a human. 696 00:32:35,054 --> 00:32:38,258 Like I could go up to a resident and say, Hey, don't you think this person 697 00:32:38,258 --> 00:32:41,528 is a little too high risk to be putting in observation for their chest pain? 698 00:32:41,618 --> 00:32:45,098 And just by the nature of how I ask that question, that resident's gonna go, 699 00:32:45,098 --> 00:32:47,568 oh clearly he thinks that he a person. 700 00:32:47,568 --> 00:32:48,468 Let me reevaluate. 701 00:32:48,468 --> 00:32:49,188 Lemme take a look. 702 00:32:49,188 --> 00:32:50,058 Okay. 703 00:32:50,058 --> 00:32:50,068 Yeah. 704 00:32:50,068 --> 00:32:50,078 Okay. 705 00:32:50,078 --> 00:32:51,738 I'm just gonna, I don't really know the answer, I'm just gonna 706 00:32:51,738 --> 00:32:52,608 reinforce what he is saying. 707 00:32:52,608 --> 00:32:53,958 So yeah , actually, you're right. 708 00:32:53,958 --> 00:32:56,648 This person is too high risk and thank you for bringing that to my attention. 709 00:32:56,648 --> 00:32:57,818 I'm gonna go change that right now. 710 00:32:57,978 --> 00:33:03,583 When I don't expect that from an artificial intelligence LLM, but that is 711 00:33:03,583 --> 00:33:05,769 the case How I ask a question matters. 712 00:33:06,003 --> 00:33:06,363 Yeah. 713 00:33:06,553 --> 00:33:09,823 I mean there's so much detail to communication. 714 00:33:10,033 --> 00:33:13,414 First off, when you're communicating with a human being, it's not just text, right? 715 00:33:13,414 --> 00:33:14,654 Unless you're literally texting them. 716 00:33:14,654 --> 00:33:18,034 Which we all know, texting can lead to miscommunications plenty of times. 717 00:33:18,273 --> 00:33:18,873 Yes. 718 00:33:18,911 --> 00:33:22,744 You're dealing with voice inflection, cadence, facial expression, 719 00:33:22,744 --> 00:33:25,319 body language, speed of speech. 720 00:33:25,586 --> 00:33:29,769 These different factors beyond the words that you're saying are really important. 721 00:33:30,059 --> 00:33:33,029 But also at the end of the day what's even more interesting, it's not just 722 00:33:33,419 --> 00:33:37,689 if you ask a AI a question in different ways it'll give different answers. 723 00:33:37,809 --> 00:33:40,689 You can ask it the same question, using the same language, and still 724 00:33:40,689 --> 00:33:43,753 get different answers because these are probabilistic models at the end 725 00:33:43,753 --> 00:33:45,133 of the day, they're not deterministic. 726 00:33:45,133 --> 00:33:47,833 They don't always do the same thing, but what they basically do is you 727 00:33:47,833 --> 00:33:52,306 feed it a list of words, which is your prompt, and then at the end, all it 728 00:33:52,306 --> 00:33:53,746 does is try to predict the next word. 729 00:33:53,866 --> 00:33:56,116 And then it takes that next word, adds it onto the list and 730 00:33:56,116 --> 00:33:57,226 tries to predict the next word. 731 00:33:57,466 --> 00:34:00,226 And it just does that over and over again until it gives you a response. 732 00:34:00,406 --> 00:34:03,886 And the way it chooses that next word is it looks at all of the words it knows 733 00:34:04,156 --> 00:34:09,188 and it says out of these words, what are the most likely to be the next word and 734 00:34:09,188 --> 00:34:11,024 then it does a probabilistic sampling. 735 00:34:11,024 --> 00:34:12,524 It doesn't always pick the top one. 736 00:34:12,701 --> 00:34:17,524 It picks out of the top 10, 15, whatever, based on how likely each 737 00:34:17,524 --> 00:34:19,174 one is to be the right answer. 738 00:34:19,501 --> 00:34:21,884 So it can give you different responses. 739 00:34:21,884 --> 00:34:25,968 Now, these days, that's not a big of an issue as it used to be a few years ago. 740 00:34:26,144 --> 00:34:27,448 But still something to be aware of. 741 00:34:27,448 --> 00:34:29,184 These are not humans. 742 00:34:29,184 --> 00:34:30,234 These are not human brains. 743 00:34:30,234 --> 00:34:32,184 These are probabilistic reasoning machines. 744 00:34:32,184 --> 00:34:34,734 So the amount of information you give, it matters. 745 00:34:34,854 --> 00:34:39,184 The more context and information you dump into that string, into that 746 00:34:39,286 --> 00:34:43,006 prompt, the better it's gonna do because that's going to basically 747 00:34:43,006 --> 00:34:44,536 clue in that probabilistic model. 748 00:34:44,536 --> 00:34:46,456 Okay, what is the right thing to answer this? 749 00:34:46,456 --> 00:34:48,706 If you just leave a vague question and you don't add enough 750 00:34:48,706 --> 00:34:50,386 detail, it's not gonna know. 751 00:34:51,126 --> 00:34:54,156 Said, the way you ask questions and the way you expect responses and 752 00:34:54,156 --> 00:34:57,656 interact with these systems, again, very important just in terms of how 753 00:34:57,656 --> 00:35:00,746 the system functions, how it's been trained, how it deals with information. 754 00:35:01,123 --> 00:35:04,393 It's very tricky and there are some guidelines that you can build out, but it 755 00:35:04,393 --> 00:35:06,013 is a little bit of a feel kind of thing. 756 00:35:06,906 --> 00:35:10,599 Yeah and just to add another layer of complexity, there's the reference 757 00:35:10,599 --> 00:35:12,279 or the knowledge base that it has. 758 00:35:12,369 --> 00:35:15,399 Which if you're using one of these general models is updated, 759 00:35:15,399 --> 00:35:16,419 you know, intermittently. 760 00:35:16,599 --> 00:35:17,109 You don't know. 761 00:35:17,109 --> 00:35:17,769 You could ask it. 762 00:35:17,769 --> 00:35:20,469 You could ask Chad GPT, Hey, when was your knowledge base last updated? 763 00:35:20,469 --> 00:35:21,129 And it'll tell you. 764 00:35:21,453 --> 00:35:25,233 I think it was last month, I sat and I went through some of the commercial 765 00:35:25,233 --> 00:35:28,713 ones, OpenEvidence and DocsGPT and a couple of others and some of the general 766 00:35:28,713 --> 00:35:33,443 models and I just asked them about a medication I knew that the manufacturer 767 00:35:33,443 --> 00:35:37,246 had withdrawn from the US market in the end of December of last year. 768 00:35:37,296 --> 00:35:40,146 And I said, what is the adult dosing of this medicine? 769 00:35:40,356 --> 00:35:43,879 That's all I asked and my assumptions in asking that question were, it's 770 00:35:43,879 --> 00:35:46,759 knowledge base is probably not up to date enough to answer this 771 00:35:46,759 --> 00:35:48,409 question and know it's withdrawn. 772 00:35:48,649 --> 00:35:52,363 And second, it should at least know from my IP address or from my previous 773 00:35:52,363 --> 00:35:54,193 conversations that I'm a US physician. 774 00:35:54,193 --> 00:35:55,506 So I'm working in the United States. 775 00:35:55,506 --> 00:35:59,556 And sure enough, out of like the five or six models I tested, only one of 776 00:35:59,556 --> 00:36:04,176 them threw back an answer saying, here's the adult dosing but first you 777 00:36:04,176 --> 00:36:08,856 should know this has been withdrawn from the US market as of December 24th. 778 00:36:08,856 --> 00:36:11,736 And then it spit out the adult dosing because that's what I asked it. 779 00:36:11,886 --> 00:36:15,929 All the others just gave me the answer and maybe suggested some follow up questions. 780 00:36:15,929 --> 00:36:17,099 And I went, gosh okay. 781 00:36:17,219 --> 00:36:19,499 And so then ,I went back and asked each one. 782 00:36:19,889 --> 00:36:22,559 Okay, when was your knowledge base last updated? 783 00:36:22,559 --> 00:36:26,549 And half of them told me, the other half said, yeah, that's not in my 784 00:36:26,549 --> 00:36:29,699 programming to answer and I can't give you the answer to that question. 785 00:36:29,879 --> 00:36:33,481 So again , it's one of those scenarios where you have to be like four steps 786 00:36:33,481 --> 00:36:37,201 ahead to understand what the output is coming because ultimately it's 787 00:36:37,201 --> 00:36:41,271 still a machine and if you don't know the parameters, how good it is, what 788 00:36:41,271 --> 00:36:44,644 library it's accessing, when it's knowledge base was last updated, yada. 789 00:36:45,082 --> 00:36:47,719 You take this answer and you kind of go, okay. 790 00:36:47,959 --> 00:36:51,882 And on top of that, I don't know any physician who's verifying the output. 791 00:36:52,509 --> 00:36:54,339 If it agrees with me, it must be right. 792 00:36:54,519 --> 00:36:57,669 If it disagrees with me, then I'm gonna verify or start looking 793 00:36:57,669 --> 00:36:59,041 at some of those citations. 794 00:36:59,468 --> 00:37:02,078 And like we said, these models are tuned to agree with you. 795 00:37:02,378 --> 00:37:03,208 That's their job. 796 00:37:03,788 --> 00:37:04,328 Is to agree with you. 797 00:37:04,346 --> 00:37:04,886 It's crazy. 798 00:37:05,010 --> 00:37:06,863 And it's tough to get 'em to break outta that. 799 00:37:06,863 --> 00:37:08,953 A lot of companies are trying to get 'em to break outta 800 00:37:08,953 --> 00:37:10,393 that behavior and it's tough. 801 00:37:10,660 --> 00:37:11,903 It's an interesting problem out there. 802 00:37:12,420 --> 00:37:14,880 Hey, listen, we have taken up enough of your time. 803 00:37:14,880 --> 00:37:17,040 I really appreciate you being on the podcast. 804 00:37:17,040 --> 00:37:20,910 I would love to have you on here again to share your knowledge and to continue 805 00:37:20,910 --> 00:37:24,173 to engage in this dialogue about artificial intelligence in medicine. 806 00:37:24,223 --> 00:37:27,373 Tell me before we leave, if there's someone out there to looking for 807 00:37:27,433 --> 00:37:30,733 a consultant and they wanna try and reach you, is there a webpage 808 00:37:30,733 --> 00:37:31,513 or how did they get ahold of you? 809 00:37:31,783 --> 00:37:33,638 Yeah, totally, linkedIn is always good. 810 00:37:33,638 --> 00:37:34,898 I'm always active on there. 811 00:37:34,975 --> 00:37:38,588 Just Jack Teitel, T-E-I-T-E-L on LinkedIn, or you could go to my 812 00:37:38,588 --> 00:37:44,165 website, title T-I-T-L-E - ai.com, and there's a little contact form there. 813 00:37:44,371 --> 00:37:45,285 Always happy to chat. 814 00:37:46,178 --> 00:37:46,838 Fantastic. 815 00:37:46,838 --> 00:37:47,378 Thanks Jack. 816 00:37:47,408 --> 00:37:50,772 I really appreciate you being on the show, sharing your knowledge, and I 817 00:37:50,772 --> 00:37:52,512 look forward to talking with you again. 818 00:37:52,614 --> 00:37:53,364 All right, thanks Sam. 819 00:37:53,364 --> 00:37:53,994 Thanks for having me on. 820 00:37:53,994 --> 00:37:55,104 It's a pleasure chatting with you. 821 00:37:56,056 --> 00:37:57,876 And that's a wrap for this month's episode. 822 00:37:57,916 --> 00:38:00,496 I hope you found it educational and informative. 823 00:38:00,696 --> 00:38:05,556 Don't forget to go to ebmedicine.net to read the article and claim your CME. 824 00:38:05,726 --> 00:38:08,916 And of course, check out all three of the journals and the multitude of 825 00:38:08,916 --> 00:38:13,276 resources avAIlable to you, both for emergency medicine, pediatric emergency 826 00:38:13,276 --> 00:38:15,546 medicine, and evidence based urgent care. 827 00:38:15,856 --> 00:38:17,826 Until next time, everyone be safe.