WEBVTT

00:00:00.000 --> 00:00:05.560
[MUSIC PLAYS]

00:00:05.560 --> 00:00:11.480
ASHLEY LLORENS:&nbsp;I’m Ashley Llorens with Microsoft Research. 
In&nbsp;this podcast series, I share conversations with&nbsp;&nbsp;

00:00:11.480 --> 00:00:16.640
fellow researchers about the latest developments&nbsp;
in AI models, the work we’re doing to understand&nbsp;&nbsp;

00:00:16.640 --> 00:00:21.880
their capabilities and limitations, and ultimately&nbsp;
how innovations like these can have the greatest&nbsp;&nbsp;

00:00:21.880 --> 00:00:29.400
benefit for humanity. Welcome to AI Frontiers.
Today, I’ll speak with Ida Momennejad. Ida works&nbsp;&nbsp;

00:00:29.400 --> 00:00:34.280
at Microsoft Research in New York City at&nbsp;
the intersection of machine learning and&nbsp;&nbsp;

00:00:34.280 --> 00:00:40.320
human cognition and behavior. Her current work&nbsp;
focuses on building and evaluating multi-agent&nbsp;&nbsp;

00:00:40.320 --> 00:00:46.920
AI architectures, drawing from her background in&nbsp;
both computer science and cognitive neuroscience.&nbsp;&nbsp;

00:00:46.920 --> 00:00:53.320
Over the past decade, she has focused on studying&nbsp;
how humans and AI agents build and use models of&nbsp;&nbsp;

00:00:53.320 --> 00:00:55.000
their environment.
[MUSIC FADES]&nbsp;

00:00:55.000 --> 00:01:02.600
Let’s dive right in. We are undergoing a paradigm&nbsp;
shift where AI models and systems are starting to&nbsp;&nbsp;

00:01:02.600 --> 00:01:10.040
exhibit characteristics that I and, of course,&nbsp;
many others have described as more general&nbsp;&nbsp;

00:01:10.040 --> 00:01:16.120
intelligence. When I say general in this context,&nbsp;
I think I mean systems with abilities like&nbsp;&nbsp;

00:01:16.120 --> 00:01:21.920
reasoning and problem-solving that can be applied&nbsp;
to many different tasks, even tasks they were not&nbsp;&nbsp;

00:01:21.920 --> 00:01:27.640
explicitly trained to perform. Despite all of&nbsp;
this, I think it’s also important to admit that&nbsp;&nbsp;

00:01:27.640 --> 00:01:34.840
we—and by we here, I mean humanity—are not very&nbsp;
good at measuring general intelligence, especially&nbsp;&nbsp;

00:01:34.840 --> 00:01:41.360
in machines. So I’m excited to dig further into&nbsp;
this topic with you today, especially given&nbsp;&nbsp;

00:01:41.360 --> 00:01:47.440
your background and insights into both human and&nbsp;
machine intelligence. And so I just want to start&nbsp;&nbsp;

00:01:47.440 --> 00:01:54.040
here: for you, Ida, what is general intelligence?
IDA MOMENNEJAD: Thank you for asking that. We&nbsp;&nbsp;

00:01:54.040 --> 00:01:58.640
could look at general intelligence from the&nbsp;
perspective of history of cognitive science and&nbsp;&nbsp;

00:01:58.640 --> 00:02:06.600
neuroscience. And in doing so, I’d like to mention&nbsp;
its discontents, as well. There was a time where&nbsp;&nbsp;

00:02:06.600 --> 00:02:12.600
general intelligence was introduced as the idea&nbsp;
of a kind of intelligence that was separate from&nbsp;&nbsp;

00:02:12.600 --> 00:02:18.240
what you knew or the knowledge that you had on a&nbsp;
particular topic. It was this general capacity to&nbsp;&nbsp;

00:02:18.240 --> 00:02:23.960
acquire different types of knowledge and reason&nbsp;
over different things. And this was at some point&nbsp;&nbsp;

00:02:23.960 --> 00:02:29.600
known as g, and it’s still known as g. There have&nbsp;
been many different kinds of critiques of this&nbsp;&nbsp;

00:02:29.600 --> 00:02:36.480
concept because some people said that it’s very&nbsp;
much focused on the idea of logic and a particular&nbsp;&nbsp;

00:02:36.480 --> 00:02:41.560
kind of reasoning. Some people made cultural&nbsp;
critiques of it. They said it’s very Western&nbsp;&nbsp;

00:02:41.560 --> 00:02:46.600
oriented. Others said it’s very individualistic.&nbsp;
It doesn’t consider collective or interpersonal&nbsp;&nbsp;

00:02:46.600 --> 00:02:52.400
intelligence or physical intelligence. There&nbsp;
are many critiques of it. But at the core of it,&nbsp;&nbsp;

00:02:52.400 --> 00:02:57.280
there might be something useful and helpful.&nbsp;
And I think the useful part is that there&nbsp;&nbsp;

00:02:57.280 --> 00:03:04.120
could be some general ability in humans, at&nbsp;
least the way that g was intended initially,&nbsp;&nbsp;

00:03:04.120 --> 00:03:07.880
where they can learn many different things&nbsp;
and reason over many different domains,&nbsp;&nbsp;

00:03:07.880 --> 00:03:16.120
and they can transfer ability to reason over a&nbsp;
particular domain to another. And then in the AGI,&nbsp;&nbsp;

00:03:16.120 --> 00:03:21.480
or artificial general intelligence, notion of&nbsp;
it, people took this idea of many different&nbsp;&nbsp;

00:03:21.480 --> 00:03:30.560
abilities or skills for cognitive and reasoning&nbsp;
and logic problem-solving at once. There have&nbsp;&nbsp;

00:03:30.560 --> 00:03:35.200
been different iterations of what this&nbsp;
means in different times. In principle,&nbsp;&nbsp;

00:03:35.200 --> 00:03:39.840
the concept in itself does not provide the&nbsp;
criteria on its own. Different people at&nbsp;&nbsp;

00:03:39.840 --> 00:03:43.720
different times provide different criteria&nbsp;
for what would be the artificial general&nbsp;&nbsp;

00:03:43.720 --> 00:03:48.240
intelligence notion. Some people say that they&nbsp;
have achieved it. Some people say we are on the&nbsp;&nbsp;

00:03:48.240 --> 00:03:53.440
brink of achieving it. Some people say we will&nbsp;
never achieve it. However, there is this idea,&nbsp;&nbsp;

00:03:53.440 --> 00:03:58.720
if you look at it from an evolutionary and&nbsp;
neuroscience and cognitive neuroscience lens,&nbsp;&nbsp;

00:03:58.720 --> 00:04:05.160
that in evolution, intelligence has evolved&nbsp;
multiple times in a way that is adaptive to the&nbsp;&nbsp;

00:04:05.160 --> 00:04:11.680
environment. So there were organisms that needed&nbsp;
to be adaptive to the environment where they were,&nbsp;&nbsp;

00:04:11.680 --> 00:04:17.280
that intelligence has evolved in multiple&nbsp;
different species, so there’s not one solution&nbsp;&nbsp;

00:04:17.280 --> 00:04:22.480
to it, and it depends on the ecological niche&nbsp;
that that particular species needed to adapt&nbsp;&nbsp;

00:04:22.480 --> 00:04:29.200
to and survive in. And it’s very much related to&nbsp;
the idea of being adaptive of certain kinds of,&nbsp;&nbsp;

00:04:29.200 --> 00:04:34.240
different kinds of problem-solving that&nbsp;
are specific to that particular ecology.&nbsp;&nbsp;

00:04:34.240 --> 00:04:39.360
There is also this other idea that there is&nbsp;
no free lunch and the no-free-lunch theorem,&nbsp;&nbsp;

00:04:39.360 --> 00:04:45.240
that you cannot have one particular machine&nbsp;
learning solution that can solve everything.&nbsp;&nbsp;

00:04:45.240 --> 00:04:51.840
So the idea of general artificial intelligence&nbsp;
in terms of an approach that can solve everything&nbsp;&nbsp;

00:04:51.840 --> 00:04:57.840
and there is one end-to-end training that can be&nbsp;
useful to solve every possible problem that it has&nbsp;&nbsp;

00:04:57.840 --> 00:05:03.800
never seen before seems a little bit untenable&nbsp;
to me, at least at this point. What does seem&nbsp;&nbsp;

00:05:03.800 --> 00:05:09.360
tenable to me in terms of general intelligence&nbsp;
is if we understand and study, the same way that&nbsp;&nbsp;

00:05:09.360 --> 00:05:15.880
we can do it in nature, the foundational&nbsp;
components of reasoning, of intelligence,&nbsp;&nbsp;

00:05:15.880 --> 00:05:20.280
of different particular types of intelligence,&nbsp;
of different particular skills—whether it has&nbsp;&nbsp;

00:05:20.280 --> 00:05:24.720
to do with cultural accumulation of written&nbsp;
reasoning and intelligence skills, whether it&nbsp;&nbsp;

00:05:24.720 --> 00:05:31.160
has to do with logic, whether it has to do with&nbsp;
planning—and then working on the particular types&nbsp;&nbsp;

00:05:31.160 --> 00:05:37.160
of artificial agents that are capable of putting&nbsp;
these particular foundational building blocks&nbsp;&nbsp;

00:05:37.160 --> 00:05:42.840
together in order to solve problems they’ve never&nbsp;
seen before. A little bit like putting Lego pieces&nbsp;&nbsp;

00:05:42.840 --> 00:05:49.400
together. So to wrap it up, to sum up what I just&nbsp;
said, the idea of general intelligence had a more&nbsp;&nbsp;

00:05:49.400 --> 00:05:56.080
limited meaning in cognitive science, referring to&nbsp;
human ability to have multiple different types of&nbsp;&nbsp;

00:05:56.080 --> 00:06:02.080
skills for problem-solving and reasoning. Later&nbsp;
on, it was also, of course, criticized in terms&nbsp;&nbsp;

00:06:02.080 --> 00:06:09.560
of the specificity of it and ignoring different&nbsp;
kinds of intelligence. In AI, this notion has&nbsp;&nbsp;

00:06:09.560 --> 00:06:16.640
been having many different kinds of meanings. If&nbsp;
we just mean it’s a kind of a toolbox of general&nbsp;&nbsp;

00:06:16.640 --> 00:06:21.080
kinds of intelligence for something that can&nbsp;
be akin to an assistant to a human, that could&nbsp;&nbsp;

00:06:21.080 --> 00:06:26.800
make sense. But if we go too far and use it in the&nbsp;
kind of absolute notion of general intelligence,&nbsp;&nbsp;

00:06:26.800 --> 00:06:32.840
as it has to encompass all kinds of intelligence&nbsp;
possible, that might be untenable. And also&nbsp;&nbsp;

00:06:32.840 --> 00:06:38.800
perhaps we shouldn’t think about it in terms of a&nbsp;
lump of one end-to-end system that can get all of&nbsp;&nbsp;

00:06:38.800 --> 00:06:45.640
it down. Perhaps we can think about it in terms&nbsp;
of understanding the different components that we&nbsp;&nbsp;

00:06:45.640 --> 00:06:52.120
have also seen emerge in evolution in different&nbsp;
species. Some of them are robust across many&nbsp;&nbsp;

00:06:52.120 --> 00:06:57.160
different species. Some of them are more specific&nbsp;
to some species with a specific ecological niche&nbsp;&nbsp;

00:06:57.160 --> 00:07:04.800
or specific problems to solve. But I think perhaps&nbsp;
it could be more helpful to find those cognitive&nbsp;&nbsp;

00:07:04.800 --> 00:07:10.880
and other interpersonal, cultural, different&nbsp;
notions of intelligence; break them down into&nbsp;&nbsp;

00:07:10.880 --> 00:07:17.040
their foundational building blocks; and then&nbsp;
see how a particular artificial intelligence&nbsp;&nbsp;

00:07:17.040 --> 00:07:24.840
agent can bring together different skills from&nbsp;
this kind of a library of intelligence skills in&nbsp;&nbsp;

00:07:24.840 --> 00:07:31.080
order to solve problems it’s never seen before.
LLORENS: There are two concepts that jump out&nbsp;&nbsp;

00:07:31.080 --> 00:07:39.160
at me based on what you said. One is artificial&nbsp;
general intelligence and the other is humanlike&nbsp;&nbsp;

00:07:39.160 --> 00:07:45.640
intelligence or human-level intelligence. And&nbsp;
you’ve referenced the fact that, you know,&nbsp;&nbsp;

00:07:45.640 --> 00:07:50.160
oftentimes, we equate the two or at least&nbsp;
it’s not clear sometimes how the two relate&nbsp;&nbsp;

00:07:50.160 --> 00:07:55.960
to each other. Certainly, human intelligence&nbsp;
has been an important inspiration for what&nbsp;&nbsp;

00:07:55.960 --> 00:08:01.160
we’ve done—a lot of what we’ve done—in AI&nbsp;
and, in many cases, a kind of evaluation&nbsp;&nbsp;

00:08:01.160 --> 00:08:07.040
target in terms of how we measure progress or&nbsp;
performance. But I wonder if we could just back&nbsp;&nbsp;

00:08:07.040 --> 00:08:12.600
up a minute. Artificial general intelligence&nbsp;
and humanlike, human-level intelligence—how&nbsp;&nbsp;

00:08:12.600 --> 00:08:17.560
do these two concepts relate to you?
MOMENNEJAD: Great question. I like that you&nbsp;&nbsp;

00:08:17.560 --> 00:08:23.360
asked to me because I think it would be different&nbsp;
for different people. I’ve written about this,&nbsp;&nbsp;

00:08:23.360 --> 00:08:30.680
in fact. I think humanlike intelligence or&nbsp;
human-level intelligence would require performance&nbsp;&nbsp;

00:08:30.680 --> 00:08:38.800
that is similar to humans, at least behaviorally,&nbsp;
not just in terms of what the agent gets right,&nbsp;&nbsp;

00:08:38.800 --> 00:08:43.360
but also in terms of the kinds of mistakes and&nbsp;
biases that the agent might have. It should look&nbsp;&nbsp;

00:08:43.360 --> 00:08:49.560
like human intelligence. For instance, humans&nbsp;
show primacy bias, recency bias, variety of&nbsp;&nbsp;

00:08:49.560 --> 00:08:56.080
biases. And this seems like it’s unhelpful in&nbsp;
a lot of situations. But in some situations,&nbsp;&nbsp;

00:08:56.080 --> 00:09:02.080
it helps to come with fast and frugal solutions&nbsp;
on the go. It helps to summarize certain things&nbsp;&nbsp;

00:09:02.080 --> 00:09:07.800
or make inferences really fast that can help&nbsp;
in human intelligence. For instance, there is&nbsp;&nbsp;

00:09:07.800 --> 00:09:14.160
analogical reasoning. That is, there are different&nbsp;
types of intelligence that humans do. Now, if you&nbsp;&nbsp;

00:09:14.160 --> 00:09:20.760
look at what are tasks that are difficult and what&nbsp;
are tasks that are easier for humans and compare&nbsp;&nbsp;

00:09:20.760 --> 00:09:27.360
that to a, for instance, let’s say just a large&nbsp;
language model like GPT-4, you will see whether&nbsp;&nbsp;

00:09:27.360 --> 00:09:32.960
they find similar things simple and similar&nbsp;
things difficult or not. When they don’t find&nbsp;&nbsp;

00:09:32.960 --> 00:09:38.200
similar things easy or difficult, I think that&nbsp;
we should not say that this is humanlike per se,&nbsp;&nbsp;

00:09:38.200 --> 00:09:44.640
unless we mean for a specific task. Perhaps&nbsp;
on specific sets of tasks, an agent can be,&nbsp;&nbsp;

00:09:44.640 --> 00:09:51.080
can have human-level or humanlike intelligent&nbsp;
behavior; however, if we look overall, as long&nbsp;&nbsp;

00:09:51.080 --> 00:09:58.920
as there are particular skills that are more or&nbsp;
less difficult for one or the other, it might be&nbsp;&nbsp;

00:09:58.920 --> 00:10:05.400
not reasonable to compare them. That being said,&nbsp;
there are many things that some AI agent and even&nbsp;&nbsp;

00:10:05.400 --> 00:10:10.480
a [programming] language would be better [than]&nbsp;
humans at. Does that mean that they are generally&nbsp;&nbsp;

00:10:10.480 --> 00:10:14.840
more intelligent? No, it doesn’t because there&nbsp;
are also many things that humans are far better&nbsp;&nbsp;

00:10:14.840 --> 00:10:22.400
than AI at. The second component of this is the&nbsp;
mechanisms by which humans do the intelligent&nbsp;&nbsp;

00:10:22.400 --> 00:10:28.960
things that we do. We are very energy efficient.&nbsp;
With very little amount of energy consumption,&nbsp;&nbsp;

00:10:28.960 --> 00:10:34.200
we can solve very complicated problems. If you&nbsp;
put some of us next to each other or at least&nbsp;&nbsp;

00:10:34.200 --> 00:10:40.120
give a pen and paper to one of us, this can be&nbsp;
even a lot more effective; however, the amount&nbsp;&nbsp;

00:10:40.120 --> 00:10:46.920
of energy consumption that it takes in order for&nbsp;
any machine to solve similar problems is a lot&nbsp;&nbsp;

00:10:46.920 --> 00:10:55.160
higher. So another difference between humanlike&nbsp;
intelligence or biologically inspired intelligence&nbsp;&nbsp;

00:10:55.160 --> 00:11:01.920
and the kind of intelligence that is in silico&nbsp;
is efficiency, energy efficiency in general. And&nbsp;&nbsp;

00:11:01.920 --> 00:11:10.560
finally, the amount of data that goes into current&nbsp;
state-of-[the-art] AI versus perhaps the amount of&nbsp;&nbsp;

00:11:10.560 --> 00:11:17.960
data that a human might need to learn new tasks or&nbsp;
acquire new skills seem to be also different. So&nbsp;&nbsp;

00:11:17.960 --> 00:11:25.840
it seems like there are a number of different&nbsp;
approaches to comparing human and machine&nbsp;&nbsp;

00:11:25.840 --> 00:11:32.600
intelligence and deriving what are the criteria&nbsp;
for a machine intelligence to be more humanlike.&nbsp;&nbsp;

00:11:32.600 --> 00:11:38.160
But other than the conceptual aspect of it, it’s&nbsp;
not clear that we necessarily want something&nbsp;&nbsp;

00:11:38.160 --> 00:11:43.920
that’s entirely humanlike. Perhaps we want in&nbsp;
some tasks and in some particular use cases for&nbsp;&nbsp;

00:11:43.920 --> 00:11:50.560
the agent to be humanlike but not in everything.
LLORENS: You mentioned some of the ways in which&nbsp;&nbsp;

00:11:50.560 --> 00:11:57.840
human intelligence is inferior or has weaknesses.&nbsp;
You mentioned some of the weaknesses of human&nbsp;&nbsp;

00:11:57.840 --> 00:12:06.320
intelligence, like recency bias. What are some&nbsp;
of the weaknesses of artificial intelligence,&nbsp;&nbsp;

00:12:06.320 --> 00:12:13.080
especially frontier systems today? You’ve recently&nbsp;
published some works that have gotten into new&nbsp;&nbsp;

00:12:13.080 --> 00:12:18.040
paradigms for evaluation, and you’ve explored some&nbsp;
of these weaknesses. And so can you tell us more&nbsp;&nbsp;

00:12:18.560 --> 00:12:27.440
about that work and about your view on this?
MOMENNEJAD: Certainly. So inspired by a very&nbsp;&nbsp;

00:12:27.440 --> 00:12:34.000
long-standing tradition of evaluating cognitive&nbsp;
capacities—those Lego pieces that bring together&nbsp;&nbsp;

00:12:34.000 --> 00:12:41.040
intelligence that I was mentioning in humans and&nbsp;
animals—I have conducted a number of experiments,&nbsp;&nbsp;

00:12:41.040 --> 00:12:48.440
first in humans, and built reinforcement learning&nbsp;
models over the past more than a decade on the&nbsp;&nbsp;

00:12:48.440 --> 00:12:54.880
idea of multistep reasoning and planning.&nbsp;
It is in the general domain of reasoning,&nbsp;&nbsp;

00:12:54.880 --> 00:13:01.880
planning, and decision making. And I particularly&nbsp;
focused on what kind of memory representations&nbsp;&nbsp;

00:13:01.880 --> 00:13:08.800
allow brains and reinforcement learning models&nbsp;
inspired by human brain and behavior to be able&nbsp;&nbsp;

00:13:08.800 --> 00:13:15.000
to predict the future and plan the future and&nbsp;
reason over the past and the future seamlessly&nbsp;&nbsp;

00:13:15.000 --> 00:13:22.840
using the same representations. Inspired by the&nbsp;
same research that goes back in tradition to&nbsp;&nbsp;

00:13:22.840 --> 00:13:29.160
Edward Tolman’s idea of cognitive maps and&nbsp;
latent learning in the early 20th century,&nbsp;&nbsp;

00:13:29.160 --> 00:13:36.760
culminating in his very influential 1948 paper,&nbsp;
“Cognitive maps in rats and men,” I sat down with&nbsp;&nbsp;

00:13:36.760 --> 00:13:43.040
a couple of colleagues last year—exactly this&nbsp;
time, probably—and we worked on figuring out if&nbsp;&nbsp;

00:13:43.040 --> 00:13:51.200
we can devise similar experiments to that in order&nbsp;
to test cognitive maps and planning and multistep&nbsp;&nbsp;

00:13:51.200 --> 00:13:57.120
reasoning abilities in large language models. So&nbsp;
I first turned some of the experiments that I had&nbsp;&nbsp;

00:13:57.120 --> 00:14:01.840
conducted in humans and some of the experiments&nbsp;
that were done by Edward Tolman on the topic in&nbsp;&nbsp;

00:14:01.840 --> 00:14:09.920
rodents and turned them into prompts for ChatGPT.&nbsp;
That’s where I started, with GPT-4. The reason I&nbsp;&nbsp;

00:14:09.920 --> 00:14:16.360
did that was that I wanted to make sure that I&nbsp;
will create some prompts that have not been in&nbsp;&nbsp;

00:14:16.360 --> 00:14:22.280
the training set. My experiments, although the&nbsp;
papers have been published, the stimuli of the&nbsp;&nbsp;

00:14:22.280 --> 00:14:27.440
experiments were not linguistic. They were&nbsp;
visual sequences that the human would see,&nbsp;&nbsp;

00:14:27.440 --> 00:14:32.080
and they would have to have some reinforcement&nbsp;
learning and learn from the sequences to make&nbsp;&nbsp;

00:14:32.080 --> 00:14:37.120
inferences about relationships between different&nbsp;
states and find what is the path that would&nbsp;&nbsp;

00:14:37.120 --> 00:14:44.400
give them optimal rewards. Very simple human&nbsp;
reinforcement learning paradigms. However, with&nbsp;&nbsp;

00:14:44.400 --> 00:14:51.080
different kind of structures. The inspirations&nbsp;
that I had drawn from the cognitive maps works&nbsp;&nbsp;

00:14:51.080 --> 00:14:58.400
by Edward Tolman and others was in this idea that&nbsp;
in order for a creature, whether it’s a rodent,&nbsp;&nbsp;

00:14:58.400 --> 00:15:04.000
a human, or a machine, to be able to reason in&nbsp;
[multiple] steps, plan, and have cognitive maps,&nbsp;&nbsp;

00:15:04.000 --> 00:15:09.840
which is simply a representation of the&nbsp;
relational structure of the environment,&nbsp;&nbsp;

00:15:09.840 --> 00:15:16.160
in order for a creature to have these abilities or&nbsp;
these capacities, it means that the creature needs&nbsp;&nbsp;

00:15:16.160 --> 00:15:23.480
to be sensitive and adaptive to local changes&nbsp;
in the environment. So I designed the, sort of,&nbsp;&nbsp;

00:15:23.480 --> 00:15:33.000
the initial prompts and recruited a number of very&nbsp;
smart and generous-with-their-time colleagues who&nbsp;&nbsp;

00:15:33.000 --> 00:15:37.600
we sat together and created these prompts&nbsp;
in different domains. For instance, we also&nbsp;&nbsp;

00:15:37.600 --> 00:15:43.560
created social prompts. We also created the same&nbsp;
kind of graph structures but for reasoning over&nbsp;&nbsp;

00:15:43.560 --> 00:15:48.480
social structures. For instance, I say, Ashley’s&nbsp;
friends with Matt. Matt is friends with Michael.&nbsp;&nbsp;

00:15:48.480 --> 00:15:52.760
If I want to pass a message to Michael, what&nbsp;
is the path that I can choose? Which would be,&nbsp;&nbsp;

00:15:52.760 --> 00:15:58.120
I have to tell Ashley. Ashley will tell Matt. Matt&nbsp;
will tell Michael. This is very similar to another&nbsp;&nbsp;

00:15:58.120 --> 00:16:04.960
paradigm which was more like a maze, which would&nbsp;
be similar to saying, there is a castle; it has 16&nbsp;&nbsp;

00:16:04.960 --> 00:16:10.240
rooms. You enter Room 1. You open the door. It&nbsp;
opens to Room 2. In Room 2, you open the door,&nbsp;&nbsp;

00:16:10.240 --> 00:16:16.760
and so on and so forth. So you describe, using&nbsp;
language, the structure of a social environment&nbsp;&nbsp;

00:16:16.760 --> 00:16:23.440
or the structure of a spatial environment, and&nbsp;
then you ask certain questions that have to&nbsp;&nbsp;

00:16:23.440 --> 00:16:30.240
do with getting from A to B in this social or&nbsp;
spatial environment from the LLM, or you say,&nbsp;&nbsp;

00:16:30.240 --> 00:16:35.640
oh, you know, Matt and Michael don’t talk to each&nbsp;
other anymore. So now in order to pass a message,&nbsp;&nbsp;

00:16:35.640 --> 00:16:41.280
what should I do? So I need to find a detour. Or,&nbsp;
for instance, I say, you know, Ashley has become&nbsp;&nbsp;

00:16:41.280 --> 00:16:46.640
close to Michael now. So now I have a shortcut,&nbsp;
so I can directly give the message to Ashley,&nbsp;&nbsp;

00:16:46.640 --> 00:16:51.160
and Ashley can directly give the message to&nbsp;
Michael. My path to Michael is shorter now.&nbsp;&nbsp;

00:16:51.160 --> 00:16:56.480
So finding things like detours, shortcuts, or if&nbsp;
the reward location changes, these are the kinds&nbsp;&nbsp;

00:16:56.480 --> 00:17:03.880
of changes that, inspired by my own past work&nbsp;
and inspired by the work of Tolman and others,&nbsp;&nbsp;

00:17:03.880 --> 00:17:10.320
we implemented in all of our experiments. This led&nbsp;
to 15 different tasks for every single graph, and&nbsp;&nbsp;

00:17:10.320 --> 00:17:16.640
we have six graphs total of different complexity&nbsp;
levels with different graph theoretic features,&nbsp;&nbsp;

00:17:16.640 --> 00:17:22.280
and [for] each of them, we had three domains.&nbsp;
We had a spatial domain that was with rooms&nbsp;&nbsp;

00:17:22.280 --> 00:17:28.160
that had orders like Room 1, Room 2, Room 3; a&nbsp;
spatial domain that there was no number, there&nbsp;&nbsp;

00:17:28.160 --> 00:17:33.120
was no ordinal order to the rooms; and a social&nbsp;
environment where it was the names of different&nbsp;&nbsp;

00:17:33.120 --> 00:17:40.400
people and so the reasoning was over social, sort&nbsp;
of, spaces. So you can see this is a very large&nbsp;&nbsp;

00:17:40.400 --> 00:17:47.840
number of tasks. It’s 6 times 15 times 3, and&nbsp;
each of the prompts we ran 30 times for different&nbsp;&nbsp;

00:17:47.840 --> 00:17:53.960
temperatures. Three temperatures: 0, 0.5, and&nbsp;
1. And for those who are not familiar with this,&nbsp;&nbsp;

00:17:53.960 --> 00:17:59.560
a temperature of a large language model determines&nbsp;
how random it will be or how much it will stick to&nbsp;&nbsp;

00:17:59.560 --> 00:18:07.120
the first or the best option that comes to it&nbsp;
at the last layer. And so when there are some&nbsp;&nbsp;

00:18:07.120 --> 00:18:12.800
problems that may be the first obvious answer&nbsp;
that it finds are not good, perhaps increasing&nbsp;&nbsp;

00:18:12.800 --> 00:18:17.280
the temperature could help, or perhaps a problem&nbsp;
that needs precision, increasing the temperature&nbsp;&nbsp;

00:18:17.280 --> 00:18:22.880
would make it worse. So based on these ideas, we&nbsp;
also tried it for different temperatures. And we&nbsp;&nbsp;

00:18:22.880 --> 00:18:29.160
tested eight different language models like this&nbsp;
in order to systematically evaluate their ability&nbsp;&nbsp;

00:18:29.160 --> 00:18:36.880
for this multistep reasoning and planning, and&nbsp;
the framework that we use—we call it CogEval—and&nbsp;&nbsp;

00:18:36.880 --> 00:18:42.680
CogEval is a framework that’s not just for&nbsp;
reasoning and multistep planning. Other tasks can&nbsp;&nbsp;

00:18:42.680 --> 00:18:48.320
be used in this framework in order to be tested,&nbsp;
as well. And the first step of it is always to&nbsp;&nbsp;

00:18:48.320 --> 00:18:53.600
operationalize the cognitive capacity in terms of&nbsp;
many different tasks like I just mentioned. And&nbsp;&nbsp;

00:18:53.600 --> 00:18:58.080
then the second task is designing the specific&nbsp;
experiments with different domains like spatial&nbsp;&nbsp;

00:18:58.080 --> 00:19:04.200
and social; with different structures, like the&nbsp;
graphs that I told you; and with different kind&nbsp;&nbsp;

00:19:04.200 --> 00:19:12.040
of repetitions and with different tasks, like&nbsp;
the detour, shortcut, and the reward revaluation,&nbsp;&nbsp;

00:19:12.040 --> 00:19:17.000
transition revaluation, and just traversal, all&nbsp;
the different tasks that I mentioned. And then the&nbsp;&nbsp;

00:19:17.000 --> 00:19:24.160
third step is to generate many prompts and then&nbsp;
test them with many repetitions using different&nbsp;&nbsp;

00:19:24.160 --> 00:19:30.280
temperatures. Why is that? I think something&nbsp;
that Sam Altman had said is relevant here,&nbsp;&nbsp;

00:19:30.280 --> 00:19:35.560
which is sometimes with some problems,&nbsp;
you ask GPT-4 a hundred times, and one&nbsp;&nbsp;

00:19:35.560 --> 00:19:40.400
out of those hundred, it would give the correct&nbsp;
answer. Sometimes 30 out of a hundred, it will&nbsp;&nbsp;

00:19:40.400 --> 00:19:45.080
give the correct answer. You obviously want&nbsp;
it to give hundred out of hundred the correct&nbsp;&nbsp;

00:19:45.080 --> 00:19:50.480
answer. But we didn’t want to rely on just one&nbsp;
try and miss the opportunity to see whether it&nbsp;&nbsp;

00:19:50.480 --> 00:19:56.720
could give the answer if you probed it again .&nbsp;
And in all of the eight large language models,&nbsp;&nbsp;

00:19:56.720 --> 00:20:02.120
we saw that none of the large language models&nbsp;
was robust to the graph structure. Meaning,&nbsp;&nbsp;

00:20:02.120 --> 00:20:06.960
its performance got really worse as soon as the&nbsp;
graph structure, [which] didn’t even have many&nbsp;&nbsp;

00:20:06.960 --> 00:20:14.920
nodes but just had a tree structure that was six&nbsp;
or seven nodes, or a six- or seven-node tree was&nbsp;&nbsp;

00:20:14.920 --> 00:20:20.160
much more difficult for it to solve than a graph&nbsp;
that had 15 nodes but had a simpler structure that&nbsp;&nbsp;

00:20:20.160 --> 00:20:26.920
was just two lines. We noted that sometimes,&nbsp;
counterintuitively, some graph structures&nbsp;&nbsp;

00:20:26.920 --> 00:20:31.280
that you think should be easy to solve were more&nbsp;
difficult for them. On the other hand, they were&nbsp;&nbsp;

00:20:31.280 --> 00:20:37.440
not robust to the task set. So the specific task&nbsp;
that we tried, whether it was detour, shortcut,&nbsp;&nbsp;

00:20:37.440 --> 00:20:42.240
or it was reward revaluation or traversal, it&nbsp;
mattered. For instance, shortcut and detour&nbsp;&nbsp;

00:20:42.240 --> 00:20:47.160
were very difficult for all of them. Another&nbsp;
thing that we noticed was that all of them,&nbsp;&nbsp;

00:20:47.160 --> 00:20:53.560
including GPT-4, hallucinated paths that didn’t&nbsp;
exist. For instance, there was no door between&nbsp;&nbsp;

00:20:53.560 --> 00:20:58.160
Room 12 and Room 16. They would hallucinate that&nbsp;
there is a door, and they would give a response&nbsp;&nbsp;

00:20:58.160 --> 00:21:04.240
that includes that door. Another kind of failure&nbsp;
mode that we observed was that they would fail to&nbsp;&nbsp;

00:21:04.240 --> 00:21:09.000
even find a one-step path. Let’s say between Room&nbsp;
7 and 8, there is a direct door. We would say,&nbsp;&nbsp;

00:21:09.000 --> 00:21:15.000
what is the path from 7 and 8? And they would take&nbsp;
a longer path to go from it. And a final mode that&nbsp;&nbsp;

00:21:15.000 --> 00:21:19.560
we observed was that they would sometimes fall&nbsp;
in loops. Even though we would directly ask them&nbsp;&nbsp;

00:21:19.560 --> 00:21:26.880
to find the shortest path, they would sometimes&nbsp;
fall into a loop on the way to getting to their&nbsp;&nbsp;

00:21:26.880 --> 00:21:31.640
destination, which obviously you shouldn’t do&nbsp;
if you are trying to find the shortest path.&nbsp;&nbsp;

00:21:31.640 --> 00:21:36.280
That said, there is two differing notions&nbsp;
of accuracy here. You can have satisficing,&nbsp;&nbsp;

00:21:36.280 --> 00:21:41.800
which means you get there; you just take a longer&nbsp;
path. And there is this notion that you cannot&nbsp;&nbsp;

00:21:41.800 --> 00:21:47.240
get there because you used some imaginary path or&nbsp;
you did something that didn’t make sense and you,&nbsp;&nbsp;

00:21:47.240 --> 00:21:52.640
sort of, gave a nonsensical response. We had&nbsp;
both of those kinds of issues, so we had a lot&nbsp;&nbsp;

00:21:52.640 --> 00:21:58.640
of issues with giving nonsensical answers,&nbsp;
repeating the question that we were asking,&nbsp;&nbsp;

00:21:58.640 --> 00:22:05.400
producing gibberish. So there were numerous kinds&nbsp;
of challenges. What we did observe was that GPT-4&nbsp;&nbsp;

00:22:05.400 --> 00:22:14.400
was far better than the other LLMs in this regard,&nbsp;
at least at the time that we tested it; however,&nbsp;&nbsp;

00:22:14.400 --> 00:22:23.040
this is obviously on the basis of the particular&nbsp;
kinds of tasks that we tried. In another study,&nbsp;&nbsp;

00:22:23.040 --> 00:22:29.520
we tried Tower of Hanoi, which is also a classic&nbsp;
cognitive science approach to [testing] planning&nbsp;&nbsp;

00:22:29.520 --> 00:22:35.240
abilities and hierarchical planning abilities.&nbsp;
And we found that GPT-4 does between zero and&nbsp;&nbsp;

00:22:35.240 --> 00:22:42.440
10 percent in the three-disk problem and zero&nbsp;
percent for the four-disk problem. And that is&nbsp;&nbsp;

00:22:42.440 --> 00:22:49.000
when we started to think about having more&nbsp;
brain-inspired solutions to improve that&nbsp;&nbsp;

00:22:49.000 --> 00:22:53.800
approach. But I’m going to leave that for next.
LLORENS: So it sounds like a very extensive set&nbsp;&nbsp;

00:22:53.800 --> 00:23:00.760
of experiments across many different tasks&nbsp;
and with many different leading AI models,&nbsp;&nbsp;

00:23:00.760 --> 00:23:07.520
and you’ve uncovered a lack of robustness across&nbsp;
some of these different tasks. One curiosity that&nbsp;&nbsp;

00:23:07.520 --> 00:23:13.800
I have here is how would you assess the relative&nbsp;
difficulty of these particular tasks for human&nbsp;&nbsp;

00:23:13.800 --> 00:23:19.440
beings? Would all of these be relatively&nbsp;
easy for a person to do or not so much?&nbsp;

00:23:19.440 --> 00:23:24.840
MOMENNEJAD: Great question. So I have conducted&nbsp;
some of these experiments already and have&nbsp;&nbsp;

00:23:24.840 --> 00:23:30.520
published them before. Humans do not perform&nbsp;
symmetrically on all these tasks, for sure;&nbsp;&nbsp;

00:23:32.080 --> 00:23:38.120
however, for instance, Tower of Hanoi is a problem&nbsp;
that we know humans can solve. People might have&nbsp;&nbsp;

00:23:38.120 --> 00:23:45.520
seen this. It’s three little rods. Usually it’s a&nbsp;
wooden structure, so you have a physical version&nbsp;&nbsp;

00:23:45.520 --> 00:23:49.480
of it, or you can have a virtual version of it,&nbsp;
and there are different disks with different&nbsp;&nbsp;

00:23:49.480 --> 00:23:55.000
colors and sizes. There are some rules. You cannot&nbsp;
put certain disks on top of others. So there is a&nbsp;&nbsp;

00:23:55.000 --> 00:24:00.200
particular order in which you can stack the disks.&nbsp;
Usually what happens is that all the disks are on&nbsp;&nbsp;

00:24:00.200 --> 00:24:05.280
one side—and when I say a three-disk problem, it&nbsp;
means you have three total disks. And there is&nbsp;&nbsp;

00:24:05.280 --> 00:24:11.480
usually a target solution that you are shown,&nbsp;
and you’re told to get there in a particular&nbsp;&nbsp;

00:24:11.480 --> 00:24:16.880
number of moves or in a minimum number of moves&nbsp;
without violating the rules. So in this case,&nbsp;&nbsp;

00:24:16.880 --> 00:24:23.640
the rules would be that you wouldn’t put certain&nbsp;
disks on top of others. And based on that, you’re&nbsp;&nbsp;

00:24:23.640 --> 00:24:30.000
expected to solve the problem. And the performance&nbsp;
of GPT-4 on Tower of Hanoi three disk is between&nbsp;&nbsp;

00:24:30.000 --> 00:24:36.840
0 to 10 percent and on Tower of Hanoi four&nbsp;
disks is zero percent—zero shot. With the help,&nbsp;&nbsp;

00:24:36.840 --> 00:24:42.680
it can get better. With some support, it gets&nbsp;
better. So in this regard, it seems like Tower&nbsp;&nbsp;

00:24:42.680 --> 00:24:48.480
of Hanoi is extremely difficult for GPT-4. It&nbsp;
doesn’t seem as difficult as it is for GPT-4 for&nbsp;&nbsp;

00:24:48.480 --> 00:24:56.760
humans. It seems for some reason, that it couldn’t&nbsp;
even improve itself when we explained the problem&nbsp;&nbsp;

00:24:56.760 --> 00:25:02.040
even further to it and explain to it what it did&nbsp;
wrong. Sometimes—if people want to try it out,&nbsp;&nbsp;

00:25:02.040 --> 00:25:06.080
they should—sometimes, it would argue back and&nbsp;
say, “No, you’re wrong. I did this right.” Which&nbsp;&nbsp;

00:25:06.080 --> 00:25:13.600
was a very interesting moment for us with ChatGPT.&nbsp;
That was the experience that we had for trying it&nbsp;&nbsp;

00:25:13.600 --> 00:25:21.120
out first without giving it, sort of, more support&nbsp;
than that, but I can tell you what we did next,&nbsp;&nbsp;

00:25:21.120 --> 00:25:26.600
but I want to make sure that we cover your other&nbsp;
questions. But just to wrap this part up, inspired&nbsp;&nbsp;

00:25:26.600 --> 00:25:32.800
by tasks that have been used for evaluation of&nbsp;
cognitive capacities such as multistep reasoning&nbsp;&nbsp;

00:25:32.800 --> 00:25:40.880
and planning in humans, it is possible to evaluate&nbsp;
cognitive capacities and skills such as multistep&nbsp;&nbsp;

00:25:40.880 --> 00:25:46.400
reasoning and planning also in large language&nbsp;
models. And I think that’s the takeaway from this&nbsp;&nbsp;

00:25:46.400 --> 00:25:53.760
particular study and from this general cognitive&nbsp;
science–inspired approach. And I would like to say&nbsp;&nbsp;

00:25:53.760 --> 00:25:59.800
also it is not just human tasks that are useful.&nbsp;
Tolman’s tasks were done in rodents. A lot of&nbsp;&nbsp;

00:25:59.800 --> 00:26:07.320
people have done experiments in fruit flies, in&nbsp;
C. elegans, in worms, in various kinds of other&nbsp;&nbsp;

00:26:07.320 --> 00:26:15.080
species that are very relevant to testing, as&nbsp;
well. So I think there is a general possibility of&nbsp;&nbsp;

00:26:15.080 --> 00:26:23.200
testing particular intelligence skills, evaluating&nbsp;
it, inspired by experiments and evaluation methods&nbsp;&nbsp;

00:26:23.200 --> 00:26:29.480
for humans and other biological species.
LLORENS: Let’s explore the way forward&nbsp;&nbsp;

00:26:29.480 --> 00:26:36.120
for AI from your perspective. You know,&nbsp;
as you’ve described your recent works,&nbsp;&nbsp;

00:26:36.120 --> 00:26:44.040
it’s clear that you have, that your work is deeply&nbsp;
informed by insights from cognitive science,&nbsp;&nbsp;

00:26:44.040 --> 00:26:50.440
insights from neuroscience, and recent works—your&nbsp;
recent works—have called for the development,&nbsp;&nbsp;

00:26:50.440 --> 00:26:56.360
for example, of a prefrontal cortex for AI, and&nbsp;
I understand this to be the part of the brain&nbsp;&nbsp;

00:26:56.360 --> 00:27:03.160
that facilitates executive function. How does, how&nbsp;
does this relate to the, you know, extending the&nbsp;&nbsp;

00:27:03.160 --> 00:27:09.320
capabilities of AI, a prefrontal cortex for AI?
MOMENNEJAD: Thank you for that question. So let&nbsp;&nbsp;

00:27:09.320 --> 00:27:17.760
me start by reiterating something I said earlier,&nbsp;
which is the brain didn’t evolve in a lump. There&nbsp;&nbsp;

00:27:17.760 --> 00:27:24.160
were different components of brains and nervous&nbsp;
systems and neurons that evolved at different&nbsp;&nbsp;

00:27:24.160 --> 00:27:30.160
evolutionary scales. There are some parts of&nbsp;
the brain that appear in many different species,&nbsp;&nbsp;

00:27:30.160 --> 00:27:34.720
so they’re robust across many species. And there&nbsp;
are some parts of the brain that appear in some&nbsp;&nbsp;

00:27:34.720 --> 00:27:39.840
species that had some particular needs, some&nbsp;
particular problems they were facing, or some&nbsp;&nbsp;

00:27:39.840 --> 00:27:47.680
ecological niche. What is, however, in common in&nbsp;
many of them is that there seems to be some kind&nbsp;&nbsp;

00:27:47.680 --> 00:27:57.560
of a modular or multicomponent aspect to what we&nbsp;
call higher cognitive function or what we call&nbsp;&nbsp;

00:27:57.560 --> 00:28:05.120
executive function. And so the kinds of animals&nbsp;
that we ascribe some form of executive function&nbsp;&nbsp;

00:28:05.120 --> 00:28:12.200
of sorts to seem to have brains that have parts or&nbsp;
modules that do different things. It doesn’t mean&nbsp;&nbsp;

00:28:12.200 --> 00:28:19.240
that they only do that. It’s not a very extreme&nbsp;
Fodorian view of modularity. But it is the view&nbsp;&nbsp;

00:28:19.240 --> 00:28:25.640
that, broadly speaking, when, for instance, we&nbsp;
observe patients that have damage to a particular&nbsp;&nbsp;

00:28:25.640 --> 00:28:30.760
part of their prefrontal cortex, it could be that&nbsp;
they perform the same on an IQ test, but they have&nbsp;&nbsp;

00:28:30.760 --> 00:28:35.040
problems holding their relationship or their&nbsp;
jobs. So there are different parts of the brain&nbsp;&nbsp;

00:28:35.040 --> 00:28:41.840
that selective damage to those areas, because&nbsp;
of accidents or coma or such, it seems to impair&nbsp;&nbsp;

00:28:41.840 --> 00:28:48.040
specific cognitive capacities. So this is what&nbsp;
very much inspired me. I have been investigating&nbsp;&nbsp;

00:28:48.040 --> 00:28:55.960
the prefrontal cortex for, I guess, 17 years&nbsp;
now, [LAUGHS] which is a scary number to say. But&nbsp;&nbsp;

00:28:55.960 --> 00:29:03.000
been ... basically since I started my PhD and even&nbsp;
during my master’s thesis, I have been focused on&nbsp;&nbsp;

00:29:03.000 --> 00:29:10.040
the role of the prefrontal cortex in our ability&nbsp;
for long-term reasoning and planning in not just&nbsp;&nbsp;

00:29:10.040 --> 00:29:18.000
this moment—long-term, open-ended reasoning and&nbsp;
planning. Inspired by this work, I thought, OK,&nbsp;&nbsp;

00:29:18.000 --> 00:29:25.320
if I want to improve GPT-4’s performance on, let’s&nbsp;
say, Tower of Hanoi, can we get inspired by this&nbsp;&nbsp;

00:29:25.320 --> 00:29:30.920
kind of multiple roles that different parts of&nbsp;
the brain play in executive function, specifically&nbsp;&nbsp;

00:29:30.920 --> 00:29:35.320
different parts of the neocortex and specifically&nbsp;
different parts of the prefrontal cortex,&nbsp;&nbsp;

00:29:35.320 --> 00:29:41.360
part of the neocortex, in humans? Can we get&nbsp;
inspired by some of these main roles that I have&nbsp;&nbsp;

00:29:41.360 --> 00:29:50.800
studied before and ask GPT-4 to play the role of&nbsp;
those different parts and solve different parts of&nbsp;&nbsp;

00:29:50.800 --> 00:29:56.960
the planning and reasoning problem—the multistep&nbsp;
planning and reasoning problem—using these roles&nbsp;&nbsp;

00:29:56.960 --> 00:30:04.400
and particular rules of how to iterate over them.&nbsp;
For instance, there is a part of the brain called&nbsp;&nbsp;

00:30:04.400 --> 00:30:10.720
anterior cingulate cortex. Among other things, it&nbsp;
seems to be involved in monitoring for errors and&nbsp;&nbsp;

00:30:10.720 --> 00:30:16.080
signaling when there is a need to exercise more&nbsp;
control or move from what people like to call a&nbsp;&nbsp;

00:30:16.080 --> 00:30:23.120
faster way of thinking to a slower way of thinking&nbsp;
to solve a particular problem. And there is … so&nbsp;&nbsp;

00:30:23.120 --> 00:30:28.840
let’s call this the cognitive function of this&nbsp;
part. Let’s call it the monitor. This is a part of&nbsp;&nbsp;

00:30:28.840 --> 00:30:34.560
the brain that monitors for when there is a need&nbsp;
for exercising more control or changing something&nbsp;&nbsp;

00:30:34.560 --> 00:30:41.360
because there is an error maybe. There is another&nbsp;
part of the brain and the frontal lobe that is&nbsp;&nbsp;

00:30:41.360 --> 00:30:46.200
the, for instance, dorsolateral prefrontal&nbsp;
cortex;; that one is involved in working&nbsp;&nbsp;

00:30:46.200 --> 00:30:53.880
memory and coming up with, like, simpler plans to&nbsp;
execute. Then there is a ventromedial prefrontal&nbsp;&nbsp;

00:30:53.880 --> 00:30:59.080
cortex that is involved in the value of states and&nbsp;
predicting what is the next state and integrating&nbsp;&nbsp;

00:30:59.080 --> 00:31:04.760
it with information from other parts of the brain&nbsp;
to figure out what is the value. So you put all of&nbsp;&nbsp;

00:31:04.760 --> 00:31:09.960
these things together, you can basically write&nbsp;
different algorithms that have these different&nbsp;&nbsp;

00:31:09.960 --> 00:31:16.440
components talking to each other. And we have in&nbsp;
that paper also, written in a pseudocode style,&nbsp;&nbsp;

00:31:16.440 --> 00:31:23.120
the different algorithms that are basically akin&nbsp;
to a tree search, in fact. So there is a part of&nbsp;&nbsp;

00:31:23.120 --> 00:31:33.600
the role … they’re part of the multicomponent&nbsp;
or multi-agent realization of a prefrontal&nbsp;&nbsp;

00:31:34.720 --> 00:31:41.840
cortex-like GPT-4 solution. One part of it&nbsp;
would propose a plan. The monitor would say,&nbsp;&nbsp;

00:31:41.840 --> 00:31:46.960
thanks for that; let me pass it on to the part&nbsp;
that is evaluating what is the outcome of this&nbsp;&nbsp;

00:31:46.960 --> 00:31:51.920
and what’s the value of that, and get back to&nbsp;
you. It evaluates there and comes back and says,&nbsp;&nbsp;

00:31:51.920 --> 00:31:57.120
you know, this is not a good plan; give me another&nbsp;
one. And in this iteration, sometimes it takes&nbsp;&nbsp;

00:31:57.120 --> 00:32:05.120
10 iterations; sometimes it takes 20 iterations.&nbsp;
This kind of council of different types of roles,&nbsp;&nbsp;

00:32:05.120 --> 00:32:12.560
they come up with a solution that is solving the&nbsp;
Tower of Hanoi problem. And we managed to bring&nbsp;&nbsp;

00:32:12.560 --> 00:32:22.080
the performance from 0 to 10 [percent] in GPT-4&nbsp;
to, I think, about 70—70 percent—in Tower of&nbsp;&nbsp;

00:32:22.080 --> 00:32:28.200
Hanoi three disks, and OOD, or out-of-distribution&nbsp;
generalization, without giving any examples of a&nbsp;&nbsp;

00:32:28.200 --> 00:32:34.800
four disk, it could generalize to above 20 percent&nbsp;
in four-disk problems. Another impressive thing&nbsp;&nbsp;

00:32:34.800 --> 00:32:40.120
that happened here—and we tested it on the CogEval&nbsp;
and the planning tasks from the other experiment,&nbsp;&nbsp;

00:32:40.120 --> 00:32:47.120
too—was that it brought all of the, sort of,&nbsp;
hallucinations from about 20 to 30 percent—in&nbsp;&nbsp;

00:32:47.120 --> 00:32:53.400
some cases, much higher percentages—to&nbsp;
zero percent. So we had slow thinking;&nbsp;&nbsp;

00:32:53.400 --> 00:32:58.400
we had 30 iterations, so it took a lot longer.&nbsp;
And this is, you know, fast and slow thinking.&nbsp;&nbsp;

00:32:58.400 --> 00:33:02.960
This is very slow thinking. However, we had&nbsp;
no hallucinations anymore. And hallucination&nbsp;&nbsp;

00:33:02.960 --> 00:33:11.880
in Tower of Hanoi would be making a move that is&nbsp;
impossible. For instance, putting in a, kind of,&nbsp;&nbsp;

00:33:11.880 --> 00:33:18.800
a disk on top of another that you cannot do&nbsp;
because you violate a rule or taking out a middle&nbsp;&nbsp;

00:33:18.800 --> 00:33:22.960
disk that you cannot pull out actually. So those&nbsp;
would be the kinds of hallucinations in Tower of&nbsp;&nbsp;

00:33:22.960 --> 00:33:28.800
Hanoi. All of those also went to zero. And so&nbsp;
that is one thing that we have done already,&nbsp;&nbsp;

00:33:28.800 --> 00:33:33.880
which I have been very excited about.
LLORENS: So you painted a pretty&nbsp;&nbsp;

00:33:33.880 --> 00:33:41.920
interesting—fascinating, really—picture of a&nbsp;
multi-agent framework where different instances&nbsp;&nbsp;

00:33:41.920 --> 00:33:50.760
of an advanced model like GPT-4 would be prompted&nbsp;
to play the roles of different parts of the brain&nbsp;&nbsp;

00:33:50.760 --> 00:34:00.440
and, kind of, work together. And so my question is&nbsp;
a pragmatic one. How do you prompt GPT-4 to play&nbsp;&nbsp;

00:34:00.440 --> 00:34:04.260
the role of a specific part of the human&nbsp;
brain? What does that prompt look like?&nbsp;

00:34:04.260 --> 00:34:10.080
MOMENNEJAD: Great question. I can actually, well,&nbsp;
we have all of that at the end of our paper,&nbsp;&nbsp;

00:34:10.080 --> 00:34:17.560
so I can even read some of them if that was of&nbsp;
interest. But just a quick response to that is&nbsp;&nbsp;

00:34:17.560 --> 00:34:26.680
you can basically describe the function that you&nbsp;
want the LLM—in this case GPT-4—to play. You can&nbsp;&nbsp;

00:34:26.680 --> 00:34:33.960
write that in simple language. You don’t have to&nbsp;
tell it that this is inspired by the brain. It is&nbsp;&nbsp;

00:34:33.960 --> 00:34:41.400
completely sufficient to just basically provide&nbsp;
certain sets of rules in order for it, in order&nbsp;&nbsp;

00:34:41.400 --> 00:34:51.840
to be able to do that. For instance, after you&nbsp;
provide the problem, sort of, description … let&nbsp;&nbsp;

00:34:51.840 --> 00:34:56.560
me see if I can actually read some part of this&nbsp;
for you. For instance, you give it a problem,&nbsp;&nbsp;

00:34:56.560 --> 00:35:01.560
and you say, consider this problem. Rule 1: you&nbsp;
can only move a number if it’s at this and that.&nbsp;&nbsp;

00:35:01.560 --> 00:35:07.840
You clarify the rules. Here are examples. Here are&nbsp;
proposed moves. And then you say, for instance,&nbsp;&nbsp;

00:35:08.360 --> 00:35:18.080
your role is to find whether this particular&nbsp;
number generated as a solution is accurate.&nbsp;&nbsp;

00:35:18.080 --> 00:35:24.920
In order to do that, you can call on this other&nbsp;
function, which is the predictor and evaluator&nbsp;&nbsp;

00:35:24.920 --> 00:35:31.080
that says, OK, if I do this, what state do I end&nbsp;
up in, and what is the value of that state? And&nbsp;&nbsp;

00:35:31.080 --> 00:35:35.560
you get that information, and then based on that&nbsp;
information, you decide whether the proposed move&nbsp;&nbsp;

00:35:35.560 --> 00:35:41.560
for this problem is a good move or not. If it&nbsp;
is, then you pass a message that says, all right,&nbsp;&nbsp;

00:35:41.560 --> 00:35:46.160
give me the next step of the plan. If it’s not,&nbsp;
then you say, OK, this is not a good plan; propose&nbsp;&nbsp;

00:35:46.160 --> 00:35:52.760
another plan. And then the part of, the part that&nbsp;
plays the role of, “hey, here is the problem. Here&nbsp;&nbsp;

00:35:52.760 --> 00:35:57.200
are the rules. Propose the first towards the&nbsp;
subgoal or find the subgoal towards this and&nbsp;&nbsp;

00:35:57.200 --> 00:36:02.520
propose the next step.” And that one receives&nbsp;
this feedback from the monitor. And monitor has&nbsp;&nbsp;

00:36:02.520 --> 00:36:07.920
asked the predictor and evaluator, hey, what&nbsp;
happens if I do these things and what would&nbsp;&nbsp;

00:36:07.920 --> 00:36:14.080
be the value of that in order to say, hey, this&nbsp;
is not a great idea. So in a way this becomes a&nbsp;&nbsp;

00:36:14.080 --> 00:36:21.120
very simple prefrontal cortex–inspired multi-agent&nbsp;
system. All of them are within the same … sort of,&nbsp;&nbsp;

00:36:21.120 --> 00:36:25.680
different calls to GPT-4 but the same instance.&nbsp;
Just, like, because we were calling it in a code,&nbsp;&nbsp;

00:36:25.680 --> 00:36:30.720
it’s just, you just call, it’s called multiple&nbsp;
times and each time with this kind of a very&nbsp;&nbsp;

00:36:30.720 --> 00:36:38.760
simple in-context learning text that, in text, it&nbsp;
describes, hey, here’s the kind of problem you’re&nbsp;&nbsp;

00:36:38.760 --> 00:36:44.640
going to see. Here’s the role I want you to play.&nbsp;
And here is what other kind of rules you need to&nbsp;&nbsp;

00:36:44.640 --> 00:36:50.600
call in order to play your role here. And then&nbsp;
it’s up to the LLM to decide how many times it’s&nbsp;&nbsp;

00:36:50.600 --> 00:36:55.600
going to call which components in order to solve&nbsp;
the problem. We don’t decide. We can only decide,&nbsp;&nbsp;

00:36:55.600 --> 00:37:02.200
hey, cap it at 10 times, for instance, or cap it&nbsp;
at 30 iterations and then see how it performs.&nbsp;

00:37:02.200 --> 00:37:04.580
LLORENS: So, Ida, what’s next&nbsp;
for you and your research?&nbsp;

00:37:04.580 --> 00:37:13.920
MOMENNEJAD: Thank you for that. I have always&nbsp;
been interested in understanding minds and&nbsp;&nbsp;

00:37:13.920 --> 00:37:20.680
making minds, and this has been something that&nbsp;
I’ve wanted to do since I was a teenager. And I&nbsp;&nbsp;

00:37:20.680 --> 00:37:26.360
think that my approaches in cognitive neuroscience&nbsp;
have really helped me to understand minds to the&nbsp;&nbsp;

00:37:26.360 --> 00:37:35.400
extent that is possible. And my understanding of&nbsp;
how to make minds comes from basically the work&nbsp;&nbsp;

00:37:35.400 --> 00:37:45.147
that I’ve done in AI and computer science since my&nbsp;
undergrad. What I would be interested in is—and I&nbsp;&nbsp;

00:37:45.147 --> 00:37:50.920
have learned over the years that you cannot think&nbsp;
about the mind in general when you are trying to&nbsp;&nbsp;

00:37:50.920 --> 00:37:56.960
isolate some components and building them—is&nbsp;
that my interest is very much in reasoning and&nbsp;&nbsp;

00:37:56.960 --> 00:38:04.440
multistep planning, especially in complex problems&nbsp;
and very long-term problems and how they relate to&nbsp;&nbsp;

00:38:04.440 --> 00:38:10.600
memory, how the past and the future relate to&nbsp;
one another. And so something that I would be&nbsp;&nbsp;

00:38:10.600 --> 00:38:21.880
very interested in is making more efficient types&nbsp;
of multi-agent brain-inspired AI but also to train&nbsp;&nbsp;

00:38:21.880 --> 00:38:29.000
smaller large language models, perhaps using&nbsp;
the process of reasoning in order to improve&nbsp;&nbsp;

00:38:29.000 --> 00:38:33.760
their reasoning abilities. Because it’s one thing&nbsp;
to train on outcome and outcome can be inputs and&nbsp;&nbsp;

00:38:33.760 --> 00:38:38.960
outputs, and that’s the most of the training data&nbsp;
that LLMs receive. But it’s an entirely different&nbsp;&nbsp;

00:38:38.960 --> 00:38:44.920
approach to teach the process and probe them on&nbsp;
different parts of the process as opposed to just&nbsp;&nbsp;

00:38:44.920 --> 00:38:49.280
the input and output. So I wonder whether&nbsp;
with that kind of an approach, which would&nbsp;&nbsp;

00:38:49.280 --> 00:38:54.760
require generating a lot of synthetic data that&nbsp;
relates to different types of reasoning skills,&nbsp;&nbsp;

00:38:54.760 --> 00:39:00.560
whether it’s possible to teach LLMs reasoning&nbsp;
skills, and by reasoning skills, I mean very&nbsp;&nbsp;

00:39:00.560 --> 00:39:08.600
clearly operationalized—similar to the CogEval&nbsp;
approach—operationalized, very well-researched,&nbsp;&nbsp;

00:39:08.600 --> 00:39:14.200
specific cognitive constructs that have construct&nbsp;
validity and then operationalizing them in terms&nbsp;&nbsp;

00:39:14.200 --> 00:39:18.760
of many tasks. And something that’s important&nbsp;
to me is a very important idea and a part of&nbsp;&nbsp;

00:39:18.760 --> 00:39:24.000
intelligence that maybe I didn’t highlight enough&nbsp;
in the first part is being able to transfer to&nbsp;&nbsp;

00:39:24.000 --> 00:39:29.120
tasks that they have never seen before, and they&nbsp;
can piece together different intelligence skills&nbsp;&nbsp;

00:39:29.120 --> 00:39:34.040
or reasoning skills in order to solve them.&nbsp;
Another thing that I have done and I will&nbsp;&nbsp;

00:39:34.040 --> 00:39:38.760
continue to do is collective intelligence.&nbsp;
So we talked about multi-agent systems,&nbsp;&nbsp;

00:39:38.760 --> 00:39:43.560
that they are playing the roles of different parts&nbsp;
inside one brain. But I’ve also done experiments&nbsp;&nbsp;

00:39:43.560 --> 00:39:50.080
with multiple humans and how different structures&nbsp;
of human communication leads to better memory or&nbsp;&nbsp;

00:39:50.080 --> 00:39:56.400
problem-solving. Humans, also, we invent things;&nbsp;
we innovate things in cultural accumulation,&nbsp;&nbsp;

00:39:56.400 --> 00:40:01.880
which requires [building] on a lot of … some&nbsp;
people do something, I take that outcome,&nbsp;&nbsp;

00:40:01.880 --> 00:40:05.160
take another outcome, put them together, make&nbsp;
something. Someone takes my approach and adds&nbsp;&nbsp;

00:40:05.160 --> 00:40:09.200
something to it; makes something else. So this&nbsp;
kind of cultural accumulation, we have done some&nbsp;&nbsp;

00:40:09.200 --> 00:40:13.360
work on that with deep reinforcement learning&nbsp;
models that share their replay buffer as a way&nbsp;&nbsp;

00:40:13.360 --> 00:40:19.160
of sharing skill with each other; however,&nbsp;
as humans become a lot more accustomed to&nbsp;&nbsp;

00:40:19.160 --> 00:40:25.960
using LLMs and other generative AI, basically&nbsp;
generative AI would start participating in this&nbsp;&nbsp;

00:40:25.960 --> 00:40:31.080
kind of cultural accumulation. So the notion of&nbsp;
collective cognition, collective intelligence,&nbsp;&nbsp;

00:40:31.080 --> 00:40:36.600
and collective memory will now have to incorporate&nbsp;
the idea of generative AI being a part of&nbsp;&nbsp;

00:40:36.600 --> 00:40:44.160
it. And so I’m also interested in different&nbsp;
approaches to modeling that, understanding that,&nbsp;&nbsp;

00:40:44.160 --> 00:40:50.480
optimizing that, identifying in what ways it’s&nbsp;
better. We have found both in humans and in&nbsp;&nbsp;

00:40:50.480 --> 00:40:55.560
deep reinforcement learning agents, for instance,&nbsp;
that particular structures of communication that&nbsp;&nbsp;

00:40:55.560 --> 00:41:00.760
are actually not the most energy-consuming,&nbsp;
not all-to-all communication, but particular&nbsp;&nbsp;

00:41:01.680 --> 00:41:06.640
partially connected structures are better for&nbsp;
innovation than others. And some other structures&nbsp;&nbsp;

00:41:06.640 --> 00:41:11.280
might be better for memory or collective memory&nbsp;
converging with each other. So I think it would be&nbsp;&nbsp;

00:41:11.280 --> 00:41:15.240
very interesting—the same way that we are looking&nbsp;
at what kind of components talk to each other in&nbsp;&nbsp;

00:41:15.240 --> 00:41:21.920
one brain to solve certain problems—to think about&nbsp;
what kind of structures or roles can interact with&nbsp;&nbsp;

00:41:21.920 --> 00:41:29.080
each other, in what shape and in what frequency of&nbsp;
communication, in order to solve larger, sort of,&nbsp;&nbsp;

00:41:29.800 --> 00:41:32.551
cultural accumulation problems.
[MUSIC PLAYS]&nbsp;

00:41:32.551 --> 00:41:35.680
LLORENS: Well, that’s a compelling vision.&nbsp;
I really look forward to seeing how far&nbsp;&nbsp;

00:41:35.680 --> 00:41:38.640
you and the team can take it. And&nbsp;
thanks for a fascinating discussion.&nbsp;

00:41:38.640 --> 00:41:47.817
MOMENNEJAD: Thank you so much.
[MUSIC FADES]

