00:00:00.000 --> 00:00:01.067
[TEASER]  

00:00:01.067 --> 00:00:01.567
[MUSIC PLAYS UNDER DIALOGUE] 

00:00:01.567 --> 00:00:05.280
EMRE KICIMAN: I think it's really important&nbsp;
for people to find passion and joy in the&nbsp;&nbsp;

00:00:05.280 --> 00:00:11.000
work that they do. At some point, do the work&nbsp;
for the work's sake. I think this will drive&nbsp;&nbsp;

00:00:11.000 --> 00:00:16.000
you through the challenges that you'll&nbsp;
inevitably face with any sort of project&nbsp;&nbsp;

00:00:16.000 --> 00:00:20.097
and give you the persistence that you need to&nbsp;
really have the impact that you want to have.

00:00:20.097 --> 00:00:21.763
[TEASER ENDS] 

00:00:26.762 --> 00:00:30.760
JOHANNES GEHRKE: Microsoft Research works at&nbsp;
the cutting edge. But how much do we know about&nbsp;&nbsp;

00:00:30.760 --> 00:00:35.600
the people behind the science and technology&nbsp;
that we create? This is What’s Your Story,&nbsp;&nbsp;

00:00:35.600 --> 00:00:40.960
and I’m Johannes Gehrke. In my 10 years&nbsp;
with Microsoft, across product and research,&nbsp;&nbsp;

00:00:40.960 --> 00:00:44.400
I’ve been continuously excited and&nbsp;
inspired by the people I work with,&nbsp;&nbsp;

00:00:44.400 --> 00:00:49.360
and I’m curious about how they became the&nbsp;
talented and passionate people they are today.&nbsp;&nbsp;

00:00:49.360 --> 00:00:55.080
So I sat down with some of them. Now, I’m sharing&nbsp;
their stories with you. In this podcast series,&nbsp;&nbsp;

00:00:55.080 --> 00:00:58.080
you’ll hear from them about how they&nbsp;
grew up, the critical choices that&nbsp;&nbsp;

00:00:58.080 --> 00:01:02.321
shaped their lives, and their advice to&nbsp;
others looking to carve a similar path. 

00:01:04.421 --> 00:01:05.921
[MUSIC FADES]

00:01:06.898 --> 00:01:12.360
In this episode, I’m talking with Emre Kiciman,&nbsp;
the senior principal research manager leading the&nbsp;&nbsp;

00:01:12.360 --> 00:01:17.760
AI for Industry research team at Microsoft&nbsp;
Research Redmond. After completing a PhD in&nbsp;&nbsp;

00:01:17.760 --> 00:01:23.960
systems and networking in 2005, Emre began his&nbsp;
career with Microsoft Research in the same area,&nbsp;&nbsp;

00:01:23.960 --> 00:01:29.640
studying reliability in large-scale internet&nbsp;
services. Exposure to social data inspired him&nbsp;&nbsp;

00:01:29.640 --> 00:01:36.200
to refocus his research pursuits: his recent&nbsp;
work in causal analysis—including DoWhy,&nbsp;&nbsp;

00:01:36.200 --> 00:01:42.080
a Python library for causal inference—is helping&nbsp;
to connect the whats and whys in the abundance of&nbsp;&nbsp;

00:01:42.080 --> 00:01:48.480
data that exists. Meanwhile, his work with large&nbsp;
language models is geared toward making AI systems&nbsp;&nbsp;

00:01:48.480 --> 00:01:54.360
more secure and maximizing their benefit to&nbsp;
society. Here’s my conversation with Emre,&nbsp;&nbsp;

00:01:54.360 --> 00:01:58.760
beginning with some of his work at Microsoft&nbsp;
Research and how he landed in computer science.

00:01:58.760 --> 00:02:02.240
GEHRKE: Welcome to What's Your Story. So can you&nbsp;&nbsp;

00:02:02.240 --> 00:02:04.591
just tell us a little bit about what&nbsp;
you do at MSR [Microsoft Research]?

00:02:04.591 --> 00:02:11.840
KICIMAN: Sure. I work primarily on two areas&nbsp;
at the moment, I guess. One is causal analysis,&nbsp;&nbsp;

00:02:11.840 --> 00:02:17.160
where we work on trying to answer cause-and-effect&nbsp;
questions from data in a wide variety of domains,&nbsp;&nbsp;

00:02:17.160 --> 00:02:23.880
kind of, building that horizontal platform.&nbsp;
And I work a lot recently, especially with this&nbsp;&nbsp;

00:02:23.880 --> 00:02:30.000
large language model focus, on the security&nbsp;
of AI-driven systems: how do we make sure&nbsp;&nbsp;

00:02:30.000 --> 00:02:35.380
that these AI systems that we're building are&nbsp;
not opening up new vulnerabilities to attackers?

00:02:35.380 --> 00:02:41.760
GEHRKE: Super interesting. And maybe we can start&nbsp;
out even before we go more in depth into that by,&nbsp;&nbsp;

00:02:41.760 --> 00:02:45.980
you know, how did you actually end up in computer&nbsp;
science? I learned that you grew up in Berkeley.

00:02:45.980 --> 00:02:48.360
KICIMAN: Yeah, on average, I like to say.

00:02:48.360 --> 00:02:49.231
GEHRKE: On average? [LAUGHTER]

00:02:49.231 --> 00:02:53.800
KICIMAN: So I moved to the US with my&nbsp;
parents when I was 2 years old, and&nbsp;&nbsp;

00:02:53.800 --> 00:03:01.240
we lived in El Cerrito, a small town just north&nbsp;
of Berkeley. And then around middle school age,&nbsp;&nbsp;

00:03:01.240 --> 00:03:06.840
we moved to Piedmont, just south of Berkeley.&nbsp;
So on average, yes, I grew up in Berkeley,&nbsp;&nbsp;

00:03:06.840 --> 00:03:11.240
and I did end up going there for college.&nbsp;
And you asked about how I got into computer&nbsp;&nbsp;

00:03:11.240 --> 00:03:17.560
science. When I was probably around third or&nbsp;
fourth grade, my dad, who was a civil engineer,&nbsp;&nbsp;

00:03:17.560 --> 00:03:24.840
decided that he wanted to start a business on&nbsp;
the side, and he loved software engineering&nbsp;&nbsp;

00:03:24.840 --> 00:03:35.520
and wanted to build software to help automate a&nbsp;
lot of the more cumbersome design tasks in the&nbsp;&nbsp;

00:03:35.520 --> 00:03:43.160
design of steel connections, and so he wrote ...&nbsp;
he bought a PC and brought it home and started&nbsp;&nbsp;

00:03:43.160 --> 00:03:49.180
working on his work. But then that was also&nbsp;
my opportunity to learn what a computer was.

00:03:49.180 --> 00:03:52.000
GEHRKE: So that was your&nbsp;
first computer? Was it an x86?

00:03:52.000 --> 00:04:01.920
KICIMAN: Yes, it was an IBM PC, the first x86,&nbsp;
the one before the 286. And—it wasn't the very&nbsp;&nbsp;

00:04:01.920 --> 00:04:07.660
original PC. It did have a CGA—color graphics&nbsp;
adapter—so we could have four colors at once.

00:04:07.660 --> 00:04:08.760
GEHRKE: Nice.

00:04:08.760 --> 00:04:13.320
KICIMAN: And, yeah, that's ...&nbsp;
it came with—luckily for me,&nbsp;&nbsp;

00:04:13.320 --> 00:04:17.820
I guess—it came with a BASIC manual. So reading&nbsp;
that manual is how I learned how to program.

00:04:17.820 --> 00:04:20.320
GEHRKE: And this is the typical IBM white box with&nbsp;&nbsp;

00:04:20.320 --> 00:04:23.831
a monitor on top of it and a floppy&nbsp;
drive, or how should I picture it?

00:04:23.831 --> 00:04:25.351
KICIMAN: Yeah, two floppy drives ...
GEHRKE: Two floppy drives? OK ...

00:04:25.351 --> 00:04:28.120
KICIMAN: Two floppy drives, yeah, so&nbsp;
you could copy from one to the other.

00:04:28.120 --> 00:04:29.440
GEHRKE: Five and a quarter or three and a half?

00:04:29.440 --> 00:04:31.160
KICIMAN: Five and a quarter,&nbsp;&nbsp;

00:04:31.160 --> 00:04:37.440
yeah, yeah. The loud, clickety-clack keyboard and,&nbsp;
yeah, a nice monitor. So not the green and black;&nbsp;&nbsp;

00:04:37.440 --> 00:04:43.120
the one that could display the colors. And,&nbsp;
yeah, had a lot of fun with programming.

00:04:43.120 --> 00:04:46.280
GEHRKE: So what were some of&nbsp;
the first things that you wrote?

00:04:46.280 --> 00:04:50.080
KICIMAN: A lot of the first ones were just&nbsp;
the examples from the book, the for loops,&nbsp;&nbsp;

00:04:50.080 --> 00:04:56.480
for example. But then after that, I started&nbsp;
getting into some of the, you know, building,&nbsp;&nbsp;

00:04:56.480 --> 00:05:00.440
like, little mini painting tools. You know,&nbsp;
you could move a cursor around the screen,&nbsp;&nbsp;

00:05:00.440 --> 00:05:06.160
click a button and paint to fill in a region,&nbsp;
and then save the commands that you did to&nbsp;&nbsp;

00:05:06.160 --> 00:05:11.040
make graphics. Eventually, that actually&nbsp;
turned into, like, a friend and I really&nbsp;&nbsp;

00:05:11.040 --> 00:05:15.520
enjoyed playing computer games, so we had in&nbsp;
our mind we're going to build a computer game.

00:05:15.520 --> 00:05:16.671
GEHRKE: Who doesn't think that.

00:05:16.671 --> 00:05:17.204
KICIMAN: Of course, right?

00:05:17.204 --> 00:05:18.360
GEHRKE: Of course …

00:05:18.360 --> 00:05:22.920
KICIMAN: And so we had, like, a "choose your&nbsp;
own adventure"–style program. I think we had&nbsp;&nbsp;

00:05:22.920 --> 00:05:32.040
maybe even four or five screens you could step&nbsp;
through, right. And he was able to get some boxes,&nbsp;&nbsp;

00:05:32.040 --> 00:05:36.640
and we printed some manuals even. We had big&nbsp;
plans, but then we didn't know what to do,&nbsp;&nbsp;

00:05:36.640 --> 00:05:40.380
how to finish the game, how to get it out&nbsp;
there, so ... but we had a lot of fun.

00:05:40.380 --> 00:05:41.939
GEHRKE: Wow, that sounds amazing.

00:05:41.939 --> 00:05:42.512
KICIMAN: Really fond memories, yeah.

00:05:42.512 --> 00:05:47.360
GEHRKE: That sounds amazing. And then you&nbsp;
went to Berkeley afterwards? Is that how&nbsp;&nbsp;

00:05:47.360 --> 00:05:50.220
you realized your passion, or how do&nbsp;
you decide to study computer science?

00:05:50.220 --> 00:05:55.200
KICIMAN: Yeah ... so from that age, I was&nbsp;
set on computing. I think my parents were&nbsp;&nbsp;

00:05:55.200 --> 00:06:01.160
a bit of a devil's advocate. They wanted me to&nbsp;
consider my options. So I did consider, like,&nbsp;&nbsp;

00:06:01.160 --> 00:06:06.600
mechanical engineering or industrial engineering&nbsp;
in, like, maybe junior year of high school, but&nbsp;&nbsp;

00:06:06.600 --> 00:06:13.680
it never felt right. I went into computing, had a&nbsp;
very smooth transition into Berkeley. They have a&nbsp;&nbsp;

00:06:13.680 --> 00:06:18.280
local program where students from the local high&nbsp;
school can start to take college classes early.&nbsp;&nbsp;

00:06:18.280 --> 00:06:24.200
So I'd even started taking some computer classes&nbsp;
and then just went right into my freshman year.

00:06:24.200 --> 00:06:26.960
GEHRKE: Sounds like a very&nbsp;
smooth transition. Anything&nbsp;&nbsp;

00:06:26.960 --> 00:06:30.660
bumpy? Anything bumpy on the ride out there, or …?

00:06:30.660 --> 00:06:36.320
KICIMAN: Nothing really, nothing&nbsp;
really bumpy. I had one general&nbsp;&nbsp;

00:06:36.320 --> 00:06:41.383
engineering class that somehow got&nbsp;
on my schedule at 8 AM freshman year.

00:06:41.383 --> 00:06:42.240
GEHRKE: [LAUGHS] That's a tough one.

00:06:42.240 --> 00:06:48.800
KICIMAN: That's a tough one, yeah. And so&nbsp;
there were a few weeks I didn't attend class,&nbsp;&nbsp;

00:06:48.800 --> 00:06:53.760
and I knew there was a midterm coming up,&nbsp;
so I show up. Because, you know, next week,&nbsp;&nbsp;

00:06:53.760 --> 00:06:58.280
there's a midterm. I better figure out what&nbsp;
they're, what they're learning. And I come in&nbsp;&nbsp;

00:06:58.280 --> 00:07:02.080
a couple minutes late because it's, even though&nbsp;
I'm intending to go, it's still an 8 AM class.&nbsp;&nbsp;

00:07:02.080 --> 00:07:07.720
I show up a few minutes late, and everyone is&nbsp;
heads down writing on pieces of paper. The whole&nbsp;&nbsp;

00:07:07.720 --> 00:07:14.160
room is quiet. And the TA gives me a packet&nbsp;
and says, you might as well start now. "Oh&nbsp;&nbsp;

00:07:14.160 --> 00:07:22.440
no." And I'm like freaking out. Like this is,&nbsp;
this is a bad dream. [LAUGHS] And I'm flipping&nbsp;&nbsp;

00:07:22.440 --> 00:07:27.720
through ... not only do I not know how to answer&nbsp;
the questions; I don't understand the questions,&nbsp;&nbsp;

00:07:27.720 --> 00:07:34.080
like the vocabulary. It's only been three weeks.&nbsp;
How did they learn so much? And then I noticed&nbsp;&nbsp;

00:07:34.080 --> 00:07:40.080
that it's an open-book exam and I don't have my&nbsp;
book on top of it, like ... but what I didn't&nbsp;&nbsp;

00:07:40.080 --> 00:07:46.480
notice and what became apparent in about 20&nbsp;
minutes … the TA clapped his hands, and said,&nbsp;&nbsp;

00:07:47.480 --> 00:07:52.100
“All right, everyone, put it down. We'll go&nbsp;
over the answers now.” It was a practice.

00:07:52.100 --> 00:07:53.911
GEHRKE: Oh, lucky you.

00:07:53.911 --> 00:07:57.000
KICIMAN: Oh, my god, yes. So&nbsp;
I did nothing but study for&nbsp;&nbsp;

00:07:57.000 --> 00:08:00.220
that exam for the next week and did fine on it.

00:08:00.220 --> 00:08:02.111
GEHRKE: So you didn't have to drop&nbsp;
the class or anything like that?

00:08:02.111 --> 00:08:07.960
KICIMAN: No, no, no. I studied enough that&nbsp;
I did reasonably, you know, reasonably well.

00:08:07.960 --> 00:08:10.600
GEHRKE: At what point in time was it&nbsp;
clear to you that you wanted to do a&nbsp;&nbsp;

00:08:10.600 --> 00:08:13.280
PhD or that you wanted to continue your studies?

00:08:13.280 --> 00:08:16.800
KICIMAN: I tried to explore&nbsp;
a lot during my undergrad,&nbsp;&nbsp;

00:08:17.960 --> 00:08:24.720
so I did go off to industry for&nbsp;
a summer internship. Super fun.

00:08:24.720 --> 00:08:25.640
GEHRKE: Where did you, where did you work?

00:08:25.640 --> 00:08:26.360
KICIMAN: It was Netscape.

00:08:26.360 --> 00:08:27.060
GEHRKE: Oh Netscape.

00:08:27.060 --> 00:08:29.440
KICIMAN: And it was a joint project with IBM.

00:08:29.440 --> 00:08:30.740
GEHRKE: Which year was that in?

00:08:30.740 --> 00:08:33.300
KICIMAN: This would have been '90, around '93.

00:08:33.300 --> 00:08:36.000
GEHRKE: ’93 … OK, so the very&nbsp;
early days of Netscape, actually.

00:08:36.000 --> 00:08:38.720
KICIMAN: Yeah, yeah. They were&nbsp;
building Netscape Navigator 4,&nbsp;&nbsp;

00:08:38.720 --> 00:08:42.300
and the project I was on was&nbsp;
Netscape Navigator for OS/2.

00:08:42.300 --> 00:08:43.240
GEHRKE: OK.

00:08:43.240 --> 00:08:48.560
KICIMAN: IBM's OS/2 had come out and was doing&nbsp;
poorly against NT, and they wanted to raise its&nbsp;&nbsp;

00:08:48.560 --> 00:08:58.240
profile. And this team of 20 people were really&nbsp;
just focused on getting this out there. And so I&nbsp;&nbsp;

00:08:58.240 --> 00:09:03.225
always thought of, you know—and I was an OS/2 user&nbsp;
already, which is how I got onto that project.

00:09:03.225 --> 00:09:04.840
GEHRKE: OK … And how was&nbsp;
the culture there, or ...?

00:09:04.840 --> 00:09:09.280
KICIMAN: The culture, it's what you would&nbsp;
think of as a startup culture. You know,&nbsp;&nbsp;

00:09:09.280 --> 00:09:13.400
they gave out all their meals. There&nbsp;
was lots of fun events. You know,&nbsp;&nbsp;

00:09:13.400 --> 00:09:17.600
dentists came into the parking lot like&nbsp;
once a month or something like that.

00:09:17.600 --> 00:09:18.640
GEHRKE: Dentist?

00:09:18.640 --> 00:09:21.600
KICIMAN: There was, like, a&nbsp;
yeah, it was, yeah, you know,&nbsp;&nbsp;

00:09:21.600 --> 00:09:25.500
everyone's working too much at the office,&nbsp;
so the company wanted to make things easy.

00:09:25.500 --> 00:09:26.520
GEHRKE: That sounds great.

00:09:26.520 --> 00:09:34.320
KICIMAN: But the next summer then, I did a&nbsp;
research internship, a research assistantship,&nbsp;&nbsp;

00:09:34.320 --> 00:09:39.800
at Berkeley. I worked with Randy&nbsp;
Katz and Eric Brewer and got into,&nbsp;&nbsp;

00:09:39.800 --> 00:09:44.720
you know, trying to understand cellphone&nbsp;
networks and what they were thinking about,&nbsp;&nbsp;

00:09:44.720 --> 00:09:48.740
you know, cloud infrastructure&nbsp;
for new cellular technologies.

00:09:48.740 --> 00:09:52.631
GEHRKE: And Eric Brewer, was he, at that point&nbsp;
in time, already running Inktomi, or ... ?

00:09:52.631 --> 00:09:55.080
KICIMAN: He was already running&nbsp;
Inktomi. Yeah, yeah, he'd already&nbsp;&nbsp;

00:09:55.080 --> 00:09:59.120
started it. I don't think it was public&nbsp;
yet at the time, but maybe getting there.

00:09:59.120 --> 00:10:02.520
GEHRKE: OK. Well, this was right at the&nbsp;
beginning when, like, all the, you know,&nbsp;&nbsp;

00:10:02.520 --> 00:10:06.880
cloud infrastructure was defined and,&nbsp;
you know, a lot of the basics were set.&nbsp;&nbsp;

00:10:06.880 --> 00:10:10.760
So you did this internship then in your,&nbsp;
after your junior year, the second one?

00:10:10.760 --> 00:10:13.320
KICIMAN: Yeah, after my junior&nbsp;
year. It was then senior year,&nbsp;&nbsp;

00:10:13.320 --> 00:10:19.240
and it was time to apply for, you know,&nbsp;
what's going to come after college.&nbsp;&nbsp;

00:10:19.240 --> 00:10:24.400
And I knew it … after that assistantship at&nbsp;
Berkeley, I knew I was going to go do a PhD.

00:10:24.400 --> 00:10:28.860
GEHRKE: So what is the thing about the internship&nbsp;
that made you want to stay in research?

00:10:28.860 --> 00:10:36.400
KICIMAN: Oh, it's just the ... it gave a vision&nbsp;
of the future. Like, we were playing with, like,&nbsp;&nbsp;

00:10:36.400 --> 00:10:40.720
you know, there were people in the lab&nbsp;
playing with video over the internet and,&nbsp;&nbsp;

00:10:40.720 --> 00:10:48.520
you know, teleconferencing, and just seeing&nbsp;
that, it felt like you were seeing into the&nbsp;&nbsp;

00:10:48.520 --> 00:10:57.080
future and diving deep technically across the&nbsp;
stack in a way that the industry internship&nbsp;&nbsp;

00:10:57.840 --> 00:11:02.520
hadn't done. And so that part of it and&nbsp;
obviously lots of particulars. You know,&nbsp;&nbsp;

00:11:02.520 --> 00:11:05.840
lots of internships do go very&nbsp;
deep in industry, as well,&nbsp;&nbsp;

00:11:05.840 --> 00:11:12.700
but that's what struck me, is that, kind&nbsp;
of, wanting to learn was the big driver.

00:11:12.700 --> 00:11:17.160
GEHRKE: And what excited you about systems&nbsp;
as compared to something that's more&nbsp;&nbsp;

00:11:17.160 --> 00:11:21.000
applications-oriented or more touching the user?&nbsp;
I feel like systems you always have to have this,&nbsp;&nbsp;

00:11:21.000 --> 00:11:24.840
kind of, drive for infrastructure&nbsp;
and for scale and for, you know,&nbsp;&nbsp;

00:11:24.840 --> 00:11:28.240
building the foundation as compared&nbsp;
to, like, directly impacting the user.

00:11:28.240 --> 00:11:35.360
KICIMAN: I think the way I think about&nbsp;
systems today—and I can't remember what&nbsp;&nbsp;

00:11:35.360 --> 00:11:41.720
it was about systems then. I'd always done&nbsp;
operating ... like, operating systems was&nbsp;&nbsp;

00:11:41.720 --> 00:11:45.000
one of my first upper-division courses&nbsp;
at Berkeley and everything. So, like,&nbsp;&nbsp;

00:11:45.000 --> 00:11:51.680
I certainly enjoyed it a lot. But the way I&nbsp;
think about systems now—and I think I do bring&nbsp;&nbsp;

00:11:51.680 --> 00:11:59.240
systems thinking to a lot of the work I do, even&nbsp;
in AI and responsible AI—is the way you structure&nbsp;&nbsp;

00:11:59.840 --> 00:12:06.600
software, it feels like you should be making a&nbsp;
statement about what the underlying problem is,&nbsp;&nbsp;

00:12:06.600 --> 00:12:12.240
what is the component you should be building from&nbsp;
an elegance or first-principles perspective. But&nbsp;&nbsp;

00:12:12.240 --> 00:12:19.080
really, it's about the people who are going&nbsp;
to be using and building and maintaining that&nbsp;&nbsp;

00:12:19.080 --> 00:12:25.120
system. You want to componentize it so that&nbsp;
the teams who are going to be building the&nbsp;&nbsp;

00:12:25.120 --> 00:12:31.240
bigger thing can work independently, revise&nbsp;
and update their software without having to&nbsp;&nbsp;

00:12:31.240 --> 00:12:35.800
coordinate every little thing. I think that's&nbsp;
where that systems thinking comes in for me,&nbsp;&nbsp;

00:12:35.800 --> 00:12:40.400
is what's the right abstraction that's&nbsp;
going to decouple folks from each other.

00:12:40.400 --> 00:12:45.280
GEHRKE: That's a really great analogy because the&nbsp;
way it was once told to me was that systems is&nbsp;&nbsp;

00:12:45.280 --> 00:12:49.920
really about discovering the beauty in large&nbsp;
software. Because once you touch the user,&nbsp;&nbsp;

00:12:49.920 --> 00:12:52.120
you, sort of, have to do whatever is necessary to,&nbsp;&nbsp;

00:12:52.120 --> 00:12:55.760
you know, make the user happy. But in the&nbsp;
foundations, you should have simplicity;&nbsp;&nbsp;

00:12:55.760 --> 00:12:59.120
you should have ease; you should have&nbsp;
elegance. Is that how you think about it?

00:12:59.120 --> 00:13:03.880
KICIMAN: I do think about those aspects, but it's&nbsp;
for a purpose. You know, you want the elegance&nbsp;&nbsp;

00:13:03.880 --> 00:13:09.760
and the simplicity so that you can have, you&nbsp;
know, one team working on Layer 1 of the stack,&nbsp;&nbsp;

00:13:09.760 --> 00:13:13.120
another team working on Layer 2 of the&nbsp;
stack, and you don't want them to have&nbsp;&nbsp;

00:13:13.120 --> 00:13:19.960
to talk to each other every 10 minutes when&nbsp;
they're making any change to any line of code,&nbsp;&nbsp;

00:13:19.960 --> 00:13:23.640
right. And so thinking about, what is the&nbsp;
more fundamental layer of abstraction that&nbsp;&nbsp;

00:13:23.640 --> 00:13:29.720
lets these people work on separate problems?&nbsp;
That's what's important to me. And, of course,&nbsp;&nbsp;

00:13:29.720 --> 00:13:35.640
like, that then interplays with people's&nbsp;
interests and expertise. And as people's&nbsp;&nbsp;

00:13:35.640 --> 00:13:41.560
expertise evolves, that might mean that that&nbsp;
has implications for the design of your system.

00:13:41.560 --> 00:13:44.360
GEHRKE: And so you're, OK, you're&nbsp;
an undergrad. You have done this&nbsp;&nbsp;

00:13:44.360 --> 00:13:47.160
research experience; you now apply. So now you go&nbsp;&nbsp;

00:13:47.160 --> 00:13:49.860
to grad school. Do you do anything fun&nbsp;
between your undergrad and grad school?

00:13:49.860 --> 00:13:50.980
KICIMAN: No, I went straight in.

00:13:50.980 --> 00:13:51.620
GEHRKE: Right straight in?

00:13:51.620 --> 00:13:55.440
KICIMAN: Right straight in. I did&nbsp;
my PhD at Stanford. So I went,&nbsp;&nbsp;

00:13:55.440 --> 00:13:56.587
you know, a little way to school.

00:13:56.587 --> 00:13:59.068
GEHRKE: To a rival school, isn't&nbsp;
it? Isn't it a big rival school?

00:13:59.068 --> 00:14:00.760
KICIMAN: To a rival school. Well,&nbsp;
the undergrad school wins. I think&nbsp;&nbsp;

00:14:00.760 --> 00:14:05.800
that's the general rule of thumb.&nbsp;
But I did continue working with&nbsp;&nbsp;

00:14:05.800 --> 00:14:09.480
folks at Berkeley. So my adviser&nbsp;
was also from Berkeley and so ...

00:14:09.480 --> 00:14:10.360
GEHRKE: Who was your adviser?

00:14:10.360 --> 00:14:11.765
KICIMAN: My adviser was Armando Fox, …

00:14:11.765 --> 00:14:13.267
GEHRKE: OK, yeah. Mm-hmm.

00:14:13.267 --> 00:14:14.260
KICIMAN: … and we had a ...

00:14:14.260 --> 00:14:15.120
GEHRKE: Recovery-oriented computing?
KICIMAN: Yes, exactly. Recovery-oriented

00:14:15.120 --> 00:14:23.000
computing. And the other person on the&nbsp;
recovery-oriented computing project ...

00:14:23.000 --> 00:14:24.027
GEHRKE: Dave Patterson ...

00:14:24.027 --> 00:14:25.180
KICIMAN: ... was Dave Patterson, yeah.

00:14:25.180 --> 00:14:28.360
GEHRKE: So it was really a true, sort of,&nbsp;
Stanford-Berkeley joint project in a way?

00:14:28.360 --> 00:14:37.400
KICIMAN: Yes, yeah. And that was my PhD. The work&nbsp;
I did then was the first work to apply machine&nbsp;&nbsp;

00:14:37.400 --> 00:14:43.680
learning to the problem of fault detection and&nbsp;
diagnosis in large-scale systems. I worked with&nbsp;&nbsp;

00:14:43.680 --> 00:14:50.960
two large companies—one of them was Amazon; one&nbsp;
of them was anonymous—to test out these ideas&nbsp;&nbsp;

00:14:50.960 --> 00:14:56.840
in more realistic settings. And then I did a lot&nbsp;
of open-source work with J2EE to demonstrate how&nbsp;&nbsp;

00:14:56.840 --> 00:15:01.960
you can trace the behavior of a system and build&nbsp;
up models of its behavior and detect anomalies.&nbsp;&nbsp;

00:15:01.960 --> 00:15:09.800
Funnily enough, I know this is going to sound a&nbsp;
little alien to us now maybe in today's world:&nbsp;&nbsp;

00:15:09.800 --> 00:15:14.560
Dave and Armando would not let me use&nbsp;
the phrase "artificial intelligence"&nbsp;&nbsp;

00:15:14.560 --> 00:15:18.420
anywhere in my thesis because they were&nbsp;
worried I would not be able to get a job.

00:15:18.420 --> 00:15:22.160
GEHRKE: I see. Because that was, sort of,&nbsp;
one of ... I mean, AI goes through these&nbsp;&nbsp;

00:15:22.160 --> 00:15:27.040
hype cycles and then, you know, the winters&nbsp;
again, and so this was one of the winter times?

00:15:27.040 --> 00:15:32.200
KICIMAN: This was definitely a wintertime. I was&nbsp;
able to use the phrase "machine learning" in the&nbsp;&nbsp;

00:15:32.200 --> 00:15:39.520
body of the thesis, but I had to make up something&nbsp;
about statistical monitoring for the title.

00:15:39.520 --> 00:15:42.280
GEHRKE: So what is the actual final&nbsp;
title of your thesis, if you remember it?

00:15:42.280 --> 00:15:44.680
KICIMAN: "Statistical monitoring for fault&nbsp;&nbsp;

00:15:44.680 --> 00:15:49.516
detection and diagnosis in large-scale&nbsp;
internet services" or something like that.

00:15:49.512 --> 00:15:51.720
GEHRKE: So you replaced AI&nbsp;
with statistical modeling&nbsp;&nbsp;

00:15:51.720 --> 00:15:52.831
and then everything [turned out all right]?

00:15:52.831 --> 00:15:55.960
KICIMAN: Yes, yeah. Everything ...&nbsp;
then it didn't sound too hype-y.

00:15:55.960 --> 00:16:01.920
GEHRKE: And then after your PhD, you&nbsp;
went straight to MSR, is that right?

00:16:01.920 --> 00:16:08.880
KICIMAN: Yeah. I mean, so here I'm coming out of&nbsp;
my PhD with a focus on academic-style research for&nbsp;&nbsp;

00:16:08.880 --> 00:16:15.200
large-scale systems. Kind of boxed myself in&nbsp;
a little bit. No university has a large-scale&nbsp;&nbsp;

00:16:15.200 --> 00:16:20.440
internet service, and most large-scale internet&nbsp;
service companies don't have research arms. So&nbsp;&nbsp;

00:16:20.440 --> 00:16:25.320
Microsoft Research was actually the perfect&nbsp;
fit for this work. And when I got here,&nbsp;&nbsp;

00:16:25.320 --> 00:16:28.960
I started diving in and actually expanding&nbsp;
a little bit and thinking about what are the&nbsp;&nbsp;

00:16:28.960 --> 00:16:35.640
end-to-end reliability issues with our services.&nbsp;
So assume that the back end is running well. What&nbsp;&nbsp;

00:16:35.640 --> 00:16:39.680
else could go wrong that's going to get in the&nbsp;
way of the user? So I had one project going on,&nbsp;&nbsp;

00:16:39.680 --> 00:16:44.512
wide area network reliability with&nbsp;
David Maltz, and one project ...

00:16:44.512 --> 00:16:45.840
GEHRKE: Who is now CVP in Azure.

00:16:45.840 --> 00:16:48.920
KICIMAN: Who's now, yeah, leading&nbsp;
Azure network—the head of Azure&nbsp;&nbsp;

00:16:48.920 --> 00:16:59.240
networking. And one project on how we can&nbsp;
monitor the behavior of our JavaScript&nbsp;&nbsp;

00:16:59.240 --> 00:17:03.320
applications that were just starting to become&nbsp;
big. Like around then is when, you know,&nbsp;&nbsp;

00:17:03.320 --> 00:17:09.120
the first 10,000-line, 100,000-line-of-code&nbsp;
JavaScript applications [were] appearing,&nbsp;&nbsp;

00:17:09.120 --> 00:17:12.280
and we had no idea whether they were actually&nbsp;
running correctly, right? They're running&nbsp;&nbsp;

00:17:12.280 --> 00:17:15.540
on someone else's browser and someone&nbsp;
else's operating system. We didn't know.

00:17:15.540 --> 00:17:17.960
GEHRKE: A big one at that point in time,&nbsp;
I think was Gmail, right? This was,&nbsp;&nbsp;

00:17:17.960 --> 00:17:20.300
sort of, a really big one. But did&nbsp;
we have any big ones in Microsoft?

00:17:20.300 --> 00:17:22.783
KICIMAN: Gmail was the first&nbsp;
big one in the industry.

00:17:22.783 --> 00:17:23.840
GEHRKE: Hotmail, was it also&nbsp;
Java, based in JavaScript?

00:17:23.840 --> 00:17:28.240
KICIMAN: Hotmail was not initially&nbsp;
JavaScript based. The biggest one at&nbsp;&nbsp;

00:17:28.240 --> 00:17:33.800
that time was our maps. Not Bing&nbsp;
maps, but whatever we called it.

00:17:33.800 --> 00:17:35.440
GEHRKE: MSN maps, or ...

00:17:35.440 --> 00:17:37.100
KICIMAN: Probably something like that, yeah, yeah.

00:17:37.100 --> 00:17:41.800
GEHRKE: I see. And so you applied your techniques&nbsp;
to that code base and tried to find a lot of bugs?

00:17:41.800 --> 00:17:46.560
KICIMAN: Yeah, this project was—and this was&nbsp;
about data gathering, right, so I'm still&nbsp;&nbsp;

00:17:46.560 --> 00:17:51.120
thinking about it from the perspective of how&nbsp;
do I analyze data to tell me what's going on.&nbsp;&nbsp;

00:17:51.120 --> 00:17:55.880
We had data for the wide area network, but these&nbsp;
web applications, we didn't have any. So I'm,&nbsp;&nbsp;

00:17:55.880 --> 00:18:00.120
like, I'm going to build this infrastructure,&nbsp;
collect the data, so that in a couple years,&nbsp;&nbsp;

00:18:00.120 --> 00:18:08.760
I can analyze it. And so what I wrote was a proxy&nbsp;
that sat on the side of the IAS server and just&nbsp;&nbsp;

00:18:08.760 --> 00:18:15.200
dynamically instrumented all the JavaScript that&nbsp;
got shipped out. And the idea was that no one user&nbsp;&nbsp;

00:18:15.200 --> 00:18:21.880
was going to pay the cost of the instrumentation,&nbsp;
but everyone would pay a little small percentage,&nbsp;&nbsp;

00:18:21.880 --> 00:18:25.240
and then you could collect it in the back&nbsp;
end to get the full complete picture.

00:18:25.240 --> 00:18:28.760
GEHRKE: Right. It's so interesting because, I&nbsp;
mean, in those days, right, you still thought&nbsp;&nbsp;

00:18:29.600 --> 00:18:32.920
maybe in terms of years and so on, right.&nbsp;
I mean, you've said, well, I instrumented,&nbsp;&nbsp;

00:18:32.920 --> 00:18:35.480
then maybe in a year, I have some&nbsp;
data. And today it happens that I&nbsp;&nbsp;

00:18:35.480 --> 00:18:38.840
instrument, and tomorrow I have enough data&nbsp;
to make a decision on an A/B test and so on,&nbsp;&nbsp;

00:18:38.840 --> 00:18:42.120
right. It was a very different time, right.&nbsp;
And also, it was probably a defining time&nbsp;&nbsp;

00:18:42.120 --> 00:18:46.040
for Microsoft because we moved into online&nbsp;
services, right. We moved into large-scale&nbsp;&nbsp;

00:18:46.040 --> 00:18:49.320
internet services. So it must have been&nbsp;
exciting to be in the middle of all of this.

00:18:49.320 --> 00:18:54.640
KICIMAN: It really was. I mean, there was a lot of&nbsp;
change happening both inside Microsoft and outside&nbsp;&nbsp;

00:18:54.640 --> 00:19:02.880
Microsoft. That's when ... soon after this is&nbsp;
when social networking started to become big,&nbsp;&nbsp;

00:19:02.880 --> 00:19:12.440
right. You started seeing Facebook and&nbsp;
Twitter show up, and search became a&nbsp;&nbsp;

00:19:12.440 --> 00:19:17.480
bigger deal for Microsoft when we started&nbsp;
investing in Windows Live and then Bing,&nbsp;&nbsp;

00:19:20.320 --> 00:19:25.120
and that's actually ... my manager, Yi-Min&nbsp;
Wang, actually joined up with Harry Shum&nbsp;&nbsp;

00:19:25.120 --> 00:19:29.720
to create the Internet Services Research&nbsp;
Center with the specific focus of helping&nbsp;&nbsp;

00:19:29.720 --> 00:19:36.520
Bing. And so that also shifted my focus a&nbsp;
little bit and so had me looking more at&nbsp;&nbsp;

00:19:36.520 --> 00:19:40.320
some of the social data that would, kind of,&nbsp;
take my trajectory on a little bit further.

00:19:40.320 --> 00:19:43.600
GEHRKE: Right. I mean, so you're unique&nbsp;
in that, you know, people very often,&nbsp;&nbsp;

00:19:43.600 --> 00:19:46.480
they come in here and, you know, they're&nbsp;
specialists in systems, and they branch&nbsp;&nbsp;

00:19:46.480 --> 00:19:50.360
out within systems a little bit and, you know,&nbsp;
of course, move with time. Maybe now they do,&nbsp;&nbsp;

00:19:50.360 --> 00:19:55.480
you know, AI infrastructure. But you have&nbsp;
really moved quite a bit, right. I mean,&nbsp;&nbsp;

00:19:55.480 --> 00:20:01.600
you did your PhD on systems … I mean, systems&nbsp;
and AI really, the way I understand it. Then you&nbsp;&nbsp;

00:20:01.600 --> 00:20:06.080
worked here a little bit more on systems in wide&nbsp;
area and large-scale systems. But then, you know,&nbsp;&nbsp;

00:20:06.080 --> 00:20:12.200
you really became also an expert in causality and&nbsp;
looked at, sort of, the social side. And now you,&nbsp;&nbsp;

00:20:12.200 --> 00:20:17.200
of course, have started to move very deeply into&nbsp;
LLMs. So rather than talking about the topics&nbsp;&nbsp;

00:20:17.200 --> 00:20:23.000
itself, how do you decide? How do you make these&nbsp;
decisions? How do you ... you know, you're a world&nbsp;&nbsp;

00:20:23.000 --> 00:20:28.000
expert on x, and how do you, in some sense,&nbsp;
throw it all away and go to y? Do you decide&nbsp;&nbsp;

00:20:28.000 --> 00:20:32.400
one day, "I'm interested in y"? Do you, sort of,&nbsp;
shift over time a little bit? How do you do it?

00:20:32.400 --> 00:20:38.080
KICIMAN: I've done it, I think, two or maybe&nbsp;
three times, depending on if you count now,&nbsp;&nbsp;

00:20:38.080 --> 00:20:44.840
and some transitions have gone better&nbsp;
than others. I think my transition from&nbsp;&nbsp;

00:20:44.840 --> 00:20:52.720
systems to social data and computational&nbsp;
social science, it was driven by a project&nbsp;&nbsp;

00:20:52.720 --> 00:21:00.640
that we did for search at the time. Shuo Chen,&nbsp;
another researcher here at Microsoft Research,&nbsp;&nbsp;

00:21:00.640 --> 00:21:07.520
built a web application that lets you give very&nbsp;
concrete feedback back to Windows Live. You could&nbsp;&nbsp;

00:21:07.520 --> 00:21:14.280
drag and drop the results around and say, this&nbsp;
is what I wanted it to look like. And this made,&nbsp;&nbsp;

00:21:14.280 --> 00:21:19.400
you know, feedback much more actionable and helped&nbsp;
really understand DSATs and where they're coming&nbsp;&nbsp;

00:21:19.400 --> 00:21:24.920
from. DSAT being dissatisfactions. And I&nbsp;
looked at that and I was like, I want to&nbsp;&nbsp;

00:21:24.920 --> 00:21:31.160
be able to move search results around and share&nbsp;
with my friends. And I, kind of, poked at Shuo,&nbsp;&nbsp;

00:21:31.160 --> 00:21:35.600
you know, asked him if he would build this, and&nbsp;
he said no. He said he's busy. So eventually,&nbsp;&nbsp;

00:21:35.600 --> 00:21:43.360
I—because I knew something about JavaScript&nbsp;
applications—decided to just drop things and spend&nbsp;&nbsp;

00:21:43.360 --> 00:21:49.600
six months building out this application. So I&nbsp;
built out this social search application where you&nbsp;&nbsp;

00:21:49.600 --> 00:21:54.400
could drag and drop search results around, share&nbsp;
it with your friends, and we put it out, actually.&nbsp;&nbsp;

00:21:54.400 --> 00:22:00.820
We got it deployed as an external service.&nbsp;
We had maybe 10,000 people kick the tires.

00:22:00.820 --> 00:22:02.591
GEHRKE: Within Microsoft or ...?

00:22:02.591 --> 00:22:03.120
KICIMAN: No, externally.

00:22:03.120 --> 00:22:03.760
GEHRKE: OK.

00:22:03.760 --> 00:22:09.360
KICIMAN: Yeah. There was a great headline that,&nbsp;
like, Google then fast followed with a similar&nbsp;&nbsp;

00:22:09.360 --> 00:22:13.840
feature, and the headline was like, Google fast&nbsp;
follows, basically, on Microsoft. Our PR folks&nbsp;&nbsp;

00:22:13.840 --> 00:22:21.400
were very excited about that. I say this all&nbsp;
... I mean, it's all history now. But certainly,&nbsp;&nbsp;

00:22:21.400 --> 00:22:29.000
it was fun at the time. But now we're&nbsp;
... I'm giving this demo, this talk,&nbsp;&nbsp;

00:22:29.000 --> 00:22:33.520
about this prototype that we built and what we're&nbsp;
learning about, you know, what's in people's way,&nbsp;&nbsp;

00:22:33.520 --> 00:22:38.280
what's friction, what do they like and not like,&nbsp;
etc. And I'm standing up and, you know, giving&nbsp;&nbsp;

00:22:38.280 --> 00:22:42.920
this presentation, this demo, and someone says,&nbsp;
hey could you, could you go back to, you know, go&nbsp;&nbsp;

00:22:42.920 --> 00:22:49.400
back in the browser? On the bottom right corner,&nbsp;
it says Mike did something on this search page;&nbsp;&nbsp;

00:22:49.400 --> 00:22:55.520
he edited some search results. Could you click on&nbsp;
that? I want to know what he did. I'm like, OK,&nbsp;&nbsp;

00:22:55.520 --> 00:23:01.240
yeah, sure. I click on it. And [it’s like], OK,&nbsp;
that's great. That's, that's really interesting.&nbsp;&nbsp;

00:23:01.240 --> 00:23:05.680
And this happened multiple times. Like, in a&nbsp;
formal presentation, for someone to interrupt&nbsp;&nbsp;

00:23:05.680 --> 00:23:11.960
you and ask a personal question just out of their&nbsp;
own curiosity, that's what showed me … that's what&nbsp;&nbsp;

00:23:11.960 --> 00:23:17.800
got me really thinking deeply about the value of&nbsp;
this social data and, like, why is it locked up&nbsp;&nbsp;

00:23:17.800 --> 00:23:22.760
in a very specific interface. What else could&nbsp;
you do with this data if it's so engaging, so&nbsp;&nbsp;

00:23:22.760 --> 00:23:30.840
fascinating, that people are willing to interrupt&nbsp;
a speaker for some totally irrelevant, basically,&nbsp;&nbsp;

00:23:30.840 --> 00:23:36.800
question? And that's when I switched to really&nbsp;
trying to figure out what to do with social data.

00:23:36.800 --> 00:23:41.000
GEHRKE: I see. So it was this, kind&nbsp;
of, really personal experience of&nbsp;&nbsp;

00:23:41.000 --> 00:23:45.000
people being so excited about that social&nbsp;
interaction on the demos that you're giving.

00:23:45.000 --> 00:23:47.560
KICIMAN: Exactly. They cared&nbsp;
about their friends and what&nbsp;&nbsp;

00:23:47.560 --> 00:23:51.020
their friends did, and that was super clear.

00:23:51.020 --> 00:23:53.840
GEHRKE: So, so coming back,&nbsp;
let's go there in a second,&nbsp;&nbsp;

00:23:53.840 --> 00:23:57.020
but coming back to the story that you told,&nbsp;
you said you had 10,000 external users.

00:23:57.020 --> 00:23:57.620
KICIMAN: Yeah.

00:23:57.620 --> 00:24:02.600
GEHRKE: So I'm still, you know, also always&nbsp;
trying to learn what we can do better because&nbsp;&nbsp;

00:24:02.600 --> 00:24:08.160
we sometimes have prototypes that are incredibly&nbsp;
valuable. They're prototypes that have fans;&nbsp;&nbsp;

00:24:08.160 --> 00:24:12.960
they're prototypes that, you know, the fans even&nbsp;
want to contribute. But then somehow, we get stuck&nbsp;&nbsp;

00:24:12.960 --> 00:24:16.680
in the middle; and they don't scale, and they&nbsp;
don't become a business. What happened with that?

00:24:16.680 --> 00:24:17.860
KICIMAN: Yeah.

00:24:17.860 --> 00:24:19.071
GEHRKE: Also in [retrospect], ...

00:24:19.071 --> 00:24:20.022
KICIMAN: In retrospect ...

00:24:20.022 --> 00:24:21.640
GEHRKE: … what, what ... should&nbsp;
we have done something different,&nbsp;&nbsp;

00:24:21.640 --> 00:24:23.660
or did it live up to its potential?

00:24:23.660 --> 00:24:28.560
KICIMAN: I think we learned something. I think&nbsp;
that there were a couple of things we learned. One&nbsp;&nbsp;

00:24:28.560 --> 00:24:34.480
was that, you know, every extra click that&nbsp;
people wanted to do, you know, took the number&nbsp;&nbsp;

00:24:34.480 --> 00:24:40.480
of interactions down by, you know, an order of&nbsp;
magnitude. So starring something and bringing&nbsp;&nbsp;

00:24:40.480 --> 00:24:46.280
it to the top, that was very popular. Dragging&nbsp;
and dropping? Little bit less so. Dragging and&nbsp;&nbsp;

00:24:46.280 --> 00:24:54.480
dropping from one search to a different search?&nbsp;
So maybe I'll search for, you know, "Johannes,"&nbsp;&nbsp;

00:24:54.480 --> 00:24:59.680
find your homepage, and then drag and drop it to,&nbsp;
like, people's, you know, publications list to,&nbsp;&nbsp;

00:24:59.680 --> 00:25:08.440
like, keep an eye on or something. Like that,&nbsp;
almost never. And people were very wary about&nbsp;&nbsp;

00:25:08.440 --> 00:25:14.920
editing the page. Like, what if I make a mistake?&nbsp;
What if it's just, just me, like, who wants this,&nbsp;&nbsp;

00:25:14.920 --> 00:25:19.160
and I'm messing up search for the rest of the&nbsp;
world? And it's like, no, no, it's just your&nbsp;&nbsp;

00:25:19.160 --> 00:25:23.560
friends, like just you and your friends who are&nbsp;
going to see this. And so we learned a lot about&nbsp;&nbsp;

00:25:23.560 --> 00:25:28.320
people's mental models and, like, what stood in&nbsp;
the way of, you know, interactions on the web.&nbsp;&nbsp;

00:25:29.280 --> 00:25:34.920
There were lots of challenges to doing this&nbsp;
at scale. I mean, we needed, for example,&nbsp;&nbsp;

00:25:34.920 --> 00:25:40.160
a way of tracking users. We needed a way&nbsp;
of very quickly, within 100 milliseconds,&nbsp;&nbsp;

00:25:40.160 --> 00:25:47.760
getting information about a user's past edits&nbsp;
to search pages into, you know, into memory&nbsp;&nbsp;

00:25:47.760 --> 00:25:51.380
if we were going to do this for real on Windows&nbsp;
Live. And we just didn't have the infrastructure.

00:25:51.380 --> 00:25:53.560
GEHRKE: I see. And those&nbsp;
problems were hard in those days.

00:25:53.560 --> 00:25:59.760
KICIMAN: Yeah. A prototype is fine. People,&nbsp;
you know, will handle a little bit of latency&nbsp;&nbsp;

00:25:59.760 --> 00:26:05.800
if it's a research prototype, but for&nbsp;
everyday use, you need something more.

00:26:05.800 --> 00:26:10.471
GEHRKE: And there was no push to try&nbsp;
it, to land it somehow, or what ... ?

00:26:10.471 --> 00:26:13.312
KICIMAN: There were big pushes, but&nbsp;
the infrastructure, it was really ...

00:26:13.312 --> 00:26:15.031
GEHRKE: I see. It was really an&nbsp;
infrastructure problem, then?

00:26:15.031 --> 00:26:16.256
KICIMAN: Yeah, yeah.

00:26:16.256 --> 00:26:18.600
GEHRKE: OK. Interesting because it sounds to me&nbsp;
like, wow, there's an exciting research problem&nbsp;&nbsp;

00:26:18.600 --> 00:26:22.520
there; now you need the infrastructure to try&nbsp;
to make all of these things really, really fast.&nbsp;&nbsp;

00:26:22.520 --> 00:26:27.380
It's always fascinating to see, you know, where&nbsp;
things get stuck and how they, how they proceed.

00:26:27.380 --> 00:26:29.920
KICIMAN: Yeah, I think it'd be a&nbsp;
lot easier to build that—from an&nbsp;&nbsp;

00:26:29.920 --> 00:26:34.200
infrastructure point of view—today. But, of&nbsp;
course, then there's lots of other questions,&nbsp;&nbsp;

00:26:34.200 --> 00:26:37.000
like is this really what, you know,&nbsp;
the best thing to do. Like I mentioned,&nbsp;&nbsp;

00:26:37.000 --> 00:26:42.388
Google had this fast follow feature.&nbsp;
They also removed it afterwards, as well.

00:26:42.388 --> 00:26:48.360
GEHRKE: OK. Yeah, hindsight is always, you&nbsp;
know, twenty-twenty. So, OK, so you're now&nbsp;&nbsp;

00:26:48.360 --> 00:26:53.480
starting to move into social computing, right,&nbsp;
and trying to understand more about social&nbsp;&nbsp;

00:26:53.480 --> 00:26:58.440
interactions between users. How did you end up in&nbsp;
causality, and then how did you make the switch&nbsp;&nbsp;

00:26:58.440 --> 00:27:01.920
to LLMs? And maybe even more about this; I&nbsp;
mean, I understand here this was, sort of,&nbsp;&nbsp;

00:27:01.920 --> 00:27:07.880
this personal story that you really saw that,&nbsp;
you know, the audience was really asking you&nbsp;&nbsp;

00:27:07.880 --> 00:27:12.840
about what's happening here and that, sort of,&nbsp;
motivated you. Was it always this personal drive,&nbsp;&nbsp;

00:27:12.840 --> 00:27:16.680
or was it always others who pulled you?&nbsp;
And how did you make these switches?

00:27:16.680 --> 00:27:22.200
KICIMAN: I think the switch from systems into&nbsp;
social, it was about trying to get closer to&nbsp;&nbsp;

00:27:22.200 --> 00:27:29.200
problems that really mattered to people. I really&nbsp;
enjoy working on systems problems, but oftentimes,&nbsp;&nbsp;

00:27:29.200 --> 00:27:34.560
they feel like they're in the back end. And so&nbsp;
I wanted something where, you know, even if I'm&nbsp;&nbsp;

00:27:34.560 --> 00:27:39.720
not the domain expert working on something,&nbsp;
I can feel like I'm making a contribution to&nbsp;&nbsp;

00:27:39.720 --> 00:27:51.720
that problem. The transition with social data then&nbsp;
into causality and, um, and LLMs, that was a bit&nbsp;&nbsp;

00:27:51.720 --> 00:27:57.000
smoother. So working with social data, trying to&nbsp;
understand what it meant and what it said about&nbsp;&nbsp;

00:27:57.000 --> 00:28:03.920
the world in aggregate, was super-fascinating&nbsp;
problems. So much information is embedded in the&nbsp;&nbsp;

00:28:03.920 --> 00:28:09.520
digital traces that people leave behind. But it&nbsp;
was really difficult for people to come to solid&nbsp;&nbsp;

00:28:09.520 --> 00:28:16.080
conclusions. So there was one conference I went&nbsp;
to where almost every presentation that day gave&nbsp;&nbsp;

00:28:16.080 --> 00:28:21.480
some fascinating insight. This is how people make&nbsp;
friendships. This is how, you know, we're seeing,&nbsp;&nbsp;

00:28:21.480 --> 00:28:28.080
like, signs of disease spread in, you know,&nbsp;
through real-world interactions as they're in&nbsp;&nbsp;

00:28:28.080 --> 00:28:34.320
social data. Here's how people spend their time.&nbsp;
And then people would, and then people would&nbsp;&nbsp;

00:28:34.320 --> 00:28:39.760
close; their conclusion slide every time was,&nbsp;
"And, of course, correlation is not causation,&nbsp;&nbsp;

00:28:39.760 --> 00:28:46.400
so anything could actually be happening." Like,&nbsp;
that is such, that is such a bummer. Like,&nbsp;&nbsp;

00:28:46.400 --> 00:28:51.760
beautiful theory, great understanding. You spent&nbsp;
so much time. I feel like I got some insight. And&nbsp;&nbsp;

00:28:51.760 --> 00:28:59.800
then you pull the rug out and say, but maybe&nbsp;
not. And I'd heard about this work on ... that&nbsp;&nbsp;

00:28:59.800 --> 00:29:05.200
there was work on causal analysis and that there&nbsp;
were certain conditions and ways to get actual&nbsp;&nbsp;

00:29:05.200 --> 00:29:09.480
learned causal relationships from data. So that's&nbsp;
the day I decided I'm going to go figure out what&nbsp;&nbsp;

00:29:09.480 --> 00:29:14.480
that is and how to apply it to social data&nbsp;
for these types of questions. And I went out,&nbsp;&nbsp;

00:29:14.480 --> 00:29:21.640
and the first work there was a collaboration with&nbsp;
Munmun De Choudhury, faculty at Georgia Tech,&nbsp;&nbsp;

00:29:21.640 --> 00:29:31.080
looking at online traces related to mental health&nbsp;
and suicidal ideation and trying to understand&nbsp;&nbsp;

00:29:31.080 --> 00:29:38.080
what some of the factors were in a more, in&nbsp;
a more solid and causal fashion. And so this&nbsp;&nbsp;

00:29:38.080 --> 00:29:43.240
really became, like, this was ... this interest&nbsp;
in computational social science really ended up&nbsp;&nbsp;

00:29:43.240 --> 00:29:48.520
branching out into two areas. One, obviously,&nbsp;
I'm caring about, what can we learn about the&nbsp;&nbsp;

00:29:48.520 --> 00:29:54.280
world? Part of this is, of course, thinking&nbsp;
deeply about the implications of AI on society,&nbsp;&nbsp;

00:29:54.280 --> 00:29:58.120
like what is it going to mean that we&nbsp;
have this data for all of these, you know,&nbsp;&nbsp;

00:29:58.120 --> 00:30:04.400
societal challenges? And then causality. So the&nbsp;
AI and its implications on society is what led&nbsp;&nbsp;

00:30:04.400 --> 00:30:10.720
towards the work on the security of AI systems&nbsp;
and now security of AI as it relates to large&nbsp;&nbsp;

00:30:10.720 --> 00:30:15.840
language models. And then causality was the&nbsp;
other branch that split off from there. Both&nbsp;&nbsp;

00:30:15.840 --> 00:30:21.720
of them really stemming from this desire to&nbsp;
see that we have a positive impact with AI.

00:30:21.720 --> 00:30:24.640
GEHRKE: So you mentioned that, you know, you&nbsp;
were sitting in these talks and people are&nbsp;&nbsp;

00:30:24.640 --> 00:30:28.000
talking about the correlation, and&nbsp;
now you finally have this new tool,&nbsp;&nbsp;

00:30:28.000 --> 00:30:32.240
which is causation. So what are some of the&nbsp;
examples where, you know, with correlation&nbsp;&nbsp;

00:30:32.240 --> 00:30:36.680
you came out with answer A, but now causation&nbsp;
gave you some better, some real deep insights?

00:30:36.680 --> 00:30:41.736
KICIMAN: I haven't gone looking&nbsp;
to refute studies, so ...

00:30:41.736 --> 00:30:42.827
GEHRKE: I see. OK.

00:30:42.827 --> 00:30:46.920
KICIMAN: ... but there are many well-known&nbsp;
studies in the past where people have made&nbsp;&nbsp;

00:30:46.920 --> 00:30:52.360
mistakes because they didn't account for the&nbsp;
right confounding variables. Ronny Kohavi has&nbsp;&nbsp;

00:30:52.360 --> 00:30:58.280
a great list of these on one of his websites.&nbsp;
But a fun one is a study that came out in the&nbsp;&nbsp;

00:30:58.280 --> 00:31:08.320
late '90s on the influence of night lights on&nbsp;
myopia in children. So this was a big splash.&nbsp;&nbsp;

00:31:08.320 --> 00:31:11.640
I think it made it to like Newsweek or&nbsp;
60 Minutes and stuff, that if you have&nbsp;&nbsp;

00:31:11.640 --> 00:31:19.800
night lights in the house, your kids are more&nbsp;
likely to need glasses. And this was wrong.

00:31:19.800 --> 00:31:22.080
GEHRKE: My parents told me all the&nbsp;
time, don't read in bed, you know,&nbsp;&nbsp;

00:31:22.080 --> 00:31:24.817
with your flashlight because&nbsp;
your eyes are going to get bad.

00:31:24.817 --> 00:31:26.520
KICIMAN: Yes.
GEHRKE: That's the story basically, right?

00:31:26.520 --> 00:31:29.393
KICIMAN: This was, yeah, the night&nbsp;
lights that plug in the wall.

00:31:29.393 --> 00:31:30.493
GEHRKE: But that's the ...
KICIMAN: That's the idea, the same thing.

00:31:30.493 --> 00:31:31.951
GEHRKE: The same thing, right.

00:31:31.951 --> 00:31:36.240
KICIMAN: And so these people analyzed a&nbsp;
bunch of data, and they found that there&nbsp;&nbsp;

00:31:36.240 --> 00:31:42.360
was a correlation, and they said that, you know,&nbsp;
it's a cause; you know, this is a cause. And the&nbsp;&nbsp;

00:31:42.360 --> 00:31:49.040
problem was that they didn't account for the&nbsp;
parents' myopia. Apparently, parents who had&nbsp;&nbsp;

00:31:49.040 --> 00:31:55.040
myopia were more likely to install night lights.&nbsp;
And then you have the genetic factor then actually&nbsp;&nbsp;

00:31:55.040 --> 00:32:02.280
causing the myopia. Very simple. But, you know,&nbsp;
people have to replicate this study to, you know,&nbsp;&nbsp;

00:32:02.280 --> 00:32:07.480
to realize it was a mistake. Others were things&nbsp;
like correlations, I think, around vitamin C have&nbsp;&nbsp;

00:32:07.480 --> 00:32:13.880
been reported repeatedly and then refuted in&nbsp;
randomized control trials. But there's many of&nbsp;&nbsp;

00:32:13.880 --> 00:32:20.440
these. Medicine, in particular, has a long history&nbsp;
of false correlations leading people astray.

00:32:20.440 --> 00:32:22.680
GEHRKE: Do you have a story&nbsp;
where here at Microsoft your&nbsp;&nbsp;

00:32:22.680 --> 00:32:24.880
work in causation had a really big impact?

00:32:24.880 --> 00:32:34.320
KICIMAN: You know, the one—it's still ongoing—but&nbsp;
one of the ones that I'm really excited about now,&nbsp;&nbsp;

00:32:34.320 --> 00:32:37.240
and thinking also from the&nbsp;
broader societal impact lens,&nbsp;&nbsp;

00:32:37.240 --> 00:32:44.840
is a collaboration with Ranveer Chandra and his&nbsp;
group. So with a close collaborator at MSR India,&nbsp;&nbsp;

00:32:44.840 --> 00:32:54.520
Amit Sharma, we've developed a connection between&nbsp;
representation learning and underlying causal&nbsp;&nbsp;

00:32:54.520 --> 00:33:00.000
representation of the data-generating process&nbsp;
that's driving something. So if you imagine, like,&nbsp;&nbsp;

00:33:00.000 --> 00:33:05.960
we want to learn a classifier on an object, on an&nbsp;
image, and we want that classifier to generalize&nbsp;&nbsp;

00:33:05.960 --> 00:33:13.160
to other settings, there's lots of reasons why&nbsp;
this can go wrong. You know, you have, you know,&nbsp;&nbsp;

00:33:13.160 --> 00:33:17.880
like a classic example is the question of, is&nbsp;
this picture showing you a camel, or is it showing&nbsp;&nbsp;

00:33:17.880 --> 00:33:24.040
you a cow? The classifier is much more likely to&nbsp;
look at the background, and if it's green grass,&nbsp;&nbsp;

00:33:24.040 --> 00:33:29.080
it's probably a cow. If it's sandy desert, it's&nbsp;
probably a camel. But then you fail if you look&nbsp;&nbsp;

00:33:29.080 --> 00:33:36.120
at a camel in the zoo or a cow on a beach, right.&nbsp;
So how do you make sure that you're looking at the&nbsp;&nbsp;

00:33:36.120 --> 00:33:44.520
real features? People have developed algorithms&nbsp;
for these. But no algorithm actually is robust&nbsp;&nbsp;

00:33:44.520 --> 00:33:48.200
across all the different kinds of distribution&nbsp;
shifts that people see in the real world. Some&nbsp;&nbsp;

00:33:48.200 --> 00:33:52.080
algorithms work on these kinds of distribution&nbsp;
shifts. Some algorithms work on those kinds of&nbsp;&nbsp;

00:33:52.080 --> 00:33:56.560
distribution shifts. And it was a bit of an&nbsp;
interesting, I think, puzzle as to why. And&nbsp;&nbsp;

00:33:56.560 --> 00:34:03.240
so we realized that these distribution shifts,&nbsp;
if you look at them from a causal perspective,&nbsp;&nbsp;

00:34:03.240 --> 00:34:07.600
you can see that the algorithms are actually&nbsp;
imposing different statistical independence&nbsp;&nbsp;

00:34:07.600 --> 00:34:13.120
constraints. And you can read those statistical&nbsp;
independence constraints off of a causal graph.&nbsp;&nbsp;

00:34:13.120 --> 00:34:20.880
And the reason that some algorithms worked well&nbsp;
in some settings was that the underlying causal&nbsp;&nbsp;

00:34:20.880 --> 00:34:26.120
graph implied a different set of statistical&nbsp;
independence constraints in that setting.&nbsp;&nbsp;

00:34:26.120 --> 00:34:30.600
And so that algorithm was the right one for that&nbsp;
setting. If you have a different causal graph with&nbsp;&nbsp;

00:34:30.600 --> 00:34:34.520
different statistical independence constraints,&nbsp;
the other algorithm was better. And so now you&nbsp;&nbsp;

00:34:34.520 --> 00:34:38.920
can see that no one algorithm is going to work&nbsp;
well across all of them. So we built an adaptive&nbsp;&nbsp;

00:34:38.920 --> 00:34:42.760
algorithm that looks at the causal graph,&nbsp;
picks the right statistical independencies,&nbsp;&nbsp;

00:34:42.760 --> 00:34:48.400
and applies them, and now what we're doing with&nbsp;
this algorithm is we're applying it to satellite&nbsp;&nbsp;

00:34:48.400 --> 00:35:01.320
imagery to help us build a more generalizable,&nbsp;
more robust model of carbon in farm fields so&nbsp;&nbsp;

00:35:01.320 --> 00:35:08.712
we can remotely sense and predict what the carbon&nbsp;
level is in a field. And so, the early results ...

00:35:08.712 --> 00:35:09.780
GEHRKE: And that's important for what?

00:35:09.780 --> 00:35:19.680
KICIMAN: And so this is important because soil is&nbsp;
seen as a very promising method for sequestering&nbsp;&nbsp;

00:35:19.680 --> 00:35:28.760
carbon for a climate change perspective. And it's&nbsp;
also the more carbon there is … the higher your&nbsp;&nbsp;

00:35:28.760 --> 00:35:33.480
soil carbon, usually the healthier the soil is,&nbsp;
as well. It's able to absorb more water, so less&nbsp;&nbsp;

00:35:33.480 --> 00:35:39.000
flooding; your crops are more productive because&nbsp;
of the microbial growth that's happening. And so&nbsp;&nbsp;

00:35:39.000 --> 00:35:44.280
people want to adopt policies and methods that&nbsp;
increase the soil carbon in the fields for all of&nbsp;&nbsp;

00:35:44.280 --> 00:35:49.840
these reasons. But measuring soil carbon is really&nbsp;
intensive. You have to go sample it, take it off&nbsp;&nbsp;

00:35:49.840 --> 00:35:56.200
to a lab, and it's too expensive for people to do&nbsp;
regularly. And so if we can develop remote-sensing&nbsp;&nbsp;

00:35:56.200 --> 00:36:03.040
methods that are able to take a satellite image&nbsp;
and, you know, really robustly predict what the&nbsp;&nbsp;

00:36:03.040 --> 00:36:08.240
real soil carbon measurement would be, that's&nbsp;
really game changing. That's something that, you&nbsp;&nbsp;

00:36:08.240 --> 00:36:15.120
know, will help us evaluate policies and whether&nbsp;
they're working; help us evaluate, you know, what&nbsp;&nbsp;

00:36:15.120 --> 00:36:19.760
the right practices should be for a particular&nbsp;
field. So I'm really excited about that.

00:36:19.760 --> 00:36:27.680
GEHRKE: That's really exciting. You'd mentioned&nbsp;
when we talked before that you'd benefited in&nbsp;&nbsp;

00:36:27.680 --> 00:36:32.680
your career from several good mentors.&nbsp;
How do you think about mentoring,&nbsp;&nbsp;

00:36:32.680 --> 00:36:36.000
and what are the ways that you&nbsp;
benefited from it? And how do you,&nbsp;&nbsp;

00:36:36.000 --> 00:36:40.180
you know, live that now in your daily life as&nbsp;
you're a mentor now to the next generation?

00:36:40.180 --> 00:36:49.400
KICIMAN: Yeah, the way I look at all&nbsp;
the people—and there's so many—who have,&nbsp;&nbsp;

00:36:49.400 --> 00:36:57.520
you know, given me a hand and advice&nbsp;
and, you know, along the way, I often&nbsp;&nbsp;

00:36:58.520 --> 00:37:10.240
find I pick up on some attributes of my mentors,&nbsp;
of a particular mentor, and find that it's&nbsp;&nbsp;

00:37:10.240 --> 00:37:16.280
something that I want to emulate. So recognizing,&nbsp;
you know, everyone is complicated and no one is&nbsp;&nbsp;

00:37:16.280 --> 00:37:21.160
perfect, but, you know, there's so many ways&nbsp;
that, you know, individuals get things right&nbsp;&nbsp;

00:37:21.160 --> 00:37:26.080
and trying to understand what it is that they're&nbsp;
doing right and how I can try and repeat that for,&nbsp;&nbsp;

00:37:26.080 --> 00:37:30.880
like, you said, the next generation, I think, is&nbsp;
really, really important. It's like one story,&nbsp;&nbsp;

00:37:30.880 --> 00:37:38.040
for example, around 2008, while I was still&nbsp;
working on large-scale internet services,&nbsp;&nbsp;

00:37:38.040 --> 00:37:45.160
I was going around the company to, kind of, get&nbsp;
a sense of, you know, what's the current state&nbsp;&nbsp;

00:37:45.160 --> 00:37:51.680
of the reliability of our services and how we&nbsp;
architect them and run them. And so I was talking&nbsp;&nbsp;

00:37:51.680 --> 00:37:57.720
to developers and architects and Ops folks around&nbsp;
the company, and James Hamilton was a great mentor&nbsp;&nbsp;

00:37:57.720 --> 00:38:03.880
at that moment, helping me to connect,&nbsp;
helping suggest questions that I might ask.

00:38:03.880 --> 00:38:06.280
GEHRKE: So he was working on&nbsp;
SQL Server reliability, right,&nbsp;&nbsp;

00:38:06.280 --> 00:38:08.080
at that point in time or on Windows reliability?

00:38:08.080 --> 00:38:12.960
KICIMAN: He was already starting to move over&nbsp;
into datacenter reliability. I think at the time,&nbsp;&nbsp;

00:38:12.960 --> 00:38:20.120
right before he moved over to the research side of&nbsp;
things, I think he was one of the heads of the, of&nbsp;&nbsp;

00:38:20.120 --> 00:38:29.040
our enterprise email businesses, and then he came&nbsp;
over to research to focus on, I think, datacenters&nbsp;&nbsp;

00:38:29.040 --> 00:38:37.360
in general. And, yeah, and he just donated so much&nbsp;
of his time. He was so generous with, you know,&nbsp;&nbsp;

00:38:37.360 --> 00:38:45.680
reviewing this large report that I was writing and&nbsp;
just helping me out with insights. That struck me&nbsp;&nbsp;

00:38:45.680 --> 00:38:49.920
as, like ... he's a very busy person. He's doing&nbsp;
all this stuff, and he's spending, you know,&nbsp;&nbsp;

00:38:49.920 --> 00:38:55.120
I sent him an email with, you know, 15 pages,&nbsp;
and he responds with feedback within a couple&nbsp;&nbsp;

00:38:55.120 --> 00:39:01.840
of hours every morning. That was astonishing&nbsp;
to me, especially in hindsight, and so … but&nbsp;&nbsp;

00:39:01.840 --> 00:39:08.480
that kind of generosity of time and trying&nbsp;
to help direct people's work in a way that's&nbsp;&nbsp;

00:39:08.480 --> 00:39:14.460
going to be most impactful for what they want to&nbsp;
achieve, that's something I try and emulate today.

00:39:14.460 --> 00:39:18.160
GEHRKE: So, so, you know, you've benefited from a&nbsp;
lot of great mentors and you said you're now also&nbsp;&nbsp;

00:39:18.160 --> 00:39:23.080
a mentor to others. Do you have any last&nbsp;
piece of advice for any of our listeners?

00:39:23.080 --> 00:39:27.560
KICIMAN: I think it's really important&nbsp;
for people to find passion and joy&nbsp;&nbsp;

00:39:27.560 --> 00:39:35.600
in the work that they do and, at some point, do&nbsp;
the work for the work's sake. I think this will&nbsp;&nbsp;

00:39:35.600 --> 00:39:42.560
drive you through the challenges that you'll&nbsp;
inevitably face with any sort of project and&nbsp;&nbsp;

00:39:42.560 --> 00:39:47.360
give you the persistence that you need to&nbsp;
really have the impact that you want to have.

00:39:47.360 --> 00:39:51.071
GEHRKE: Well, thanks for that advice. And&nbsp;
thanks for being in What's Your Story, Emre.

00:39:51.071 --> 00:39:53.440
KICIMAN: Thanks very much,&nbsp;
Johannes. Great to be here.

00:39:53.440 --> 00:39:54.520
[MUSIC]

00:39:54.520 --> 00:39:56.800
To learn more about Emre or to see photos of&nbsp;&nbsp;

00:39:56.800 --> 00:40:04.291
Emre as a child in California,&nbsp;
visit aka.ms/ResearcherStories.

00:40:04.291 --> 00:40:07.089
[MUSIC FADES]

