WEBVTT

00:00:03.449 --> 00:00:04.449
[MUSIC PLAYS]

00:00:04.449 --> 00:00:07.660
GRETCHEN HUIZINGA: Welcome to Abstracts, a
Microsoft Research Podcast that puts the spotlight

00:00:07.660 --> 00:00:10.389
on world-class research in brief.

00:00:10.389 --> 00:00:15.349
I’m Dr. Gretchen Huizinga.

00:00:15.349 --> 00:00:19.900
In this series, members of the research community
at Microsoft give us a quick snapshot—or

00:00:19.900 --> 00:00:23.699
a podcast abstract—of their new and noteworthy
papers.

00:00:23.699 --> 00:00:24.750
[MUSIC FADES]

00:00:24.750 --> 00:00:30.650
Today, I’m talking to Dr. Chang Liu, a senior
researcher from Microsoft Research AI4Science.

00:00:30.650 --> 00:00:37.989
Dr. Liu is coauthor of a paper called “Overcoming
the Barrier of Orbital-Free Density Functional

00:00:37.989 --> 00:00:41.910
Theory for Molecular Systems Using Deep Learning.”

00:00:41.910 --> 00:00:44.270
Chang Liu, thanks for joining us on Abstracts!

00:00:44.270 --> 00:00:45.540
CHANG LIU: Thank you.

00:00:45.540 --> 00:00:48.370
Thank you for this opportunity to share our
work.

00:00:48.370 --> 00:00:54.240
HUIZINGA: So in a few sentences, tell us about
the issue or problem your paper addresses

00:00:54.240 --> 00:00:57.170
and why people should care about this research.

00:00:57.170 --> 00:00:58.510
LIU: Sure.

00:00:58.510 --> 00:01:03.780
Since this is an AI4Science work, let’s
start from this perspective.

00:01:03.780 --> 00:01:08.970
About science, people always want to understand
the properties of matters, such as why some

00:01:08.970 --> 00:01:14.430
substances can cure disease and why some materials
are heavy or conductive.

00:01:14.430 --> 00:01:21.100
For a very long period of time, these properties
can only be studied by observation and experiments,

00:01:21.100 --> 00:01:23.740
and the outcome will just look like magic
to us.

00:01:23.740 --> 00:01:30.390
If we can understand the underlying mechanism
and calculate these properties on our computer,

00:01:30.390 --> 00:01:36.180
then we can do the magic ourselves, and it
can, hence, accelerate industries like medicine

00:01:36.180 --> 00:01:39.310
development and material discovery.

00:01:39.310 --> 00:01:44.829
Our work aims to develop a method that handles
the most fundamental part of such property

00:01:44.829 --> 00:01:49.580
calculation and with better accuracy and efficiency.

00:01:49.580 --> 00:01:55.600
If you zoom into the problem, properties of
matters are determined by the properties of

00:01:55.600 --> 00:01:58.700
molecules that constitute the matter.

00:01:58.700 --> 00:02:03.170
For example, the energy of a molecule is an
important property.

00:02:03.170 --> 00:02:09.030
It determines which structure it mostly takes,
and the structure indicates whether it can

00:02:09.030 --> 00:02:12.600
bind to a disease-related biomolecule.

00:02:12.600 --> 00:02:19.640
You may know that molecules consist of atoms,
and atoms consist of nuclei and electrons,

00:02:19.640 --> 00:02:25.950
so properties of a molecule are the result
of the interaction among the nuclei and the

00:02:25.950 --> 00:02:28.290
electrons in the molecule.

00:02:28.290 --> 00:02:34.640
The nuclei can be treated as classical particles,
but electrons exhibit significant quantum

00:02:34.640 --> 00:02:35.670
effect.

00:02:35.670 --> 00:02:42.909
You can imagine this like electrons move so
fast that they appear like cloud or mist spreading

00:02:42.909 --> 00:02:44.560
over the space.

00:02:44.560 --> 00:02:50.290
To calculate the properties of the molecule,
you need to first solve the electronic structure—that

00:02:50.290 --> 00:02:54.269
is, how the electrons spread over this space.

00:02:54.269 --> 00:02:57.950
This is governed by an equation that is hard
to solve.

00:02:57.950 --> 00:03:03.549
The target of our research is hence to develop
a method that solves the electronic structure

00:03:03.549 --> 00:03:10.390
more accurately and more efficiently so that
properties of molecules can be calculated

00:03:10.390 --> 00:03:17.140
in a higher level of accuracy and efficiency
that leads to better ways to solve the industrial

00:03:17.140 --> 00:03:18.140
problems.

00:03:18.140 --> 00:03:23.010
HUIZINGA: Well, most research owes a debt
to work that went before but also moves the

00:03:23.010 --> 00:03:24.200
science forward.

00:03:24.200 --> 00:03:29.599
So how does your approach build on and/or
differ from related research in this field?

00:03:29.599 --> 00:03:35.920
LIU: Yes, there are indeed quite a few methods
that can solve the electronic structure, but

00:03:35.920 --> 00:03:39.950
they show a harsh tradeoff between accuracy
and efficiency.

00:03:39.950 --> 00:03:46.170
Currently, density functional theory, often
called DFT, achieves a preferred balance for

00:03:46.170 --> 00:03:50.310
most cases and is perhaps the most popular
choice.

00:03:50.310 --> 00:03:56.950
But DFT still requires a considerable cost
for large molecular systems.

00:03:56.950 --> 00:03:59.500
It has a cubic cost scaling.

00:03:59.500 --> 00:04:04.420
We hope to develop a method that scales with
a milder cost increase.

00:04:04.420 --> 00:04:11.480
We noted an alternative type of method called
orbital-free DFT, or called OFDFT, which has

00:04:11.480 --> 00:04:13.709
a lower order of cost scaling.

00:04:13.709 --> 00:04:20.620
But existing OFDFT methods cannot achieve
satisfying accuracy on molecules.

00:04:20.620 --> 00:04:27.050
So our work leverages deep learning to achieve
an accurate OFDFT method.

00:04:27.050 --> 00:04:34.660
The method can achieve the same level of accuracy
as conventional DFT; meanwhile, it inherits

00:04:34.660 --> 00:04:41.080
the cost scaling of OFDFT, hence is more efficient
than the conventional DFT.

00:04:41.080 --> 00:04:48.250
HUIZINGA: OK, so we’re moving acronyms from
DFT to OFDFT, and you’ve got an acronym

00:04:48.250 --> 00:04:49.780
that goes M-OFDFT.

00:04:49.780 --> 00:04:52.419
What does that stand for?

00:04:52.419 --> 00:05:01.009
LIU: The M represents molecules, since it
is especially hard for classical or existing

00:05:01.009 --> 00:05:05.700
OFDFT to achieve a good accuracy on molecules.

00:05:05.700 --> 00:05:08.500
So our development tackles that challenge.

00:05:08.500 --> 00:05:10.180
HUIZINGA: Great.

00:05:10.180 --> 00:05:13.230
And I’m eager to hear about your methodology
and your findings.

00:05:13.230 --> 00:05:14.740
So let’s go there.

00:05:14.740 --> 00:05:20.990
Tell us a bit about how you conducted this
research and what your methodology was.

00:05:20.990 --> 00:05:21.990
LIU: Yeah.

00:05:21.990 --> 00:05:24.810
Regarding methodology, let me delve into a
bit into some details.

00:05:24.810 --> 00:05:31.340
We follow the formulation of OFDFT, which
solves the electronic structure by optimizing

00:05:31.340 --> 00:05:39.340
the electron density, where the optimization
objective is to minimize the electronic energy.

00:05:39.340 --> 00:05:46.750
The challenge in OFDFT is, part of the electronic
energy, specifically the kinetic energy, is

00:05:46.750 --> 00:05:51.470
hard to calculate accurately, especially for
molecular systems.

00:05:51.470 --> 00:05:58.100
Existing computation formulas are based on
approximate physical models, but the approximation

00:05:58.100 --> 00:06:00.790
accuracy is not satisfying.

00:06:00.790 --> 00:06:05.780
Our method uses a deep learning model to calculate
the kinetic energy.

00:06:05.780 --> 00:06:11.981
We train the model on labeled data, and by
the powerful learning ability, the model can

00:06:11.981 --> 00:06:14.750
give a more accurate result.

00:06:14.750 --> 00:06:19.319
This is the general idea, but there are many
technical challenges.

00:06:19.319 --> 00:06:25.250
For example, since the model is used as an
optimization objective, it needs to capture

00:06:25.250 --> 00:06:28.900
the overall landscape of the function.

00:06:28.900 --> 00:06:34.300
The model cannot recover the landscape if
only one labeled data point is provided.

00:06:34.300 --> 00:06:40.530
For this, we made a theoretical analysis on
the data generation method and found a way

00:06:40.530 --> 00:06:45.780
to generate multiple labeled data points for
each molecular structure.

00:06:45.780 --> 00:06:52.110
Moreover, we can also calculate a gradient
label for each data point, which provides

00:06:52.110 --> 00:06:55.800
the slope information on the landscape.

00:06:55.800 --> 00:07:01.490
Another challenge is that the kinetic energy
has a strong non-local effect, meaning that

00:07:01.490 --> 00:07:08.210
the model needs to account for the interaction
between any pair of spots in space.

00:07:08.210 --> 00:07:13.900
This incurs a significant cost if using the
conventional way to represent density—that

00:07:13.900 --> 00:07:16.650
is, to using a grid.

00:07:16.650 --> 00:07:23.120
For this challenge, we choose to expand the
density function on a set of basis functions

00:07:23.120 --> 00:07:28.090
and use the expansion coefficients to represent
the density.

00:07:28.090 --> 00:07:34.650
The benefit is that it greatly reduces the
representation dimension, which in turn reduces

00:07:34.650 --> 00:07:38.139
the cost for non-local calculation.

00:07:38.139 --> 00:07:43.810
These two examples are also the differences
from other deep learning OFDFT works.

00:07:43.810 --> 00:07:48.200
There are more technical designs, and you
may check them in the paper.

00:07:48.200 --> 00:07:50.659
HUIZINGA: So talk about your findings.

00:07:50.659 --> 00:07:56.000
After you completed and analyzed what you
did, what were your major takeaways or findings?

00:07:56.000 --> 00:08:01.860
LIU: Yeah, let’s dive into the details,
into the empirical findings.

00:08:01.860 --> 00:08:11.020
We find that our deep learning OFDFT, abbreviated
as M-OFDFT, is much more accurate than existing

00:08:11.020 --> 00:08:17.750
OFDFT methods with tens to hundreds times
lower error and achieves the same level of

00:08:17.750 --> 00:08:20.020
accuracy as the conventional DFT.

00:08:20.020 --> 00:08:21.020
HUIZINGA: Wow …

00:08:21.020 --> 00:08:25.880
LIU: On the other hand, the speed is indeed
improved over conventional DFT.

00:08:25.880 --> 00:08:33.479
For example, on a protein molecule with more
than 700 atoms, our method achieves nearly

00:08:33.479 --> 00:08:35.800
30 times speedup.

00:08:35.800 --> 00:08:42.300
The empirical cost scaling is lower than quadratic
and is one order less than that of conventional

00:08:42.300 --> 00:08:43.330
DFT.

00:08:43.330 --> 00:08:48.060
So the speed advantage would be more significant
on larger molecules.

00:08:48.060 --> 00:08:53.079
I’d also like to mention an interesting
observation.

00:08:53.079 --> 00:08:58.000
Since our method is based on deep learning,
a natural question is how accurate would the

00:08:58.000 --> 00:09:04.850
method be if applied to much larger molecules
than those used for training the deep learning

00:09:04.850 --> 00:09:06.200
model?

00:09:06.200 --> 00:09:11.630
This is the generalization challenge and is
one of the major challenges of deep learning

00:09:11.630 --> 00:09:14.990
method for molecular science applications.

00:09:14.990 --> 00:09:20.780
We investigated this question in our method
and found that the error increases slower

00:09:20.780 --> 00:09:24.180
than linearly with molecular size.

00:09:24.180 --> 00:09:30.550
Although this is not perfect since the error
is still increasing, but it is better than

00:09:30.550 --> 00:09:37.310
using the same model to predict the property
directly, which shows an error that increases

00:09:37.310 --> 00:09:39.640
faster than linearly.

00:09:39.640 --> 00:09:45.779
This somehow shows the benefits of leveraging
the OFDFT framework for using a deep learning

00:09:45.779 --> 00:09:48.519
method to solve molecular tasks.

00:09:48.519 --> 00:09:52.010
HUIZINGA: Well, let’s talk about real-world
impact for a second.

00:09:52.010 --> 00:09:56.120
You’ve got this research going on in the
lab, so to speak.

00:09:56.120 --> 00:09:59.010
How does it impact real-life situations?

00:09:59.010 --> 00:10:02.050
Who does this work help the most and how?

00:10:02.050 --> 00:10:09.630
LIU: Since our method achieves the same level
of accuracy as conventional DFT but runs faster,

00:10:09.630 --> 00:10:15.920
it could accelerate molecular property calculation
and molecular dynamic simulation especially

00:10:15.920 --> 00:10:22.800
for large molecules; hence, it has the potential
to accelerate solving problems such as medicine

00:10:22.800 --> 00:10:26.240
development and material discovery.

00:10:26.240 --> 00:10:32.810
Our method also shows that AI techniques can
create new opportunities for other electronic

00:10:32.810 --> 00:10:39.649
structure formulations, which could inspire
more methods to break the long-standing tradeoff

00:10:39.649 --> 00:10:42.750
between accuracy and efficiency in this field.

00:10:42.750 --> 00:10:47.550
HUIZINGA: So if there was one thing you wanted
our listeners to take away, just one little

00:10:47.550 --> 00:10:50.700
nugget from your research, what would that
be?

00:10:50.700 --> 00:10:57.060
LIU: If only for one thing, that would be
we develop the method that solves molecular

00:10:57.060 --> 00:11:03.170
properties more accurately and efficiently
than the current portfolio of available methods.

00:11:03.170 --> 00:11:10.000
HUIZINGA: So finally, Chang, what are the
big unanswered questions and unsolved problems

00:11:10.000 --> 00:11:13.750
that remain in this field, and what’s next
on your research agenda?

00:11:13.750 --> 00:11:15.380
LIU: Yeah, sure.

00:11:15.380 --> 00:11:20.370
There indeed remains problems and challenges.

00:11:20.370 --> 00:11:25.840
One remaining challenge mentioned above is
the generalization to molecules much larger

00:11:25.840 --> 00:11:27.750
than those in training.

00:11:27.750 --> 00:11:34.870
Although the OFDFT method is better than directly
predicting properties, there is still room

00:11:34.870 --> 00:11:36.270
to improve.

00:11:36.270 --> 00:11:42.270
One possibility is to consider the success
of large language models by including more

00:11:42.270 --> 00:11:49.920
abundant data and more diverse data in training
and using a large model to digest all the

00:11:49.920 --> 00:11:50.990
data.

00:11:50.990 --> 00:11:55.269
This can be costly, but it may give us a surprise.

00:11:55.269 --> 00:12:02.279
And another way we may consider is to incorporate
mathematical structures of the learning target

00:12:02.279 --> 00:12:10.500
functional into the model, such as convexity,
lower and upper bounds, and some invariance.

00:12:10.500 --> 00:12:16.680
And such structures could regularize the model
when applied to larger systems than it has

00:12:16.680 --> 00:12:19.040
seen during training.

00:12:19.040 --> 00:12:25.851
So we have actually incorporated some such
structures into the model, for example, the

00:12:25.851 --> 00:12:32.410
geometric invariance, but other mathematical
properties are nontrivial to incorporate.

00:12:32.410 --> 00:12:39.940
We made some discussions in the paper, and
we’ll engage working on that direction in

00:12:39.940 --> 00:12:41.180
the future.

00:12:41.180 --> 00:12:48.370
The ultimate goal underlying this technical
development is to build a computational method

00:12:48.370 --> 00:12:54.579
that is fast and accurate universally so that
we can simulate the molecular world of any

00:12:54.579 --> 00:12:55.579
kind.

00:12:55.579 --> 00:12:56.579
[MUSIC PLAYS]

00:12:56.579 --> 00:13:00.490
HUIZINGA: Well, Chang Liu, thanks for joining
us today, and to our listeners, thanks for

00:13:00.490 --> 00:13:01.570
tuning in.

00:13:01.570 --> 00:13:08.500
If you want to read this paper, you can find
a link at aka.ms/abstracts.

00:13:08.500 --> 00:13:14.650
You can also read it on arXiv, or you can
check out the March 2024 issue of Nature Computational

00:13:14.650 --> 00:13:16.089
Science.

00:13:16.089 --> 00:13:19.320
See you next time on Abstracts!

00:13:19.320 --> 00:13:24.040
[MUSIC FADES]