Of course, we’re looking at ways to apply MuZero to real-world problems, and there are some encouraging early results. To give a concrete example, traffic on the Internet is dominated by video, and a big open problem is how to compress those videos as efficiently as possible. You can think of this as an empowerment learning problem as there are very complicated programs that compress the video, but what you see next is unknown. But when you plug something like MuZero into it, our initial results look very promising in terms of saving significant amounts of data, maybe about 5 percent of the bits used when compressing a video.
Where do you think reinforcement learning will have the greatest impact in the longer term?
I come up with a system that can help you as a user to achieve your goals as effectively as possible. A really powerful system that sees all the things you see, that all have the same senses as you, that can help you achieve your goals in life. I think that’s a very important one. Another transformative, long-term focus is something that a personalized healthcare solution could provide. There are privacy and ethical issues to be addressed, but it will have tremendous transformative value; it will change the face of medicine and people’s quality of life.
Is there anything you think machines will learn to do in your lifetime?
I don’t want to put a timetable on it, but I would say that whatever a human can accomplish, in the end I think a machine can. The brain is a computational process, I don’t think there is any magic going on there.
Can we get to the point where we can understand and implement algorithms as effective and powerful as the human brain? Well, I don’t know what the timescale is. But I think the journey is exciting. And we should strive for that. The first step in taking that journey is to try to understand what it even means to achieve intelligence. What problem are we trying to solve in solving intelligence?
Beyond practical uses, are you sure you can move from mastering games like chess and Atari to real intelligence? Why do you think reinforcement learning will lead? common sense machines?
There is a hypothesis, we call it the reward-enough hypothesis, that says that the essential process of intelligence can be as simple as a system trying to maximize its reward, and that process of trying to achieve a goal and trying to maximize the reward. is sufficient to give rise to all the properties of intelligence that we see in natural intelligence. It’s a hypothesis, we don’t know if it’s true, but it kind of gives direction to the investigation.
When we take common sense specifically, the reward-is-enough hypothesis says well: if common sense is useful to a system, it means that it should actually help it achieve its goals better.
It sounds like you think your field of empowerment learning is in a sense fundamental to understanding or “solving” intelligence. Is that correct?
I really see it as very essential. I think the big question is, is it true? Because it certainly conflicts with how many people look at AI, which is that there’s an incredibly complex set of mechanisms involved in intelligence, and each of them has their own kind of problem it solves or their own special way of working. works, or maybe there is no clear problem definition at all for something like common sense. This theory says, no, there may be a very clear and simple way to think about all intelligence, which is that it is a goal optimizing system, and if we find the way to optimize goals really, really well, all these others things will emerge from that process.