AIME21

Name: AIME21
Start: 2021-11-11T14:00:00+01:00
End: 2021-11-26T18:00:00+01:00
Location: No location set

11–26 Nov 2021

Europe/Budapest timezone

Hardness of planning and learning with linear function approximation in reinforcement learning

Not scheduled

20m

Online lecture

Csaba Szepesvari (DeepMind/University of Alberta)

Markov decision processes (MDPs) is a minimalist framework that is designed to capture the most important aspects of decision making under uncertainty, a problem of major practical interest. Unfortunately, the price of the minimalist approachis that MDPs lack structure and as such planning and learning in MDPs with combinatorial-sized state and action spaces is strongly intractable. An appealing idea to overcome this intractability is to assume that the optimal action-value function, which effectively captures everything that needs to be known about an MDP in order to act optimally in it, enjoys some distinct structural properties, such as that it can be written as the linear combination of a tractable number of basis functions. The question is whether this structure is sufficient to design algorithms that overcome the inherent intractability of planning and learning in MDPs. In this talk, I will look back at research addressing this question (starting in the 1960s) and then present a recent result that shows that the MDP planning and learning problems remain strongly intractable even with the extra assumptions, as far as scaling with the planning and the number of basis functions is concerned. I will finish with listing a number of open problems.

Title

Hardness of planning and learning with linear function approximation in reinforcement learning

authors	Weisz, Gellert; Amortila, Philip; Szepesvári, Csaba
affiliation	DeepMind; University of Illinois, Urbana-Champaign, USA; DeepMind/University of Alberta

Csaba Szepesvari (DeepMind/University of Alberta) Mr Gellert Weisz (DeepMind) Mr Philip Amortila (University of Illinois, Urbana-Champaign, USA)

There are no materials yet.

AIME21

Hardness of planning and learning with linear function approximation in reinforcement learning

Speaker

Description

Title

Authors

Presentation materials

Choose timezone

AIME21

Speaker

Description

Title

Authors

Presentation materials