Increasing Policy Network Size Does Not Guarantee Better Performance in Deep Reinforcement Learning

Berg, Zachery Peter

doi:10.25394/PGS.19651251.v1

Master_s_Thesis (1).pdf (537.29 kB)

Increasing Policy Network Size Does Not Guarantee Better Performance in Deep Reinforcement Learning

thesis

posted on 2022-04-25, 17:33 authored by Zachery Peter BergZachery Peter Berg

The capacity of deep reinforcement learning policy networks has been found to affect the performance of trained agents. It has been observed that policy networks with more parameters have better training performance and generalization ability than smaller networks. In this work, we find cases where this does not hold true. We observe unimodal variance in the zero-shot test return of varying width policies, which accompanies a drop in both train and test return. Empirically, we demonstrate mostly monotonically increasing performance or mostly optimal performance as the width of deep policy networks increase, except near the variance mode. Finally, we find a scenario where larger networks have increasing performance up to a point, then decreasing performance. We hypothesize that these observations align with the theory of double descent in supervised learning, although with specific differences.

History

Degree Type

Master of Science

Department

Computer Science

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Yexiang Xue

Additional Committee Member 2

Kamyar Azizzadenesheli

Additional Committee Member 3

Bruno Ribeiro

Usage metrics

Keywords

Deep Reinforcement Learning (DRL)Reinforcement Learning (RL)Double descent Policy network size bias-variance tradeoff Reinforcement Learning Generalization overparameterization Theoretical Computer Science

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Increasing Policy Network Size Does Not Guarantee Better Performance in Deep Reinforcement Learning

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Usage metrics

Categories

Keywords

Licence

Exports