Learning in Stochastic Stackelberg Games

Das, Pranoy

doi:10.25394/PGS.25620726.v1

PurdueThesis_template_for_Purdue_University_theses__dissertations__etc___2_.pdf (309.74 kB)

Learning in Stochastic Stackelberg Games

thesis

posted on 2024-04-19, 18:57 authored by Pranoy DasPranoy Das

The original definition of Nash Equilibrium applied to normal form games, but the notion has now been extended to various other forms of games including leader-follower games (Stackelberg games), extensive form games, stochastic games, games of incomplete information, cooperative games, and so on. We focus on general-sum stochastic Stackelberg games in this work. An example where such games would be natural to consider is in security games where a defender wishes to protect some targets through deployment of limited resources and an attacker wishes to strategically attack the targets to benefit themselves. The hierarchical order of play arises naturally since the defender typically acts first and deploys a strategy, while the attacker observes the strategy ofthe defender before attacking. Another example where this framework fits is in testing during epidemics, where the leader (the government) sets testing policies and the follower (the citizens) decide at every time step whether to get tested. The government wishes to minimize the number of infected people in the population while the follower wishes to minimize the cost of getting sick and testing. This thesis presents a learning algorithm for players to converge to their stationary policies in a general sum stochastic sequential Stackelberg game. The algorithm is a two time scale implicit policy gradient algorithm that provably converges to stationary points of the optimization problems of the two players. Our analysis allows us to move beyond the assumptions of zero-sum or static Stackelberg games made in the existing literature for learning algorithms to converge.

Funding

The work was partially supportedby ARO through grants W911NF2310111 and W911NF2310266, AFOSR through grant F.10052139.02.005, ONR through grant 13001274, and NSF through grants 2300355 and 2222097

History

Degree Type

Master of Science in Electrical and Computer Engineering

Department

Electrical and Computer Engineering

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Vijay Gupta

Additional Committee Member 2

Abolfazl Hashemi

Additional Committee Member 3

Mahsa Ghasemi

Usage metrics

Keywords

Game theory, optimization techniques, Federated Learning Reinforcement Learning Learning

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Learning in Stochastic Stackelberg Games

Funding

The work was partially supportedby ARO through grants W911NF2310111 and W911NF2310266, AFOSR through grant F.10052139.02.005, ONR through grant 13001274, and NSF through grants 2300355 and 2222097

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Usage metrics

Categories

Keywords

Licence

Exports