← Back to papers

Paper deep dive

Evolutionarily Stable Stackelberg Equilibrium

Sam Ganzfried

Year: 2026Venue: arXiv preprintArea: cs.GTType: PreprintEmbeddings: 47

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 94%

Last extracted: 3/22/2026, 6:03:24 AM

Summary

The paper introduces 'Evolutionarily Stable Stackelberg Equilibrium' (SESS), a game-theoretic solution concept that refines Stackelberg equilibrium by requiring the follower population to play an evolutionarily stable strategy (ESS). It provides definitions for discrete and continuous games, including optimistic and pessimistic variants, and discusses applications in biological systems such as cancer treatment.

Entities (4)

ESS · game-theory-concept · 98%SESS · solution-concept · 98%Stackelberg Equilibrium · game-theory-concept · 95%Cancer Treatment · application-domain · 92%

Relation Signals (2)

SESS incorporates ESS

confidence 95% · The leader selects an optimal mixed strategy, anticipating that the follower population plays an evolutionarily stable strategy (ESS)

SESS refines Stackelberg Equilibrium

confidence 95% · SESS refines Stackelberg equilibrium by restricting the follower response to strategies that are evolutionarily stable

Cypher Suggestions (2)

Find all game theory concepts related to SESS · confidence 90% · unvalidated

MATCH (s:Concept {name: 'SESS'})-[:REFINES|INCORPORATES]->(related) RETURN related.name, related.entity_type

List applications of the SESS framework · confidence 85% · unvalidated

MATCH (a:Application)-[:APPLIES_TO]->(s:Concept {name: 'SESS'}) RETURN a.name

Abstract

Abstract:We present a new solution concept called evolutionarily stable Stackelberg equilibrium (SESS). We study the Stackelberg evolutionary game setting in which there is a single leading player and a symmetric population of followers. The leader selects an optimal mixed strategy, anticipating that the follower population plays an evolutionarily stable strategy (ESS) in the induced subgame and may satisfy additional ecological conditions. We consider both leader-optimal and follower-optimal selection among ESSs, which arise as special cases of our framework. Prior approaches to Stackelberg evolutionary games either define the follower response via evolutionary dynamics or assume rational best-response behavior, without explicitly enforcing stability against invasion by mutations. We present algorithms for computing SESS in discrete and continuous games, and validate the latter empirically. Our model applies naturally to biological settings; for example, in cancer treatment the leader represents the physician and the followers correspond to competing cancer cell phenotypes.

Tags

ai-safety (imported, 100%)csgt (suggested, 92%)preprint (suggested, 88%)

Links

Your browser cannot display the PDF inline. Open PDF directly →

Full Text

46,674 characters extracted from source content.

Expand or collapse full text

Evolutionarily Stable Stackelberg Equilibrium Sam Ganzfried Ganzfried Research sam.ganzfried@gmail.com Abstract We present a new solution concept called evolutionarily stable Stackelberg equilibrium (SESS). We study the Stackelberg evolutionary game setting in which there is a single leading player and a symmetric population of followers. The leader selects an optimal mixed strategy, anticipating that the follower population plays an evolutionarily stable strategy (ESS) in the induced subgame and may satisfy additional ecological conditions. We consider both leader-optimal and follower-optimal selection among ESSs, which arise as special cases of our framework. Prior approaches to Stackelberg evolutionary games either define the follower response via evolutionary dynamics or assume rational best-response behavior, without explicitly enforcing stability against invasion by mutations. We present algorithms for computing SESS in discrete and continuous games, and validate the latter empirically. Our model applies naturally to biological settings; for example, in cancer treatment the leader represents the physician and the followers correspond to competing cancer cell phenotypes. I Introduction While Nash equilibrium has emerged as the standard solution concept in game theory, it is often criticized as being too weak: often games contain multiple Nash equilibria (sometimes even infinitely many), and we want to select one that satisfies other natural properties. For example, one popular concept that refines Nash equilibrium is evolutionarily stable strategy (ESS). A mixed strategy in a two-player symmetric game is an evolutionarily stable strategy if it is robust to being overtaken by a mutation strategy. Formally, ∗x^* is an ESS if for every mixed strategy x that differs from ∗x^*, there exists ϵ0=ϵ0​()>0 _0= _0(x)>0 such that, for all ϵ∈(0,ϵ0)ε∈(0, _0), (1−ϵ)​u1​(,∗)+ϵ​u1​(,)<(1−ϵ)​u1​(∗,∗)+ϵ​u1​(∗,).(1-ε)u_1(x,x^*)+ε u_1(x,x)<(1-ε)u_1(x^*,x^*)+ε u_1(x^*,x). From a biological perspective, we can interpret ∗x^* as a distribution among “normal” individuals within a population, and consider a mutation that makes use of strategy x, assuming that the proportion of the mutation in the population is ϵε. In an ESS, the expected payoff of the mutation is smaller than the expected payoff of a normal individual, and hence the proportion of mutations will decrease and eventually disappear over time, with the composition of the population returning to being mostly ∗x^*. An ESS is therefore a mixed strategy that is immune to being overtaken by mutations. ESS was initially proposed by mathematical biologists motivated by applications such as population dynamics (e.g., maintaining robustness to mutations within a population of humans or animals) [21, 20]. A common example game is the 2x2 game where strategies correspond to an “aggressive” Hawk or a “peaceful” Dove strategy. A paper has recently proposed a similar game in which an aggressive malignant cell competes with a passive normal cell for biological energy, which has applications to cancer eradication [6]. While Nash equilibrium is defined for general multiplayer games, ESS is traditionally defined specifically for two-player symmetric games. ESS is a refinement of Nash equilibrium. In particular, if ∗x^* is an ESS, then (∗,∗)(x^*,x^*) (i.e., the strategy profile where both players play ∗x^*) is a (symmetric) Nash equilibrium [19]. Of course the converse is not necessarily true (not every symmetric Nash equilibrium is an ESS), or else ESS would be a trivial refinement. In fact, ESS is not guaranteed to exist in games with more than two pure strategies per player (while Nash equilibrium is guaranteed to exist in all finite games). For example, while rock-paper-scissors has a mixed strategy Nash equilibrium (which puts equal weight on all three actions), it has no ESS [19] (that work considers a version where payoffs are 1 for a victory, 0 for loss, and 23 23 for a tie). There exists a polynomial-time algorithm for computing Nash equilibrium (NE) in two-player zero-sum games, while for two-player non-zero-sum and multiplayer games computing an NE is PPAD-complete and it is widely conjectured that no efficient (polynomial-time) algorithm exists. However, several algorithms have been devised that perform well in practice. The problem of computing whether a game has an ESS was shown to be both NP-hard and CO-NP hard and also to be contained in Σ2P ^P_2 (the class of decision problems that can be solved in nondeterministic polynomial time given access to an NP oracle) [7]. Subsequently it was shown that the exact complexity of this problem is that it is Σ2P ^P_2-complete [5], and even more challenging for more than two players [2]. Note that this result is for determining whether an ESS exists (as discussed above there exist games which have no ESS), not for the complexity of computing an ESS in games for which one exists. Thus, computing an ESS is significantly more difficult than computing an NE. Several approaches have been proposed for computing ESS in two-player games [14, 1, 4, 3, 23]. A normal-form game consists of a finite set of players N=1,…,nN=\1,…,n\, a finite set of pure strategies SiS_i for each player i, and a real-valued utility for each player for each strategy vector (aka strategy profile), ui:×iSi→ℝu_i:×_iS_i . In a symmetric normal-form game, all strategy spaces SiS_i are equal and the utility functions satisfy the following symmetry condition: for every player i∈Ni∈ N, pure strategy profile (s1,…,sn)∈Sn(s_1,…,s_n)∈ S^n, and permutation π of the players, ui​(s1,…,sn)=uπ​(i)​(sπ​(1),…,sπ​(n)).u_i(s_1,…,s_n)=u_π(i)(s_π(1),…,s_π(n)). This allows us to remove the player index of the utility function and just write u​(s1,…,sn),u(s_1,…,s_n), where it is implied that the utility is for player 1 (we can simply permute the players to obtain the utilities of the other players). We write uiu_i for notational convenience, but note that only a single utility function must be specified which applies to all players. Let Σi _i denote the set of mixed strategies of player i (probability distributions over elements of SiS_i). If players follow mixed strategy profile =((1),…,(n)),x=(x^(1),…,x^(n)), where (i)∈Σix^(i)∈ _i, the expected payoff to player i is ui​((1),…,(n))=∑s1,…,sn∈Sxs1(1)​⋯​xsn(n)​ui​(s1,…,sn).u_i(x^(1),…,x^(n))= _s_1,…,s_n∈ Sx^(1)_s_1·s x^(n)_s_nu_i(s_1,…,s_n). We write ui​()=ui​((i),(−i))u_i(x)=u_i(x^(i),x^(-i)), where (−i)x^(-i) denotes the vector of strategies of all players except i. If all players follow the same mixed strategy x, then for all players i: ui​()=ui​(,…,)=∑s1,…,sn∈Sxs1​⋯​xsn​ui​(s1,…,sn).u_i(x)=u_i(x,…,x)= _s_1,…,s_n∈ Sx_s_1·s x_s_nu_i(s_1,…,s_n). Definition 1 A mixed strategy profile ⋆x is a Nash equilibrium if for each player i∈Ni∈ N and for each mixed strategy (i)∈Σix^(i)∈ _i: ui​(∗(i),∗(−i))≥ui​((i),∗(−i)).u_i(x^*(i),x^*(-i))≥ u_i(x^(i),x^*(-i)). Definition 2 A mixed strategy profile ⋆x in a symmetric normal-form game in a symmetric Nash equilibrium if it is a Nash equilibrium and: ∗(1)=∗(2)=…=∗(n).x^*(1)=x^*(2)=…=x^*(n). Definition 3 A mixed strategy ⋆∈Σ1x ∈ _1 is evolutionarily stable in a symmetric normal-form game if for each mixed strategy ≠⋆x exactly one of the following holds: 1. u1​(∗,∗,…,∗)>u1​(,∗,…,∗)u_1(x^*,x^*,…,x^*)>u_1(x,x^*,…,x^*), 2. u1​(∗,∗,…,∗)=u1​(,∗,…,∗)u_1(x^*,x^*,…,x^*)=u_1(x,x^*,…,x^*) and u1​(∗,,∗,…,∗)>u1​(,,∗,…,∗)u_1(x^*,x,x^*,…,x^*)>u_1(x,x,x^*,…,x^*). It has been proven that every symmetric normal-form game has at least one symmetric Nash equilibrium [24]. It is clear from Definition 3 that every evolutionarily stable strategy in a symmetric normal-form game must be a symmetric Nash equilibrium (SNE). Thus, a natural approach for ESS computation is to first compute SNE and then perform subsequent procedures to determine whether they are ESS. Now we present the Stackelberg evolutionary game setting. The leader’s set of pure strategies is ℳ=1,…,mM=\1,…,m\ and the follower phenotypes are =1,…,n.P=\1,…,n\. The leader’s utility function is given by ∈ℝm×nA_L ^m× n, where ​(ℓ,i)A_L( ,i) gives the payoff to the leader when the leader plays pure strategy ℓ∈ℳ and the follower population state is phenotype i∈.i . The leader’s expected utility when the leader plays mixed strategy σ and the follower plays mixed strategy x is UL​(,)=∑ℓ=1m∑i=1nσℓ​xi​(ℓ,i).U_L( σ,x)= _ =1^m _i=1^n _ x_iA_L( ,i). The follower payoff tensor is ∈ℝm×n×n,A_F ^m× n× n, where ​(ℓ,i,j)A_F( ,i,j) is the payoff (fitness) to a follower of phenotype i∈i when the leader plays pure strategy ℓ∈ℳ and the interacting follower has phenotype j∈.j . For each fixed ℓ∈ℳ, , ​(ℓ,⋅,⋅)A_F( ,·,·) is an n×n× n symmetric payoff matrix. After the leader commits to mixed strategy ∈Δ​(ℳ) σ∈ (M), the induced follower payoff matrix is ∈ℝn×nB σ ^n× n with =∑ℓ=1mσℓ​(ℓ,⋅,⋅).B σ= _ =1^m _ A_F( ,·,·). The follower population then plays a symmetric evolutionary game with payoff matrix .B σ. A population state ∈Δ​()x∈ (P) is admissible for σ iff it is an ESS of .B σ. Let ℰ​()E( σ) denote the set of ESSs of ,B σ, and let g:Δ​()→ℝg σ: (P) denote the follower’s ESS selection function. Finally let ​()=arg​max∈ℰ​()⁡g​().G( σ)= *arg\,max_x ( σ)g σ(x). We now present our general definition for evolutionarily stable Stackelberg equilibrium (SESS) in Definition 4. We also define two natural special cases. In the first case, the followers play the ESS that is most beneficial to the leader. We refer to this as an optimistic evolutionarily stable Stackelberg equilibrium (OSESS). In this case we have g​()=UL​(,).g σ(x)=U_L( σ,x). We also consider pessimistic evolutionarily stable Stackelberg equilibrium (PSESS), in which the followers play the ESS that is worst for the leader. While the followers are not rationally trying to harm the leader, we can view PSESS as SESS that is robust to worst-case evolutionary stability. For PSESS we have g​()=−UL​(,).g σ(x)=-U_L( σ,x). OSESS and PSESS are defined below in Definitions 5–6. For OSESS we have ​()=arg​max∈ℰ​()⁡UL​(,)G( σ)= *arg\,max_x ( σ)U_L( σ,x), and for PSESS ​()=arg​min∈ℰ​()⁡UL​(,).G( σ)= *arg\,min_x ( σ)U_L( σ,x). SESS refines Stackelberg equilibrium by restricting the follower response to strategies that are evolutionarily stable in the induced population game. Definition 4 (Evolutionarily Stable Stackelberg Equilibrium) Given a selection correspondence ​()⊆ℰ​()G( σ) ( σ), (∗,∗)( σ^*,x^*) is an evolutionarily stable Stackelberg equilibrium if ∗∈arg​max∈Δ​(ℳ)⁡max∈​()⁡UL​(,)and∗∈​(∗). σ^*∈ *arg\,max_ σ∈ (M)\; _x ( σ)U_L( σ,x) ^* ( σ^*). Definition 5 (Optimistic SESS) A pair (∗,∗)∈Δ​(ℳ)×Δ​()( σ^*,x^*)∈ (M)× (P) is an optimistic evolutionarily stable Stackelberg equilibrium if (∗,∗)∈arg​max∈Δ​(ℳ),∈ℰ​()⁡UL​(,).( σ^*,x^*)∈ *arg\,max_ σ∈ (M),\;x ( σ)U_L( σ,x). Definition 6 (Pessimistic SESS) A pair (∗,∗)∈Δ​(ℳ)×Δ​()( σ^*,x^*)∈ (M)× (P) is a pessimistic evolutionarily stable Stackelberg equilibrium if ∗∈arg​max∈Δ​(ℳ)⁡min∈ℰ​()⁡UL​(,)and∗∈arg​min∈ℰ​(∗)⁡UL​(∗,). σ^*∈ *arg\,max_ σ∈ (M)\; _x ( σ)U_L( σ,x) ^*∈ *arg\,min_x ( σ^*)U_L( σ^*,x). I Continuous Stackelberg–ESS We now extend the Stackelberg–ESS framework to continuous Stackelberg evolutionary games, which arise naturally in biological and control applications such as cancer treatment. In this setting, the leader selects a continuous decision variable, while the followers correspond to an evolving population whose response is described by coupled ecological and evolutionary dynamics. Let ℳ⊆ℝ+dM _+^d denote the leader decision set (e.g., treatment doses or control parameters). For a given leader decision ∈ℳm , the followers are described by a collection of continuous state variables, including population sizes and traits. We denote the follower outcome by =(,)∈z=(x,u) , where ∈ℝ+nx _+^n represents population abundances and ∈⊆ℝnu ^n represents trait or strategy variables. As prior work has done we will assume that =[0,1]n.U=[0,1]^n. In the context of cancer treatment d denotes the number of drugs and n denotes the number of cancer cell phenotypes. The leader objective is given by a function Q:ℳ×→ℝQ:M×Z . For each ∈ℳm , the leader induces an eco–evolutionary system governing the dynamics of (,)(x,u). For a fixed leader decision m, we define the set of admissible follower responses ℰ​()⊆E(m) as the set of eco–evolutionarily stable outcomes under m. Informally, a follower outcome ∗=(∗,∗)z^*=(x^*,u^*) is admissible if it satisfies any required ecological equilibrium conditions and is stable against invasion by rare mutant phenotypes in the induced eco–evolutionary system. We assume throughout that ℰ​()E(m) is nonempty for the leader decisions under consideration. When multiple evolutionarily stable outcomes exist for the same leader decision, we model their selection via a correspondence ​()⊆ℰ​()G(m) (m). We now define continuous analogues of Stackelberg–ESS. Definition 7 (Continuous Stackelberg–ESS) Let ​()⊆ℰ​()G(m) (m) be a selection correspondence. A pair (∗,∗)∈ℳ×(m^*,z^*) ×Z is a continuous Stackelberg–ESS if ∗∈arg​max∈ℳ⁡max∈​()⁡Q​(,)and∗∈​(∗).m^*∈ *arg\,max_m \ _z (m)Q(m,z) ^* (m^*). Definition 8 (Optimistic continuous Stackelberg–ESS) A pair (∗,∗)∈ℳ×(m^*,z^*) ×Z is an optimistic continuous Stackelberg–ESS if (∗,∗)∈arg​max∈ℳ,∈ℰ​()⁡Q​(,).(m^*,z^*)∈ *arg\,max_m ,\ z (m)Q(m,z). Definition 9 (Pessimistic continuous Stackelberg–ESS) A pair (∗,∗)∈ℳ×(m^*,z^*) ×Z is a pessimistic continuous Stackelberg–ESS if ∗∈arg​max∈ℳ⁡min∈ℰ​()⁡Q​(,)and∗∈arg​min∈ℰ​(∗)⁡Q​(∗,).m^*∈ *arg\,max_m \ _z (m)Q(m,z) ^*∈ *arg\,min_z (m^*)Q(m^*,z). First we present the Stackelberg evolutionary game formulation of cancer treatment from prior work [18]. In this formulation (Equation 1), the objective Q corresponds to the quality of life function that is maximized by the leader. For each cell type i, there is a fitness function Gi​(ui,,)G_i(u_i,m,x) that the follower is trying to maximize. We assume that the dynamics of the population x are governed by ˙=​(t)​. x=G(t)x. In order to ensure that we are in equilibrium of the ecological dynamics we must have that xi˙=0 x_i=0 for all i.i. Note that in this formulation the follower is playing a best response by maximizing the fitness function. By contrast, in Stackelberg-ESS we assume that the behavior of the cancer cells is determined by evolutionary stability conditions as opposed to explicit utility maximization. An algorithm for solving this problem has been presented that is based on a nonconvex quadratic program formulation [8]. This algorithm has been demonstrated to quickly compute a global optimal solution on a realistic example problem proposed by prior work [18]. max∗,∗,∗ _m^*,u^*,x^* Q​(∗,∗,∗) Q(m^*,u^*,x^*) (1) s.t. x˙i∗=0,i=1,…,n, x_i^*=0, i=1,…,n, ui∗∈arg⁡maxui∈[0,1]⁡Gi​(ui,∗,∗),i=1,…,n, u_i^*∈ _u_i∈[0,1]G_i(u_i,m^*,x^*), i=1,…,n, ∗≥,∗≥,≤∗≤. ^* 0, ^* 0, 0 ^* 1. The formulation for optimistic continuous Stackelberg-ESS is given by Equation 2. This formulation is similar to the previous one for standard Stackelberg equilibrium except that now instead of maximizing the fitness functions the follower ensures that their strategy is an ESS in the induced subgame given the leader is playing m∗.m^*. We will focus our attention on the problem of computing OSESS under this formulation, since it is perhaps the easiest case of SESS to solve since both players are aligned in maximizing Q.Q. max∗,∗,∗ _m^*,u^*,x^* Q​(∗,∗,∗) Q(m^*,u^*,x^*) (2) s.t. x˙i∗=0,i=1,…,n, x_i^*=0, i=1,…,n, (∗,∗)∈ℰ​(∗), (x^*,u^*) (m^*), ∗≥,∗≥,≤∗≤. ^* 0, ^* 0, 0 ^* 1. Equation 3 provides an equivalent reformulation of Equation 2. Note that we use the continuous-trait definition of ESS for our setting [16, 25, 11] as opposed to the discrete ESS definition presented previously. Given fixed m, a resident state (∗,∗)(x^*,u^*) is an ESS iff ∗x^* is an ecological equilibrium (i.e., ˙∗= x^*=0) and no rare mutant can grow, i.e., supuiGi​(ui,,∗)≤0​ ​∀i. _u_iG_i(u_i,m,x^*)≤ 0 ∀ i. Recall that ˙=. x=Gx. So the constraint x˙i∗=0 x^*_i=0 is equivalent to Gi​(i∗,∗,∗)​xi∗=0.G_i(u^*_i,m^*,x^*)x^*_i=0. Recall that (∗,∗)∈ℰ​()(x^*,u^*) (m) iff (∗,∗)(x^*,u^*) is evolutionarily stable in the eco-evolutionary system induced by .m. This is true iff for all i and uiu_i a mutation to uiu_i does not lead to positive fitness (growth rate), i.e., Gi​(ui,,∗)≤0.G_i(u_i,m,x^*)≤ 0. max∗,∗,∗ _m^*,u^*,x^* Q​(∗,∗,∗) Q(m^*,u^*,x^*) (3) s.t. xi∗​Gi​(ui∗,∗,∗)=0,i=1,…,n, x_i^*\,G_i(u_i^*,m^*,x^*)=0, i=1,…,n, Gi​(ui,∗,∗)≤0∀ui∈[0,1],i=1,…,n, G_i(u_i,m^*,x^*)≤ 0 ∀ u_i∈[0,1],\;i=1,…,n, ∗≥,∗≥,≤∗≤. ^* 0, ^* 0, 0 ^* 1. I Algorithms We first present an algorithm for computing OSESS in normal-form games, followed by an algorithm for continuous-trait games. Pseudocode for the main algorithm in the normal-form setting is given in Algorithm 1. We compute an OSESS by enumerating candidate follower supports and for each support computing a Stackelberg equilibrium of the game where the follower is restricted to strategies with the given support. This is done by using the subroutine given in Algorithm 2. Given the Stackelberg equilibrium (,),( σ, x), we next check whether x is an ESS in the subgame induced by the leader’s strategy . σ. This is done in Algorithm 3. If the strategies do constitute an ESS, then we have found an evolutionarily stable Stackelberg equilibrium (SESS). We then calculate the objective ⊤​L, σ A_L, and return the SESS with highest value as an OSESS. The algorithm can be easily adapted to output the first SESS found, all SESSs found, or an SESS that optimizes a different objective. The values used for all numerical tolerance parameters are given in Table I. Algorithm 2 computes a Stackelberg equilibrium strategy with given support T by solving a quadratically-constrained quadratic program (QCQP). The constraints i≥ϵs x_i≥ _s ensure that the pure strategies in the support are played with positive probability. The objective ensures that the leader’s payoff is maximized, and the final two sets of constraints ensure that the follower is best-responding to the leader’s strategy. Our algorithm is implicitly assuming that the game is nondegenerate and that there exists at least one Stackelberg equilibrium for each follower support. This is a common assumption in equilibrium-finding algorithms [22, 12, 15], and has been studied recently in the context of ESS computation [10, 9]. If the game is degenerate with an infinite continuum of equilibria our algorithm will fail to find all of them, and therefore may fail to find all SESSs; however it may still find one. Algorithm 3 also involves solving a QCQP. In practice we will be solving nonconvex QCQPs with Gurobi’s solver, which can solve small problem instances quickly despite their computational complexity. The QCQP checks whether any mutant strategy y can successfully invade against a population following strategy x in the game induced by the leader’s strategy σ.σ. The parameter ϵp _p allows for a small amount of numerical imprecision, and the final constraint ensures that y is numerically distinct from x by using numerical separation parameter δ.δ. Algorithm 1 ComputeDiscreteOSESS 0: Follower payoff tensor F∈ℝm×n×nA_F ^m× n× n 0: Leader payoff matrix L∈ℝm×nA_L ^m× n 0: Minimum support mass ϵs _s, ESS tolerances ϵp,δ _p,δ 0: OSESS (⋆,⋆)( σ , x ) 1: v⋆←−∞v ←-∞ 2: (⋆,⋆)←(∅,∅)( σ , x )←( , ) 3: for all nonempty supports T⊆PT P do 4: (s​t​a​t​u​s,,)←SolveSESupport​(F,L,T,ϵs)(status, σ, x)← SolveSESupport(A_F,A_L,T, _s) 5: if s​t​a​t​u​s=FEASIBLEstatus= FEASIBLE then 6: if IsESS​(F,,,ϵp,δ) IsESS(A_F, σ, x, _p,δ) then 7: v←⊤​L​v← σ A_L x 8: if v>v⋆v>v then 9: v⋆←v ← v 10: (⋆,⋆)←(,)( σ , x )←( σ, x) 11: end if 12: end if 13: end if 14: end for 15: return (⋆,⋆)( σ , x ) Algorithm 2 SolveSESupport 0: Follower payoff tensor F∈ℝm×n×nA_F ^m× n× n 0: Leader payoff matrix L∈ℝm×nA_L ^m× n 0: Support T⊆PT P, minimum support mass ϵs _s 0: (s​t​a​t​u​s,,)(status, σ, x) 1: Solve the following optimization problem: max,,v _ σ, x,v ⊤​L​ σ A_L x s.t. ℓ≥0,∀ℓ∈M, σ_ ≥ 0, ∀ ∈ M, ∑ℓ∈Mℓ=1, _ ∈ M σ_ =1, i≥ϵs,∀i∈T, x_i≥ _s, ∀ i∈ T, j=0,∀j∉T, x_j=0, ∀ j∉ T, ∑i∈Pi=1, _i∈ P x_i=1, ∑ℓ∈M∑k∈Pℓ​F​(ℓ,i,k)​k=v,∀i∈T, _ ∈ M _k∈ P σ_ A_F( ,i,k) x_k=v, ∀ i∈ T, ∑ℓ∈M∑k∈Pℓ​F​(ℓ,j,k)​k≤v,∀j∉T. _ ∈ M _k∈ P σ_ A_F( ,j,k) x_k≤ v, ∀ j∉ T. 2: if the problem is feasible then 3: return (FEASIBLE,,)( FEASIBLE, σ, x) 4: else 5: return (INFEASIBLE,∅,∅)( INFEASIBLE, , ) 6: end if Algorithm 3 IsESS 0: Follower payoff tensor F∈ℝm×n×nA_F ^m× n× n 0: Leader mixed strategy ∈Δ​(M) σ∈ (M) 0: Follower mixed strategy ∈Δ​(P) x∈ (P) 0: Tolerances ϵp,δ _p,δ 0: TRUE if x is an ESS of the induced follower game; otherwise FALSE 1: Define the induced follower payoff matrix ∈ℝn×nB ^n× n: i​j←∑ℓ∈Mℓ​F​(ℓ,i,j),∀i,j∈P.B_ij← _ ∈ M σ_ A_F( ,i,j), ∀ i,j∈ P. 2: v←⊤​v← x B x 3: Solve the following optimization problem: max _ y ⊤​−⊤​ y B y- x B y s.t. j≥0,∀j∈P, y_j≥ 0, ∀ j∈ P, ∑j∈Pj=1, _j∈ P y_j=1, |⊤​−v|≤ϵp, | y B x-v |≤ _p, ‖−‖22≥δ. \| y- x\|^2_2≥δ. 4: if the optimal objective value ≤ϵp≤ _p then 5: return TRUE 6: else 7: return FALSE 8: end if Pseudocode for the main algorithm in continuous setting is given in Algorithm 4. The algorithm follows a generate-and-certify structure: first a candidate solution satisfying the KKT conditions of the invasion maximization problems is computed, and then a global optimization check verifies that no mutant phenotype can invade. The main subroutine Algorithm 5 generates a candidate OSESS by solving the optimization problem presented in Equation 3. The formulation finds a solution that satisfies the KKT optimality conditions, though it is not guaranteed to be a globally optimal solution. To test if this candidate solution is globally optimal we perform the ex-post certification procedure described in Algorithm 6, using a numerical tolerance of ϵi​n​v. _inv. We set ϵi​n​v=10−3, _inv=10^-3, which is biologically negligible but significantly larger than Gurobi’s default tolerance of 10−6.10^-6. Note that if the quality of life functions GiG_i are concave in uiu_i then the KKT conditions are also sufficient. In our experiments we will use previously-studied Q and GiG_i functions, and the algorithm will involve solving nonconvex QCQPs. Algorithm 4 ComputeContinuousOSESS 0: Leader objective Q​(,,)Q(m,u,x) 0: Fitness functions Gi​(ui,,)i=1n\G_i(u_i,m,x)\_i=1^n 0: Feasible set ℳM, tolerance ϵinv _inv 0: OSESS (⋆,⋆,⋆)(m ,u ,x ) if one is found 1: (⋆,⋆,⋆)←GenOSESS​(Q,Gi,ℳ)(m ,u ,x )← GenOSESS(Q,\G_i\,M) 2: for i=1i=1 to n do 3: Gimax←Certify​(i,⋆,⋆)G_i ← Certify(i,m ,x ) 4: end for 5: if maxi⁡Gimax≤ϵinv _iG_i ≤ _inv then 6: return (⋆,⋆,⋆)(m ,u ,x ) 7: else 8: return FAILURE 9: end if Algorithm 5 GenOSESS 0: Leader objective Q​(,,)Q(m,u,x) 0: Fitness functions Gi​(ui,,)i=1n\G_i(u_i,m,x)\_i=1^n 0: Feasible set ℳM 0: Candidate solution (⋆,⋆,⋆)(m ,u ,x ) 1: Solve the following optimization problem: max,,,′,L,U _m,u,x,u , λ^L, λ^U Q​(,,) Q(m,u,x) s.t. xi​Gi​(ui,,)=0,i=1,…,n, x_iG_i(u_i,m,x)=0, i=1,…,n, ∈ℳ,≥0, 0≤1, 0≤′≤1, ,\ x≥ 0,0 ≤ 1,0 ≤ 1, Gi​(ui′,,)≤0,i=1,…,n, G_i(u _i,m,x)≤ 0, i=1,…,n, ∂Gi∂ui​(ui′,,)+λiL−λiU=0,i=1,…,n, ∂ G_i∂ u_i(u _i,m,x)+ _i^L- _i^U=0,\ i=1,…,n, λiL​ui′=0,i=1,…,n, _i^Lu _i=0, i=1,…,n, λiU​(1−ui′)=0,i=1,…,n, _i^U(1-u _i)=0, i=1,…,n, λiL≥0,λiU≥0,i=1,…,n. _i^L≥ 0,\; _i^U≥ 0, i=1,…,n. 2: return optimal solution (⋆,⋆,⋆)(m ,u ,x ) Algorithm 6 Certify 0: Phenotype index i∈1,…,ni∈\1,…,n\ 0: Fixed (⋆,⋆)(m ,x ) 0: GimaxG_i 1: Computing globally optimal solution: maxu _u Gi​(u,⋆,⋆) G_i(u,m ,x ) s.t. 0≤u≤1 0≤ u≤ 1 2: return optimal objective GimaxG_i Parameter Value ϵs _s 10−410^-4 ϵp _p 10−510^-5 δ 10−210^-2 ϵinv _inv 10−310^-3 TABLE I: Tolerance parameter values for the algorithms. IV Experiments We experiment with our continuous algorithm on an evolutionary cancer game model previously studied [18]. Note that prior work for this model has computed a Stackelberg equilibrium (SE), where the leader commits to a strategy that optimizes the quality of life function Q while the cancer cell phenotypes optimize their respective fitness functions GiG_i (subject to the equilibrium conditions of the ecological dynamics being satisfied) [18, 8]. In contrast, we compute an OSESS in this game, which guarantees that the follower cancer cells are following an ESS in the game induced by the leader’s strategy and no mutant trait can have positive growth (while SE does not preclude existence of mutants with positive growth). This is fundamentally a different solution concept, so the players’ payoffs may increase or decrease. The model is instantiated by the following functional forms for the fitness functions GiG_i and quality of life function Q. Note that the model presentation is slightly different between the paper [18] and the implementation in the code repository [17], and we will use the model from the code. G0=rmax​(1−α00​x0+α01​x1+α02​x2K)−d−m1k1−m2k2G_0=r_max (1- _00x_0+ _01x_1+ _02x_2K )-d- m_1k_1- m_2k_2 G1=rmax​e−g1​u1​(1−α10​x0+α11​x1+α12​x2K)−d−m1b1​u1+k1−m2k2G_1=r_maxe^-g_1u_1 (1- _10x_0+ _11x_1+ _12x_2K )-d- m_1b_1u_1+k_1- m_2k_2 G2=rmax​e−g2​u2​(1−α20​x0+α21​x1+α22​x2K)−d−m1k1−m2b2​u2+k2G_2=r_maxe^-g_2u_2 (1- _20x_0+ _21x_1+ _22x_2K )-d- m_1k_1- m_2b_2u_2+k_2 Q=Qmax−c​(x0+x1+x2K)2−w1​m12−w2​m22−r1​u12−r2​u22Q=Q^max-c ( x_0+x_1+x_2K )^2-w_1m^2_1-w_2m^2_2-r_1u^2_1-r_2u^2_2 The model has several parameters, whose interpretations are summarized in Table I. Note that in the code additional parameters a0,a1,a2,a3a_0,a_1,a_2,a_3 are defined, with [α00α01α02α10α11α12α20α21α22]=[a0a1a1a2a0a3a2a3a0] bmatrix _00& _01& _02\\ _10& _11& _12\\ _20& _21& _22\\ bmatrix= bmatrixa_0&a_1&a_1\\ a_2&a_0&a_3\\ a_2&a_3&a_0\\ bmatrix Parameter Interpretation rmaxr_max Max cell growth rate gig_i Cost of resistance strategy (cell type) i αi​j _ij Interaction coefficient between cell types i and j K Carrying capacity d Natural death rate kik_i Innate resistance that may be present before drug exposure bib_i Benefit of the evolved resistance trait in reducing therapy efficacy QmaxQ^max Quality of life of a healthy patient wiw_i Toxicity of drug i rir_i Effect of resistance rate of cell type i c Weight for impact of tumor burden vs. drug toxicity/treatment-induced resistance rate TABLE I: Interpretations of model parameters We ran experiments comparing the algorithm for computing a Stackelberg equilibrium [8] to our new algorithm for computing OSESS on a problem instance using the same parameter values that have been previously used [17], which are provided in Table I. Both approaches involve solving QCQPs using Gurobi’s nonconvex mixed-integer quadratically-constrained programming (MIQCP) solver version 11.0.3 build v11.0.3rc0 [13] with Java version 14.0.2. Gurobi’s nonconvex MIQCP solver guarantees global optimality (up to a numerical tolerance). We used Gurobi’s default numerical tolerance value of 10−6.10^-6. We used Gurobi’s default values for all settings other than DualReductions which we set to 0 since we observed improved performance by disabling aggressive presolve reductions. We used an Intel Core i7-1065G7 processor with base clock speed of 1.30 GHz (and maximum turbo boost speed of up to 3.9 GHz) with 16 GB of RAM under 64-bit Windows 11 (8 logical cores/threads). Parameter Value rmaxr_max 0.45 g1g_1 0.5 g2g_2 0.5 a0a_0 1 a1a_1 0.15 a2a_2 0.9 a3a_3 0.9 K 10,000 d 0.01 k1k_1 5 k2k_2 5 b1b_1 10 b2b_2 10 QmaxQ^max 1 w1w_1 0.5 w2w_2 0.2 r1r_1 0.4 r2r_2 0.4 c 0.5 TABLE I: Parameter values used in experiments The main results from our experiments are depicted in Table IV. Note that the optimal physician strategies (∗m^*) vary slightly between the two solutions, and the optimal quality of life Q∗Q^* is slightly higher for the OSESS solution than the SE solution. The SE algorithm ran significantly faster than the OSESS algorithm (around 2 seconds versus 2.2 minutes). Both algorithms are based on KKT necessary optimality conditions and require the application of ex-post certification procedures to ensure global optimality. For both algorithms we recompute the values of Q and GiG_i exactly by calculating them directly from the solutions output from Gurobi (since the values of Q and GiG_i in the optimizations themselves contain numerical error). For the OSESS algorithm the certification procedure obtains G1m​a​x=7.85×10−5G^max_1=$7.85×10^-5$, G2m​a​x=1.41×10−4,G^max_2=$1.41×10^-4$, which are both significantly below ϵi​n​v=10−3 _inv=10^-3, so we conclude that our solution is in fact an OSESS. As it turns out, running the same certification procedure on the Stackelberg equilibrium solution obtains G1m​a​x=8.00×10−5G^max_1=$8.00×10^-5$, G2m​a​x=1.42×10−4,G^max_2=$1.42×10^-4$, which are also both significantly below ϵi​n​v=10−3 _inv=10^-3. So for these particular game parameters it turns out that the SE solution obtained is also an OSESS, though this will not be the case in general. Stackelberg Equilibrium OSESS m1∗m^*_1 0.4105 0.4003 m2∗m^*_2 0.4680 0.4571 u1∗u^*_1 0.2139 0.1827 u2∗u^*_2 0.2856 0.2828 x0∗x^*_0 5731.0481 5823.7239 x1∗x^*_1 0.0087 9.5179 x2∗x^*_2 950.7623 946.4278 Q∗Q^* 0.5978 0.6029 Runtime (seconds) 2.2070 134.6910 TABLE IV: Experimental results for evolutionary cancer game V Conclusion Stackelberg evolutionary games have emerged as a powerful model for studying many types of biological interactions, including cancer therapy, fishery management, and pest control [18]. In these games, a rational human leader first selects their strategy, then evolutionary followers select their strategies conditional on the leader’s strategy (possibly subject to ecological population equilibrium constraints). Prior work has assumed that the follower players act to maximize fitness functions GiG_i. In the resulting Stackelberg equilibrium, the leader and followers are simultaneously maximizing their respective objective functions. However, this solution does not guarantee that the follower strategy is resistant to invasion by rare mutants. To ensure evolutionary stability we must impose the condition that the fitness functions are nonpositive (subject to numerical tolerance) for all possible mutations so that their populations do not continue to grow. We introduced the new solution concept evolutionarily stable Stackelberg equilibrium (SESS) in which the leader is a rational utility maximizer, while the followers employ an evolutionarily stable strategy (ESS) that is resistant to invasion. In an SESS, the leader commits to a strategy that maximizes their objective assuming the followers will play an ESS in the subgame induced by the leader’s strategy. This imposes a stricter stability requirement on the follower strategy than standard Stackelberg equilibrium, which only requires best-response behavior without ensuring evolutionary stability. We define SESS both for discrete normal-form games, as well continuous-trait games, which arise naturally in biological and control applications such as cancer treatment. In our general formulation, the followers play any ESS in the subgame induced by the leader’s strategy. We also consider the special cases where the followers play the ESS that is most beneficial to the leader (OSESS) and the ESS that is worst for the leader (PSESS). OSESS is perhaps the easiest case of SESS to compute since in a sense all players are aligned in maximizing the leader’s payoff (subject to the followers playing an ESS), which simplifies the optimization to a single outer maximization problem. We present algorithms for computing OSESS in both the discrete and continuous settings. If we view the follower strategies as population frequencies, then our model is applicable to settings with a large number of follower players; e.g., there may be a large number of cancer cells where each cell has one of a small number of possible phenotypes. We implemented our OSESS algorithm on an evolutionary cancer game model that has been previously studied and demonstrated that its runtime is reasonable in practice. It is not surprising that OSESS computation takes longer than SE computation, since OSESS is essentially SE with additional stability constraints imposed and therefore involves solving a more challenging optimization problem. As it turns out, the Stackelberg equilibrium solution for this game ended up being an OSESS as well, though this is not guaranteed in general. This suggests that for larger games perhaps a good approach would be to first compute an SE, which is more tractable than SESS, and test whether it also satisfies evolutionary stability; if not then we can continue on to the more computationally expensive SESS algorithm. References [1] Andris Abakuks. Conditions for evolutionarily stable strategies. Journal of Applied Probability, 17(2):559–562, 1980. [2] Manon Blanc and Kristoffer Arnsfelt Hansen. Computational complexity of multi-player evolutionarily stable strategies. In International Computer Science Symposium in Russia (CSR 2021), volume 12730 of Lecture Notes in Computer Science, pages 44–58. Springer, 2021. [3] I. M. Bomze. Detecting all evolutionarily stable strategies. Journal of Optimization Theory and Applications, 75:313–329, 11 1992. [4] Mark Broom and Jan Rychtář. Game-Theoretical Models in Biology. CRC Press, 2013. [5] Vincent Conitzer. The exact computational complexity of evolutionarily stable strategies. In Conference on Web and Internet Economics (WINE-13), pages 96–108, 2013. [6] David Dingli, Francisco A. C. C. Chalub, Francisco C. Santos, Sven Van Segbroeck, and Jorge M. Pacheco. Cancer phenotype as the outcome of an evolutionary game between normal and malignant cells. British Journal of Cancer, 101(7):1130–1136, 2009. [7] Kousha Etessami and Andreas Lochbihler. The computational complexity of evolutionarily stable strategies. International Journal of Game Theory, 37(1):93–113, 2008. [8] Sam Ganzfried. Computing Stackelberg equilibrium for cancer treatment. Games, 15(6), 2024. [9] Sam Ganzfried. Computing evolutionarily stable strategies in imperfect-information games, 2025. arXiv:2512.10279 [cs.GT]. [10] Sam Ganzfried. Computing evolutionarily stable strategies in multiplayer games, 2025. arXiv:2511.20859 [cs.GT]. [11] Stefan A. H. Geritz, Éva Kisdi, Gábor Meszéna, and Johan A. J. Metz. Evolutionarily singular strategies and the adaptive growth and branching of the evolutionary tree. Evolutionary Ecology, 12(1):35–57, 1998. [12] Srihari Govindan and Robert Wilson. A global Newton method to compute Nash equilibria. Journal of Economic Theory, 110:65–86, 2003. [13] Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2026. [14] John Haigh. Game theory and evolution. Advances in Applied Probability, 7(1):8–11, 1975. [15] P. Jean-Jacques Herings and Ronald Peeters. Homotopy methods to compute equilibria in game theory. Economic Theory, 42(1):119–156, 2010. [16] Josef Hofbauer and Karl Sigmund. Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge, UK, 1998. [17] Maria Kleshnina. https://github.com/kleshnina/SEGopinion, 2023. [18] Maria Kleshnina, Sabrina Streipert, Joel Brown, and Kateřina Staňková. Game theory for managing evolving systems: Challenges and opportunities of including vector-valued strategies and life-history traits. Dynamic Games and Applications, 13:1–26, 2023. [19] Michael Maschler, Eilon Solan, and Shmuel Zamir. Game Theory. Cambridge University Press, 2013. [20] John Maynard Smith. Evolution and the Theory of Games. Cambridge University Press, Cambridge, UK, 1982. [21] John Maynard Smith and George R. Price. The logic of animal conflict. Nature, 246:15–18, 1973. [22] Richard D. McKelvey and Andrew McLennan. Computation of equilibria in finite games. In H. Amann, D. Kendrick, and J. Rust, editors, Handbook of Computational Economics, volume 1, pages 87–142. Elsevier, 1996. [23] John McNamara, James N. Webb, E. J. Collins, Tamás Székely, and Alasdair I. Houston. A general technique for computing evolutionarily stable strategies based on errors in decision-making. Journal of Theoretical Biology, 189:211–225, 1997. [24] John Nash. Non-cooperative games. Annals of Mathematics, 54(2):286–295, 1951. [25] William H. Sandholm. Population Games and Evolutionary Dynamics. MIT Press, Cambridge, MA, 2010.