Lai–Robbins lower bound

Apr 19, 2026 - 15:35

0 0

Lai–Robbins lower bound

← Previous revision		Revision as of 10:04, 19 April 2026
Line 1:		Line 1:
	The ~~Lai-Robbins~~ lower bound gives an asymptotic lower bound on the regret that any uniformly good algorithm must incur in the stochastic [[Multi-armed bandit\|multi-armed bandit problem]]. The original result was proved by [[Tze Leung Lai]] and [[Herbert Robbins]] in 1985 for parametric [[Exponential family\|exponential families]]. Later work extended the statement to more general classes of distributions.		The Lai–Robbins lower bound gives an asymptotic lower bound on the regret that any uniformly good algorithm must incur in the stochastic [[Multi-armed bandit\|multi-armed bandit problem]]. The original result was proved by [[Tze Leung Lai]] and [[Herbert Robbins]] in 1985 for parametric [[Exponential family\|exponential families]]. Later work extended the statement to more general classes of distributions.

	== Multi-armed bandit problem ==		== Multi-armed bandit problem ==
Line 32:		Line 32:
	# One may consider a ''fixed-horizon'' setting (known $T$ ) or an ''anytime'' setting (unknown $T$ ).		# One may consider a ''fixed-horizon'' setting (known $T$ ) or an ''anytime'' setting (unknown $T$ ).

	== ~~Lai-Robbins~~ lower bound ==		== Lai–Robbins lower bound ==

	The theorem gives the right amount of time we should pull a suboptimal arm $k$ to distinguish whether we are in the instance with $\nu_k$ or with $\tilde{\nu}_k$ where $\tilde{\nu}_k$ is such that $\tilde{\mu}_k > \mu^*$ .		The theorem gives the right amount of time we should pull a suboptimal arm $k$ to distinguish whether we are in the instance with $\nu_k$ or with $\tilde{\nu}_k$ where $\tilde{\nu}_k$ is such that $\tilde{\mu}_k > \mu^*$ .
Line 86:		Line 86:
	Consistency imposes that, for every $\nu$ , the number of pulls of an optimal arm must be large. This means that $\mu^$ is estimated very accurately. The goal is to determine, for a suboptimal arm $k$ , how many samples are needed to be confident, with the appropriate level of confidence, that $\mu_k < \mu^$ . To do so, we use what is called the '''most confusing instance''': an instance close to $\nu$ such that arm $k$ is optimal. We define it as $\tilde{\nu}$ such that, for all $a \neq k$ , $\tilde{\nu}_a = \nu_a$ , and $\tilde{\nu}_k$ is chosen so that $\tilde{\mu}_k > \mu^*$ . The objective is to determine how many samples of arm $k$ are required to distinguish whether we are in the instance with $\nu_k$ or with $\tilde{\nu}_k$ in terms of $\operatorname{KL}$ distance.		Consistency imposes that, for every $\nu$ , the number of pulls of an optimal arm must be large. This means that $\mu^$ is estimated very accurately. The goal is to determine, for a suboptimal arm $k$ , how many samples are needed to be confident, with the appropriate level of confidence, that $\mu_k < \mu^$ . To do so, we use what is called the '''most confusing instance''': an instance close to $\nu$ such that arm $k$ is optimal. We define it as $\tilde{\nu}$ such that, for all $a \neq k$ , $\tilde{\nu}_a = \nu_a$ , and $\tilde{\nu}_k$ is chosen so that $\tilde{\mu}_k > \mu^*$ . The objective is to determine how many samples of arm $k$ are required to distinguish whether we are in the instance with $\nu_k$ or with $\tilde{\nu}_k$ in terms of $\operatorname{KL}$ distance.

	== Algorithms achieving the ~~Lai-Robbins~~ lower bound ==		== Algorithms achieving the Lai–Robbins lower bound ==

	Several algorithms are known to achieve the ~~Lai-Robbins~~ asymptotic lower bound		Several algorithms are known to achieve the Lai–Robbins asymptotic lower bound
	under specific assumptions on the reward distribution class $\mathcal{D}$ .		under specific assumptions on the reward distribution class $\mathcal{D}$ .
	The following list summarizes a non-exhaustive list of algorithms mathing the lower bound.		The following list summarizes a non-exhaustive list of algorithms mathing the lower bound.

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Related Posts

Template:2026 PBA Commissioner's Cup standings

Apr 19, 2026 0 0

RMS Virginian

Apr 19, 2026 0 0

User talk:Noetiket

Apr 19, 2026 0 0

User talk:KAJI (Nepali film)/sandbox

Apr 19, 2026 0 0

Template:American Horse of the Year winners

Apr 19, 2026 0 0

John Komnenos Palaiologos

Apr 19, 2026 0 0

Comments