# pr.probability – convergence rate for ergodic Markov chains induced by stable dynamical systems

Consider a deterministic dynamical system on $$mathbb{R}^n$$ defined by the recurrence $$x_{t+1} = f(x_t)$$.

Suppose the dynamical system is stable in the following sense: there exists a $$Q : mathbb{R}^n rightarrow mathbb{R}_{geq 0}$$ and $$lambda in (0, 1)$$ such that for all $$x$$ we have $$Q(f(x)) leq lambda Q(x)$$. Furthermore, for simplicity of this post, let us also assume that (a) $$Q(x) geq psi | x|_2^2$$ for all $$x$$ and (b) $$Q$$ is twice differentiable and its Hessian $$nabla^2 Q(x)$$ is uniformly upper bounded: $$| nabla^2 Q(x) |_{mathrm{op}} leq L$$ for all $$x$$.

Now consider the Markov chain $${X_t}_{t geq 0}$$ given by the transition $$X_{t+1} = f(X_t) + varepsilon_t$$, where each $$varepsilon_t$$ is sampled independently across time from a distribution that is absolutely continuous wrt Lebesgue measure on $$mathbb{R}^n$$. Let $$P$$ denote the transition kernel of this Markov chain.

From standard results in Markov chain theory we know that there exists an invariant measure $$pi$$ and constants $$R > 0$$ and $$r in (0, 1)$$ such that for all measures $$mu$$:
$$| mu P^t – pi |_{mathrm{TV}} leq R r^{t} (1 + mu(Q)), :: mu(Q) = int Q(x) mu(dx), :: t=0,1,2,….$$

The standard way to check this is to verify a Lyapunov drift condition.
For simplicity let us assume that $$varepsilon_t$$ is an isotropic Gaussian in $$mathbb{R}^n$$. Then we use the Lyapunov stability property of $$Q$$, in addition to the assumptions (a) and (b) above, to argue:
begin{align*} mathbb{E}( Q(X_{t+1}) | X_t ) &leq mathbb{E}( Q(f(X_t)) + frac{L}{2} | varepsilon_t |_2^2 |X_t ) \ &leq lambda Q(X_t) + frac{L n}{2} leq frac{1+lambda}{2} Q(X_t) – frac{(1-lambda)psi}{2} |X_t|_2^2 + frac{Ln}{2}. end{align*}
Therefore, we have the drift condition:
begin{align*} &mathbb{E}( Q(X_{t+1}) | X_t ) leq frac{1+lambda}{2} Q(X_t) + frac{Ln}{2} mathbf{1}{X_t in C}, \ &C := left{ x in mathbb{R}^n : |x|_2 leq sqrt{frac{Ln}{psi(1-lambda)}} right}. end{align*}
It remains to check that $$C$$ is small set. In particular, we need to find a probability measure $$nu$$ and positive constant $$eta > 0$$ such that:
$$inf_{x in C} P(x, A) geq eta nu(A) :: forall mathcal{B}(A).$$
The standard way of doing this (see e.g. Section 5.3.5 of Meyn and Tweedie) is to set $$nu(A) = frac{mu(A cap C)}{mu(C)}$$, with $$mu$$ as the $$n$$-dimensional Lebesgue measure.
Writing $$C = { x : |x|_2 leq R }$$ and letting $$q$$ denote the density of an $$n$$-dimensional isotropic Gaussian, we have:
begin{align*} inf_{x in C} P(x, A) &= int_{A} q(f(x) – y) : dy \ &geq int_{A cap C} q(f(x) – y) : dy \ &geq inf_{x in C, y in C} q(f(x) – y) mu(A cap C) \ &= inf_{x in C, y in C} q(f(x) – y) cdot mu(C) cdot nu(A). end{align*}
This means we can take $$eta = inf_{x in C, y in C} q(f(x) – y) cdot mu(C)$$,
and since $$q(cdot)$$ is continuous and $$C$$ compact, we have that $$eta > 0$$, demonstrating $$C$$ is a small set.

This puts us in a position to recover the geometric ergodicty claim.
The problem here though is that the $$eta$$ constant will scale like $$exp{ -R^2 }$$, and since $$R$$ scales as $$sqrt{n}$$, this gives us that $$eta asymp exp{-n}$$.
This in turn tells us that the ergodicity rate $$r$$ will scale like $$1 – e^{-n}$$ (see e.g. Theorem 1.3 of Hairer and Mattingly).

Question: Is this exponential dependence in the ergodicity rate a limitation of the analysis, or is it actually unavoidable in general?