Suppose that for all $ n in mathbf {N} $, $ X_n $ and $ Y_n $ are independent random variables with
$$ X_n sim mathtt {Binomial} (n, 1-q), $$
and
$$ Y_n sim mathtt {Poisson} (n (q + epsilon_n)), $$
or $ q in (0,1) $, and $ ( epsilon_n) $ is a deterministic sequence such that $ epsilon_n to 0 $ as $ n to infty $.
Goal:
I'm looking for a way to solve the following "signal extraction / estimation" problem, namely:
For a sequence $ s_n geq 0 $ with $ n s_n in mathbf {N} $, show that $ n to infty $,
$$ frac { mathbf {E} (X_n mid X_n + Y_n = n s_n)} {n} = 1 – q + O (| s_n – 1 |) + O ( epsilon_n). $$
Heuristic:
Here's why I think it's true. We know that $ n ^ {- 1} X_n $ and $ n ^ {- 1} Y_n $ are both approximately Gaussian and moreover, if $ Z_1, Z_2 $ are independent gaussians with means $ mu_1 $ and $ mu_2 $and deviations $ sigma_1 ^ 2 $ and $ sigma_2 ^ 2 $ respectively, $ Z_1 mid Z_1 + Z_2 = s $ is also Gaussian and
$$ mathbf {E} (Z_1 mid Z_1 + Z_2 = s) = mu_1 + frac { sigma_1 ^ 2} { sigma_1 ^ 2 + sigma_2 ^ 2} (s – mu_1 – mu_2), $$
that is to say that we distribute the difference between the expectation of the sum and the statistic observed according to the ratio of variances.
If we naively assume that this property can be carried over to the boundaries of $ n ^ {- 1} X_n $ and $ n ^ {- 1} Y_n $, then we can believe that
begin {align}
frac { mathbf {E} (X_n mid X_n + Y_n = n s_n)} {n} & = mathbf {E} (n ^ {- 1} X_n mid n ^ {- 1} X_n + n ^ {-1} Y_n = s_n) \
& about 1 – q + frac {q (1-q)} {q (1-q) + q + epsilon_n} (s_n – (1-q) – (q + epsilon_n) \
& = 1 – q + O (| s_n – 1 |) + O ( epsilon_n).
end {align}
Attempts):
-
Feat local limit theorem: my main attempt was a brute force approach, trying to prove it directly by approximating the probability mass functions of $ X_n $ and $ Y_n $ by Gaussian densities using the local limit theorem, that is, we can write
$$ frac { mathbf {E} (X_n mid X_n + Y_n = n s_n)} {n} = frac {1} {n} sum_ {k = 0} ^ nk frac { mathbf {P } (X_n = k) mathbf {P} (Y_n = ns_n – k)} { mathbf {P} (X_n + Y_n = ns_n)}. $$
Each of the probabilities in the sum can be approximated by a Gaussian density with an error term which is $ O (n ^ {- 1/2}) $ evenly in $ k $. The realization of this operation is however extremely complicated, and it will be necessary to be extremely careful during the precision of the approximation of the sums of Riemann which will appear with their corresponding integrals.
-
Try to find relevant tips / results under the theme "signal extraction / estimation": essentially, the problem here is to estimate / extract a signal from an observation with independent additive noise (and approximately Gaussian). It seems to me that this would be a well-studied problem but the permutation searches of my question above give the standard undergraduate results for the sums of the random variables iid.
Specific questions:
- Is it possible that there is an intelligent way to use the approximate Gaussian behavior of $ X_n $ and $ Y_n $ prove this result without the brute force approach of the local limit theorem?
- Are there keywords that can lead me to similar results in the signal extraction / estimation literature?