## monte carlo – Sample from distribution important sampling

I am currently studying monte carlo methods and have some trouble understanding how to use importance sampling to perform monte carlo integration.what I don’t understand will probably be best shown with an example. Say I have some function f(x)=x(1-x) and I want to estimate the integral of it between zero and 1. To do it I use importance sampling. I choose p(x)=sin(x), then Integral = 1/N sum(f(x)/p(x)) but I need to sample my points accordingly to the distribution sin(x) and I am a bit unsure on how to do it in practice. If I generate random numbers uniform on (0,1) and the evaluate that for pdf sin(x) so that I get my points from points=sin(x) and then to obtain the integral I just evaluate my sum with the points I just sampled. When I try to do so I get a value very far from true value of 1/6. The result is way better when I just sample X as normal distribution and do ordinary monte carlo integration. What do I do wrong and how should I think about it?

## algebra precalculus – How to find the distribution of cases in the AstraZeneca/Oxford Phase 3 study interim analysis?

I’m trying to figure out whether it’s possible to find how the 131 cases from an interim analysis of the AstraZeneca/Oxford Phase 3 study are distributed among the two arms of two dosing regimens.

Vaccine Efficacy is abbreviated as VE.

Facts from the AstraZeneca November 23rd Press Release, as I understand them:

• There are two dosing regimens. Let’s call them Regimen A and Regimen B.
• Regimen A (2741 participants) has a VE of 90%
• Regimen B (8895 participants) has a VE of 62%
• The combined VE across the two regimens is 70%.
• There were a total of 131 cases in the two regimens.

This is how the AstraZeneca Phase III Study Protocol defines Vaccine Efficacy (VE):

“VE is calculated as RRR = 100*(1-relative risk), which is the incidence of infection in the vaccine group relative to the incidence of infection in the control group expressed as a percentage.”

Side Note: I think they actually made a mistake with the word “relative risk” inside the formula. I think “relative risk” must be “risk ratio”. I looked up the notion of “relative risk” and it”s far more complicated than their own description of the formula. The word “risk ratio” I got from Lesson 3: Measures of Risk – Principles of Epidemiology in Public Health Practice

Given he definition of VE and the total number of cases, I think it must follow that there were 30 cases in the vaccine group, and 101 cases in the placebo group, because this is the only proportion that results in a vaccine efficacy of 70%.

`1-(30/101) = 0.702 = 0.70`

Now the question is, how are these cases (30 in the vaccine group, 101 in the placebo group) spread along the two regimens?

I tried to write down the formulas, as best as I could. I used the following for the variable names:

• `P`: The amount of cases in the placebo group P.
• `V`: The amount of cases in the vaccine group V.
• Three suffixes: `t` for Total, `a` for Regimen A, `b` for Regimen B.

`Pt = 101 = Pa + Pb`
`Vt = 30 = Va + Vb`
`(1-Va/Pa) = 0.90` —> `Va/Pa = 0.1`
`(1-Vb/Pb) = 0.62` —> `Vb/Pb = 0.38`

Is this making sense?
Can this be solved?

If so, is there one solution, or multiple solutions?

Consider two probability distributions $$D$$ and $$U$$, over $$n$$-bit strings, where $$U$$ is the uniform distribution. We are not given an explicit description of $$D$$: we are only given black-box access, ie, we are only given a sampling device that can sample from $$D$$. Consider a sample $$z in {0, 1}^{n}$$, taken from either $$D$$ or $$U$$. We want to know which one is the case, and to do that, we consider polynomial-time algorithms that use the sampling device.

Intuitively, it seems obvious that the best polynomial-time algorithm that distinguishes which distribution the sample $$z$$ came from must have gotten that very $$z$$ as a sample at least once when running the sampling device. How do I mathematically formalize this statement?

Also, does this same intuition hold if we are given a polynomial number of samples as input (taken from either $$D$$ or $$U$$) instead of just one and are also given access to a black-box sampler for $$D$$? That is, the best algorithm for distinguishability has to “see” all the input samples at least once while running the sampler, if it were to decide that the input samples come from the distribution $$D$$?

## probability – Why do we use normal approximation for sample proportions of cases involving a binomial distribution?

I’m in high school and am learning sample proportions. I have encountered a doubt that i cannot answer myself.

Why do we use normal approximation for sample proportions of cases involving a binomial distribution?

Why approximate? We have a binomial distribution, isn’t it more accurate to just use this?

For example,

A company employs a sales team of 20 people, consisting of 12 men and 8 women. 5 sales people are to be selected at random to attend an important conference. Determine the probability that the proportion of men, in a random sample of 5 selected from the sales team, is greater than 0.7.

To solve this question, my teachers would say use normal distribution with X being N ~ (3, 0.048) to calculate the probabilty of (X>0.7) in my calculator.

My problem here is: wouldn’t it make more sense to calculate using binomial distribution with X being Bin ~ (5,0.6) then, as 0.7*5 is 3.5 round down to 3 as you can’t have 3.5 men, calculating probability of (X>3) in my calculator.

Can someone please clarify this? 😀

Thanks

## real analysis – Proof of Levy-Khintchine formula: Question on the existence of an infinitely divisible distribution defined by the formula

I am reading the proof of the Levy-Khintchine formula from Ken Iti Sato’s Levy Processes, however, I cannot understand a line from the proof that given a symmetric nonnegative definite $$d times d$$ matrix $$A$$, and a measure $$nu$$ on $$mathbb{R}^d$$ with $$nu({0})=0$$, $$int (1 wedge |x|^2) nu(dx), and $$gamma in mathbb{R}^d$$, we get an infinitely divisible distribution $$mu$$ whose characteristic function is given by the formula.

In the proof below, it states that $$phi_n$$ is the convolution of a Gaussian and a compound Poisson distribution. A distribution $$mu$$ on $$mathbb{R}^d$$ is compound Poisson if there exists $$c>0$$ and $$sigma$$ on $$mathbb{R}^d$$ with $$sigma ({0})=0$$ and the characteristic function $$hat mu(z)= exp(c(hat sigma(z)-1)).$$ $$D$$ here is the closed unit ball.

But I cannot see from below why we get this form for the inner integral. That is, it seems like we should have $$int_{|x|>1/n} (1-ilangle z,xrangle 1_D(x)) nu(dx)=1$$ from the definition above but I don’t see how we get this. Why does this integral define the characteristic function of a compound Poisson?

## probability – Distribution of p under alternative hypothesis

Suppose $$X_{i}…X_{n}$$ are i.i.d Bernoulli with $$P_{theta}(X_{i} = 1) = theta = 1 – P_{theta}(X_{i} = 0)$$.

n is large. Consider testing $$H_{0} = 0.5$$ versus $$H_{1} : theta > 0.5.$$

Let $$hat theta_{n} = sum frac{X_{i}}{n}$$ be the MLE of $$theta$$. Consider the large sample test which rejects when 2*$$sqrt{n}(hat theta_{n} – 0.5) > z_{1-alpha}$$, where $$z_{1-alpha}$$ is the 1 – $$alpha$$ quantile of the standard normal distribution.

(i) If $$theta$$ > 0.5, what is the limiting distribution of $$hat p_{n}$$?

Here is what I believe I know if $$H_{0}$$ is true:

$$2sqrt{n}(hat theta_{n} – 0.5)$$ $$xrightarrow{d}$$ N(0,1)

$$therefore$$ $$hat p_{n}$$ $$xrightarrow{d}$$ U(0,1)

If $$H_{a}$$ is true, here is what I have gotten:

$$2sqrt{n}(hat theta_{n} – (0.5 + theta_{1} – theta_{0})$$ $$xrightarrow{d}$$ N(0,1)

$$therefore$$ $$2sqrt{n}(hat theta_{n} – 0.5)$$ $$xrightarrow{d}$$ N($$2sqrt{n}(theta_{1} – theta_{0}),1)$$

If Z = $$2sqrt{n}(hat theta_{n} – (0.5 + theta_{1} – theta_{0}))$$, then:

P(Z $$geq z_{1-alpha} – 2 sqrt{n}(theta_{1} – theta_{0})$$)

If $$n rightarrow infty$$, then the probability tends to one.

I don’t believe this is right. The p_value should be skewed towards 0 but I am not sure where I am wrong here. Help would be appreciated.

## geometric probability – Distribution of line segment intersections in random pointsets

let $$P$$ be a set of $$n$$ points that are uniformly distributet inside the unit square ore unit circle, and $$L=lbraceell_{ij}rbrace := lbrace lbrace alpha p+ (1-alpha q)rbrace,|,0lealphale 1;, p,qin Prbrace$$ the set of line segments connecting pairs of points.

How are the numbers $$operatorname{card}(lbrace ell_{hk}| lbrace h,krbracesubseteq P,setminuslbrace p,qrbracerbrace)$$ of line-segments that intersect $$ell_{pq}$$ distributed?

## probability – Trivariate normal distribution with mean 0 and covariance matrix \$Sigma\$

Consider a trivariate normal vector (X, Y, Z) with mean 0 and covariance matrix $$Sigma$$. How to construct a covariance matrix that X, Y are conditionally independent given Z but X, Y are not marginally independent.

I think $$Sigma = begin{bmatrix} 1 & 1 & 0 \ 1 & 1 & 0 \ 0 & 0 & 1 end{bmatrix}$$ can fulfill the requirement but I don’t know how to show it conditionally independent.

## Extracting the component distribution parameters from a mixed-normal distribution (for n > 2 normal)

I’m trying to extract the distribution parameters of the sub-distributions which comprise a mixed normal distribution.

I’ll give my attempts so far.

First the simulated data:

``````MixedGaussiaData = Apply[Join, {RandomVariate[NormalDistribution[0, 2], 300], RandomVariate[NormalDistribution[0, 0.7], 500], RandomVariate[NormalDistribution[0, 0.4], 500], RandomVariate[NormalDistribution[0, 1], 200]}];
``````

Which when plotted looks like this:

So we have four normal distributions with different $$sigma$$ values and different number of points, but all distributions have a common mean value $$mu = 0$$.

I define my $$n$$ mixed-normal distribution as:

``````NMixedGaussian[n_] := MixtureDistribution[Array[w, n], MapThread[NormalDistribution[#1, #2] &, {Array[m, n], Array[s, n]}]]
``````

Then using `FindDistributionParameters`

``````FourMixedNormalMLE = FindDistributionParameters[MixedGaussiaData, NMixedGaussian[4], ParameterEstimator->{"MaximumLikelihood", PrecisionGoal->1, AccuracyGoal->1}]
``````

If I plot the result, it looks pretty good:

However if we take a look at the results, they’re not that good when compared to the inputs of the simulation:

``````mMLE = Array[m, 4] /. FourMixedNormalMLE
sMLE = Array[s, 4] /. FourMixedNormalMLE
wMLE = Array[w, 4] /. FourMixedNormalMLE

{0.0284676, 0.00902554, 0.0930328, -0.470579}
{1.8648, 0.274301, 0.667947, 0.385259}
{0.237727, 0.192302, 0.475281, 0.0946906}
``````

Second attempt:

I tried explicitly defining the Mixed-Gaussain function with `ProbabilityDensity`:

``````Clear[w, m, s, n];
NMixedGaussian[n_] := MixtureDistribution[Array[w, n], MapThread[NormalDistribution[#1, #2] &, {Array[m, n], Array[s, n]}]]

NMixGauss = NMixedGaussian[4];
NMixGaussPDF[z_] = FullSimplify[PDF[NMixGauss, z], DistributionParameterAssumptions[NMixGauss]]

NMixGaussPD = ProbabilityDistribution[NMixGaussPDF[z], {z, -Infinity, Infinity}, Assumptions -> DistributionParameterAssumptions[NMixGauss]]

FourMixedNormalPDFMLE = FindDistributionParameters[MixedGaussiaData, NMixGaussPD, ParameterEstimator->{"MaximumLikelihood", PrecisionGoal->1, AccuracyGoal->1}]
``````

This makes it worse. I think the main issue might be initial values and constraints, but I’m not sure how to best implement this. Does anyone have any suggestions?

One thing I noticed is that the weights produced by `FindDistributionParameters` don’t seem to make sense. They sum to one, but none seem to correspond to weights defined by $$1/sigma^{2}$$ or $$1/sigma_{rm{SE}}^{2}$$

What I’m trying to achieve is another way of performing a weighted mean. I could divide the simulated data up into chunks/bins, find the $$mu$$ and $$sigma$$ for each one and perform a weighted mean. I want to avoid binning if possible, hence this approach.

## probability – Geometric distribution question with a small show that section

If a football team has to leave a tournament after they lose. & in each match they have a probability $$q$$ of losing. Assuming each match is independent. IF they play X total matches.

Am I right in saying that the distribution of X is the following:

$$X$$ ~ $$Geo(q)$$ ?

I have the follow up questions then:

1. Show that $$mathbb{P}(Xgt x) = (1-q)^{x}$$ for $$x in lbrace 0,1,2,3,dotsrbrace.$$ Interpret this result
2. Find a formula for the probability that X is even.

I know for 1. that $$(1-q)^x$$ represents the number of times the team will win. But I don’t understand how to show it, or how to interpret it properly either & for number 2 I just have no idea in general.