## st.statistics – Statistical question – MathOverflow

In a study of ground motion caused by earthquakes, peak speed (in m / s) and peak acceleration (in m / s2) were recorded for five earthquakes. The results are presented in the following table.

Speed ​​—— 1.5 | 1.6 | 0.9 | 1.3 | 2.4

Acceleration 7.8 | 8.1 | 7.4 | 7.2 | 8.4

A. calculate the correlation coefficient between top speed and peak acceleration.

B. Find a 90% confidence interval for p, the population correlation between top speed and peak acceleration.

C. Can you conclude that p> .8?

D. Can you conclude that p> 0?

Posted on

## st.statistics – Correlation of several variables

I have a severe skin allergy. It is not a matter of medical advice. I am constantly under the care of a certified doctor, but we are lost, so I am diversifying our investigation :]

I have a big data set like this:

March 1: bread, butter, cheese, ham. Skin condition: 5/10

March 2: bread, chicken, milk. Skin condition: 3/10

Each day lists all the foods and ingredients I have eaten and also gives a subjective note of my skin irritation. Each day usually contains between 3 and 10 ingredients. Most ingredients repeat several times over the course of several days.

I'm looking for a way to find correlations between certain ingredients and the skin response – to find out what I'm most likely to be allergic to. I can do the programming part of the work, but I'm having trouble with the math part. Could you give me some suggestions, preferably with examples, on how to carry out such a task?

Posted on

## st.statistics – Statistical discrepancy – MathOverflow

Anyone know of such a statistical discrepancy?
$$start {equation} text {D} (P || Q) = frac {1} {2} left ( text {KL} (M || P) + text {KL} (M || Q) right) end {equation}$$
or $$M = frac {1} {2} (P + Q)$$.

This divergence is very similar to the Jensen-Shannon divergence $$text {D} (P || Q) = text {KL} (P || M) + text {KL} (Q || M)$$ but where the distributions in the statistical divergence argument appear in the second KL divergence argument.

I would like to know if such a divergence exists in the literature and know the properties of such a divergence. Thank you!

Posted on

## st.statistics – Baffled: The question of basic linear regression provided the mean, the variance. Necessary to find Hesse. Question in the message

A linear regressor with 2 inputs is driven with a square loss (mean square error):
$$mathcal {L} (W) = frac {1} {N} sum_ {i = 1} ^ N frac {1} {2} | Y_i – {W} ^ { top} X_i | ^ 2$$.

Variable 1 has a mean 3 and variance 4, while variable 2 has a mean -4 and variance 1 and is
otherwise not correlated with variable 1.

• Write the burlap of L (W).
• What can you do with the data to make Hesse equal to the identity matrix?
Posted on

## st.statistics – How to find the maximum likelihood estimate when the value of the estimate is outside the parameter range

Given that X1, …, Xn are independent and identically distributed Poisson variables (λ), we know that the maximum likelihood estimator of λ = X̄.

Suppose that X̄ = 2, and if λ can only take values ​​in (0,1), then what is the maximum likelihood estimate of λ?

Posted on

## st.statistics – How to calculate the mode if all the values ​​are unique values?

(2795.9579999999996, 3447.288, 1068,495, 673.02, 4064,463, 2762,046, 3754,335, 4058,9055000000003, 4544,568, 3438,7275000000004, 1261,605, 904,485, 3646,149 , 981.4350000000001, 1285.14, 4051.5735, 829.125, 2553.48, 1710.09, 2469.0599999999995, 337.125, 3620.2725, 1500.015, 2610.162, 3430.8689999999997, 772.8)

What will be the mode of the above list of values?

Python stats.mode () of the Scipy module giving 337.125 as mode.

Help me in this
Thank you

Posted on

## st.statistics – Clarification on insanity, standard deviation and upper limit

I have a little doubt about the inequality of the mean absolute difference (crazy) and the standard deviation.
From "Upper and lower limits of sample standard deviation"I read the following relationship:
$$MAD leq sigma leq frac {R} {2}$$ with
$$MAD = frac { sum_ {i = 1} ^ {n} | x_i – bar {x} |} {n}$$
and $$sigma$$ standard deviation, but after an example, the relationship is not true.
In my example, I have N multidimensional vectors instead of having unique values, after that I compute the classic formula on the standard deviation by considering the vectors.
From Popoviciu's inequality, I read the following relation:
$$V (X) leq { frac {(x_ {max} -x_ {min}) ^ 2} {4}}$$
to express the upper limit of the variance.
Since I have dimension vectors $$d$$ I write:
$$sigma leq { sqrt { frac {d * (x_ {max} -x_ {min}) ^ 2} {4}}}$$
My question is:

1. Can I write the last expression with d?
2. Why the MAD value is not less than the
standard deviation?

Thank you very much in advance.

Posted on

## st.statistics – Cumulative distribution function strictly increasing convolution

I am currently trying to show (understand) part of a proposition, the information I know is:

1: $$X_1$$ and $$X_2$$ are non-negative random variables.
2: $$X_1$$ and $$X_2$$ are distributed absolutely continuously.
3: The cumulative distribution function of $$X_1$$ and $$X_2$$, denoted $$F_ {X_1}$$ and
$$F_ {X_2}$$, increase strictly for all $$x geq0$$.

I want to use this information to show that $$F_ {X_1 + X_2}$$ strictly increases.

I am currently trying to use it since $$F_ {X_1}$$ and $$F_ {X_2}$$ strictly increase, then

$$P (a 0$$ for everyone $$0 leq a , Likewise
$$P (c 0$$ for everyone $$0 leq c .

and somehow use it to show that

$$P (e 0$$ for everyone $$0 leq e .

Any help is welcome, thank you!

Posted on

## st.statistics – Are there any known results in the literature on empirical and theoretical minima for likelihood estimators?

I'm working on a problem in my research where I want to estimate a parameter $$theta$$ from samples $$x_i, z_i, , {1 leq i leq n }$$. For each $$x_i$$, we also observe the probability $$z_i$$. In reality, $$l$$ is the likelihood function, so $$l (x_i, theta)$$ gives us the real probability of seeing $$x_i$$ if our model $$theta$$ as the underlying parameter.

Let $$hat { theta} _n = arg min _ { theta} sum_ {i = 1} ^ n (l (x_i, theta) -z_i) ^ 2$$, our estimate of $$theta$$ of our samples using a square error (we could use a different error if necessary).

Are there any results in the literature where they provide limits on $$left | hat { theta} _n- theta right |$$? Or better yet, some results on the behavior of $$hat { theta} _n$$? This is not an M-estimator so it is not as easy of a problem as I originally thought …

Posted on

## st.statistics – Approximation of the gamma function, decomposition

Is there an approximation of the Gamma function of a sum such that the Gamma function is decomposed into functions of each element of the sum? Example:
$$Gamma (n_1 + n_2 + … + n_N) = f_1 (n_1), f_2 (n_2), …, f_N (n_N)$$ where the, in the left side can be replaced with any mathematical operation and the elements that is to say $$n_1 … n_N$$ are positive integers and a single element, e.g. $$n_1$$, must be a positive real.

Posted on