## functions – Approximation of the convolution operator

I am trying to convolve two functions:

$$f(x) = e^{-h t}$$

$$g(x) = e^{-(e^{-t})^2}$$

$$(f*g)(x) = int_{0}^{t} f(t-tau)g(tau) dx = int_{0}^{t} e^{t-tau} e^{-(e^{-tau})^2} dx$$

Unfortunately, neither Mathematica nor Maple mastered this integral.

Are there any ways to approximate the convolution operator and get at least an approximate solution?

I would be grateful for any advice and help.

## approximation theory – Approximating Power Series Coefficients — Why Does a Clearly Illegitimate Method (Sometimes) Work So Well?

For reasons that don’t matter here,
I want to estimate the power series coefficients
$$t_{ij}$$ for the rational function
$$T(x,y)= {(1+x)(1+y)over 1- x y(2+x+y+x y)}=sum_{i,j} t_{ij}x^iy^j$$

Using a method that I cannot justify, I get
highly accurate estimates when $$i=j$$ and highly inaccurate estimates when
$$|i-j|$$ strays at all far from zero.

My questions are:

Q1) Why does my apparently illegitimate method work so well when $$i=j$$?

Q2) Why does the answer to Q1) not apply when $$ineq j$$ ?

(Of course, once the answer to Q1) is known, the answer to Q2) might be
self-evident.)

I’ll first present the method, then explain why I think it shouldn’t work,
then present the evidence that it works anyway when $$i=j$$, and then present
the evidence that it rapidly goes haywire when $$ineq j$$.

The Apparently Illegitimate Method:

Note that $$t_{ij}=t_{ji}$$, so we can limit ourselves to estimating
$$t_{j+k,j}$$ for $$kge 0$$.

I) Define
$$T_k(y)=sum_nt_{k+j,j}y^j$$
For example, a residue calculation gives

$$T_0(y)= {1-y-sqrt{1-4y+2y^2+y^4}over ysqrt{1-4y+2y^2+y^4}}$$

It turns out that all of the $$T_k$$ share a branch point at $$zetaapprox .2956$$ and are analytic in the disc $$r.

II) Write
$$L_k=lim_{kmapsto zeta} T_k(y)sqrt{y-zeta}$$.
Discover that $$L_0approx 1.44641$$ and $$L_k=L_0zeta^{k/2}$$.

III) Approximate
$$T_k(y)approx L_k/sqrt{y-zeta}$$

IV) Expand the right hand side in a power series around $$y=0$$ and equate
coefficients to get
$$t_{ij}approx pm{L_0oversqrt{zeta}}pmatrix{-1/2cr jcr}zeta^{-(i+j)/2} approx 2.66036 pmatrix{-1/2cr jcr}zeta^{-(i+j)/2}qquad(E1)$$

Remarks:

1. Obviously one could try to improve this approximation
at Step III by using more terms in the power series for $$T_k$$ at $$y=zeta$$.
This doesn’t seem to help, except when $$k=0$$, in which case the original approximation is already quite good.

2. For $$kge 2$$, $$T_k(y)$$ has a zero of
order $$k-1$$ at the origin. Thus one could modify this method by approximating
$$T_k(y)/(y^{k-1})$$ instead of $$T_k(y)$$
This yields
$$t_{ij}approx pm{2.66036}pmatrix{-1/2cr 1-i+2j}zeta^{-(i+j)/2}qquad(E2)$$
(E2) is (much) better than (E1) in the range $$ige 2j+1$$, where it gets
exactly the correct value, namely zero. Otherwise, it seems neither systematically better nor worse.

Why Nothing Like This Should Work: The expansion of $$T_k(y)$$ at
$$zeta$$ contains nonzero terms of the form
$$A_{i,j}(zeta-y)^j$$ for all positive integers $$j$$. (I’m writing $$i=j+k$$ to
match up with the earlier indexing.) The truncation at Step III throws all
these terms away. Therefore the expansion around the origin in Step IV
ignores (among other things) the contribution of $$A_{ij}$$ to the estimate
for $$t_{ij}$$. So unless we can control the sizes of the $$A_{ij}$$, we
have absolutely no control over the quality of the estimate.

And in fact, even when $$k=0$$, the $$A_{j,j}$$ are not small.
For example, $$t_{8,8}=8323$$ and my estimate for $$t_{8,8}$$ is a
respectable $$8962.52$$. But $$A_{8,8}$$, which should have contributed to that
estimate and got truncated away, is equal to $$58035$$. It seems remarkable
that I can throw away multiple terms of that size and have the effects nearly cancel.
I’d like a conceptual explanation for this.

But When $$i=j$$, It Works Anyway:

and these get even better if you truncate just slightly farther out.

Why any explanation can’t be too general:

## Extension of polynomial approximation on a subset

Let $$f$$ be a smooth function on $$(0,1)$$ and $$E$$ a measurable subset of $$(0,1)$$ with $$m(E)>0$$ (positive Lebesgure measure). Suppose that $$p$$ is a polynomial such that $$|p(x)-f(x)| for $$xin E$$ and some small $$delta$$.

What I’d like to say is something like: if the degree of $$p$$ is large enough, then $$|p(x)-f(x)|$$ is small on the entire $$(0,1)$$.

Would anyone point me to some references? Thanks!

## Approximation of a function by a polynomial (Chebyshev First Kind, Bernstein, etc…) containing only even degrees and constants in a given Range[a,b]

In Mathematica, how can I create a polynomial function containing only even degrees and constants?

That is, I have a function:

$$f(x)=frac{pi ^2}{left(frac{pi }{2}-tan ^{-1}(k (x-1))right)^2}$$

And I’m looking for a function that generates such a polynomial approximation on arbitrary range with Chebyshev First Kind, Bernstein or another in form:

$$p(x) = c_0 + c_1 cdot x^2+ … + c_m cdot x^m$$

where $$m$$ – maximum even degree of polynom.

``````ClearAll("Global`*")
pars = {k = 1, Subscript((Omega), 0) = 1, (CapitalDelta) = 5}
f = (-ArcTan(k (x - Subscript((Omega), 0))) + Pi/2)/Pi
Plot(f, {x, 0, 5}, PlotRange -> All)
p(x_) = 1/f^2
Plot(1/f^2, {x, 0, 5}, PlotRange -> All)
P = Collect(
N(InterpolatingPolynomial({{0, 0}, {Subscript((Omega), 0)/2,
p(Subscript((Omega), 0)/2)}, {Subscript((Omega), 0),
p(Subscript((Omega), 0))}, {(CapitalDelta)/2,
p((CapitalDelta)/2)}, {(CapitalDelta), p((CapitalDelta))}},
x)), x) // Simplify
Plot({1/f^2, P}, {x, 0, 2}, PlotRange -> All)
``````

Somewhere on the Internet I read information that in order to get rid of odd degrees you need to include the point {0,0} in the polynomial, which I did with the usual InterpolatingPolynomial command.
Despite this, the odd degrees in the polynomial have been preserved.

## at.algebraic topology – relationship between “linear approximation” to immersions and formal immersions

Here, I am regarding $$mathrm{Imm}(-,N)$$ as a presheaf on the open sets of some manifold $$M$$

If we take $$mathrm{Imm}^f(-,N)$$ to be the sheaf of formal immersion (an element is an injective bundle map on the tangent bundles covering an arbitrary map $$g:M to N$$.)

I’ve gathered from context that there is a connection between formal immersions and the sheaf of sections $$mathcal{F}(U):=Gamma(V_n(TU) times_{GL_n} mathrm{Imm}(mathbb R^n,N))$$ (defined for an arbitrary open $$mathbb R^n to M$$. Please see definition 3.2 and the preceeding paragraph in the linked notes for more details.) This is referenced as the linear approximation to the sheaf of immersions, but I don’t know why, although I assume it agrees in the sense of functor calculus.

My questions are the following: is the topological sheaf of formal immersions isomorphic to $$mathcal{F}$$? If not, is there some relationship? If so, is the scanning map of Segal compatible (via some isomorphism) with the “obvious” maps $$mathrm{Imm}(U,N) to mathrm{Imm}^f(U,N),,,, f mapsto (df,f)$$?

## matrix – Pade approximation of vector or operator functions

`PadeApproximant` is a very useful function of MA that starts with a truncated Taylor series

$$f(x)approxsum_{k=0}^{l} c_k (x-x_0)^k,$$

and represents them in a rational form

$$f(x)approxfrac{sum_{i=1}^{m} a_i (x-x_0)^i}{sum_{j=1}^{n} b_j (x-x_0)^j}.$$

Thus, Padé approximantion: $$c_k rightarrow (a_i,b_j)$$.

Recently, I faced a problem to perform such approximation for vector (or even operator) functions $$hat f(x)$$. The requirement is, however, that there is a common denominator for all components:

$$hat f(x)approxfrac{sum_{i=1}^{m} hat a_i (x-x_0)^i}{sum_{j=1}^{n} b_j (x-x_0)^j}.$$

In other words, the $$b_j$$ are scalars. Because of this, the `PadeApproximate` function cannot be used. I know that such methods exist and used, for instance, to speed up computations of the matrix exponents. In fact, the whole section 8 in Padé approximants by Baker and Graves-Morris is devoted to this problem. However, the book is written in too formalized way that I cannot even figure out working formulas. I hope that someone here has an experience with this method.

The basic example in the documentation is

``````PadeApproximant(Exp(x), {x, 0, {2, 3}})
``````

Here the desired syntax is

``````MatrixPadeApproximant(MatrixExp({{1,x},{x,x^2}}), {x, 0, {2, 3}})
``````

Thus, for matrix Padé approximantion: $$hat c_k rightarrow (hat a_i,b_j)$$ is required.

## When approximating a function using a Taylor series, can your approximation get worse by adding the following term?

I have seen the rest of the theorem for a Taylor approximation of a function. I also saw that the limit when n approaches infinity, the rest goes to zero.

I was wondering if when approximating a function using a Taylor series, is it possible that adding the following term actually worsens your approximation? Do you have to solve the remaining equation to find out for each individual case, or is there a general rule about it?

## theory of approximation – Upper limit for the eigenvalue of the symmetric nucleus

Let $$V in L ^ 2 (D times D)$$ be a symmetrical core defining the compact and non-negative integral operator
$$begin {equation} mathcal {V}: L ^ {2} (D) rightarrow L ^ {2} (D), quad ( mathcal {V} u) (x) = int_ {D} V left (x, x ^ { prime} right) u left (x ^ { prime} right) dx ^ { prime} end {equation}$$

Yes $$V$$ is piecewise analytical on $$D times D$$ and $$( lambda_m) ​​_ {m> 1}$$ is the sequence of $$m$$-th greatest eigenvalue of its associated operator, then there are constants $$c_ {1, V}, c_ {2, V}$$ as a function only of V such that $$V$$ such as
$$begin {equation} 0 leq lambda_ {m} leq c_ {1, V} exp (-c_ {2}, v ^ {1 / d}), quad forall m geq 1. end {equation}$$

Indeed, it is proposition 2.18 from paper (1) and the proof refers to proposition 2.17, however, proposition 2.17 did not provide any proof.

I am very confused as to how we can have the term exponential.
Anyone have an idea?

Reference
(1) Schwab, Christoph and Radu Alexandru Todor. "Karhunen – Loève approximation of random fields by generalized rapid multipolar methods." Journal of Computational Physics 217.1 (2006): 100-122.

## approximation – fast and stable calculation of x * tanh (log1pexp (x))

$$f (x) = x tanh ( log (1 + e ^ x))$$

The function can be easily implemented using a stable log1pexp without any significant loss of precision. Unfortunately, this is cumbersome in terms of calculations.

Is it possible to write a more direct and faster numerically stable implementation?

The context: It is the activation function of mish that has become popular in deep learning today. YOLOv4 uses it. Unfortunately, its first activation step is as slow as the first convolution operation on some GPUs. Have you ever thought that an activation would become a performance bottleneck in a neural network?

## probability – Expression of the bound on the error of a Poisson approximation?

I hope someone can help clarify a question I had regarding Poisson approximation and its applications. My textbook presents the following theorem, which I have trouble understanding. My assumption is that the first capital $$N$$ is supposed to be a lowercase $$n$$, and that all $$N$$represent a discrete random variable with a binomial distribution:

Theorem 2.5. Consider independent events $$A_i$$, $$i = 1,2, …, n$$, with probabilities $$p_i = P (A_i)$$. Let $$N$$ be the number of events that happen, let $$lambda = p_1 + ··· + p_n$$and leave $$Z$$ have a Poisson distribution with parameter $$lambda$$. Then for any whole set $$B$$,$$left vert P (N in B) -P (Z in B) right vert leq sum_ {i = 1} ^ n p_i ^ 2 tag {2.14}$$ We can simplify the right side by noting $$sum_ {i = 1} ^ n p_i ^ 2 leq max_i p_i sum_ {i = 1} ^ n p_i = lambda max_ip_i$$ That said if all $$p_i$$ are small then the distribution of $$N$$ is close to a Fish with parameter $$lambda$$. Outlet $$B = {k }$$, we see that the individual probabilities $$P (N = k)$$ are close to $$P (Z = k)$$, but this result says more. The likelihood of such events $$P (3 leq N leq 8)$$ are close to $$P (3 leq N leq 8)$$ and we have an explicit limit on the error.

The text then refers to an example comparing the exact probability of getting exactly a double $$6$$ in twelve rolls of a pair of corresponding Poisson approximation dice:

Suppose we roll two dice $$12$$ times and we leave $$D$$ be the number of times a double $$6$$ appears. Here, $$n = 12$$ and $$p = 1/36$$, so $$np = 1/3$$. We compare now $$P (D = k)$$ with Poisson approximation for $$k = 1$$.
$$k = 1 text {exact answer:} , , , , , , P (D = 1) = left (1- frac {1} {36} right) ^ {12 } = 0.7132$$ $$text {Poisson approximation:} , , , , , , P (D = 1) = e ^ {- 1/3} = 0.7165$$

For a concrete situation, consider (the example above), where $$n = 12$$ and all $$p_i = 1/36$$. In this case, the error limit is $$sum_ {i = 1} ^ {12} p_i ^ 2 = 12 left ( frac {1} {36} right) ^ 2 = frac {1} {108} = 0.00926$$ while the error of approximation for $$k = 1$$ East $$0.0057$$.

In this context, I think the mistake they are referring to is More precisely that of the Poisson approximation to the binomial distribution, and not the Poisson approximation to another type of distribution. Can anyone with a more complete understanding of the Poisson distribution (and its relationship to the binomial) confirm or refute this statement? I would also be curious to know where the proof for this theorem comes from, as my text offers no obvious references.

Quote: Elementary probability for applications, Rick Durrett.