# Help finding expected value of sum of random variables

I'm very much a Sage newbie, and I'm having trouble solving for the expected value of a discrete summation. I'll admit that I'm well removed from statistics, linear algebra, and econometrics, so it might be that what I'm trying to accomplish is illogical.

Consider the following parameters:

E ~ N(0,1) (i.e., E is a random variable distributed standard normal)

M ~ U(1,m) (i.e., M is a uniformly distributed random variable varying between 1 and m)

A = | Σ E×M | over the interval (1,N) (or the absolute value of the summation of E times M over interval 1,N)

I'd like to find the expected value of A as a function of N (or the limit of A as N goes to infinity, assuming A converges to a real number). Can I use Sage to solve for something like this (assuming it's solvable, which I think it is based on some simulation results)?

edit retag close merge delete

What does your summation mean in the definition of A? It is mathematically unclear to me.

There are too (two) many appearances of N, while m is used and forgotten. We can also be delighted to see E and M, the only two letters used to denote either expectation or mean, as the names of two random variables. We may use X and V instead. Then the two variables should be independent, else nothing can be computed. The statement should make this clear. Since i am inside a comment, there is a remark that is appropriate.Since everything in probability has to go quick and intuitive, we have a lot of "probability theory without probability spaces". Instead, one has a dictionary of concepts (e.g. density) and a fenomenological way to manipulate them without a solid fundamental shortcut. In my opinion, sage and similar computer algebra systems help to see and use the probability space.

Sort by » oldest newest most voted

The question is not well defined. The best way to "do something" is to "guess" the or a related question, and answer this one. (The original question must have been in the same circle of ideas, and should have been touched with similar vehicles.)

Restatement:

Let us fix an integer $K>1$. We consider

• $K$ random variables $Z_1,\dots, Z_K$ which follow the standard normal distribution $N(0,1^2)$,

• and $K$ random variables $V_1,\dots, V_K$ which follow the uniform distribution on the intervals $(1,2),\dots,(1,K+1)$ - respectively.

The family of all these variables should be an independent family of random variables defined on the same probability space. Let $\mathbb{E}$ be the expectation, the mean on this space. We build $X(K)=|Z_1V_1+\dots+Z_KV_K|$ and its expectation $f(K)= \mathbb{E} X(K)=\mathbb{E}\Big[\ |Z_1V_1+\dots+Z_KV_K|\ \Big]$ as a function of $K$. The exercise asks for

• heuristical arguments, that may lead to an asymptotic $F(K)=O(K^?)$ in big-O-notation, and

• a computer simulation that supports the heuristic.

This was the complicated part of the answer. From this point things go straightforward: The random variable under the modulus has mean zero since $\mathbb{E} [Z_jV_j] = \mathbb{E} [Z_j] \mathbb{E} [V_j] = 0\cdot \mathbb{E} [V_j]=0$, and terms have variance

Var$\displaystyle[Z_jV_j] = \mathbb{E} [(Z_jV_j)^2] -\mathbb{E} [Z_jV_j]^2 =\mathbb{E} [Z_j^2] \mathbb{E} [V_j^2] -\mathbb{E} [Z_j]\mathbb{E} [V_j]^2$

$\qquad\displaystyle = 1\cdot \mathbb{E} [V_j^2]-0 =\frac 1{j}\int_1^{j+1}x^2\; dv=\frac 1{3j}((j+1)^3-1^3)$

and so on.

We used independence. Further using the independence, the variance of the sum is the sum of the variances and we compute $\displaystyle\sum_{1\le j\le K}\frac 1{3j}((j+1)^3-1^3)$:

sage: var( 'j,K' );
sage: latex( sum( 1/3/j * ( (j+1)^3-1^3 ), j, 1, K ).factor() )
\frac{1}{9} \, {\left(K^{2} + 6 \, K + 14\right)} K


Then we expect: $\displaystyle \frac{Z_1V_1+\dots+Z_KV_K}{\displaystyle\sqrt{\frac{1}{9} {\left(K^{2} + 6 \, K + 14\right)} K}} \sim N(0,1^2)$ .

(This is the optimistic law of large numbers, applied outside mathematics when we do not have time to check the details.)

For a big $K$ we can optimistically and statistically approximate the RHS with a normally distributed $Y\in N(0,1^2)$.

Then $\mathbb{E}|Y|$ is twice the integral on $[0,\infty)$ from $\frac1 {\sqrt{2\pi}}y\exp(-y^2/2)$.

Putting all together we get: $\displaystyle f(K)\sim \frac 2{3\sqrt {2\pi}} \sqrt{K\left(K^{2} + 6 K + 14\right)}$ .

That's the maths.

Now we simulate and we ask also for the values respecting the guessed asymptotic:

The simulation...

for pow in [ 2,3,4,5 ]:
K = 10 ** pow
SAMPLES = []    # and we append
for experiment in [ 1..99 ]:
SAMPLES . append( abs( sum( [ gauss(0,1) * uniform( 1,k+2 ) for k in range(K) ] ) ) )
print "%s -> %s" % ( K, mean( SAMPLES ) )


We get

100 -> 294.711735785
1000 -> 8714.8222098
10000 -> 249403.620665
100000 -> 8734793.09067


Next time we will see other numbers above.

And the asymptotic:

for pow in [ 2,3,4,5 ]:
K = 10 ** pow
print "%s -> %f" % ( K, 2/3/sqrt(2*pi) * sqrt( K * ( K^2 + 6*K + 14 ) ) )

100 -> 274.004912
1000 -> 8435.694028
10000 -> 266041.315371
100000 -> 8410694.055422


I did not check the details, but we strongly encourage $f(K)\in O(K^{3/2})$, even more, we have

$\displaystyle f(K)\sim\frac 1{\sqrt{2\pi}}\cdot\frac 23\cdot K^{3/2}$.

more