# frecuency and numbers - Numpy - mean, histogram and more

Hi experts!

If i have two numpy arrays: A = califications, B = number of alumns with calification 'a' (I mean: B[j] is the number of alumni in the class with test calification a[j]):

How can i calculate the MEAN VALUE and STANDARD DEVIATION of califications, and build a histogram of califications using numpy.

Thanks a lot!

edit retag close merge delete

Sort by » oldest newest most voted

Here is how you would do it, assuming A and B are NumPy arrays:

import matplotlib.pyplot as plt
mean = A.mean()
standard_deviation = A.std()
plt.hist(B)
plt.savefig('histogram.png')


Unfortunately Sage does not yet have good histogram functionality; however, it does have mean and standard deviation, with the benefit that you can get exact output:

sage: std(range(10))
sqrt(55/6)


EDIT: Here's a way to take (integer) frequencies into account:

def mean2(data, freq):
return sum(map(prod, zip(data, freq))) / sum(freq)

def std2(data, freq):
data2 = []
for index, item in enumerate(data):
data2.extend([item] * freq[index])
return std(data2)

more

Hello Eviatar Bach.

mean = A.mean()
standard_deviation = A.std()


like you wrote, you are not cosidering the 'frecuency' (weight) of califications (array B).

Surfing in web, I found the function np.average:

note_mean=np.average(A,weights=B).


I still can not find a function to calculate the standard deviation. Any idea?

Thanks a lot!

I think R will handle working with a frequency table if you set the table up correctly. Otherwise, you might need to write your own code for this.

Sorry, I misunderstood the question. I edited my post with code which I think does what you want.

Only remains to construct the bar graph from these two arrays (data & freq).

Any idea??

Thanks a lot!

You can do the bar graph using pyplot from matplotlib as follows:

import matplotlib.pyplot as plt
categories=[1,2,3,4]
frequencies=[5,2,6,10]
plt.figure()
plt.bar(categories,frequencies,width=1)
plt.savefig('tmp.png')

more

1