# Distribution of a list of numbers Hello! I'd like to know if there is a Sage function to retrieve a distribution of some experimental data. I have a 1D list of measured values and I need to obtain a second list containing the distribution of that data, for example, to fit it with some Gaussian or Poisson curve.

Thanks.

edit retag close merge delete

Sort by » oldest newest most voted

The numpy histogram function will bin values together: http://docs.scipy.org/doc/numpy/refer...

The Sage Timeseries histogram function also bins data: http://www.sagemath.org/doc/reference...

more I use matplotlib for that purpose. Have a look at the first example on the page

http://matplotlib.sourceforge.net/exa...

Edit: To get a list use

import numpy as np
import pylab as P

mu, sigma = 200, 25
x = mu + sigma*P.randn(10000)

n, bins, patches = P.hist(x, 50, normed=1, histtype='stepfilled')
P.setp(patches, 'facecolor', 'g', 'alpha', 0.75)

y = P.normpdf( bins, mu, sigma)
l = P.plot(bins, y, 'k--', linewidth=1.5)
print bins

more

Oh, I see. Indeed, it may be a good solution. But what do you import Numpy for in this case? Thanks!

Sorry I edited an example file from net and forgot to delete the line I used the hist() function from Matplotlib also, but does it allow to store the distribution in the list for? The problem is that I need to do some fitting to this distribution, and not only to visualize it.

I have found a way to achieve approximately what I need using the histogram() function from scipy.stats module.

Something like this:

from scipy.stats import histogram
distribution_list = histogram(data_list, numbins=10, defaultlimits=(90,164))


It gives the values for the bins, the first bin start and the bin width. To obtain the distribution list with the actual values for further work (like fitting, plotting, adding to something, etc.) one can do something like this:

from scipy.stats import histogram
distribution_list = histogram(data_list, numbins=10, defaultlimits=(90,164))

actual_distribution = []
i = 0
for bin in distribution_list:
actual_distribution.append((distribution_list+i*distribution_list, bin))
i+=1


But maybe there are some other (more simple or built-in or smarter) ways to get a distribution out of a list of numbers?

Thanks.

more