Ask Your Question

Distribution of a list of numbers

asked 2012-02-03 13:03:20 +0200

v_2e gravatar image

updated 2014-10-28 21:15:16 +0200

kcrisman gravatar image

Hello! I'd like to know if there is a Sage function to retrieve a distribution of some experimental data. I have a 1D list of measured values and I need to obtain a second list containing the distribution of that data, for example, to fit it with some Gaussian or Poisson curve.


edit retag flag offensive close merge delete

3 Answers

Sort by ยป oldest newest most voted

answered 2012-02-04 09:20:42 +0200

Jason Grout gravatar image

The numpy histogram function will bin values together:

The Sage Timeseries histogram function also bins data:

edit flag offensive delete link more

answered 2012-02-03 17:37:23 +0200

Shashank gravatar image

updated 2012-02-04 14:02:19 +0200

I use matplotlib for that purpose. Have a look at the first example on the page

Edit: To get a list use

import numpy as np
import pylab as P

mu, sigma = 200, 25
x = mu + sigma*P.randn(10000)

n, bins, patches = P.hist(x, 50, normed=1, histtype='stepfilled')
P.setp(patches, 'facecolor', 'g', 'alpha', 0.75)

y = P.normpdf( bins, mu, sigma)
l = P.plot(bins, y, 'k--', linewidth=1.5)
print bins
edit flag offensive delete link more


Oh, I see. Indeed, it may be a good solution. But what do you import Numpy for in this case? Thanks!

v_2e gravatar imagev_2e ( 2012-02-05 07:42:59 +0200 )edit

Sorry I edited an example file from net and forgot to delete the line

Shashank gravatar imageShashank ( 2012-02-05 13:44:03 +0200 )edit

answered 2012-02-04 08:03:23 +0200

v_2e gravatar image

I used the hist() function from Matplotlib also, but does it allow to store the distribution in the list for? The problem is that I need to do some fitting to this distribution, and not only to visualize it.

I have found a way to achieve approximately what I need using the histogram() function from scipy.stats module.

Something like this:

from scipy.stats import histogram
distribution_list = histogram(data_list, numbins=10, defaultlimits=(90,164))

It gives the values for the bins, the first bin start and the bin width. To obtain the distribution list with the actual values for further work (like fitting, plotting, adding to something, etc.) one can do something like this:

from scipy.stats import histogram
distribution_list = histogram(data_list, numbins=10, defaultlimits=(90,164))

actual_distribution = []
i = 0
for bin in distribution_list[0]:
    actual_distribution.append((distribution_list[1]+i*distribution_list[2], bin))

But maybe there are some other (more simple or built-in or smarter) ways to get a distribution out of a list of numbers?


edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools


Asked: 2012-02-03 13:03:20 +0200

Seen: 1,059 times

Last updated: Oct 28 '14