# Pearson correlation

Is there a method in Sage to compute Pearson correlation of a finite set? I have data points $(x_1,y_1),\ldots,(x_6,y_6)$ and wondering if there is a shorter way to do it than just write one long expression.

edit retag close merge delete

Sort by » oldest newest most voted

You can call R from within Sage as follows. The command r.cor does require that you give it two lists containing the coordinates from your data set. In the example below, I'll use a list comprehension in Python to do that.

data=[[1,3],[1.5,2],[5,7]]
xdata=[a[0] for a in data]
ydata=[a[1] for a in data]
r.cor(xdata,ydata)

You can also get the regression line easily.

var('a,b')
f(x)=a*x+b
ans=find_fit(data,f)
g(x)=ans[0].rhs()*x+ans[1].rhs()
plot(g(x),(x,0,10))+list_plot(data)
more

Instead of doing two list comps, you could use zip and unpacking: x, y = zip(*data), for example. (r.cor(*zip(*data)) would do it in one line, but that looks a little too cute to use.)

( 2012-10-13 11:19:27 -0500 )edit

Hmm. Unable to start r. I downloaded Sage from its WWW-site and installed that.

( 2012-10-13 12:09:41 -0500 )edit

I thought R came with the standard Sage installation.

( 2012-10-13 16:06:22 -0500 )edit

Alternatively, you can use scipy's: import scipy.stats and then scipy.stats.pearsonr(xdata, ydata)  -- type help(scipy.stats.pearsonr) after the import for the details.

( 2012-10-13 17:10:49 -0500 )edit

I actually like the plots from stats in scipy better than in R for use in Sage. I'm surprised by the unable to start r error with r.cor. I've gotten it when using R's graphics in Sage but not with the basic stats commands.

( 2012-10-13 17:36:42 -0500 )edit