# Pearson correlation

Is there a method in Sage to compute Pearson correlation of a finite set? I have data points $(x_1,y_1),\ldots,(x_6,y_6)$ and wondering if there is a shorter way to do it than just write one long expression.

edit retag close merge delete

Sort by » oldest newest most voted

You can call R from within Sage as follows. The command r.cor does require that you give it two lists containing the coordinates from your data set. In the example below, I'll use a list comprehension in Python to do that.

data=[[1,3],[1.5,2],[5,7]]
xdata=[a for a in data]
ydata=[a for a in data]
r.cor(xdata,ydata)


You can also get the regression line easily.

var('a,b')
f(x)=a*x+b
ans=find_fit(data,f)
g(x)=ans.rhs()*x+ans.rhs()
plot(g(x),(x,0,10))+list_plot(data)

more

Instead of doing two list comps, you could use zip and unpacking: x, y = zip(*data), for example. (r.cor(*zip(*data)) would do it in one line, but that looks a little too cute to use.)

Hmm. Unable to start r. I downloaded Sage from its WWW-site and installed that.

Alternatively, you can use scipy's: import scipy.stats and then scipy.stats.pearsonr(xdata, ydata)  -- type help(scipy.stats.pearsonr) after the import for the details.

I actually like the plots from stats in scipy better than in R for use in Sage. I'm surprised by the unable to start r error with r.cor. I've gotten it when using R's graphics in Sage but not with the basic stats commands.