# Empirical Data Plot

I wish to make a plot of N vs Time where $N=$ # of edges in a graph. i.e. i want to make an empirical estimate of the time complexity it takes to compute certain things about a graph based on its number of edges.

so i guess i'm wanting to know how i can plot these data points in sage and find a polynomial that best fits?

i.e.
N = [(16,28,59,120)]
T = [(0.135,0.523,7.36,248)] --where time is measured in seconds

i'm a beginner so any help will be much appreciated.

edit retag close merge delete

1

It's highly unlikely that you'll want a polynomial that gives best fit; you can get a polynomial with a PERFECT fit if you choose a high enough degree polynomial! You may want to look at general books on data analysis.

( 2013-03-15 08:47:00 +0100 )edit

That said, you could certainly try fitting to various things like x^2 or x log(x) with find_fit, see http://www.sagemath.org/doc/reference/sage/numerical/optimize.html#sage.numerical.optimize.find_fit

( 2013-03-15 08:47:27 +0100 )edit

well, i just want to know the order of growth. i.e. as the edges increase what is the order of growth in terms of time taken to compute my function.

( 2013-03-15 10:28:19 +0100 )edit

@kcrisman is right. You can always have a polynomial of degree at most len(N)-1 that perfectly fits the solution. Finding such a polynomial would require the same time as inverting a (len(N)-1) x (len(N)-1) matrix. There possibly exists a more "straightforward" (but complex) expression too since the inverse of a Vandermond matrix is known.

( 2013-03-15 10:33:38 +0100 )edit

Sort by » oldest newest most voted

Here is a sample of what you can do in Sage.

var('a,n')
model(x) = a*x^n
N = [16,28,59,120]
T = [0.135,0.523,7.36,248]
data = zip(N,T)
@interact
def _(n=[1..5]):
L = find_fit(data,model.subs(n=n))
show(plot(model.subs(a=L[0].rhs(),n=n),(x,0,120))+points(data,pointsize=20,color='black'))


See it live and in action here. Note that with four data points, of course the fifth-degree one is perfect! My guess is that here you may have (yuck) exponential growth, but without more data it's hard to be sure - definitely could also be $O(x^3)$ or $O(x^4)$ or something.

more

Actually, a degree 3 polynomial is sufficient to fit the data. See [this interact](http://sagecell.sagemath.org/?z=eJxlkD9PwzAQxfd8iluq3IEV5U8DpVIkmFiiDtCtCshNbGrFjSM7haiI746TbHCLn3zv_c6-HRRwSO5YumH5A0vSuAr201UcJVnO4ihPM3YfZd6w3lRBwwfuu1fV447tKQgeVTcIy-shaIQEKbEt1rQNwNf4ya3zZn_gIRxXTbhCRSCNBQWqA8u7D4EtVTTbz6YRGkfyCXc545w-qApuYHxT_1NLqPRuqbrmXaoBp8exhbN0n16eX73hu_ScSJ8c0hZmbSf9l6lFhyXRD8xZdzJf2Gsz4AyM3OXocAISw5HF06aIbnvj_--WyYtWV1GkMauNNrYIj5rXbUj0C89DZ5Y=&lang=sage). The complexity is polynomial in the number of data points.

( 2013-03-15 12:45:47 +0100 )edit

Time = [0.135, 0.523, 2.72, 7.36, 23.2, 45.1, 96.3, 248]   Edges = [16.0, 28.0, 47.0, 59.0, 77.0, 84.0, 98.0, 120.0] -- seems like it's more like O(N^5) growth with these additional points?

( 2013-03-15 18:36:18 +0100 )edit

Try [this interact](http://sagecell.sagemath.org/?z=eJxNjs1qwzAQhO96Ct0ktWKR_52CoA8QcsrNuEHxT2KqSsZyW5On78q5dFmGgZlP2h-zcGakY4J8-X6wfBNUU_OyfThyQtckJShJ0zpqXkUtDlGr3dd51MOeJqkC1VJyjpiCJCskVVCkGeJQpYhAVqLPAH1eQIJgCTHN65b0ZjUIPqaZn-RZkPfJrcNiupX0w0gv3OkmBSha8UYozhG74-T6yzitPLJyPx_C9zVg1wmx18Ld__LZ-pX_i40-NqqF5R64kLEr-SaVxPuFeJ09fhyeTz799Bh0qmTnrV80u1rTfTIhQ2fsoJn1N1wm_gCVCliN&lang=sage) with log log scales; that's often more helpful for determining these things. @ppurka - my model intentionally started with zero edges = zero time, so degree three wouldn't suffice.

( 2013-03-16 12:54:01 +0100 )edit

awesome thanks

( 2013-03-17 23:14:09 +0100 )edit