ASKSAGE: Sage Q&A Forum - RSS feedhttps://ask.sagemath.org/questions/Q&A Forum for SageenCopyright Sage, 2010. Some rights reserved under creative commons license.Wed, 03 Mar 2021 18:49:09 +0100Fit model datahttps://ask.sagemath.org/question/56001/fit-model-data/I have a very naive question about the `find_fit` model. Is there some way to choose the "best" one? For example, in this simple situation:
R = [[0,1],[4,2],[8,4],[12,8],[16,16],[20,32],[24,64],[28,128],[32,256]]
p = scatter_plot(R)
p.show()
model(x) = 2^(a*x)
find_fit(R,model)
It will return `a==0.25`. But for choosing this model `2^(a*x)`, of course, I already know the answer. If you choose something more generic that makes more sense like
model(x) = a^(b*x)
It will return `a == 1.1017945777232188, b == 1.7875622665869968`. So I was wondering if there is a way to get the optimized model for it.
Many thanks.Wed, 03 Mar 2021 15:41:04 +0100https://ask.sagemath.org/question/56001/fit-model-data/Comment by Emmanuel Charpentier for <p>I have a very naive question about the <code>find_fit</code> model. Is there some way to choose the "best" one? For example, in this simple situation:</p>
<pre><code>R = [[0,1],[4,2],[8,4],[12,8],[16,16],[20,32],[24,64],[28,128],[32,256]]
p = scatter_plot(R)
p.show()
model(x) = 2^(a*x)
find_fit(R,model)
</code></pre>
<p>It will return <code>a==0.25</code>. But for choosing this model <code>2^(a*x)</code>, of course, I already know the answer. If you choose something more generic that makes more sense like</p>
<pre><code>model(x) = a^(b*x)
</code></pre>
<p>It will return <code>a == 1.1017945777232188, b == 1.7875622665869968</code>. So I was wondering if there is a way to get the optimized model for it.
Many thanks.</p>
https://ask.sagemath.org/question/56001/fit-model-data/?comment=56007#post-id-56007You are trying to reduce to computation a choice among a (nn-enumerable) infinite set of possible models. This is the subject of statistics (among other) ; it is usually foolish to tackle this question without prior subject matter knowledge about the origin of the data you are trying to model.
Furthermore, the meaning of "best fit" is far from obvious : smallest error ? Smallest squared error ? Smallest predictive error ?
Do you want to use it for further prediction ? If so, can you express your desired performance in terms of (predictive) error metrics ? In terms of distribution of these errors ? In terms of maximal error ? This is highly dependent of the *goal(s)* you have...
If you take into account the complexity (or (lack of) parsimony) of a model, lots of choice have to be made..Wed, 03 Mar 2021 18:49:09 +0100https://ask.sagemath.org/question/56001/fit-model-data/?comment=56007#post-id-56007Comment by rburing for <p>I have a very naive question about the <code>find_fit</code> model. Is there some way to choose the "best" one? For example, in this simple situation:</p>
<pre><code>R = [[0,1],[4,2],[8,4],[12,8],[16,16],[20,32],[24,64],[28,128],[32,256]]
p = scatter_plot(R)
p.show()
model(x) = 2^(a*x)
find_fit(R,model)
</code></pre>
<p>It will return <code>a==0.25</code>. But for choosing this model <code>2^(a*x)</code>, of course, I already know the answer. If you choose something more generic that makes more sense like</p>
<pre><code>model(x) = a^(b*x)
</code></pre>
<p>It will return <code>a == 1.1017945777232188, b == 1.7875622665869968</code>. So I was wondering if there is a way to get the optimized model for it.
Many thanks.</p>
https://ask.sagemath.org/question/56001/fit-model-data/?comment=56002#post-id-56002Up to a small numerical difference the results are the same, because for the second set of parameters $a^b \approx 2^{1/4}$. What is the question?Wed, 03 Mar 2021 17:13:21 +0100https://ask.sagemath.org/question/56001/fit-model-data/?comment=56002#post-id-56002Answer by slelievre for <p>I have a very naive question about the <code>find_fit</code> model. Is there some way to choose the "best" one? For example, in this simple situation:</p>
<pre><code>R = [[0,1],[4,2],[8,4],[12,8],[16,16],[20,32],[24,64],[28,128],[32,256]]
p = scatter_plot(R)
p.show()
model(x) = 2^(a*x)
find_fit(R,model)
</code></pre>
<p>It will return <code>a==0.25</code>. But for choosing this model <code>2^(a*x)</code>, of course, I already know the answer. If you choose something more generic that makes more sense like</p>
<pre><code>model(x) = a^(b*x)
</code></pre>
<p>It will return <code>a == 1.1017945777232188, b == 1.7875622665869968</code>. So I was wondering if there is a way to get the optimized model for it.
Many thanks.</p>
https://ask.sagemath.org/question/56001/fit-model-data/?answer=56003#post-id-56003To obtain the version of the model with the "best fit"
for the parameters found by `find_fit`, use `subs`.
If that was not the question, can you clarify it?
Example of applying `subs` to the result of `find_fit`:
sage: R = [[0, 1], [4, 2], [8, 4], [12, 8], [16, 16],
....: [20, 32], [24, 64], [28, 128], [32, 256]]
sage: p = scatter_plot(R)
sage: a = var('a')
sage: model(x) = 2^(a*x)
sage: fit = find_fit(R, model)
sage: f(x) = model(x).subs(fit)
sage: f
x |--> 2^(0.25*x)
sage: a, b = SR.var('a, b')
sage: model(x) = a^(b*x)
sage: fit = find_fit(R, model)
sage: fit
[a == 1.1017948048163804, b == 1.7875584659241468]
sage: g(x) = model(x).subs(fit)
sage: g
x |--> 1.1017948048163804^(1.7875584659241468*x)
Wed, 03 Mar 2021 17:17:18 +0100https://ask.sagemath.org/question/56001/fit-model-data/?answer=56003#post-id-56003Comment by phcosta for <p>To obtain the version of the model with the "best fit"
for the parameters found by <code>find_fit</code>, use <code>subs</code>.</p>
<p>If that was not the question, can you clarify it?</p>
<p>Example of applying <code>subs</code> to the result of <code>find_fit</code>:</p>
<pre><code>sage: R = [[0, 1], [4, 2], [8, 4], [12, 8], [16, 16],
....: [20, 32], [24, 64], [28, 128], [32, 256]]
sage: p = scatter_plot(R)
sage: a = var('a')
sage: model(x) = 2^(a*x)
sage: fit = find_fit(R, model)
sage: f(x) = model(x).subs(fit)
sage: f
x |--> 2^(0.25*x)
sage: a, b = SR.var('a, b')
sage: model(x) = a^(b*x)
sage: fit = find_fit(R, model)
sage: fit
[a == 1.1017948048163804, b == 1.7875584659241468]
sage: g(x) = model(x).subs(fit)
sage: g
x |--> 1.1017948048163804^(1.7875584659241468*x)
</code></pre>
https://ask.sagemath.org/question/56001/fit-model-data/?comment=56004#post-id-56004It is exactly the point. Thank you very much for the suggestion.Wed, 03 Mar 2021 17:31:32 +0100https://ask.sagemath.org/question/56001/fit-model-data/?comment=56004#post-id-56004