ASKSAGE: Sage Q&A Forum - Individual question feedhttp://ask.sagemath.org/questions/Q&A Forum for SageenCopyright Sage, 2010. Some rights reserved under creative commons license.Mon, 05 Mar 2018 17:28:30 -0600How to plot similarity of two data sets in Sage?http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/I'm performing some simulations, and at the end I have a CSV file with three columns. One column holds the values for the x-axis, which was also input to the simulation and theoretical calculations, second one holds theoretically expected values, and the other column holds the values obtained by the simulation. I was planning to plot something like this:
![image description](https://ask.sagemath.org/upfiles/15194978051374454.png)
But that does not look good in my case, as the values in y-axis normally double, and the values for the x-axis exponentially increase, so most of the points end up getting collected at the lower left part, near the intersection of x-axis and y-axis of the plot. Therefore, I need a different way to plot such data, which will be more visually appealing and inform how close the simulation results are to the theoretical expected ones. For example, some of my values can be seen below:
x = [2, 4, 8, 16, 32, 64] # partially removed for brevity
expected = [47.9995, 95.9783, 191.9127, 383.9708, 767.8831] # partially removed for brevity
simulated = [48, 96, 191.8, 383.8, 767.4] # partially removed for brevity
What is a good way to plot such a data that doubles in the y-axis and exponentially increases on the x-axis all the time, and to view how similar the two datasets actually are?Mon, 05 Mar 2018 09:35:11 -0600http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/Comment by ninho for <p>I'm performing some simulations, and at the end I have a CSV file with three columns. One column holds the values for the x-axis, which was also input to the simulation and theoretical calculations, second one holds theoretically expected values, and the other column holds the values obtained by the simulation. I was planning to plot something like this:</p>
<p><img alt="image description" src="https://ask.sagemath.org/upfiles/15194978051374454.png"></p>
<p>But that does not look good in my case, as the values in y-axis normally double, and the values for the x-axis exponentially increase, so most of the points end up getting collected at the lower left part, near the intersection of x-axis and y-axis of the plot. Therefore, I need a different way to plot such data, which will be more visually appealing and inform how close the simulation results are to the theoretical expected ones. For example, some of my values can be seen below:</p>
<pre><code>x = [2, 4, 8, 16, 32, 64] # partially removed for brevity
expected = [47.9995, 95.9783, 191.9127, 383.9708, 767.8831] # partially removed for brevity
simulated = [48, 96, 191.8, 383.8, 767.4] # partially removed for brevity
</code></pre>
<p>What is a good way to plot such a data that doubles in the y-axis and exponentially increases on the x-axis all the time, and to view how similar the two datasets actually are?</p>
http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/?comment=41409#post-id-41409@j.c. Yes, `expected` and `simulated` correspond to the values in the y-axis, and there are some integer values that are input to the simulation and theoretical calculations that is used to populate the x-axis. What I mean the values double is that you can see in the examples, the values for both `expected` and `simulated` are quite close to each other, but every value keeps doubling, so it was 48, then 96, then 191, then 383, then 767, and so on. And the values on the x-axis also increase exponentially, so you can guess that all these smaller values correspond to smaller values of x, as the values of x increase exponentially, so on the right size we will have a point or to on the top right corner, and all the other points on the lower left corner.Mon, 05 Mar 2018 11:41:24 -0600http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/?comment=41409#post-id-41409Comment by j.c. for <p>I'm performing some simulations, and at the end I have a CSV file with three columns. One column holds the values for the x-axis, which was also input to the simulation and theoretical calculations, second one holds theoretically expected values, and the other column holds the values obtained by the simulation. I was planning to plot something like this:</p>
<p><img alt="image description" src="https://ask.sagemath.org/upfiles/15194978051374454.png"></p>
<p>But that does not look good in my case, as the values in y-axis normally double, and the values for the x-axis exponentially increase, so most of the points end up getting collected at the lower left part, near the intersection of x-axis and y-axis of the plot. Therefore, I need a different way to plot such data, which will be more visually appealing and inform how close the simulation results are to the theoretical expected ones. For example, some of my values can be seen below:</p>
<pre><code>x = [2, 4, 8, 16, 32, 64] # partially removed for brevity
expected = [47.9995, 95.9783, 191.9127, 383.9708, 767.8831] # partially removed for brevity
simulated = [48, 96, 191.8, 383.8, 767.4] # partially removed for brevity
</code></pre>
<p>What is a good way to plot such a data that doubles in the y-axis and exponentially increases on the x-axis all the time, and to view how similar the two datasets actually are?</p>
http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/?comment=41407#post-id-41407I suppose your example figure shows `expected` and `simulated` on the y-axis with a third table giving the x-values. What do you mean by "the values normally double"? Do you mean that the corresponding values of expected and simulated are quite close to each other? If you plot the difference of expected and simulated against the x-values is that more informative? I also do not understand why points get collected near x=0 and y=0.Mon, 05 Mar 2018 10:09:27 -0600http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/?comment=41407#post-id-41407Answer by tmonteil for <p>I'm performing some simulations, and at the end I have a CSV file with three columns. One column holds the values for the x-axis, which was also input to the simulation and theoretical calculations, second one holds theoretically expected values, and the other column holds the values obtained by the simulation. I was planning to plot something like this:</p>
<p><img alt="image description" src="https://ask.sagemath.org/upfiles/15194978051374454.png"></p>
<p>But that does not look good in my case, as the values in y-axis normally double, and the values for the x-axis exponentially increase, so most of the points end up getting collected at the lower left part, near the intersection of x-axis and y-axis of the plot. Therefore, I need a different way to plot such data, which will be more visually appealing and inform how close the simulation results are to the theoretical expected ones. For example, some of my values can be seen below:</p>
<pre><code>x = [2, 4, 8, 16, 32, 64] # partially removed for brevity
expected = [47.9995, 95.9783, 191.9127, 383.9708, 767.8831] # partially removed for brevity
simulated = [48, 96, 191.8, 383.8, 767.4] # partially removed for brevity
</code></pre>
<p>What is a good way to plot such a data that doubles in the y-axis and exponentially increases on the x-axis all the time, and to view how similar the two datasets actually are?</p>
http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/?answer=41413#post-id-41413To take into acount the exponential nature of your data, you can use the `loglog` scale as follows:
sage: points(zip(x,expected), color='blue', marker='o', scale='loglog') + points(zip(x,simulated), color='red', marker='x', scale='loglog')
Now, since both experimental and theoretical data are very close to eachother, i would suggest to plot only one of them and then a plot that shows the percentage of error, like:
sage: error = [abs(a-b)/a*100 for a,b in zip(expected,simulated)]
sage: points(zip(x,expected), color='blue', marker='o', scale='loglog') + points(zip(x,error), color='red', marker='x', scale='loglog')
Mon, 05 Mar 2018 17:28:30 -0600http://ask.sagemath.org/question/41406/how-to-plot-similarity-of-two-data-sets-in-sage/?answer=41413#post-id-41413