ikol's profile - activity

overview network karma followed questions activity

2022-09-27 19:30:59 +0200	received badge	● Famous Question (source)
2022-08-03 13:52:00 +0200	received badge	● Notable Question (source)
2021-04-28 00:09:56 +0200	received badge	● Notable Question (source)
2021-04-09 01:22:53 +0200	received badge	● Popular Question (source)
2019-12-10 05:25:48 +0200	received badge	● Notable Question (source)
2018-03-24 10:29:10 +0200	received badge	● Popular Question (source)
2016-11-24 11:49:04 +0200	received badge	● Nice Question (source)
2016-11-23 02:33:33 +0200	asked a question	Parallel Interface to the Sage interpreter 2.0 The parallel Sage interface `PSage()`http://doc.sagemath.org/html/en/refer... works fine with the given example, but I have trouble with a more complex case, which would be a typical application of this very useful feature. The following code works exactly as advertised: >>> v = [ PSage() for _ in range(5)] >>> w = [x('factor(2*%s-1)'% randint(250,310)) for x in v] >>> print w [127 13367 * 164511353 * 17137716527 * 51954390877748655744256192963206220919272895548843817842228913, , <<currently executing code>>, 3 * 5^2 * 11 * 31 * 41 * 53 * 131 * 157 * 521 * 1613 * 2731 * 8191 * 51481 * 409891 * 7623851 * 34110701 * 108140989558681 * 145295143558111, ] [127 * 13367 * 164511353 * 17137716527 * 51954390877748655744256192963206220919272895548843817842228913, 7 * 73 * 16183 * 34039 * 1437967 * 2147483647 * 833732508401263 * 658812288653553079 * 2034439836951867299888617, <<currently executing code>>, 3 * 5^2 * 11 * 31 * 41 * 53 * 131 * 157 * 521 * 1613 * 2731 * 8191 * 51481 * 409891 * 7623851 * 34110701 * 108140989558681 * 145295143558111, ] [127 * 13367 * 164511353 * 17137716527 * 51954390877748655744256192963206220919272895548843817842228913, 7 * 73 * 16183 * 34039 * 1437967 * 2147483647 * 833732508401263 * 658812288653553079 * 2034439836951867299888617, <<currently executing code>>, 3 * 5^2 * 11 * 31 * 41 * 53 * 131 * 157 * 521 * 1613 * 2731 * 8191 * 51481 * 409891 * 7623851 * 34110701 * 108140989558681 * 145295143558111, 7 * 78903841 * 28753302853087 * 618970019642690137449562111 * 24124332437713924084267316537353] [127 * 13367 * 164511353 * 17137716527 * 51954390877748655744256192963206220919272895548843817842228913, 7 * 73 * 16183 * 34039 * 1437967 * 2147483647 * 833732508401263 * 658812288653553079 * 2034439836951867299888617, 131071 * 12761663 * 179058312604392742511009 * 3320934994356628805321733520790947608989420068445023, 3 * 5^2 * 11 * 31 * 41 * 53 * 131 * 157 * 521 * 1613 * 2731 * 8191 * 51481 * 409891 * 7623851 * 34110701 * 108140989558681 * 145295143558111, 7 * 78903841 * 28753302853087 * 618970019642690137449562111 * 24124332437713924084267316537353] Printing w repeatedly shows the progress of the five factorizations running in parallel (monitoring it looking at `top` is showing 5 sage/python jobs running simultaneously). The following example is global optimization, starting from 5 different starting points using the differential evolution algorithm available in SciPy. The setup is more complex but still, only a single command string is passed to `PSage()`. `>>> v = [ PSage() for _ in range(5)] >>> w = [x('from scipy.optimize import rosen, differential_evolution; differential_evolution(rosen, [(0,2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2)])') for x in v] >>> print w [Sage, Sage, Sage, Sage, Sage]` Apparently, something is wrong here, it doesn't work. But why? Let's see what happens with the serial Sage interpreter `Sage()`. `>>> s = Sage() >>> t = s('from scipy.optimize import rosen, differential_evolution;differential_evolution(rosen,[(0,2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2)])') >>> print t Sage` Looks like the same problem. However, `Sage()` can be made to work using the `eval()` method. `>>> s = Sage() >>> t = s.eval('from scipy.optimize import rosen, differential_evolution;differential_evolution(rosen,[(0,2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2)])') >>> print t fun: 1.2785524204717224e-18` `PSage()` also has an `eval()` method, which AFAIK uses `Sage().eval()` internally, but unfortunately, it doesn't work. `>>> v = [ PSage() for _ in range(5)] >>> w = [x.eval('from scipy.optimize import rosen, differential_evolution; differential_evolution(rosen, [(0,2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2), (0, 2)])') for x in v] >>> print w ['<<currently executing code>>', '<<currently executing code>>', '<<currently executing code>>', '<<currently executing code>>', '<<currently executing code>>']` Based on what I see looking at `top` the five optimization jobs do run in parallel, but even after they finish, no matter how many times I print w all I get is `<<currently executing code>>`. The bottom line is that `PSage()` is either not working at all without using the `eval()` method, or, it seems to be working with the `eval()` method, but somehow is stuck in a bad internal state and never producing the output. Any comment is highly appreciated, it would be great to get this to work consistently.
2016-11-23 01:11:43 +0200	received badge	● Enthusiast
2016-11-15 17:23:32 +0200	answered a question	Parallel Interface to the Sage interpreter It turns out that the problem has nothing to do with scalar vs. vector variable. The bottom line is that `PSage()` takes a single STRING as parameter enclosed in quotes. The following (somewhat awkward) modification makes the parallel interface work properly. `from scipy.optimize import rosen x0 = [1.3, 0.7, 0.8, 1.9, 1.2] rosen(x0) 848.22000000000003 v = [PSage() for _ in range(3)] w = [x(eval('rosen(%s)'% str(x0))) for x in v] w [848.22, 848.22, 848.22]` Unfortunately, using `eval()` does not solve the problem. Because the example here is instantaneous to compute, I haven't realized that the calculations were, in fact, executed sequentially and not in parallel. After looking at the `PSage()` code it is clear that the argument must be a single string and the fact that this example "worked" although not in parallel, must be some artifact of using an explicit `eval()`. Mea culpa, HOWEVER, I created a better example and opened a new ticket, because `PSage()` would be an exteremely good tool for this kind of thing.
2016-11-15 15:04:56 +0200	received badge	● Nice Question (source)
2016-11-15 01:16:14 +0200	asked a question	Parallel Interface to the Sage interpreter I am using the `PSage()`parallel interpreter as described here http://doc.sagemath.org/html/en/refer.... The interface works fine to evaluate multiple instances of function calls to functions of one or more scalar variables, however, if a function variable is array-like, the interface doesn't seem to work any more. I wonder if any special quoting is necessary for array-like arguments. Here is a minimal example. For reference, I copy the example from the above page, which works just fine, `factor()` has a single scalar variable. `v = [PSage() for _ in range(3)] w = [x('factor(2^%s-1)'% randint(250,310)) for x in v] w [4057 * 8191 * 6740339310641 * 3340762283952395329506327023033, 31 * 13367 * 2940521 * 164511353 * 70171342151 * 3655725065508797181674078959681, 31 * 13367 * 2940521 * 164511353 * 70171342151 * 3655725065508797181674078959681]` However, the `rosen()` function with a single array-like/vector argument doesn't seem to work in the parallel interface. (The example below just calculates the same function value three times, but that is not the point here.) `from scipy.optimize import rosen x0 = [1.3, 0.7, 0.8, 1.9, 1.2] rosen(x0) 848.22000000000003 v = [PSage() for _ in range(3)] w = [x('rosen(x0)') for x in v] w [Sage, Sage, Sage]` Does anyone have any suggestion?
2016-09-30 08:45:00 +0200	received badge	● Notable Question (source)
2016-09-30 08:45:00 +0200	received badge	● Popular Question (source)
2016-09-11 16:28:27 +0200	received badge	● Famous Question (source)
2016-09-11 16:28:27 +0200	received badge	● Popular Question (source)
2016-09-11 16:28:27 +0200	received badge	● Notable Question (source)
2016-08-08 09:02:08 +0200	received badge	● Notable Question (source)
2016-08-08 09:02:08 +0200	received badge	● Famous Question (source)
2016-08-08 09:02:08 +0200	received badge	● Popular Question (source)
2016-06-22 06:05:46 +0200	received badge	● Necromancer (source)
2016-06-22 05:00:13 +0200	marked best answer	How to format questions in this forum Sorry, I am new to this forum but already find it very helpful. I noticed that most posts have nicely formatted code snippets, but I couldn't figure out how to do it. When cut-and-paste from notebook() my code looks awful and unformatted. Thanks for any suggestions.
2016-06-22 05:00:13 +0200	received badge	● Scholar (source)
2016-06-21 23:18:16 +0200	answered a question	calling a parallel decorated function on an iterator Niles is right. I also ran into this problem recently but found a straightforward way to solve it. I have some sample code below with detailed explanation. First, let's see a simple comparison between serial and parallel execution of a function. `@parallel(p_iter='multiprocessing', ncpus=3) def test_parallel(n): f = factor(2^128+n) return len(f) t=walltime() r = range(1000) p = sorted(list( test_parallel(r))) print p[-1] print walltime(t) t=walltime() for i in range(1000): f = factor(2^128+i) print f print walltime(t)` `(((999,), {}), 5) 6.359593153 5 * 23 * 383 * 1088533 * 7097431848554855065803619703 17.0849101543` `test_parallel` is a simple function that takes a nontrivial time to execute for testing purposes. It returns the number of distinct factors of the factorization of 2^128+n. The argument of `test_parallel` is a list created by the `range` function. Note that this has to be a list, there is currently no alternative, so e.g. `xrange` cannot be used in place of `range` because `xrange` generates numbers on the fly rather than creating a whole list of them. This can be a serious problem (mainly memory problem) but it can be overcome as will be shown further below. `test_parallel` as any parallel decorated function returns a special object, which is an iterator over 2-tuples and the order of the 2-tuples is entirely random! So, the output `(((999,), {}), 5)` above representing the last item in the calculation `p[-1]` includes the input value of `n=999`, an empty input keyword list/dictionary `{}`, and the return value of `5`. It should be noted that in order to be able to parse the output from `test_parallel` it should be cast to a sorted list. In this particular run the parallel calculation (using 3 cores) took some 6 seconds whereas at the bottom of the sample code the serial equivalent took some 17 seconds to execute (and the result of the last factorization confirms that there were 5 distinct factors). This is all well and my great appreciation to the developers. Unfortunately, however, a serious problem arises when the list argument to a parallel function grows too big. One solution that has worked very well for me involves chunks as Niles suggested plus numpy arrays. The following code is a more robust and significantly more efficient alternative to the naive parallel code above. `%timeit import numpy as np sizof_chunk = 10^3 numof_chunk = 10^2 np_array = np.zeros((sizof_chunknumof_chunk,), dtype=np.uint8) for i in range(numof_chunk): beg = i sizof_chunk end = beg + sizof_chunk tuples = sorted(list(test_parallel(range(beg,end)))) iter = [ x[1] for x in tuples ] np_array[beg:end] = np.fromiter(iter, np.uint8) print np_array` `[1 2 3 ..., 6 8 3] CPU time: 13.88 s, Wall time: 670.06 s` `sizof_chunk` is set to the same number 1000 and `numof_chunk` can be set to anything. If it is set to 1 then the calculation will be the exact same as above (and will take about 6 seconds ... (more)
2016-06-18 04:19:21 +0200	commented answer	Performance issues with parallel decoration In version 7.2 I no longer see the load balancing issue. Istvan
2016-06-18 03:25:10 +0200	received badge	● Popular Question (source)
2016-04-23 02:37:10 +0200	received badge	● Popular Question (source)
2016-04-23 02:37:10 +0200	received badge	● Notable Question (source)
2015-11-17 11:49:28 +0200	received badge	● Popular Question (source)
2015-09-05 03:21:27 +0200	commented answer	How make Notebook not to write in /tmp I tried both and `SAGENB_TMPDIR` doesn't work, but setting `TMPDIR` does. Thanks!
2015-09-04 22:03:39 +0200	asked a question	How make Notebook not to write in /tmp I am using Sage Notebook to generate very large files and then process them. The size of the files is tens of Gigabytes but that wouldn't be a problem the way things are set up in my worksheet. However, it seems no matter where the worksheet is stored the associated files are temporarily stored in /tmp whil ethe worksheet is working. For example, $ ll /raid/istvan/Playground/Sage.sagenb/home/__store__/2/21/212/2123/admin/17/cells/4/ total 8 drwx------ 2 istvan istvan 4096 Sep 4 15:36 ./ drwxrwxr-x 9 istvan istvan 4096 Sep 4 15:34 ../ lrwxrwxrwx 1 istvan istvan 49 Sep 4 15:36 A.mmap -> /tmp/tmpa6CJmh/A.mmap lrwxrwxrwx 1 istvan istvan 49 Sep 4 15:36 B.mmap -> /tmp/tmpa6CJmh/B.mmap lrwxrwxrwx 1 istvan istvan 49 Sep 4 15:36 C.mmap -> /tmp/tmpa6CJmh/C.mmap lrwxrwxrwx 1 istvan istvan 49 Sep 4 15:36 D.mmap -> /tmp/tmpa6CJmh/D.mmap $ ll /tmp/tmpa6CJmh total 68860 drwx------ 2 istvan istvan 4096 Sep 4 15:35 ./ drwxrwxrwt 14 root root 12288 Sep 4 15:35 ../ -rw-rw-r-- 1 istvan istvan 1688 Sep 4 15:35 ___code___.py -rw-rw-r-- 1 istvan istvan 10000000000 Sep 4 15:37 A.mmap lrwxrwxrwx 1 istvan istvan 54 Sep 4 15:35 data -> /raid/istvan/Playground/Sage.sagenb/home/admin/17/data/ -rw-rw-r-- 1 istvan istvan 10000000000 Sep 4 15:37 B.mmap -rw-rw-r-- 1 istvan istvan 80000000000 Sep 4 15:37 C.mmap -rw-rw-r-- 1 istvan istvan 10000000000 Sep 4 15:37 D.mmap -rw-rw-r-- 1 istvan istvan 2219 Sep 4 15:35 _sage_input_6.py The total size of the files is about 110 GB and I have plenty of room on the /raid partition where the Sage Notebook resides, but my / including /tmp partition is way too small for that. How can I make the Notebook NOT to write to /tmp? Is there an envar to set that? Thanks for any suggestion, Istvan
2015-04-02 14:09:14 +0200	commented answer	Is HDF5 or the Python interface h5py supported in Sage? Download the package from `http://www.hdfgroup.org/downloads/index.html` and the installation is straightforward. On Linux: `1) untar the downloaded file 2) cd to the hdf5 directory 3) ./configure --prefix=/where/you/want/hdf5/to/be/installed (in my case it was /usr/local/hdf5) 4) make 5) make check 6) sudo make install (sudo needed if the location is not in your own user area) 7) sudo make check-install`
2015-04-01 19:18:23 +0200	received badge	● Self-Learner (source)
2015-04-01 08:51:28 +0200	answered a question	Cannot add a comment to my own question I could `answer` my question as opposed to `comment` on it. It could be my browser, I don't know but let's consider this ticket closed.
2015-04-01 08:48:10 +0200	answered a question	Is HDF5 or the Python interface h5py supported in Sage? Of course, `hdf5` must be installed first and since it is not a Python package, `pip` will likely need explicit information about `hdf5` libraries and include files. The following command worked for me: `$ sage -pip install --global-option=build_ext --global-option="-L/usr/local/hdf5/lib" --global-option="-l/usr/local/hdf5/lib" --global-option="-I/usr/local/hdf5/include" --global-option="-R/usr/local/hdf5/lib" h5py`
2015-03-31 22:24:14 +0200	asked a question	Cannot add a comment to my own question I am trying to get further help regarding "Is HDF5 or the Python interface h5py supported in Sage?" I added my related question as a comment (while logged in) twice, the site seemed to have processed it but it won't show up under the above title. Should I open a new ticket? UPDATE This is fixed now :)
2015-03-18 14:06:25 +0200	commented answer	Is HDF5 or the Python interface h5py supported in Sage? Got it. Great, thank you! Istvan
2015-03-17 18:37:47 +0200	asked a question	Is HDF5 or the Python interface h5py supported in Sage? I am saving very large list objects to disk using the `save()` command in Sage, which is utilizing Python's Pickles package. There is a known deficiency/bug that is unlikely to go away, namely that deeply buried in the compression code in the Python standard library that Pickles and therefore `save()` use, there are legacy 32-bit integers that result in a serious limitation in using `save()` with even moderately large (hundreds of MB) objects. See OverflowError: size does not fit in an int. The h5py package provides a Python interface to the HDF5 library and HDF5 can deal with multiple terabytes easily. Does anyone know if h5py is/will be implemented in Sage? Is there another alternative to `save()` in Sage to save objects to disk?
2015-03-17 17:33:14 +0200	commented question	How to change the prefix to SAGE_TMP? I am using v6.4.1.`~/.sage/temp/` is different. If I run a Notebook session, its files are stored under `~/.sage/notebook/...` but while a Sage calculation is running all the files there are just links to `/tmp/something` where `/tmp/something` is the value of `SAGE_TMP` in that session. I am saving large objects via the `save()` command and that's where I have the problem, because they can't fit in `/tmp`. There has got to be a way in Sage to set the prefix for SAGE_TMP from `/tmp` to something else.
2015-03-16 10:27:54 +0200	received badge	● Nice Question (source)
2015-03-13 23:43:56 +0200	received badge	● Editor (source)
2015-03-13 23:42:18 +0200	asked a question	How to change the prefix to SAGE_TMP? `SAGE_TMP` looks something like this by default: `/tmp/tmpGMP2PR`. My /tmp is full but I have plenty of space in a different tmp directory on another disk. How can I change the default prefix in `SAGE_TMP` from `/tmp` to, say, `/raid/scratch`? Thanks.
2015-03-04 21:15:56 +0200	commented answer	Performance issues with parallel decoration Thanks, Vincent. Of course, you are right this is not an exact apples to apples comparison and your point about the calculation being too fast is valid. I too experimented with very large numbers and indeed parallel performance is better. Nonetheless I do see 6 python jobs starting out but very quickly four of them finish and only two and then only one is running for quite a while which means that load balancing is far from ideal. I'll check to source code to see how the list is passed to the function. Thanks again, Istvan