Ask Your Question
0

Python MultiProcessing module stuck for long outputs

asked 2020-05-24 15:38:40 +0200

dzhy444 gravatar image

updated 2020-05-26 16:38:06 +0200

I am using SageMath 9.0 and I tried to do paralle computation in two ways

1) parallel decoration built in SageMath;

2) the MultiProcessing module of Python.

When using paralell decoration, everything works fine. When using MultiProcessing module for the same problem with the same input, everything works fine for short output, but there is a problem when the output is long. SageMath gets stuck after the computation if the output is long. I monitored the CPU usage, it peaks at first and then returns to zero, which means that the computation is complete. However, the output still does not appear.

What puzzles me is that the problem depends on the length of output, not the time of computation. For the same computation, once I add a line to manually set the output to be something short, or extract a small part of the original output, then the computation no longer gets stck in the end, and that small part agrees with the original answer.

I would like to know if there is any hidden parameter to prevent the Python MultiProcessing module from producing long outputs in Sagemath.

P.S. Following the suggestion of @tmonteil, I attached my code with an example.

def f(n):
    return 2 ^ (n ^ n)


def g(n):
    return factor(2 ^ (n ^ n))


def run(function, parameters):
    from multiprocessing import Process, Queue

    def target_function(x, queue):
        queue.put(function(*x))

    results = list()

    if __name__ == "__main__":
        queue = Queue()
        processes = [Process(target=target_function, args=(i, queue)) for i in parameters]
        for p in processes:
            p.start()
        for p in processes:
            p.join()
        results = [queue.get() for p in processes]

    return results

On my computer, the command

run(f,[(3,),(4,),(5,),(6,)])

works fine but the command

run(f,[(7,)])

gets stuck. On the other hand, the command

run(g,[(3,),(4,),(5,),(6,),(7,),(8,),(9,),(10,)])

works fine. Note that the function g does strictly more jobs than f, but its answers are much shorter.

edit retag flag offensive close merge delete

Comments

Could you please provide your code, so that someone could reproduce your problem ?

tmonteil gravatar imagetmonteil ( 2020-05-24 18:02:16 +0200 )edit

@tmonteil Thank you for your suggestion. I have attached my code with an example.

dzhy444 gravatar imagedzhy444 ( 2020-05-26 16:39:05 +0200 )edit

1 Answer

Sort by ยป oldest newest most voted
2

answered 2020-05-26 18:30:24 +0200

tmonteil gravatar image

updated 2020-05-26 18:39:12 +0200

I did not inspect deep causes (like a pipe being full), but a quick look at https://docs.python.org/2/library/mul... indicates that doing queue.get() should be done before p.join(), and it seems to fix the issue. Just replace

    for p in processes:
        p.join()
    results = [queue.get() for p in processes]

with

    results = [queue.get() for p in processes]
    for p in processes:
        p.join()
edit flag offensive delete link more

Comments

Thank you for your answer! Now I have found that someone had the same problem with Python. [https://stackoverflow.com/questions/3...]

dzhy444 gravatar imagedzhy444 ( 2020-05-27 04:36:10 +0200 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2020-05-24 15:38:40 +0200

Seen: 2,196 times

Last updated: May 26 '20