The “multiprocessing” module has a class Pool that is quite convenient if we want to do parallel processing. The API is simple and rather straightforward. I wanted to see if I can craft an example out of the official docs and here’s the code:
from multiprocessing import Pool
from time import sleep
from random import randint
def __init__(self, func, cb_func):
self.func = func
self.cb_func = cb_func
self.pool = Pool()
def call(self,*args, **kwargs):
self.pool.apply_async(self.func, args, kwargs, self.cb_func)
sleep_duration = randint(1,5)
print "PID: %d \t Value: %d \t Sleep: %d" % (os.getpid(), x ,sleep_duration)
async_square = AsyncFactory(square, cb_func)
Let’s see whta we’re doing here – we have two functions, the main function “square” and the callback function “cb_func” which just prints out the result. We want to run the main function asynchronously and as soon as the result is available, we want the callback to execute, printing out the result.
What we’re doing inside the AsyncFactory is storing the function and callback function inside an instance. Then we create a worker pool. The worker pool by default uses the available CPUs. We can also pass values to the “processes” argument to determine the number of worker processes in the pool.
Then we repeatedly call the apply_async on the Pool object to pass the function with the arguments. Finally, we wait for the pool to close it’s workers and rest in peace. As soon as each worker returns a value, the callback would print it out. We have also added a print statement to check the process ID, passed value and sleep duration.
Isn’t it fun?