How To Use Multiprocessing.pool In An Imported Module?
Solution 1:
The reason you need to guard multiprocessing code in a if __name__ == "__main__"
is that you don't want it to run again in the child process. That can happen on Windows, where the interpreter needs to reload all of its state since there's no fork
system call that will copy the parent process's address space. But you only need to use it where code is supposed to be running at the top level since you're in the main script. It's not the only way to guard your code.
In your specific case, I think you should put the multiprocessing
code in a function. That won't run in the child process, as long as nothing else calls the function when it should not. Your main module can import the module, then call the function (from within an if __name__ == "__main__"
block, probably).
It should be something like this:
some_module.py:
defprocess_males(x):
...
defprocess_females(x):
...
args_m = [...] # these could be defined inside the function below if that makes more sense
args_f = [...]
defdo_stuff():
with mp.Pool(processes=(mp.cpu_count() - 1)) as p:
p.map_async(process_males, args_m)
p.map_async(process_females, args_f)
main.py:
import some_module
if__name__== "__main__":
some_module.do_stuff()
In your real code you might want to pass some arguments or get a return value from do_stuff
(which should also be given a more descriptive name than the generic one I've used in this example).
Solution 2:
The idea of if __name__ == '__main__':
is to avoid infinite process spawning.
When pickling a function defined in your main script, python has to figure out what part of your main script is the function code. It will basically re run your script. If your code creating the Pool
is in the same script and not protected by the "if main", then by trying to import the function, you will try to launch another Pool
that will try to launch another Pool
....
Thus you should separate the function definitions from the actual main script:
from multiprocessing import Pool
# define test functions outside main# so it can be imported withou launching# new Pooldef test_func():
pass
if __name__ == '__main__':
withPool(4) as p:
r = p.apply_async(test_func)
... do stuff
result = r.get()
Solution 3:
Cannot yet comment on the question, but a workaround I have used that some have mentioned is just to define the process_males
etc. functions in a module that is different to where the processes are spawned. Then import the module containing the multiprocessing spawns.
Post a Comment for "How To Use Multiprocessing.pool In An Imported Module?"