作者:钟杰辉_576 | 来源:互联网 | 2022-12-02 10:35
假设yo = Yo()
是一个带有方法的大对象double
,它返回其参数乘以2
.
如果我通过yo.double
到imap
的multiprocessing
,那么它是非常缓慢的,因为每一个函数调用创建一个副本yo
,我认为.
即,这很慢:
from tqdm import tqdm
from multiprocessing import Pool
import numpy as np
class Yo:
def __init__(self):
self.a = np.random.random((10000000, 10))
def double(self, x):
return 2 * x
yo = Yo()
with Pool(4) as p:
for _ in tqdm(p.imap(yo.double, np.arange(1000))):
pass
输出:
0it [00:00, ?it/s]
1it [00:06, 6.54s/it]
2it [00:11, 6.17s/it]
3it [00:16, 5.60s/it]
4it [00:20, 5.13s/it]
...
但是,如果我yo.double
用函数包装double_wrap
并将其传递给它imap
,那么它基本上是瞬时的.
def double_wrap(x):
return yo.double(x)
with Pool(4) as p:
for _ in tqdm(p.imap(double_wrap, np.arange(1000))):
pass
输出:
0it [00:00, ?it/s]
1000it [00:00, 14919.34it/s]
如何以及为什么包装函数会改变行为?
我使用Python 3.6.6.