在使用Openstack Swift客户端库时,我遇到了Python生成器的问题.
手头的问题是我试图从特定的URL(大约7MB)中检索大量数据,将字符串分成较小的位,然后发回一个生成器类,每次迭代都保存一个字符串的块.在测试套件中,这只是一个字符串,它被发送到swift客户端的monkeypatched类进行处理.
monkeypatched类中的代码如下所示:
def monkeypatch_class(name, bases, namespace): '''Guido's monkeypatch metaclass.''' assert len(bases) == 1, "Exactly one base class required" base = bases[0] for name, value in namespace.iteritems(): if name != "__metaclass__": setattr(base, name, value) return base
在测试套件中:
from swiftclient import client import StringIO import utils class Connection(client.Connection): __metaclass__ = monkeypatch_class def get_object(self, path, obj, resp_chunk_size=None, ...): contents = None headers = {} # retrieve content from path and store it in 'contents' ... if resp_chunk_size is not None: # stream the string into chunks def _object_body(): stream = StringIO.StringIO(contents) buf = stream.read(resp_chunk_size) while buf: yield buf buf = stream.read(resp_chunk_size) contents = _object_body() return headers, contents
返回生成器对象后,它由存储类中的流函数调用:
class SwiftStorage(Storage): def get_content(self, path, chunk_size=None): path = self._init_path(path) try: _, obj = self._connection.get_object( self._container, path, resp_chunk_size=chunk_size) return obj except Exception: raise IOError("Could not get content: {}".format(path)) def stream_read(self, path): try: return self.get_content(path, chunk_size=self.buffer_size) except Exception: raise OSError( "Could not read content from stream: {}".format(path))
最后,在我的测试套件中:
def test_stream(self): filename = self.gen_random_string() # test 7MB content = self.gen_random_string(7 * 1024 * 1024) self._storage.stream_write(filename, io) io.close() # test read / write data = '' for buf in self._storage.stream_read(filename): data += buf self.assertEqual(content, data, "stream read failed. output: {}".format(data))
输出结束于此:
====================================================================== FAIL: test_stream (test_swift_storage.TestSwiftStorage) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/bacongobbler/git/github.com/bacongobbler/docker-registry/test/test_local_storage.py", line 46, in test_stream "stream read failed. output: {}".format(data)) AssertionError: stream read failed. output:
我尝试使用一个简单的python脚本来隔离它,该脚本遵循与上面的代码相同的流程,并且没有遇到任何问题:
def gen_num(): def _object_body(): for i in range(10000000): yield i return _object_body() def get_num(): return gen_num() def stream_read(): return get_num() def main(): num = 0 for i in stream_read(): num += i print num if __name__ == '__main__': main()
非常感谢任何有关此问题的帮助:)