作者:改改我的坏_155 | 来源:互联网 | 2022-12-06 14:43
我在多GPU设置上运行TensorFlow推理时遇到了问题.
环境:Python 3.6.4; TensorFlow 1.8.0; Centos 7.3; 2 Nvidia Tesla P4
系统空闲时,这是nvidia-smi输出:
Tue Aug 28 10:47:42 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.81 Driver Version: 384.81 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P4 Off | 00000000:00:0C.0 Off | 0 |
| N/A 38C P0 22W / 75W | 0MiB / 7606MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P4 Off | 00000000:00:0D.0 Off | 0 |
| N/A 39C P0 23W / 75W | 0MiB / 7606MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
与我的问题有关的主要陈述:
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
def get_sess_and_tensor(ckpt_path):
assert os.path.exists(ckpt_path), "file: {} not exist.".format(ckpt_path)
graph = tf.Graph()
with graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(ckpt_path, "rb") as fid1:
od_graph_def.ParseFromString(fid1.read())
tf.import_graph_def(od_graph_def, name="")
sess = tf.Session(graph=graph)
with tf.device('/gpu:1'):
tensor = graph.get_tensor_by_name("image_tensor:0")
boxes = graph.get_tensor_by_name("detection_boxes:0")
scores = graph.get_tensor_by_name("detection_scores:0")
classes = graph.get_tensor_by_name('detection_classes:0')
return sess, tensor, boxes, scores, classes
因此,问题是,当我将可见设备设置为'0,1'时,即使我将tf.device设置为GPU 1,当运行推理时,我从nvidia-smi看到只使用GPU 0(GPU 0的GPU- Util很高 - 几乎100% - 而GPU 1是0).为什么不使用GPU 1?
我想并行使用这两个GPU,但即使使用以下代码,它仍然只使用GPU 0:
with tf.device('/gpu:0'):
tensor = graph.get_tensor_by_name("image_tensor:0")
boxes = graph.get_tensor_by_name("detection_boxes:0")
with tf.device('/gpu:1'):
scores = graph.get_tensor_by_name("detection_scores:0")
classes = graph.get_tensor_by_name('detection_classes:0')
任何建议都非常感谢.
谢谢.
韦斯利