Tensorflow遇到的问题InvalidArgumentError: Graph execution error:2 root error(s) found.解决方法

colab踩坑

阿正的梦工坊

15708人浏览 · 2022-03-14 15:08:14

阿正的梦工坊 · 2022-03-14 15:08:14 发布

文章目录

遇到问题
解决
参考

遇到问题

平台google colab
使用GPU
训练dl模型时，遇到报错

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-56-09c5afe5a363> in <module>()
----> 1 text_vectorization.adapt(text_only_train_ds)
      2 
      3 tfidf_2gram_train_ds = train_ds.map(
      4     lambda x, y: (text_vectorization(x), y),
      5     num_parallel_calls=4)

3 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     53     ctx.ensure_initialized()
     54     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 55                                         inputs, attrs, num_outputs)
     56   except core._NotOkStatusException as e:
     57     if name is not None:

InvalidArgumentError: Graph execution error:

2 root error(s) found.
  (0) INVALID_ARGUMENT:  During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
	 [[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]
	 [[Func/map/while/body/_1/input/_48/_72]]
  (1) INVALID_ARGUMENT:  During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
	 [[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_adapt_step_194806]

上述报错由以下代码产生

text_vectorization.adapt(text_only_train_ds)

tfidf_2gram_train_ds = train_ds.map(
    lambda x, y: (text_vectorization(x), y),
    num_parallel_calls=4)
tfidf_2gram_val_ds = val_ds.map(
    lambda x, y: (text_vectorization(x), y),
    num_parallel_calls=4)
tfidf_2gram_test_ds = test_ds.map(
    lambda x, y: (text_vectorization(x), y),
    num_parallel_calls=4)

model = get_model()
model.summary()
callbacks = [
    keras.callbacks.ModelCheckpoint("tfidf_2gram.keras",
                                    save_best_only=True)
]
model.fit(tfidf_2gram_train_ds.cache(),
          validation_data=tfidf_2gram_val_ds.cache(),
          epochs=10,
          callbacks=callbacks)
model = keras.models.load_model("tfidf_2gram.keras")
print(f"Test acc: {model.evaluate(tfidf_2gram_test_ds)[1]:.3f}")

解决

目前猜测应该用CPU训练，而不是GPU。果然，将runtime type改为None即可
在这里插入图片描述
因为这里数据预处理使用CPU进行，而具体计算使用GPU，所以数据集应用CPU来做。

参考

https://github.com/tensorflow/tensorflow/issues/28007

华为云开发者联盟

为开发者提供学习成长、分享交流、生态实践、资源工具等服务，帮助开发者快速成长。

更多推荐

cover

Sermant在异地多活场景下的实践

华为云开发者联盟

cover

华为云开发者桌面全新发布CodeArts IDE for Python

华为云开发者联盟

cover

理论+实践，带你了解分布式训练

华为云开发者联盟

所有评论(0)

查看更多评论

阿正的梦工坊

已为社区贡献8条内容