python - Vertex AI - ModelDeployOp(...) 上没有名为“google_cloud_pipeline_components.remote”的模块

Question

我创建了一个简单的管道来训练模型并将其部署到 Vertex AI 端点。我注意到，在尝试使用该 google_cloud_pipeline_components.aiplatform.ModelDeployOp()组件部署模型时，它会返回一个错误。

如果我们查看google_cloud_pipeline_components.aiplatform的文档，我们可以找到 ModelDeployOp() 的两个条目。一个说明他们如何将原始方法转换为组件，另一个是有关如何使用 ModelDeployOp()方法的文档。

如果我们看看他们是如何转换方法的，我们会发现以下信息：...

Generates and invokes the following Component:
name: Model-deploy inputs: - {name: project, type: String} - {name: endpoint, type: Artifact} - {name: model, type: Model} outputs: - {name: endpoint, type: Artifact} implementation:

container:
image: gcr.io/sashaproject-1/mb_sdk_component:latest command: - python3 - remote_runner.py - –cls_name=Model - –method_name=deploy - –method.deployed_model_display_name=my-deployed-model - –method.machine_type=n1-standard-4 args: - –resource_name_output_artifact_path - {outputPath: endpoint} - –init.project - {inputValue: project} - –method.endpoint - {inputPath: endpoint} - –init.model_name - {inputPath: model}

在查看我的 gcp 日志记录重新运行的错误时：

/usr/local/bin/python3: Error while finding module specification for 'google_cloud_pipeline_components.remote.aiplatform.remote_runner' (ModuleNotFoundError: No module named 'google_cloud_pipeline_components.remote')

似乎这是容器本身内部的问题。

所以......我想我的问题是我假设这是图书馆中的一个错误是否正确？有什么解决方法吗？

提前致谢。

score 0 · Accepted Answer

我刚刚遇到了与 Kubeflow 管道类似的问题（相同的错误消息，不同的容器）。这是使用 :latest 标签的危险。几天前有效的方法今天无效。就我而言，我通过将 gcr.io/ml-pipeline/google-cloud-pipeline-components:latest 更改为以前的版本 gcr.io/ml-pipeline/google-cloud-pipeline-components:0.1.7 （最新标签是最近部署的 0.1.8，看起来像缺少库依赖项）给出相同的找不到远程模块错误。

python - Vertex AI - ModelDeployOp(...) 上没有名为“google_cloud_pipeline_components.remote”的模块

1 回答 1

Related

Reference