r - 关于 h2o.grid() 函数中的并行性的问题

Question

我尝试使用h2o.grid()h2o 包中的函数使用 R 进行一些调整，当我将参数设置为parallelism大于 1 时，它总是显示警告

某些模型由于失败而未构建，有关更多详细信息，请运行 `summary(grid_object, show_stack_traces = TRUE)

并且最终网格对象中的model_ids包含很多以等结尾的模型_cv_1，_cv_2而且模型的数量不等于我max_models的in search_criterialist的设置，我认为它们只是cv过程中的模型，而不是最终的模型。

当我设置parallelism大于 1 时：

当我保留默认值或设置为 1 时，结果是正常的，所有模型都以等parallelism结尾。_model_1_model_2

当我保留“并行度”默认值或将其设置为 1 时：当我离开

这是我的代码：

# set the grid
rf_h2o_grid <- list(mtries = seq(3, ncol(train_h2o), 4),
                    max_depth = c(5, 10, 15, 20))

# set the search_criteria
sc <- list(strategy = "RandomDiscrete", 
           seed = 100,
           max_models = 5
           )

# random grid tuning
rf_h2o_grid_tune_random <- h2o.grid(
  algorithm = "randomForest", 
  x = x, 
  y = y,
  training_frame = train_h2o,
  nfolds = 5,                     # use cv to validate the parameters
  fold_assignment = "Stratified",   
  ntrees = 100,
  seed = 100,
  hyper_params = rf_h2o_grid,
  search_criteria = sc
  # parallelism = 6           # when I set it larger than 1, the result always includes some "cv_" models
  )

那么如何parallelism正确使用 inh2o.grid()呢？感谢您的帮助！

score 0 · Accepted Answer

这是网格搜索中并行性的实际问题，之前注意到但未正确报告。感谢您提出这个问题，我们会尽快修复它：如果您想跟踪进度，请参阅https://h2oai.atlassian.net/browse/PUBDEV-7886 。

在正确修复之前，您必须避免在网格中同时使用 CV 和并行性。

关于以下错误：

某些模型由于失败而未构建，有关更多详细信息，请运行 `summary(grid_object, show_stack_traces = TRUE)

如果错误是可重现的，您应该通过运行网格来获取更多详细信息verbose=True。将整个错误消息添加到上面的票证中也可能会有所帮助。

score 0 · Accepted Answer

这是因为您设置了 max_models = 5，您的网格只会制作 5 个模型然后停止。

有三种方法可以设置提前停止标准：

“max_models”：创建的最大模型数
“max_runtime_secs”：最大运行时间，以秒为单位
通过设置“stopping_rounds”、“stopping_metric”和“stopping_tolerance”实现基于指标的提前停止

r - 关于 h2o.grid() 函数中的并行性的问题

2 回答 2

Related

Reference