- 操作系统平台和发行版:Linux Ubuntu 16.04
- Ray 安装自(源代码或二进制文件):binary
- 射线版本:0.6.5
- Python版本:3.6
我正在尝试按照教程(链接)将 ray 与 tensorflow 一起使用,我得到了一个tune error
:
错误日志
Result logdir: ray_results/tune_gan_test
Number of trials: 2 ({'ERROR': 2})
ERROR trials:
- train_gan_0_partition=0: ERROR, 1 failures: ray_results/tune_gan_test/train_gan_0_partition=0_2019-04-05_16-25-5536of9abi/error_2019-04-05_16-26-02.txt
- train_gan_1_partition=1: ERROR, 1 failures: ray_results/tune_gan_test/train_gan_1_partition=1_2019-04-05_16-26-1038hprt_a/error_2019-04-05_16-26-12.txt
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/1 GPUs
Memory usage on this node: 53.0/67.5 GB
Result logdir: ray_results/tune_gan_test
Number of trials: 2 ({'ERROR': 2})
ERROR trials:
- train_gan_0_partition=0: ERROR, 1 failures: ray_results/tune_gan_test/train_gan_0_partition=0_2019-04-05_16-25-5536of9abi/error_2019-04-05_16-26-02.txt
- train_gan_1_partition=1: ERROR, 1 failures: ray_results/tune_gan_test/train_gan_1_partition=1_2019-04-05_16-26-1038hprt_a/error_2019-04-05_16-26-12.txt
Traceback (most recent call last):
File "train.py", line 142, in <module>
**gan_spec)
File "/lib/python3.6/site-packages/ray/tune/tune.py", line 253, in run
raise TuneError("Trials did not complete", errored_trials)
ray.tune.error.TuneError: ('Trials did not complete', [train_gan_0_partition=0, train_gan_1_partition=1])
源代码/日志
射线使用相关的代码:
# !!! Entrypoint for ray.tune !!!
def train(config={'partition': 0}, reporter=None):
global status_reporter, partition_fn
status_reporter = reporter
partition_fn = config['partition']
tf.app.run(main=main)
# !!! Example of using the ray.tune Python API !!!
if __name__ == "__main__":
try:
register_trainable('train_gan', train)
gan_spec = {
'stop': {
'time_total_s': 600,
},
'config': {
'partition': grid_search([0, 1]),
},
}
ray.init()
tune.run('train_gan',
name='tune_gan_test',
resources_per_trial={"gpu":1},
raise_on_failed_trial=True,
queue_trials=True,
with_server=False,
**gan_spec)
except KeyboardInterrupt:
os._exists(1)
我该如何解决这个问题?谢谢你的帮助:)