2

我正在尝试在多节点 Storm 集群上测试 Storm+Kafka+Trident 作业。

当我在机器 1 上运行我的作业时,作业运行并处理记录当我在添加第二个工作人员后运行我的作业时,作业也运行没有任何问题。

当我将第三个工作人员添加到集群时,问题就开始了。我在工人日志中得到以下信息

2014-07-16 16:47:56 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6701... [29]
2014-07-16 16:47:56 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6703... [30]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6702... [30]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6700... [29]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6701... [30]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Closing Netty Client Netty-Client-cassandra1/10.201.221.139:6703
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent with Netty-Client-cassandra1/10.201.221.139:6703..., timeout: 600000ms, pendings: 0
2014-07-16 16:47:58 b.s.m.n.Client [INFO] Closing Netty Client Netty-Client-cassandra1/10.201.221.139:6702
2014-07-16 16:47:58 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent with Netty-Client-cassandra1/10.201.221.139:6702..., timeout: 600000ms, pendings: 0
2014-07-16 16:47:58 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6700... [30]
2014-07-16 16:48:31 s.k.KafkaUtils [INFO] Metrics Tick: Not enough data to calculate spout lag.
2014-07-16 16:48:34 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-172.144.96.66.static.eigbox.net/66.96.144.172:6701... [6]
2014-07-16 16:48:34 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-172.144.96.66.static.eigbox.net/66.96.144.172:6703... [6]

在主管日志中,我收到以下消息

2014-07-16 16:47:26 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:27 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:27 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:28 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:28 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:29 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:29 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:30 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started

作业根本不运行。我的storm.yaml 配置是这样的

storm.zookeeper.servers:
- "10.201.32.79"
# 
nimbus.host: "10.201.32.79"
storm.local.dir: "/home/hadoop/stormtmp"
java.library.path: "/opt/java7/lib"
#supervisor.slots.ports:
#    - 6700
#    - 6701
#    - 6702
#    - 6703
worker.childopts: "-Xmx2048m -XX:NewSize=1000m -XX:MaxNewSize=1000m"
nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
ui.port: 8084
ui.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
4

1 回答 1

1

基本上是说主管无法启动工作人员..尝试在主管日志中查看类似
b.s.d.supervisor [INFO] Launching worker with command: java -server .....
现在复制此命令并尝试在您的主管上运行它,看看您是否遇到任何错误,如果是这样,您可能需要相应地配置你的storm.yaml

于 2014-07-17T10:51:16.193 回答