我正在尝试在多节点 Storm 集群上测试 Storm+Kafka+Trident 作业。
当我在机器 1 上运行我的作业时,作业运行并处理记录当我在添加第二个工作人员后运行我的作业时,作业也运行没有任何问题。
当我将第三个工作人员添加到集群时,问题就开始了。我在工人日志中得到以下信息
2014-07-16 16:47:56 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6701... [29]
2014-07-16 16:47:56 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6703... [30]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6702... [30]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6700... [29]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6701... [30]
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Closing Netty Client Netty-Client-cassandra1/10.201.221.139:6703
2014-07-16 16:47:57 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent with Netty-Client-cassandra1/10.201.221.139:6703..., timeout: 600000ms, pendings: 0
2014-07-16 16:47:58 b.s.m.n.Client [INFO] Closing Netty Client Netty-Client-cassandra1/10.201.221.139:6702
2014-07-16 16:47:58 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent with Netty-Client-cassandra1/10.201.221.139:6702..., timeout: 600000ms, pendings: 0
2014-07-16 16:47:58 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-cassandra1/10.201.221.139:6700... [30]
2014-07-16 16:48:31 s.k.KafkaUtils [INFO] Metrics Tick: Not enough data to calculate spout lag.
2014-07-16 16:48:34 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-172.144.96.66.static.eigbox.net/66.96.144.172:6701... [6]
2014-07-16 16:48:34 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-172.144.96.66.static.eigbox.net/66.96.144.172:6703... [6]
在主管日志中,我收到以下消息
2014-07-16 16:47:26 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:27 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:27 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:28 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:28 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:29 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:29 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
2014-07-16 16:47:30 b.s.d.supervisor [INFO] 1fdb9a02-1110-458c-b72e-91950fbbc5fd still hasn't started
作业根本不运行。我的storm.yaml 配置是这样的
storm.zookeeper.servers:
- "10.201.32.79"
#
nimbus.host: "10.201.32.79"
storm.local.dir: "/home/hadoop/stormtmp"
java.library.path: "/opt/java7/lib"
#supervisor.slots.ports:
# - 6700
# - 6701
# - 6702
# - 6703
worker.childopts: "-Xmx2048m -XX:NewSize=1000m -XX:MaxNewSize=1000m"
nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
ui.port: 8084
ui.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"