0

I am working with:

kubernetes 1.3.6

.. with this part in the deployment file of my application:

    livenessProbe:
      httpGet:
        path: /liveness
        port: 8082
      initialDelaySeconds: 120

.. so that when I describe the pod I got this

Liveness: http-get http://:8082/liveness delay=120s timeout=1s period=10s #success=1 #failure=3

My application often starts in 110-115 seconds, but sometimes it takes more (due to DB delays, external services retry, etc ..).

The problem I see is that when it takes more than 130/140 seconds (initialDelaySeconds + period), kubernetes forces the shutdown and the pod re-start from scratch. When you have a lot of replicas (50-60) it means that the full deployment sometimes takes 10-15 minutes more than the normal one. Obviously a solution is to increase the initialDelaySeconds, but then all the deployments will take a lot more time.

I had a look here and there's nothing that seems to solve this problem: http://kubernetes.io/docs/api-reference/v1/definitions/#_v1_probe

Ideally I would like to have something that works in the opposite way: not an "initialDelaySeconds", but a maximum amount of time to start the pod. If that time passes, kubernetes forces the pod shutdown and tries another time.

4

2 回答 2

4

我终于找到了一个很好的解决方案,目前效果很好!

我设置:

  • readinessProbe.initialDelaySeconds:等于应用程序的最小启动时间
  • livenessProbe.initialDelaySeconds:等于应用程序的最大启动时间+几秒

这样 kubernetes(在 readinessProbe.initialDelaySeconds 之后)开始检查就绪探针,以便将 pod 添加到平衡中。然后(在 livenessProbe.initialDelaySeconds 之后)它也开始检查 liveness probe,以防 pod 需要重新启动。

于 2016-09-09T16:04:36.220 回答
0

好吧,您所说的时间似乎确实存在,只是没有明确说明。

您正在寻找的时间公式是

initialDelaySeconds + period * (failureTreshold - 1)

-1因为探测是在 initialDelaySeconds 之后立即执行的)。您可以通过更改这 3 个值来调整maximumAmountOfTime(您想要的参数)。

编辑:在 OP 发表评论后,上面的答案是错误的,似乎增加 initialDelaySeconds 是你现在唯一能做的事情。

于 2016-09-06T08:02:34.863 回答