Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale down a deployment by removing specific pods (PodDeletionCost) #2255

Open
8 tasks done
ahg-g opened this issue Jan 12, 2021 · 63 comments
Open
8 tasks done

Scale down a deployment by removing specific pods (PodDeletionCost) #2255

ahg-g opened this issue Jan 12, 2021 · 63 comments
Labels
sig/apps Categorizes an issue or PR as relevant to SIG Apps. stage/beta Denotes an issue tracking an enhancement targeted for Beta status

Comments

@ahg-g
Copy link
Member

ahg-g commented Jan 12, 2021

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 12, 2021
@ahg-g
Copy link
Member Author

ahg-g commented Jan 12, 2021

/sig apps

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 12, 2021
@annajung annajung added stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Jan 27, 2021
@annajung annajung added this to the v1.21 milestone Jan 27, 2021
@ahg-g
Copy link
Member Author

ahg-g commented Feb 3, 2021

@annajung @JamesLaverack james, you mentioned in the sig-apps slack channel that this enhancement is at risk, can you clarify why? it meets the criteria.

@JamesLaverack
Copy link
Member

@ahg-g Just to follow up here too, we discussed in Slack and this was due to a delay in reviewing. We've now marked this as "Tracked" on the enhancements spreadsheet for 1.21.

Thank you for getting back to us. :)

@ahg-g ahg-g changed the title Scale down a deployment by removing specific pods Scale down a deployment by removing specific pods (PodDeletionCost) Feb 17, 2021
@JamesLaverack
Copy link
Member

Hi @ahg-g,

Since your Enhancement is scheduled to be in 1.21, please keep in mind the important upcoming dates:

  • Tuesday, March 9th: Week 9 — Code Freeze
  • Tuesday, March 16th: Week 10 — Docs Placeholder PR deadline
    • If this enhancement requires new docs or modification to existing docs, please follow the steps in the Open a placeholder PR doc to open a PR against k/website repo.

As a reminder, please link all of your k/k PR(s) and k/website PR(s) to this issue so we can track them.

Thanks!

@ahg-g
Copy link
Member Author

ahg-g commented Feb 26, 2021

Hi @ahg-g,

Since your Enhancement is scheduled to be in 1.21, please keep in mind the important upcoming dates:

  • Tuesday, March 9th: Week 9 — Code Freeze

  • Tuesday, March 16th: Week 10 — Docs Placeholder PR deadline

    • If this enhancement requires new docs or modification to existing docs, please follow the steps in the Open a placeholder PR doc to open a PR against k/website repo.

As a reminder, please link all of your k/k PR(s) and k/website PR(s) to this issue so we can track them.

Thanks!

done.

@JamesLaverack
Copy link
Member

Hi @ahg-g

Enhancements team is currently tracking the following PRs

As this PR is merged, can we mark this enhancement complete for code freeze or do you have other PR(s) that are being worked on as part of the release?

@ahg-g
Copy link
Member Author

ahg-g commented Mar 2, 2021

Hi @JamesLaverack , yes the k/k code is merged, docs PR still open though.

@JamesLaverack JamesLaverack added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Apr 25, 2021
@ahg-g
Copy link
Member Author

ahg-g commented May 5, 2021

/stage beta

@k8s-ci-robot k8s-ci-robot added stage/beta Denotes an issue tracking an enhancement targeted for Beta status and removed stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status labels May 5, 2021
@ahg-g
Copy link
Member Author

ahg-g commented May 5, 2021

/milestone v1.22

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 27, 2022
@thesuperzapper
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 30, 2022
@thesuperzapper
Copy link

Hey all watching! After thinking more about how we can make pod-deletion-cost GA, I believe I have an idea that will address most of the annotation-related concerns of the current implementation (while still maintaining backward compatibility with annotations, if they are present).

I still need to write up a full proposal and KEP, but my initial thoughts can be found at:

The gist of the idea is that we can make pod-deletion-cost a more transient value (rather than only storing it in annotations), by extending the /apis/apps/v1/namespaces/{namespace}/deployments/{name}/scale API so when a caller sends a PATCH which reduces replicas, they can include the pod-deletion-cost of one-or-more Pods, these costs will only affect the current down-scale (unlike the annotations, which must be manually cleared after scaling to remove their effect).

@remiville
Copy link

Many thanks for all your works and reflections on this subject, I'm strongly interest in the capacity to choose pods to be evicted during scale in and I try to follow up corresponding discussions and feature developments or proposals.

I searched for a long time how to achieve this correctly, I was happy with the PodDeletionCost but now I am a little disappointed as it seems that it will stay a beta (please do not remove this feature until an equivalent one is released).
To give my two cents, I will share my understanding of the issue and maybe, I hope, help to solve it in a simple and globally compatible manner (may be I should post elsewhere, I'm not familiar with your processes).

My need (which is maybe different of yours) is to selectively evict or replace terminated pods to keep a dynamic number of fresh pod replicas without terminating potentially running pods (I mean pods running applications currently processing something).
It is more or less a pool of pods with a minimum and maximum replicas, a current number of replica varying in function of external demands, and the rule to forbid to terminate a pod with activity inside.

I may be wrong, but I think the root cause of the problem is the incompatibility between the automatic pod restart and the scale-in features.
If the ReplicaSet automatically restarts terminated pods then it gives no chance to the application itself to indicate which pod to be evicted during scale in (I mean without using the API).

Without PodDeletionCost, one known workaround is to:

  • stop or delete the ReplicaSet
  • delete selected pods
  • decrease replica count accordingly
  • start or recreate the ReplicaSet.

For me this workaround speaks in favor the incompatibility between ReplicaSet and scale-in features to select pods to be evicted : currently that cannot work when mixed together.

Also I think one should avoid any controller to terminate a pod, it should be the application inside the pod that terminates, implying the its pod to terminate, then a controller could evict only already terminated pod.

Here is my proposal :

  • add an option to ReplicaSet or Deployment etc to not restart terminated pods (succeeded and/or failed).
    Currently the restart policy can only be Always.
  • during scale-in prioritize terminated pods to be evicted (maybe that's already the case ?)

With these behaviors, scale-in will select pods to be evicted based on the inside pod applications termination status (here Succeeded or Failed) instead of external indicators.
If a custom controller is used to maintain a dynamic number of replica it will be able to remove or replace terminated pods just by decreasing the replica count or deleting them.

If this proposal is acceptable and can work it is maybe achievable with a minimal coding effort.

What do you think ?

@remiville
Copy link

Maybe my need is different because I need to automatically replace or delete terminated pods.
I have been able to select pods to remove from the replicaset by not terminating pods but setting the pod-delection-cost annotation instead, then a custom controller decrease the replica or delete pods accordingly.
Like evoked in PROPOSAL configurable down-scaling behaviour in ReplicaSets & Deployments something like a pod deletion cost probe would be better than the annotation to let the application indicates by itself that it must be prioritized for deletion.

I think there are two cases to distinguish during scale-in: the capacity to remove terminated pods from replicaset (without replacing them, which imply a ReplicaSet restartPolicy different than Always), and the capacity to remove running pods (using the probe).

@rhockenbury
Copy link

/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.22 milestone Oct 1, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 30, 2022
@thockin thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2023
@thockin
Copy link
Member

thockin commented Jan 14, 2023

@ahg-g I'm not in love with annotations as APIs. Do we REALLY think this is the best answer?

@ahg-g
Copy link
Member Author

ahg-g commented Jan 14, 2023

@ahg-g I'm not in love with annotations as APIs. Do we REALLY think this is the best answer?

I think we have a reasonable counter proposal in kubernetes/kubernetes#107598 (comment); can we hold this in its current beta state until that proposal makes progress?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 14, 2023
@hoerup
Copy link

hoerup commented Apr 14, 2023

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 14, 2023
@Atharva-Shinde Atharva-Shinde removed the tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team label May 14, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2024
@thesuperzapper
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2024
@ddelange
Copy link

would the following be an acceptable design pattern?

  • Pod gets a sidecar with permission to set its own PodDeletionCost
  • sidecar polls metric-server for current CPU usage of its own Pod
  • sidecar sets its own PodDeletionCost to the number of millicores returned by metrics-server
    • next time scale-in happens, the most idle Pods in the ReplicaSet get deleted, and the busy Pods can stay busy
  • dev tweaks poll interval to the use case, and the load it causes on the cluster

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2024
@thesuperzapper
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/apps Categorizes an issue or PR as relevant to SIG Apps. stage/beta Denotes an issue tracking an enhancement targeted for Beta status
Projects
Status: Needs Triage
Development

No branches or pull requests