DRA: control plane controller ("classic DRA") #3063

pohly · 2021-11-30T16:33:44Z

Enhancement Description

One-line enhancement description: dynamic resource allocation
Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation
Discussion Link: CNCF TAG Runtime Container Device Interface (COD) Working Group meeting(s)
Primary contact (assignee): @pohly
Responsible SIGs: SIG Node
Enhancement target (which target equals to which milestone):
- Alpha release target (x.y): 1.27
- Beta release target (x.y): 1.29
- Stable release target (x.y): 1.31
Alpha (1.26)
- KEP (k/enhancements) update PR(s):
- Code (k/k) update PR(s): dynamic resource allocation kubernetes#111023
- Docs (k/website) update PR(s):
  - blog: add dynamic resource allocation feature blog post website#37434
  - dynamic resource allocation concepts website#37766
Alpha (1.27)
- KEP (k/enhancements) update PR(s):
  - KEP-3063 Update kep.yaml #3868
  - KEP 3063: Update to reflect changes merged into v1.27 #3802
- Code (k/k) update PR(s):
- Docs (k/website) update PR(s):
  - dynamic resource allocation: update for 1.27 website#40086
Alpha (1.28)
Alpha (1.29)
- KEP (k/enhancements) update PR(s):
  - DRA: handle non graceful node shutdowns #4260
  - KEP-3063: dynamic resource allocation: update kubelet plugin RPC versions supported. #4164
- Code (k/k) update PR(s):
- Docs (k/website) update PR(s):
  - Document impact of DRA on scheduling website#43907
Alpha (1.30)
- KEP (k/enhancements) update PR(s):
  - KEP-3063: DRA: 1.30 update #4181
- Code (k/k) update PR(s):
  - dra: reserve + publish during pod binding kubernetes#121876
  - DRA: resource quotas kubernetes#120611
- Docs (k/website) update PR(s):
  - Add Resource Drivers section to extending Kubernetes documentation website#43742
  - DRA in Kubernetes 1.30: adds structured parameters website#45287
Alpha (1.31)
- KEP (k/enhancements) update PR(s): KEP-4381: DRA update for 1.31 #4709
- Code (k/k) update PR(s): DRA for 1.31 kubernetes#125488
- Docs (k/website) update PR(s): DRA documentation for 1.31 website#46816
Withdrawn (1.32)
- KEP (k/enhancements) update PR(s): KEP-3063: Classic DRA: withdraw the KEP #4904
- Code (k/k) update PR(s): DRA: remove "classic DRA" kubernetes#128003
- Docs (k/website) update PR(s): DRA: remove "classic DRA" website#48289

The text was updated successfully, but these errors were encountered:

pohly · 2021-11-30T16:37:55Z

/assign @pohly
/sig node

ahg-g · 2021-12-20T21:23:56Z

do we have a discussion issue on this enhancement?

pohly · 2022-01-10T07:32:18Z

@ahg-g: with discussion issue you mean a separate issue in some repo (where?) in which arbitrary comments are welcome?

No, not at the moment. I've also not seen that done elsewhere before. IMHO at this point the open KEP PR is a good place to collect feedback and questions. I also intend to come to the next SIG-Scheduling meeting.

ahg-g · 2022-01-10T13:58:09Z

@ahg-g: with discussion issue you mean a separate issue in some repo (where?) in which arbitrary comments are welcome?

Yeah, this is what I was looking for, the issue would be under k/k repo.

No, not at the moment. I've also not seen that done elsewhere before.

That is actually the common practice, one starts a feature request issue where the community discusses initial ideas and the merits of the request (look for issues with label kind/feature). That is what I would expect in the discussion link.

IMHO at this point the open KEP PR is a good place to collect feedback and questions. I also intend to come to the next SIG-Scheduling meeting.

But the community have no idea what this is about yet, so better to have an issue discusses "What would you like to be added?" and "Why is this needed" beforehand. Also, meetings are attended by fairly small groups of contributors, having an issue tracking the discussion is important IMO.

pohly · 2022-01-10T15:31:51Z

In my work in SIG-Storage I've not seen much use of such a discussion issue. Instead I had the impression that the usage of "kind/feature" is discouraged nowadays.

https://github.com/kubernetes/kubernetes/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.yaml explicitly says

Feature requests are unlikely to make progress as issues. Please consider engaging with SIGs on slack and mailing lists, instead. A proposal that works through the design along with the implications of the change can be opened as a KEP.

This proposal was discussed with various people beforehand, now we are in the formal KEP phase. But I agree, it is hard to provide a good link to those prior discussions.

ahg-g · 2022-01-10T15:50:57Z

We use that in sig-scheduling, and it does serve as a very good place for initial rounds of discussions, discussions on slack and meetings are hard to reference as you pointed out.

I still have no idea what this is proposing, and I may not attend the next sig meeting for example...

gracenng · 2022-01-30T08:59:32Z

Hi @ ! 1.24 Enhancements team here.
Checking in as we approach enhancements freeze in less than a week on 18:00pm PT on Thursday Feb 3rd
Here’s where this enhancement currently stands:

Updated KEP file using the latest template has been merged into the k/enhancements repo. KEP-3063: dynamic resource allocation #3064
KEP status is marked as implementable for this release with latest-milestone: 1.24
KEP has a test plan section filled out.
KEP has up to date graduation criteria.
KEP has a production readiness review that has been completed and merged into k/enhancements.

The status of this enhancement is track as at risk.
Thanks!

gracenng · 2022-02-04T02:21:03Z

The Enhancements Freeze is now in effect and this enhancement is removed from the release.
Please feel free to file an exception.

/milestone clear

gracenng · 2022-03-01T20:58:10Z

Hi Patrick, `tracked/yes` will be applied when the KEP is merged and all requirements for Enhancement Freeze are met. We are definitely keeping an eye on this! Feel free to ping me once it's merged

…

On Tue, Mar 1, 2022 at 11:17 AM Patrick Ohly ***@***.***> wrote: @gracenng <https://github.com/gracenng> : an exception was requested and granted for this enhancement to move to GA in 1.24: https://groups.google.com/g/kubernetes-sig-release/c/sUpd2H1wxnk/m/lL_I6GT-BwAJ Can you label it again as "tracked/yes"? — Reply to this email directly, view it on GitHub <#3063 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKCRLO46IYCBRTXXZE6EHGTU5ZUM3ANCNFSM5JCIMTQA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

k8s-triage-robot · 2022-05-30T21:20:17Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kerthcet · 2022-05-31T01:18:01Z

/remove-lifecycle stale

marosset · 2022-06-10T19:57:03Z

Hello @pohly 👋, 1.25 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PST on Thursday June 16, 2022.

For note, This enhancement is targeting for stage alpha for 1.25 (correct me, if otherwise)

Here's where this enhancement currently stands:

KEP file using the latest template has been merged into the k/enhancements repo.
KEP status is marked as implementable
KEP has a updated detailed test plan section filled out
KEP has up to date graduation criteria
KEP has a production readiness review that has been completed and merged into k/enhancements.

It looks like #3064 will address everything in this list.

For note, the status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

marosset · 2022-06-17T23:09:24Z

Hello @pohly 👋, just a quick check-in again, as we approach the 1.25 enhancements freeze.

Please plan to get #3064 reviewed and merged before enhancements freeze on Thursday, June 23, 2022 at 18:00 PM PT.

For note, the current status of the enhancement is atat-risk. Thank you!

Dynamic resource allocation rethinks how additional resources like accelerators be managed and requested in Kubernetes.

pohly · 2024-10-01T14:40:02Z

classic DRA has some limitations and we invested and decided to move with structured DRA, what is the point of exploring classic DRA?

Structured parameters has other limitations. That's why we are working on additional KEPs for it, and that is likely to continue for a while.

"Exploring" can be done on 1.31 or 1.30 or ... - why do we need to keep it in 1.32?

Perhaps because it's easier to install one version of Kubernetes and then try out different approaches? Just a thought.

can you confirm if this is slated for deprecation in 1.32?

I am not sure whether we have reached a consensus. Deadline for a decision is probably soon enough before KEP freeze so that we can still record a decision to remove it. If we keep it, no updates will be needed.

jenshu · 2024-10-01T15:55:59Z

@pohly ok thank you, I will mark this at risk for enhancement freeze for now, pending your decision.

Please keep in mind that the PRR freeze is coming up on Thursday 3rd October 2024 and the enhancements freeze is on 02:00 UTC Friday 11th October 2024 / 19:00 PDT Thursday 10th October 2024

alculquicondor · 2024-10-01T17:02:11Z

Perhaps because it's easier to install one version of Kubernetes and then try out different approaches? Just a thought.

That's not a strong argument. Unless there is a compelling argument defending the need for classic DRA to stay one more release, I prefer we remove it ASAP. The more we keep it, the more vendors will depend on it and make it harder and harder to remove every passing release.

thockin · 2024-10-01T23:13:00Z

I agree. It also makes people think that we are hedging our bet around structured parameters, and I don't think we are. If there are truly shortcomings with it, and I accept that there are, and we will fix those forward.

pohly · 2024-10-04T10:23:16Z

I've created #4904 to mark the KEP as "withdrawn" and notified folks on #wg-device-management.

kannon92 · 2024-10-07T14:37:30Z

@pohly can you update the PR description to include the latest changes for 1.32?

Much of the PRR text that was originally written for "classic DRA" applies also to "structured parameters". It gets copied from kubernetes#3063 to kubernetes#4381, with some adaptions. The v1beta1 API will be almost identical to the v1alpha3 API, with just some minor tweaks to fix oversights. The kubelet gRPC gets bumped with no changes. Nonetheless, drivers should get updated, which can be done by updating the Go dependencies and optionally changing the API import.

jenshu · 2024-10-11T01:41:24Z

1.32 Enhancements team here. I see the updates have been made to withdraw this KEP, and I've updated the status to tracked for enhancements freeze

rytswd · 2024-10-18T13:47:29Z

Hi @pohly 👋 -- this is Ryota (@rytswd) from the v1.32 Communications Team!

For the v1.32 release, we are currently in the process of collecting and curating a list of potential feature blogs, and we are keen to hear if you would consider writing one for this withdrawal!

As you may be aware, feature blogs are a great way to communicate to users about features which fall into (but not limited to) the following categories:

This introduces some breaking change(s)
This has significant impacts and/or implications to users
...Or this is a long-awaited feature, which would go a long way to cover the journey more in detail 🎉

To opt in to write a feature blog, could you please let us know and open a "Feature Blog placeholder PR" (which can be only a skeleton at first) against the website repository by Wednesday, 30th Oct 2024? For more information about writing a blog, please find the blog contribution guidelines 📚

Tip

Some timeline to keep in mind:

02:00 UTC Wednesday, 30th Oct: Feature blog PR freeze
Monday, 25th Nov: Feature blogs ready for review
You can find more in the release document

Note

In your placeholder PR, use XX characters for the blog date in the front matter and file name. We will work with you on updating the PR with the publication date once we have a final number of feature blogs for this release.

jenshu · 2024-10-28T04:33:44Z

Hello @pohly 👋, v1.32 Enhancements team here.

With all the implementation (code related) PRs merged per the issue description:

DRA: remove "classic DRA" kubernetes#128003

This enhancement is now marked as tracked for code freeze for the v1.32 Code Freeze!

Additionally, please let me know if there are any other PRs in k/k that we should track for this KEP, so that we can maintain accurate status.

rytswd · 2024-10-29T03:00:22Z

Hi @pohly 👋, v1.32 Communications Team here again!

This is a gentle reminder for the feature blog deadline mentioned above, which is 02:00 UTC Wednesday, 30th Oct. To opt in, please let us know and open a Feature Blog placeholder PR against k/website by the deadline. If you have any questions, please feel free to reach out to us!

pohly · 2024-11-19T16:13:49Z

/wg device-management

pohly · 2024-11-19T16:21:45Z

This was withdrawn in 1.32 and the feature gate definition got removed in that same release because it was alpha. There's nothing left to do.

/close

k8s-ci-robot · 2024-11-19T16:21:51Z

@pohly: Closing this issue.

In response to this:

This was withdrawn in 1.32 and the feature gate definition got removed in that same release because it was alpha.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added the needs-sig label Nov 30, 2021

k8s-ci-robot assigned pohly Nov 30, 2021

k8s-ci-robot added sig/node and removed needs-sig labels Nov 30, 2021

pohly mentioned this issue Nov 30, 2021

KEP-3063: dynamic resource allocation #3064

Merged

gracenng added the tracked/yes label Jan 17, 2022

gracenng added this to the v1.24 milestone Jan 17, 2022

gracenng added tracked/no and removed tracked/yes labels Feb 4, 2022

k8s-ci-robot removed this from the v1.24 milestone Feb 4, 2022

k8s-ci-robot added the lifecycle/stale label May 30, 2022

k8s-ci-robot removed the lifecycle/stale label May 31, 2022

dchen1107 added the stage/alpha label Jun 9, 2022

dchen1107 added this to the v1.25 milestone Jun 9, 2022

Priyankasaggu11929 added tracked/yes and removed tracked/no labels Jun 10, 2022

pohly added a commit to pohly/enhancements that referenced this issue Jun 22, 2022

dynamic resource allocation: add KEP kubernetes#3063

c9abb66

Dynamic resource allocation rethinks how additional resources like accelerators be managed and requested in Kubernetes.

jenshu moved this to At risk for enhancements freeze in 1.32 Enhancements Tracking Oct 1, 2024

pohly mentioned this issue Oct 4, 2024

KEP-3063: Classic DRA: withdraw the KEP #4904

Merged

kannon92 moved this from Considered for release to Tracked in SIG Node 1.32 KEPs planning Oct 7, 2024

jenshu moved this from At risk for enhancements freeze to Tracked for enhancements freeze in 1.32 Enhancements Tracking Oct 11, 2024

This was referenced Oct 11, 2024

DRA: remove "classic DRA" kubernetes/kubernetes#128003

Merged

DRA: remove "classic DRA" kubernetes/website#48289

Merged

jenshu moved this from Tracked for enhancements freeze to Tracked for code freeze in 1.32 Enhancements Tracking Oct 28, 2024

k8s-ci-robot added the wg/device-management label Nov 19, 2024

pohly added this to Dynamic Resource Allocation Nov 19, 2024

pohly moved this to 🆕 New in Dynamic Resource Allocation Nov 19, 2024

k8s-ci-robot closed this as completed Nov 19, 2024

github-project-automation bot moved this from Needs Triage to Closed in SIG Scheduling Nov 19, 2024

github-project-automation bot moved this from Ongoing Enhancements to Done in @klueska's k8s review queue Nov 19, 2024

github-project-automation bot moved this from Implemented to Done in SIG Node 1.32 KEPs planning Nov 19, 2024

dipesh-rawat removed the lead-opted-in label Jan 13, 2025

jenshu removed the tracked/yes label May 17, 2025

DRA: control plane controller ("classic DRA") #3063

DRA: control plane controller ("classic DRA") #3063

Comments

pohly commented Nov 30, 2021 • edited by pacoxu Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Enhancement Description

pohly commented Nov 30, 2021

Uh oh!

ahg-g commented Dec 20, 2021

Uh oh!

pohly commented Jan 10, 2022

Uh oh!

ahg-g commented Jan 10, 2022

Uh oh!

pohly commented Jan 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahg-g commented Jan 10, 2022

Uh oh!

gracenng commented Jan 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gracenng commented Feb 4, 2022

Uh oh!

gracenng commented Mar 1, 2022 via email

Uh oh!

k8s-triage-robot commented May 30, 2022

Uh oh!

kerthcet commented May 31, 2022

Uh oh!

marosset commented Jun 10, 2022

Uh oh!

marosset commented Jun 17, 2022

Uh oh!

pohly commented Oct 1, 2024

Uh oh!

jenshu commented Oct 1, 2024

Uh oh!

alculquicondor commented Oct 1, 2024

Uh oh!

thockin commented Oct 1, 2024

Uh oh!

pohly commented Oct 4, 2024

Uh oh!

kannon92 commented Oct 7, 2024

Uh oh!

jenshu commented Oct 11, 2024

Uh oh!

rytswd commented Oct 18, 2024

Uh oh!

jenshu commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rytswd commented Oct 29, 2024

Uh oh!

pohly commented Nov 19, 2024

Uh oh!

pohly commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Nov 19, 2024

Uh oh!

pohly commented Nov 30, 2021 •

edited by pacoxu

Loading

pohly commented Jan 10, 2022 •

edited

Loading

gracenng commented Jan 30, 2022 •

edited

Loading

jenshu commented Oct 28, 2024 •

edited

Loading

pohly commented Nov 19, 2024 •

edited

Loading