Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Cluster Services API #1645

Open
JeremyOT opened this issue Mar 30, 2020 · 42 comments
Open

Multi-Cluster Services API #1645

JeremyOT opened this issue Mar 30, 2020 · 42 comments
Labels
sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster.

Comments

@JeremyOT
Copy link
Member

JeremyOT commented Mar 30, 2020

Enhancement Description

Please to keep this description up to date. This will help the Enhancement Team track efficiently the evolution of the enhancement

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Mar 30, 2020
@JeremyOT
Copy link
Member Author

/sig multicluster
/cc @pmorie
@thockin

@k8s-ci-robot k8s-ci-robot added sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 30, 2020
@johnbelamaric
Copy link
Member

@JeremyOT Hi Jeremy. I am serving on the enhancements team for 1.19, which means we are tracking what KEPs may be targeted at 1.19. Do you see this moving to alpha in 1.19? If so there is work to be done on the KEP before enhancements freeze in about two weeks.

@JeremyOT
Copy link
Member Author

JeremyOT commented May 6, 2020

@johnbelamaric I think this will be tight to make alpha for 1.19 - there are some open questions that may take >2 weeks to resolve but I'll work on getting the KEP to a complete state

@johnbelamaric
Copy link
Member

Thanks Jeremy. I'll target it for 1.20, let me know if things proceed faster than expected.

/milestone v1.20

@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone May 8, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2020
@JeremyOT
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2020
@kikisdeliveryservice
Copy link
Member

Hi @JeremyOT !

Enhancements Lead here, do you still intend to go alpha in 1.20?

Thanks!
Kirsten

@JeremyOT
Copy link
Member Author

Hey @kikisdeliveryservice , we decided to go Alpha out-of-tree at sigs.k8s.io/mcs-api instead. We'll likely come back to the original in-tree plans for Beta, but we don't have a release target yet

@kikisdeliveryservice kikisdeliveryservice added the tracked/out-of-tree Denotes an out-of-tree enhancement issue, which does not need to be tracked by the Release Team label Sep 18, 2020
@kikisdeliveryservice
Copy link
Member

Sounds good @JeremyOT just keep us posted! :)

@JeremyOT
Copy link
Member Author

Will do!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2020
@BrendanThompson
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2021
@annajung annajung removed the tracked/out-of-tree Denotes an out-of-tree enhancement issue, which does not need to be tracked by the Release Team label Jan 7, 2021
@annajung annajung removed this from the v1.20 milestone Jan 7, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2021
@JeremyOT
Copy link
Member Author

JeremyOT commented Apr 7, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 6, 2021
@JeremyOT
Copy link
Member Author

JeremyOT commented Jul 7, 2021

/remove-lifecycle stale

@JeremyOT
Copy link
Member Author

This is a fine place to raise concerns - though it might be easier to track them individually as issues against sigs.k8s.io/mcs-api

Would it be better to have ServiceImport's be non-namespaced? it seems like a short-coming to require the namespace exists, and can cloud a user's cluster by needing to create these unnecessary namespaces.

There are a few reasons why namespaces are beneficial. First, Services are already namespaced. Moving multi-cluster services up to the cluster level changes that characteristic. MCS was designed to follow namespace sameness, which encourages that same-named namespaces have the same owner and use across clusters, so this makes ownership easy to follow across clusters, vs having a separate set of permissions for the multi-cluster version of a service. Further, if we don't follow the existing service ownership model, we'll need to figure out how to extend other service related APIs to fit MCS vs just following existing patterns (e.g. https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2091-admin-network-policy). Does that make sense?

Why have this restriction? I can think of many use cases where a user would want to target a specific cluster

MCS supports this, just not with a specific cluster identifier. Instead, we're taking the position that since same-named namespaces are meant to represent the same things across clusters, if a specific cluster's service instance needs to be access individually it may not really be part of the same service. You can create cluster-1-service in cluster 1 and access it by name from any cluster. If both access patterns are needed, the current solution would be to create 2 services, say a my-svc service in each cluster merged into one and a my-svc-east in your east cluster for cluster-specific access. Admittedly this is a little more config than having the functionality built in, but the thinking was that this makes it specifically opt-in and easier to reason about (there are many cases where cluster specific access is not desired as well).

Digging into your example scenarios:

Cluster migrations: slowly shifting traffic from 1 cluster to another, particularly for traffic that is not proxied externally, or traffic that is sourced from within the cluster.

Doesn't this work with the shared service? As new pods are brought up in a new cluster traffic will shift proportionally to that new cluster. As for handling same-cluster source traffic, MCS doesn't make any statements about how traffic is routed behind the VIP so that implementations have the flexibility to make more intelligent routing decisions. Extensions like https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2433-topology-aware-hints aim to make that easier to implement.

Locality routing to reduce hair pinning

In cases like this, it seems like you'd either want cluster-specific pods, or even direct pod addressing. It seems like that would be a separate step after initial discovery wouldn't it? I don't think you need to use the same service instance for that. If it's required that user requests go to specific clusters/pods, doesn't the requester either need to know that ahead of time (separate services) or need some shared-global discovery service anyway?

Complex load balancing scenarios where load balancing is managed outside of kube-proxy

MCS makes no statements about how load is balanced at all - just that there's a VIP. Implementations absolutely should try to do better than kube-proxy's random spreading if they can but we didn't want to encode more into the KEP than necessary and instead opt for things that are generally applicable across implementations.

@steeling
Copy link

Thanks for the thorough response, and thanks in advance for hearing out these arguments!

This is a fine place to raise concerns - though it might be easier to track them individually as issues against sigs.k8s.io/mcs-api

Would it be better to have ServiceImport's be non-namespaced? it seems like a short-coming to require the namespace exists, and can cloud a user's cluster by needing to create these unnecessary namespaces.

There are a few reasons why namespaces are beneficial. First, Services are already namespaced. Moving multi-cluster services up to the cluster level changes that characteristic. MCS was designed to follow namespace sameness, which encourages that same-named namespaces have the same owner and use across clusters, so this makes ownership easy to follow across clusters, vs having a separate set of permissions for the multi-cluster version of a service. Further, if we don't follow the existing service ownership model, we'll need to figure out how to extend other service related APIs to fit MCS vs just following existing patterns (e.g. https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2091-admin-network-policy). Does that make sense?

ACK, it was a minor nit on my end, and haven't fully thought through the implications of cluster-scoping the resource.

Why have this restriction? I can think of many use cases where a user would want to target a specific cluster

MCS supports this, just not with a specific cluster identifier. Instead, we're taking the position that since same-named namespaces are meant to represent the same things across clusters, if a specific cluster's service instance needs to be access individually it may not really be part of the same service. You can create cluster-1-service in cluster 1 and access it by name from any cluster. If both access patterns are needed, the current solution would be to create 2 services, say a my-svc service in each cluster merged into one and a my-svc-east in your east cluster for cluster-specific access. Admittedly this is a little more config than having the functionality built in, but the thinking was that this makes it specifically opt-in and easier to reason about (there are many cases where cluster specific access is not desired as well).

Fair enough, this workaround existing at least unblocks the use case.

Digging into your example scenarios:

Cluster migrations: slowly shifting traffic from 1 cluster to another, particularly for traffic that is not proxied externally, or traffic that is sourced from within the cluster.

Doesn't this work with the shared service? As new pods are brought up in a new cluster traffic will shift proportionally to that new cluster. As for handling same-cluster source traffic, MCS doesn't make any statements about how traffic is routed behind the VIP so that implementations have the flexibility to make more intelligent routing decisions. Extensions like https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2433-topology-aware-hints aim to make that easier to implement.

Yes, although I was also considering the use case of something like SMI's TrafficSplit resource (or future implementations of that which may be tucked inside the Gateway API?)

Locality routing to reduce hair pinning

In cases like this, it seems like you'd either want cluster-specific pods, or even direct pod addressing. It seems like that would be a separate step after initial discovery wouldn't it? I don't think you need to use the same service instance for that. If it's required that user requests go to specific clusters/pods, doesn't the requester either need to know that ahead of time (separate services) or need some shared-global discovery service anyway?

Yup I was thinking about a shared discvoery service, and leveraging an address specific to that cluster, which would work with the workaround mentioned above.

Complex load balancing scenarios where load balancing is managed outside of kube-proxy

MCS makes no statements about how load is balanced at all - just that there's a VIP. Implementations absolutely should try to do better than kube-proxy's random spreading if they can but we didn't want to encode more into the KEP than necessary and instead opt for things that are generally applicable across implementations.

Fair enough :)

All of your points above do point out we don't need DNS resolution specific to each cluster's service export, but another question may be why not include it?

the thinking was that this makes it specifically opt-in and easier to reason about (there are many cases where cluster specific access is not desired as well).

I don't necessarily think it makes it easier to reason about. Intuitively i would think that since we have:

service.ns.svc.cluster.local
service.ns.svc.cluster
service.ns.svc
service.ns
service
hostname.service.cluster.local
...
hostname.service

are all resolvable via DNS, and adding

hostname.clusterid.service.ns.svc.cluster.local
would also intuitively mean that
clusterid.service.ns.svc.cluster.local

Having it exist is essentially already "opt-in", in that I don't need to use it if I don't want. IMHO there should be a bit more justification on omitting it, although with the workaround you mentioned its not a hill a would die on :)

@JeremyOT
Copy link
Member Author

Having it exist is essentially already "opt-in", in that I don't need to use it if I don't want. IMHO there should be a bit more justification on omitting it, although with the workaround you mentioned its not a hill a would die on :)

I think the biggest issue here is that if it exists we need to reserve another VIP per-cluster which eats up another constrained resource you may not be using, and if you have many clusters, may eat into it quite quickly. I've also been thinking that the opt-in is more about allowing use than use itself. Consumers may not care if an unused VIP sits around for a service, but as a producer I want to control whether or my consumers have that option. E.g. what if a consumer decides for some reason or another to take an explicit dependency on cluster-a, if I didn't intend to allow cluster-specific access I might decide to replace cluster-a with -b and/or -c, or move which cluster my service is deployed in. Explicit opt-in to per-cluster exposure lets me decide in advance how to handle that, it also gives me a sort of alias for the per-cluster service and if I really needed to, I could move cluster-a-svc to cluster-b without impacting consumers. Definitely getting into messy territory here, but the main point I'm getting at is that opt-in even with this workaround seems like it introduces less risk and potential side effects.

If you want to discuss further, this might be a good topic for our bi-weekly meetups and a live convo. I definitely appreciate pushback and diving into the API here. We really want to make sure we aren't closing any doors unnecessarily

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 10, 2022
@steeling
Copy link

Having it exist is essentially already "opt-in", in that I don't need to use it if I don't want. IMHO there should be a bit more justification on omitting it, although with the workaround you mentioned its not a hill a would die on :)

I think the biggest issue here is that if it exists we need to reserve another VIP per-cluster which eats up another constrained resource you may not be using, and if you have many clusters, may eat into it quite quickly. I've also been thinking that the opt-in is more about allowing use than use itself. Consumers may not care if an unused VIP sits around for a service, but as a producer I want to control whether or my consumers have that option. E.g. what if a consumer decides for some reason or another to take an explicit dependency on cluster-a, if I didn't intend to allow cluster-specific access I might decide to replace cluster-a with -b and/or -c, or move which cluster my service is deployed in. Explicit opt-in to per-cluster exposure lets me decide in advance how to handle that, it also gives me a sort of alias for the per-cluster service and if I really needed to, I could move cluster-a-svc to cluster-b without impacting consumers. Definitely getting into messy territory here, but the main point I'm getting at is that opt-in even with this workaround seems like it introduces less risk and potential side effects.

If you want to discuss further, this might be a good topic for our bi-weekly meetups and a live convo. I definitely appreciate pushback and diving into the API here. We really want to make sure we aren't closing any doors unnecessarily

A super late reply here, but just had a thought, which could allow the best of both worlds. Why not add a field, exposePerClusterServices, defaulted to false.

It's pretty cumbersome to ask the user to create a serviceimport/export for each service in each cluster otherwise. I also plan on attending some of the upcoming meetings so can chat then, thanks!

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 15, 2022
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JeremyOT
Copy link
Member Author

/reopen
/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@JeremyOT: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Oct 15, 2022
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 15, 2022
@sftim
Copy link
Contributor

sftim commented Dec 4, 2022

kubernetes/website#37418 highlighted that MCS is not documented (or, if it is, those docs are too hard to find).

We should aim to document this API. We document features and APIs once they reach alpha and end users could be opting in to use them.

@sftim
Copy link
Contributor

sftim commented Dec 4, 2022

(code that is in-project but out of tree still needs docs)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 4, 2023
@lauralorenz
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 6, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 4, 2023
@thockin thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 5, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 21, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 20, 2024
@thockin thockin removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster.
Projects
None yet
Development

No branches or pull requests