My Kubernetes Nodes Randomly Going Out Again

Service

An abstract way to expose an awarding running on a set of Pods equally a network service.

With Kubernetes yous don't demand to modify your application to use an unfamiliar service discovery machinery. Kubernetes gives Pods their own IP addresses and a single DNS proper name for a ready of Pods, and can load-residual beyond them.

Motivation

Kubernetes Pods are created and destroyed to match the state of your cluster. Pods are nonpermanent resources. If you utilise a Deployment to run your app, it can create and destroy Pods dynamically.

Each Pod gets its own IP address, nevertheless in a Deployment, the fix of Pods running in i moment in time could be different from the set up of Pods running that application a moment later.

This leads to a problem: if some set of Pods (call them "backends") provides functionality to other Pods (call them "frontends") inside your cluster, how practise the frontends detect out and keep track of which IP accost to connect to, so that the frontend tin use the backend role of the workload?

Enter Services.

Service resources

In Kubernetes, a Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service). The ready of Pods targeted by a Service is ordinarily determined by a selector. To learn almost other means to ascertain Service endpoints, see Services without selectors.

For example, consider a stateless image-processing backend which is running with three replicas. Those replicas are fungible—frontends practice not care which backend they utilise. While the actual Pods that compose the backend set may change, the frontend clients should non need to be aware of that, nor should they need to keep rails of the ready of backends themselves.

The Service abstraction enables this decoupling.

Cloud-native service discovery

If yous're able to use Kubernetes APIs for service discovery in your application, y'all tin can query the API server for Endpoints, that get updated whenever the fix of Pods in a Service changes.

For non-native applications, Kubernetes offers means to place a network port or load balancer in between your application and the backend Pods.

Defining a Service

A Service in Kubernetes is a REST object, similar to a Pod. Like all of the Residue objects, yous tin can POST a Service definition to the API server to create a new case. The proper name of a Service object must be a valid RFC 1035 label name.

For instance, suppose you accept a set of Pods where each listens on TCP port 9376 and contains a label app=MyApp:

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        name:              my-service                                          spec:                                                        selector:                                                        app:              MyApp                                                        ports:                                          -              protocol:              TCP                                                        port:                            lxxx                                                                      targetPort:                            9376

This specification creates a new Service object named "my-service", which targets TCP port 9376 on any Pod with the app=MyApp label.

Kubernetes assigns this Service an IP address (sometimes chosen the "cluster IP"), which is used by the Service proxies (see Virtual IPs and service proxies below).

The controller for the Service selector continuously scans for Pods that match its selector, and then POSTs any updates to an Endpoint object also named "my-service".

Port definitions in Pods have names, and you can reference these names in the targetPort attribute of a Service. This works even if there is a mixture of Pods in the Service using a single configured proper name, with the aforementioned network protocol available via dissimilar port numbers. This offers a lot of flexibility for deploying and evolving your Services. For example, you lot can change the port numbers that Pods expose in the next version of your backend software, without breaking clients.

The default protocol for Services is TCP; you can besides utilize whatever other supported protocol.

As many Services need to betrayal more than than one port, Kubernetes supports multiple port definitions on a Service object. Each port definition can accept the same protocol, or a unlike one.

Services without selectors

Services most commonly abstract access to Kubernetes Pods, but they can also abstract other kinds of backends. For example:

You lot want to accept an external database cluster in product, but in your test environs you use your own databases.
You lot want to point your Service to a Service in a different Namespace or on another cluster.
You are migrating a workload to Kubernetes. While evaluating the arroyo, you run only a portion of your backends in Kubernetes.

In whatsoever of these scenarios y'all can define a Service without a Pod selector. For example:

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        name:              my-service                                          spec:                                                        ports:                                          -              protocol:              TCP                                                        port:                            80                                                                      targetPort:                            9376

Because this Service has no selector, the corresponding Endpoints object is not created automatically. You can manually map the Service to the network address and port where information technology's running, by adding an Endpoints object manually:

                          apiVersion:              v1                                          kind:              Endpoints                                          metadata:                                                        proper noun:              my-service                                          subsets:                                          -              addresses:                                          -              ip:                            192.0.2.42                                                                      ports:                                          -              port:                            9376

The proper name of the Endpoints object must exist a valid DNS subdomain name.

Accessing a Service without a selector works the aforementioned as if it had a selector. In the instance above, traffic is routed to the single endpoint defined in the YAML: 192.0.2.42:9376 (TCP).

An ExternalName Service is a special case of Service that does not have selectors and uses DNS names instead. For more information, see the ExternalName section later in this document.

Over Capacity Endpoints

If an Endpoints resources has more than 1000 endpoints then a Kubernetes v1.22 (or later) cluster annotates that Endpoints with endpoints.kubernetes.io/over-capacity: truncated. This annotation indicates that the affected Endpoints object is over capacity and that the endpoints controller has truncated the number of endpoints to 1000.

EndpointSlices

FEATURE STATE: Kubernetes v1.21 [stable]

EndpointSlices are an API resource that can provide a more scalable alternative to Endpoints. Although conceptually quite similar to Endpoints, EndpointSlices allow for distributing network endpoints across multiple resources. By default, an EndpointSlice is considered "full" one time it reaches 100 endpoints, at which point additional EndpointSlices will be created to shop any additional endpoints.

EndpointSlices provide boosted attributes and functionality which is described in detail in EndpointSlices.

Awarding protocol

FEATURE Country: Kubernetes v1.twenty [stable]

The appProtocol field provides a way to specify an application protocol for each Service port. The value of this field is mirrored by the corresponding Endpoints and EndpointSlice objects.

This field follows standard Kubernetes label syntax. Values should either be IANA standard service names or domain prefixed names such as mycompany.com/my-custom-protocol.

Virtual IPs and service proxies

Every node in a Kubernetes cluster runs a kube-proxy. kube-proxy is responsible for implementing a form of virtual IP for Services of blazon other than ExternalName.

Why not use round-robin DNS?

A question that pops up every now and and then is why Kubernetes relies on proxying to forwards inbound traffic to backends. What about other approaches? For example, would information technology be possible to configure DNS records that have multiple A values (or AAAA for IPv6), and rely on round-robin proper noun resolution?

There are a few reasons for using proxying for Services:

There is a long history of DNS implementations non respecting record TTLs, and caching the results of name lookups afterwards they should accept expired.
Some apps do DNS lookups merely in one case and cache the results indefinitely.
Even if apps and libraries did proper re-resolution, the low or null TTLs on the DNS records could impose a loftier load on DNS that so becomes difficult to manage.

Later in this page you can read virtually various kube-proxy implementations work. Overall, you should notation that, when running kube-proxy, kernel level rules may be modified (for case, iptables rules might go created), which won't go cleaned upwards, in some cases until you reboot. Thus, running kube-proxy is something that should only exist done by an administrator which understands the consequences of having a low level, privileged network proxying service on a calculator. Although the kube-proxy executable supports a cleanup function, this function is non an official feature and thus is only available to utilise every bit-is.

Configuration

Annotation that the kube-proxy starts up in different modes, which are determined by its configuration.

The kube-proxy's configuration is done via a ConfigMap, and the ConfigMap for kube-proxy finer deprecates the behaviour for almost all of the flags for the kube-proxy.
The ConfigMap for the kube-proxy does non support live reloading of configuration.
The ConfigMap parameters for the kube-proxy cannot all be validated and verified on startup. For instance, if your operating system doesn't allow you to run iptables commands, the standard kernel kube-proxy implementation will not work. Likewise, if you lot have an operating system which doesn't support netsh, it will not run in Windows userspace mode.

User space proxy manner

In this (legacy) mode, kube-proxy watches the Kubernetes control plane for the addition and removal of Service and Endpoint objects. For each Service it opens a port (randomly called) on the local node. Whatsoever connections to this "proxy port" are proxied to 1 of the Service'south backend Pods (every bit reported via Endpoints). kube-proxy takes the SessionAffinity setting of the Service into account when deciding which backend Pod to use.

Lastly, the user-infinite proxy installs iptables rules which capture traffic to the Service's clusterIP (which is virtual) and port. The rules redirect that traffic to the proxy port which proxies the backend Pod.

By default, kube-proxy in userspace fashion chooses a backend via a circular-robin algorithm.

Services overview diagram for userspace proxy

`iptables` proxy fashion

In this mode, kube-proxy watches the Kubernetes control plane for the improver and removal of Service and Endpoint objects. For each Service, it installs iptables rules, which capture traffic to the Service'southward clusterIP and port, and redirect that traffic to one of the Service's backend sets. For each Endpoint object, it installs iptables rules which select a backend Pod.

By default, kube-proxy in iptables way chooses a backend at random.

Using iptables to handle traffic has a lower system overhead, because traffic is handled by Linux netfilter without the need to switch between userspace and the kernel space. This approach is also likely to exist more than reliable.

If kube-proxy is running in iptables mode and the outset Pod that'south selected does not respond, the connexion fails. This is dissimilar from userspace fashion: in that scenario, kube-proxy would detect that the connection to the showtime Pod had failed and would automatically retry with a different backend Pod.

You can use Pod readiness probes to verify that backend Pods are working OK, and so that kube-proxy in iptables mode merely sees backends that test out as good for you. Doing this ways you avoid having traffic sent via kube-proxy to a Pod that's known to have failed.

Services overview diagram for iptables proxy

IPVS proxy style

FEATURE Country: Kubernetes v1.11 [stable]

In ipvs mode, kube-proxy watches Kubernetes Services and Endpoints, calls netlink interface to create IPVS rules accordingly and synchronizes IPVS rules with Kubernetes Services and Endpoints periodically. This command loop ensures that IPVS condition matches the desired country. When accessing a Service, IPVS directs traffic to one of the backend Pods.

The IPVS proxy mode is based on netfilter claw function that is similar to iptables manner, merely uses a hash tabular array as the underlying information construction and works in the kernel infinite. That means kube-proxy in IPVS mode redirects traffic with lower latency than kube-proxy in iptables mode, with much better operation when synchronising proxy rules. Compared to the other proxy modes, IPVS style likewise supports a higher throughput of network traffic.

IPVS provides more than options for balancing traffic to backend Pods; these are:

rr: round-robin
lc: least connexion (smallest number of open connections)
dh: destination hashing
sh: source hashing
sed: shortest expected filibuster
nq: never queue

Services overview diagram for IPVS proxy

In these proxy models, the traffic bound for the Service's IP:Port is proxied to an advisable backend without the clients knowing anything most Kubernetes or Services or Pods.

If you want to brand sure that connections from a particular customer are passed to the same Pod each time, y'all can select the session affinity based on the customer'south IP addresses by setting service.spec.sessionAffinity to "ClientIP" (the default is "None"). You tin can too gear up the maximum session sticky fourth dimension past setting service.spec.sessionAffinityConfig.clientIP.timeoutSeconds appropriately. (the default value is 10800, which works out to exist 3 hours).

Multi-Port Services

For some Services, yous need to expose more than one port. Kubernetes lets you configure multiple port definitions on a Service object. When using multiple ports for a Service, you must give all of your ports names so that these are unambiguous. For example:

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        proper noun:              my-service                                          spec:                                                        selector:                                                        app:              MyApp                                                        ports:                                          -              name:              http                                                        protocol:              TCP                                                        port:                            eighty                                                                      targetPort:                            9376                                                        -              proper noun:              https                                                        protocol:              TCP                                                        port:                            443                                                                      targetPort:                            9377

Choosing your own IP address

You tin can specify your own cluster IP accost as role of a Service cosmos request. To exercise this, gear up the .spec.clusterIP field. For instance, if you lot already have an existing DNS entry that you wish to reuse, or legacy systems that are configured for a specific IP address and difficult to re-configure.

The IP address that you choose must be a valid IPv4 or IPv6 address from within the service-cluster-ip-range CIDR range that is configured for the API server. If you attempt to create a Service with an invalid clusterIP accost value, the API server volition return a 422 HTTP status code to indicate that there's a trouble.

Traffic policies

External traffic policy

You tin can set the spec.externalTrafficPolicy field to control how traffic from external sources is routed. Valid values are Cluster and Local. Gear up the field to Cluster to road external traffic to all fix endpoints and Local to only route to ready node-local endpoints. If the traffic policy is Local and at that place are are no node-local endpoints, the kube-proxy does non forward whatever traffic for the relevant Service.

Internal traffic policy

Feature STATE: Kubernetes v1.22 [beta]

You tin set the spec.internalTrafficPolicy field to control how traffic from internal sources is routed. Valid values are Cluster and Local. Prepare the field to Cluster to route internal traffic to all ready endpoints and Local to only road to ready node-local endpoints. If the traffic policy is Local and in that location are no node-local endpoints, traffic is dropped by kube-proxy.

Discovering services

Kubernetes supports 2 chief modes of finding a Service - environment variables and DNS.

Surroundings variables

When a Pod is run on a Node, the kubelet adds a set of environment variables for each active Service. Information technology adds {SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT variables, where the Service name is upper-cased and dashes are converted to underscores. It also supports variables (see makeLinkVariables) that are compatible with Docker Engine'southward "legacy container links" feature.

For case, the Service redis-primary which exposes TCP port 6379 and has been allocated cluster IP address 10.0.0.11, produces the following environment variables:

                          REDIS_MASTER_SERVICE_HOST              =10.0.0.11              REDIS_MASTER_SERVICE_PORT              =              6379              REDIS_MASTER_PORT              =tcp://x.0.0.11:6379              REDIS_MASTER_PORT_6379_TCP              =tcp://10.0.0.11:6379              REDIS_MASTER_PORT_6379_TCP_PROTO              =tcp              REDIS_MASTER_PORT_6379_TCP_PORT              =              6379              REDIS_MASTER_PORT_6379_TCP_ADDR              =10.0.0.eleven

DNS

You tin can (and near always should) set up a DNS service for your Kubernetes cluster using an add-on.

A cluster-aware DNS server, such every bit CoreDNS, watches the Kubernetes API for new Services and creates a gear up of DNS records for each one. If DNS has been enabled throughout your cluster then all Pods should automatically be able to resolve Services past their DNS name.

For example, if you lot take a Service called my-service in a Kubernetes namespace my-ns, the control plane and the DNS Service acting together create a DNS record for my-service.my-ns. Pods in the my-ns namespace should be able to discover the service past doing a proper noun lookup for my-service (my-service.my-ns would also piece of work).

Pods in other namespaces must qualify the name equally my-service.my-ns. These names will resolve to the cluster IP assigned for the Service.

Kubernetes also supports DNS SRV (Service) records for named ports. If the my-service.my-ns Service has a port named http with the protocol set to TCP, you lot can do a DNS SRV query for _http._tcp.my-service.my-ns to find the port number for http, besides as the IP address.

The Kubernetes DNS server is the just way to access ExternalName Services. You can observe more than data nearly ExternalName resolution in DNS Pods and Services.

Headless Services

Sometimes you don't need load-balancing and a unmarried Service IP. In this instance, you can create what are termed "headless" Services, by explicitly specifying "None" for the cluster IP (.spec.clusterIP).

You can use a headless Service to interface with other service discovery mechanisms, without beingness tied to Kubernetes' implementation.

For headless Services, a cluster IP is not allocated, kube-proxy does not handle these Services, and at that place is no load balancing or proxying done by the platform for them. How DNS is automatically configured depends on whether the Service has selectors defined:

With selectors

For headless Services that define selectors, the endpoints controller creates Endpoints records in the API, and modifies the DNS configuration to return A records (IP addresses) that bespeak straight to the Pods bankroll the Service.

Without selectors

For headless Services that do not define selectors, the endpoints controller does not create Endpoints records. However, the DNS system looks for and configures either:

CNAME records for ExternalName-type Services.
A records for whatsoever Endpoints that share a name with the Service, for all other types.

Publishing Services (ServiceTypes)

For some parts of your application (for instance, frontends) you may want to expose a Service onto an external IP address, that's exterior of your cluster.

Kubernetes ServiceTypes allow you to specify what kind of Service you want. The default is ClusterIP.

Type values and their behaviors are:

ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType.
NodePort: Exposes the Service on each Node's IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. Yous'll be able to contact the NodePort Service, from exterior the cluster, past requesting <NodeIP>:<NodePort>.
LoadBalancer: Exposes the Service externally using a cloud provider's load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
ExternalName: Maps the Service to the contents of the externalName field (e.chiliad. foo.bar.example.com), by returning a CNAME record with its value. No proxying of whatsoever kind is set up.

You can also utilise Ingress to expose your Service. Ingress is non a Service type, just information technology acts every bit the entry point for your cluster. It lets you consolidate your routing rules into a single resource equally it can expose multiple services under the aforementioned IP accost.

Type NodePort

If yous set the type field to NodePort, the Kubernetes control plane allocates a port from a range specified by --service-node-port-range flag (default: 30000-32767). Each node proxies that port (the aforementioned port number on every Node) into your Service. Your Service reports the allocated port in its .spec.ports[*].nodePort field.

If you want to specify item IP(s) to proxy the port, you can ready the --nodeport-addresses flag for kube-proxy or the equivalent nodePortAddresses field of the kube-proxy configuration file to particular IP cake(southward).

This flag takes a comma-delimited list of IP blocks (e.k. 10.0.0.0/viii, 192.0.2.0/25) to specify IP address ranges that kube-proxy should consider as local to this node.

For example, if y'all offset kube-proxy with the --nodeport-addresses=127.0.0.0/8 flag, kube-proxy only selects the loopback interface for NodePort Services. The default for --nodeport-addresses is an empty list. This means that kube-proxy should consider all available network interfaces for NodePort. (That'south also compatible with earlier Kubernetes releases).

If yous want a specific port number, you can specify a value in the nodePort field. The control plane will either allocate you that port or report that the API transaction failed. This means that yous need to take intendance of possible port collisions yourself. Y'all besides have to apply a valid port number, one that'due south inside the range configured for NodePort apply.

Using a NodePort gives you the liberty to set up your ain load balancing solution, to configure environments that are not fully supported by Kubernetes, or even to expose ane or more nodes' IPs directly.

Note that this Service is visible as <NodeIP>:spec.ports[*].nodePort and .spec.clusterIP:spec.ports[*].port. If the --nodeport-addresses flag for kube-proxy or the equivalent field in the kube-proxy configuration file is set, <NodeIP> would be filtered node IP(southward).

For example:

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        name:              my-service                                          spec:                                                        blazon:              NodePort                                                        selector:                                                        app:              MyApp                                                        ports:                                                        # By default and for convenience, the `targetPort` is gear up to the same value as the `port` field.                                                        -              port:                            fourscore                                                                      targetPort:                            lxxx                                                                      # Optional field                                                                      # Past default and for convenience, the Kubernetes control aeroplane volition allocate a port from a range (default: 30000-32767)                                                                      nodePort:                            30007

Blazon LoadBalancer

On cloud providers which support external load balancers, setting the type field to LoadBalancer provisions a load balancer for your Service. The actual cosmos of the load balancer happens asynchronously, and information virtually the provisioned balancer is published in the Service'due south .status.loadBalancer field. For instance:

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        name:              my-service                                          spec:                                                        selector:                                                        app:              MyApp                                                        ports:                                          -              protocol:              TCP                                                        port:                            80                                                                      targetPort:                            9376                                                                      clusterIP:                            x.0.171.239                                                                      type:              LoadBalancer                                          status:                                                        loadBalancer:                                                        ingress:                                          -              ip:                            192.0.2.127

Traffic from the external load balancer is directed at the backend Pods. The cloud provider decides how it is load balanced.

Some cloud providers permit y'all to specify the loadBalancerIP. In those cases, the load-balancer is created with the user-specified loadBalancerIP. If the loadBalancerIP field is non specified, the loadBalancer is fix with an ephemeral IP address. If you lot specify a loadBalancerIP but your cloud provider does not support the feature, the loadbalancerIP field that you gear up is ignored.

Load balancers with mixed protocol types

FEATURE Land: Kubernetes v1.20 [alpha]

By default, for LoadBalancer type of Services, when in that location is more than one port defined, all ports must have the aforementioned protocol, and the protocol must be 1 which is supported by the deject provider.

If the feature gate MixedProtocolLBService is enabled for the kube-apiserver information technology is allowed to utilise different protocols when there is more than than i port divers.

Disabling load balancer NodePort allotment

Feature STATE: Kubernetes v1.22 [beta]

You tin can optionally disable node port resource allotment for a Service of type=LoadBalancer, by setting the field spec.allocateLoadBalancerNodePorts to false. This should merely exist used for load balancer implementations that road traffic directly to pods as opposed to using node ports. By default, spec.allocateLoadBalancerNodePorts is truthful and blazon LoadBalancer Services will keep to allocate node ports. If spec.allocateLoadBalancerNodePorts is prepare to false on an existing Service with allocated node ports, those node ports will not be de-allocated automatically. You must explicitly remove the nodePorts entry in every Service port to de-classify those node ports. Your cluster must accept the ServiceLBNodePortControl feature gate enabled to utilize this field. For Kubernetes v1.23, this feature gate is enabled by default, and you lot can use the spec.allocateLoadBalancerNodePorts field. For clusters running other versions of Kubernetes, bank check the documentation for that release.

Specifying class of load balancer implementation

Characteristic Country: Kubernetes v1.22 [beta]

spec.loadBalancerClass enables you to use a load balancer implementation other than the cloud provider default. Your cluster must accept the ServiceLoadBalancerClass feature gate enabled to use this field. For Kubernetes v1.23, this feature gate is enabled past default. For clusters running other versions of Kubernetes, bank check the documentation for that release. Past default, spec.loadBalancerClass is nix and a LoadBalancer blazon of Service uses the cloud provider's default load balancer implementation if the cluster is configured with a deject provider using the --cloud-provider component flag. If spec.loadBalancerClass is specified, it is assumed that a load balancer implementation that matches the specified course is watching for Services. Any default load balancer implementation (for example, the one provided past the cloud provider) will ignore Services that have this field gear up. spec.loadBalancerClass can exist attack a Service of blazon LoadBalancer just. Once set, information technology cannot exist changed. The value of spec.loadBalancerClass must be a characterization-style identifier, with an optional prefix such equally "internal-vip" or "instance.com/internal-vip". Unprefixed names are reserved for end-users.

Internal load balancer

In a mixed environment it is sometimes necessary to road traffic from Services within the same (virtual) network address block.

In a divide-horizon DNS environment yous would need 2 Services to be able to route both external and internal traffic to your endpoints.

To fix an internal load balancer, add together one of the post-obit annotations to your Service depending on the deject Service provider you're using.

              [...]                                                metadata:                                                                name:                my-service                                                                annotations:                                                                cloud.google.com/load-balancer-blazon:                                "Internal"                                                [...]

              [...]                                                metadata:                                                                proper noun:                my-service                                                                annotations:                                                                service.beta.kubernetes.io/aws-load-balancer-internal:                                "true"                                                [...]

              [...]                                                metadata:                                                                proper noun:                my-service                                                                annotations:                                                                service.beta.kubernetes.io/azure-load-balancer-internal:                                "true"                                                [...]

              [...]                                                metadata:                                                                proper noun:                my-service                                                                annotations:                                                                service.kubernetes.io/ibm-load-balancer-cloud-provider-ip-blazon:                                "individual"                                                [...]

              [...]                                                metadata:                                                                name:                my-service                                                                annotations:                                                                service.beta.kubernetes.io/openstack-internal-load-balancer:                                "true"                                                [...]

              [...]                                                metadata:                                                                proper name:                my-service                                                                annotations:                                                                service.beta.kubernetes.io/cce-load-balancer-internal-vpc:                                "truthful"                                                [...]

              [...]                                                metadata:                                                                annotations:                                                                service.kubernetes.io/qcloud-loadbalancer-internal-subnetid:                subnet-xxxxx                                [...]

              [...]                                                metadata:                                                                annotations:                                                                service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type:                                "intranet"                                                [...]

TLS support on AWS

For partial TLS / SSL support on clusters running on AWS, you lot tin can add together three annotations to a LoadBalancer service:

                          metadata:                                                        name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-ssl-cert:              arn:aws:acm:us-due east-ane:123456789012:document/12345678-1234-1234-1234-123456789012

The first specifies the ARN of the certificate to utilize. It tin exist either a certificate from a 3rd political party issuer that was uploaded to IAM or one created within AWS Document Manager.

                          metadata:                                                        name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-backend-protocol:              (https|http|ssl|tcp)

The second annotation specifies which protocol a Pod speaks. For HTTPS and SSL, the ELB expects the Pod to authenticate itself over the encrypted connection, using a certificate.

HTTP and HTTPS selects layer 7 proxying: the ELB terminates the connection with the user, parses headers, and injects the X-Forwarded-For header with the user's IP accost (Pods only see the IP address of the ELB at the other end of its connexion) when forwarding requests.

TCP and SSL selects layer 4 proxying: the ELB frontward traffic without modifying the headers.

In a mixed-utilize environment where some ports are secured and others are left unencrypted, you can use the post-obit annotations:

                                                      metadata:                                                        proper name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-backend-protocol:              http                                                        service.beta.kubernetes.io/aws-load-balancer-ssl-ports:                            "443,8443"

In the above case, if the Service contained three ports, 80, 443, and 8443, then 443 and 8443 would apply the SSL document, merely 80 would be proxied HTTP.

From Kubernetes v1.ix onwards you can utilise predefined AWS SSL policies with HTTPS or SSL listeners for your Services. To encounter which policies are available for utilize, y'all can employ the aws command line tool:

            aws elb describe-load-balancer-policies --query              'PolicyDescriptions[].PolicyName'

You tin so specify whatsoever one of those policies using the "service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy" notation; for instance:

                                                      metadata:                                                        proper noun:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy:                            "ELBSecurityPolicy-TLS-i-ii-2017-01"

PROXY protocol back up on AWS

To enable PROXY protocol support for clusters running on AWS, you can utilise the post-obit service annotation:

                                                      metadata:                                                        name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-proxy-protocol:                            "*"

Since version 1.3.0, the use of this annotation applies to all ports proxied by the ELB and cannot exist configured otherwise.

ELB Access Logs on AWS

At that place are several annotations to manage admission logs for ELB Services on AWS.

The note service.beta.kubernetes.io/aws-load-balancer-admission-log-enabled controls whether access logs are enabled.

The notation service.beta.kubernetes.io/aws-load-balancer-admission-log-emit-interval controls the interval in minutes for publishing the access logs. Yous can specify an interval of either 5 or 60 minutes.

The annotation service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-proper name controls the proper noun of the Amazon S3 bucket where load balancer access logs are stored.

The annotation service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-prefix specifies the logical hierarchy you created for your Amazon S3 saucepan.

                                                      metadata:                                                        name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-access-log-enabled:                            "true"                                                                      # Specifies whether access logs are enabled for the load balancer                                                                      service.beta.kubernetes.io/aws-load-balancer-access-log-emit-interval:                            "60"                                                                      # The interval for publishing the admission logs. Y'all can specify an interval of either 5 or 60 (minutes).                                                                      service.beta.kubernetes.io/aws-load-balancer-admission-log-s3-bucket-proper noun:                            "my-bucket"                                                                      # The name of the Amazon S3 bucket where the admission logs are stored                                                                      service.beta.kubernetes.io/aws-load-balancer-admission-log-s3-bucket-prefix:                            "my-saucepan-prefix/prod"                                                                      # The logical hierarchy you created for your Amazon S3 bucket, for instance `my-bucket-prefix/prod`

Connexion Draining on AWS

Connection draining for Classic ELBs can be managed with the annotation service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled set to the value of "true". The notation service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout tin can besides be used to fix maximum time, in seconds, to proceed the existing connections open before deregistering the instances.

                                                      metadata:                                                        name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled:                            "true"                                                                      service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout:                            "lx"

Other ELB annotations

There are other annotations to manage Archetype Elastic Load Balancers that are described below.

                                                      metadata:                                                        name:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout:                            "60"                                                                      # The fourth dimension, in seconds, that the connection is allowed to be idle (no data has been sent over the connection) before it is closed by the load balancer                                                                                                  service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled:                            "true"                                                                      # Specifies whether cross-zone load balancing is enabled for the load balancer                                                                                                  service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags:                            "environment=prod,possessor=devops"                                                                      # A comma-separated listing of key-value pairs which will exist recorded every bit                                                                      # additional tags in the ELB.                                                                                                  service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold:                            ""                                                                      # The number of successive successful health checks required for a backend to                                                                      # be considered healthy for traffic. Defaults to 2, must be between 2 and x                                                                                                  service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold:                            "3"                                                                      # The number of unsuccessful health checks required for a backend to be                                                                      # considered unhealthy for traffic. Defaults to half-dozen, must exist between 2 and 10                                                                                                  service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval:                            "20"                                                                      # The approximate interval, in seconds, betwixt health checks of an                                                                      # private case. Defaults to 10, must exist between 5 and 300                                                                                                  service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout:                            "five"                                                                      # The amount of time, in seconds, during which no response means a failed                                                                      # health check. This value must be less than the service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval                                                                      # value. Defaults to v, must be between 2 and sixty                                                                                                  service.beta.kubernetes.io/aws-load-balancer-security-groups:                            "sg-53fae93f"                                                                      # A list of existing security groups to be configured on the ELB created. Unlike the annotation                                                                      # service.beta.kubernetes.io/aws-load-balancer-extra-security-groups, this replaces all other security groups previously assigned to the ELB and as well overrides the cosmos                                                                                    # of a uniquely generated security group for this ELB.                                                                      # The first security group ID on this listing is used every bit a source to permit incoming traffic to target worker nodes (service traffic and health checks).                                                                      # If multiple ELBs are configured with the same security group ID, only a single let line volition be added to the worker node security groups, that means if y'all delete whatsoever                                                                      # of those ELBs it will remove the unmarried permit line and block admission for all ELBs that shared the same security group ID.                                                                      # This tin can crusade a cantankerous-service outage if not used properly                                                                                                  service.beta.kubernetes.io/aws-load-balancer-extra-security-groups:                            "sg-53fae93f,sg-42efd82e"                                                                      #  A listing of additional security groups to exist added to the created ELB, this leaves the uniquely generated security group in identify, this ensures that every ELB                                                                      # has a unique security grouping ID and a matching permit line to allow traffic to the target worker nodes (service traffic and health checks).                                                                      # Security groups defined here can exist shared between services.                                                                                                                service.beta.kubernetes.io/aws-load-balancer-target-node-labels:                            "ingress-gw,gw-name=public-api"                                                                      # A comma separated list of fundamental-value pairs which are used                                                                      # to select the target nodes for the load balancer

Network Load Balancer support on AWS

Characteristic STATE: Kubernetes v1.15 [beta]

To use a Network Load Balancer on AWS, use the annotation service.beta.kubernetes.io/aws-load-balancer-type with the value gear up to nlb.

                                                      metadata:                                                        proper noun:              my-service                                                        annotations:                                                        service.beta.kubernetes.io/aws-load-balancer-blazon:                            "nlb"

Unlike Archetype Elastic Load Balancers, Network Load Balancers (NLBs) forward the customer's IP accost through to the node. If a Service'south .spec.externalTrafficPolicy is ready to Cluster, the client's IP accost is non propagated to the end Pods.

By setting .spec.externalTrafficPolicy to Local, the client IP addresses is propagated to the end Pods, but this could result in uneven distribution of traffic. Nodes without any Pods for a item LoadBalancer Service volition neglect the NLB Target Group's health check on the car-assigned .spec.healthCheckNodePort and not receive any traffic.

In gild to achieve even traffic, either use a DaemonSet or specify a pod anti-affinity to not locate on the aforementioned node.

Y'all can too utilize NLB Services with the internal load balancer annotation.

In order for client traffic to reach instances behind an NLB, the Node security groups are modified with the following IP rules:

Rule	Protocol	Port(due south)	IpRange(s)	IpRange Description
Health Check	TCP	NodePort(s) (`.spec.healthCheckNodePort` for `.spec.externalTrafficPolicy = Local`)	Subnet CIDR	kubernetes.io/rule/nlb/wellness=<loadBalancerName>
Client Traffic	TCP	NodePort(due south)	`.spec.loadBalancerSourceRanges` (defaults to `0.0.0.0/0`)	kubernetes.io/rule/nlb/client=<loadBalancerName>
MTU Discovery	ICMP	3,iv	`.spec.loadBalancerSourceRanges` (defaults to `0.0.0.0/0`)	kubernetes.io/rule/nlb/mtu=<loadBalancerName>

In order to limit which client IP's tin access the Network Load Balancer, specify loadBalancerSourceRanges.

                          spec:                                                        loadBalancerSourceRanges:                                          -              "143.231.0.0/sixteen"

Further documentation on annotations for Rubberband IPs and other mutual use-cases may be found in the AWS Load Balancer Controller documentation.

Other CLB annotations on Tencent Kubernetes Engine (TKE)

At that place are other annotations for managing Deject Load Balancers on TKE as shown below.

                                                      metadata:                                                        name:              my-service                                                        annotations:                                                        # Demark Loadbalancers with specified nodes                                                                      service.kubernetes.io/qcloud-loadbalancer-backends-label:              key in (value1, value2)                                                                                    # ID of an existing load balancer                                                        service.kubernetes.io/tke-existed-lbid：lb-6swtxxxx                                                                                    # Custom parameters for the load balancer (LB), does not support modification of LB type yet                                                                      service.kubernetes.io/service.extensiveParameters:                            ""                                                                                                  # Custom parameters for the LB listener                                                                      service.kubernetes.io/service.listenerParameters:                            ""                                                                                                  # Specifies the type of Load balancer;                                                                      # valid values: classic (Classic Cloud Load Balancer) or application (Awarding Cloud Load Balancer)                                                                      service.kubernetes.io/loadbalance-type:              xxxxx                                                                                    # Specifies the public network bandwidth billing method;                                                                      # valid values: TRAFFIC_POSTPAID_BY_HOUR(bill-by-traffic) and BANDWIDTH_POSTPAID_BY_HOUR (bill-by-bandwidth).                                                                      service.kubernetes.io/qcloud-loadbalancer-internet-charge-blazon:              xxxxxx                                                                                    # Specifies the bandwidth value (value range: [i,2000] Mbps).                                                                      service.kubernetes.io/qcloud-loadbalancer-net-max-bandwidth-out:                            "10"                                                                                                  # When this notation is ready，the loadbalancers will only register nodes                                                                      # with pod running on it, otherwise all nodes will exist registered.                                                                      service.kubernetes.io/local-svc-only-bind-node-with-pod:                            true

Type ExternalName

Services of type ExternalName map a Service to a DNS name, not to a typical selector such as my-service or cassandra. You specify these Services with the spec.externalName parameter.

This Service definition, for instance, maps the my-service Service in the prod namespace to my.database.instance.com:

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        name:              my-service                                                        namespace:              prod                                          spec:                                                        type:              ExternalName                                                        externalName:              my.database.case.com

When looking up the host my-service.prod.svc.cluster.local, the cluster DNS Service returns a CNAME record with the value my.database.example.com. Accessing my-service works in the same way as other Services but with the crucial difference that redirection happens at the DNS level rather than via proxying or forwarding. Should you later make up one's mind to move your database into your cluster, yous tin can beginning its Pods, add together advisable selectors or endpoints, and alter the Service's type.

External IPs

If there are external IPs that road to one or more cluster nodes, Kubernetes Services can exist exposed on those externalIPs. Traffic that ingresses into the cluster with the external IP (as destination IP), on the Service port, will be routed to one of the Service endpoints. externalIPs are not managed by Kubernetes and are the responsibleness of the cluster administrator.

In the Service spec, externalIPs tin be specified forth with whatever of the ServiceTypes. In the example below, "my-service" can be accessed by clients on "80.11.12.10:80" (externalIP:port)

                          apiVersion:              v1                                          kind:              Service                                          metadata:                                                        name:              my-service                                          spec:                                                        selector:                                                        app:              MyApp                                                        ports:                                          -              proper noun:              http                                                        protocol:              TCP                                                        port:                            80                                                                      targetPort:                            9376                                                                      externalIPs:                                          -              lxxx.11.12.10

Shortcomings

Using the userspace proxy for VIPs works at pocket-size to medium scale, but will not scale to very big clusters with thousands of Services. The original blueprint proposal for portals has more than details on this.

Using the userspace proxy obscures the source IP address of a parcel accessing a Service. This makes some kinds of network filtering (firewalling) impossible. The iptables proxy fashion does not obscure in-cluster source IPs, but it does still bear on clients coming through a load balancer or node-port.

The Type field is designed as nested functionality - each level adds to the previous. This is not strictly required on all cloud providers (e.g. Google Compute Engine does non need to classify a NodePort to make LoadBalancer work, but AWS does) merely the current API requires it.

Virtual IP implementation

The previous information should be sufficient for many people who want to apply Services. However, there is a lot going on backside the scenes that may exist worth understanding.

Avoiding collisions

One of the main philosophies of Kubernetes is that y'all should not be exposed to situations that could cause your actions to fail through no fault of your own. For the design of the Service resource, this means not making you choose your ain port number if that choice might collide with someone else'due south pick. That is an isolation failure.

In club to allow you to choose a port number for your Services, we must ensure that no 2 Services can collide. Kubernetes does that past allocating each Service its own IP accost.

To ensure each Service receives a unique IP, an internal allocator atomically updates a global allocation map in etcd prior to creating each Service. The map object must exist in the registry for Services to get IP address assignments, otherwise creations will fail with a bulletin indicating an IP address could non be allocated.

In the command plane, a background controller is responsible for creating that map (needed to support migrating from older versions of Kubernetes that used in-retentiveness locking). Kubernetes besides uses controllers to bank check for invalid assignments (eg due to ambassador intervention) and for cleaning upwardly allocated IP addresses that are no longer used by any Services.

Service IP addresses

Unlike Pod IP addresses, which actually route to a fixed destination, Service IPs are non actually answered by a single host. Instead, kube-proxy uses iptables (packet processing logic in Linux) to define virtual IP addresses which are transparently redirected as needed. When clients connect to the VIP, their traffic is automatically transported to an advisable endpoint. The environment variables and DNS for Services are really populated in terms of the Service'due south virtual IP accost (and port).

kube-proxy supports three proxy modes—userspace, iptables and IPVS—which each operate slightly differently.

Userspace

Equally an example, consider the image processing application described higher up. When the backend Service is created, the Kubernetes master assigns a virtual IP address, for case ten.0.0.1. Assuming the Service port is 1234, the Service is observed past all of the kube-proxy instances in the cluster. When a proxy sees a new Service, it opens a new random port, establishes an iptables redirect from the virtual IP address to this new port, and starts accepting connections on information technology.

When a client connects to the Service's virtual IP address, the iptables rule kicks in, and redirects the packets to the proxy'due south own port. The "Service proxy" chooses a backend, and starts proxying traffic from the client to the backend.

This means that Service owners can choose whatever port they want without take chances of collision. Clients tin can connect to an IP and port, without existence enlightened of which Pods they are really accessing.

iptables

Again, consider the epitome processing application described above. When the backend Service is created, the Kubernetes control plane assigns a virtual IP address, for example 10.0.0.1. Bold the Service port is 1234, the Service is observed past all of the kube-proxy instances in the cluster. When a proxy sees a new Service, it installs a series of iptables rules which redirect from the virtual IP address to per-Service rules. The per-Service rules link to per-Endpoint rules which redirect traffic (using destination NAT) to the backends.

When a customer connects to the Service's virtual IP accost the iptables rule kicks in. A backend is called (either based on session affinity or randomly) and packets are redirected to the backend. Dissimilar the userspace proxy, packets are never copied to userspace, the kube-proxy does not accept to be running for the virtual IP address to piece of work, and Nodes see traffic arriving from the unaltered client IP address.

This same basic flow executes when traffic comes in through a node-port or through a load-balancer, though in those cases the client IP does get altered.

IPVS

iptables operations irksome down dramatically in large scale cluster e.g 10,000 Services. IPVS is designed for load balancing and based on in-kernel hash tables. So you can attain performance consistency in large number of Services from IPVS-based kube-proxy. Meanwhile, IPVS-based kube-proxy has more sophisticated load balancing algorithms (least conns, locality, weighted, persistence).

API Object

Service is a top-level resources in the Kubernetes REST API. You can notice more details about the API object at: Service API object.

Supported protocols

TCP

Yous can utilize TCP for whatsoever kind of Service, and it'south the default network protocol.

UDP

You can employ UDP for most Services. For blazon=LoadBalancer Services, UDP support depends on the deject provider offering this facility.

SCTP

Feature STATE: Kubernetes v1.twenty [stable]

When using a network plugin that supports SCTP traffic, yous can use SCTP for near Services. For type=LoadBalancer Services, SCTP support depends on the cloud provider offering this facility. (Nigh practise not).

Warnings

Support for multihomed SCTP associations

Windows

Userspace kube-proxy

HTTP

If your deject provider supports information technology, you can use a Service in LoadBalancer fashion to gear up external HTTP / HTTPS reverse proxying, forwarded to the Endpoints of the Service.

PROXY protocol

If your cloud provider supports it, you can use a Service in LoadBalancer fashion to configure a load balancer outside of Kubernetes itself, that volition frontward connections prefixed with PROXY protocol.

The load balancer will send an initial series of octets describing the incoming connection, similar to this example

          PROXY TCP4 192.0.ii.202 ten.0.42.vii 12345 7\r\due north

followed by the data from the client.

What'south adjacent

Read Connecting Applications with Services
Read about Ingress
Read about EndpointSlices

rollintille1992.blogspot.com

Source: https://kubernetes.io/docs/concepts/services-networking/service/