EKS Anywhere, targeted scale-down of machines on bare-metal instances

Ambar Hassani
7 min readAug 6, 2024

--

This article is part of the EKS Anywhere series EKS Anywhere, extending the Hybrid cloud momentum | by Ambar Hassani

Recently, I was tasked to design a bare-metal deployment of EKS-Anywhere in which one of the use-cases was to ensure that scale-in processes can target specific worker nodes. The reason for doing so is obvious, e.g. replacing a faulty node or carrying out hardware replacements on that node, etc.

At the onset of the use-case, I felt pretty comfortable given the fact that TinkerBell, which is the underlying IaaS provider for Cluster-API/EKS-Anywhere associates the cluster with hardware.csv. The hardware.csv file is a list of machines with their MAC addresses and IP addresses and represent the nodes in the cluster.

The assumption was that if we remove the corresponding entry in the hardware.csv and upgrade the cluster, EKS-Anywhere and TinkerBell will automatically target that particular node for deletion.

The Problem

While I tested this out, I was surprised that the above assumption was not true. Instead, a different random node was deleted and that kind of amused me to further investigate if there were any bugs. In that stride, I landed upon EKSA bare metal cluster scale-in doesn’t honor new hardware.csv file · Issue #8190 · aws/eks-anywhere (github.com)

And I caught a breath seeking the above as a fix to my woes. I was running EKS Anywhere version 0.19.5 and per the GitHub issue, this was fixed in 0.19.7. So, I finally decided to upgrade to 0.20.x with utmost confidence that the targeted scale-in issue should be a history. With that confidence, I approached the use-case testing and landed up in a pretty embarrassing situation, where the same issue still persisted.

The below logs (also provided in the above issue via my user id thecloudgarage) explain the nature of the issue, even after upgrading EKS Anywhere to 0.20.x

EKS Version:

eksctl anywhere version
Version: v0.20.1
Release Manifest URL: https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml
Bundle Manifest URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/69/manifest.yaml

Before starting scale in, I have 2 worker nodes

kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
instance-528 Ready control-plane 4h24m v1.29.5-eks-1109419 10.103.15.163 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-529 Ready <none> 31m v1.29.5-eks-1109419 10.103.15.165 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-530 Ready <none> 3h34m v1.29.5-eks-1109419 10.103.15.182 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-531 Ready control-plane 4h7m v1.29.5-eks-1109419 10.103.15.184 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-532 Ready control-plane 3h47m v1.29.5-eks-1109419 10.103.15.186 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4

Then I have edited my hardware csv file to remove the instance-530 worker node

cat hardware-targeted-scale-down.csv
hostname,bmc_ip,bmc_username,bmc_password,mac,ip_address,netmask,gateway,nameservers,labels,disk
instance-531,10.204.196.126,root,xxxxxx,XX:XX:XX:XX:XX:XX,10.103.15.184,255.255.252.0,10.103.12.1,10.103.8.12|10.103.12.12,type=cp,/dev/sda
instance-532,10.204.196.127,root,xxxxxx,XX:XX:XX:XX:XX:XX,10.103.15.186,255.255.252.0,10.103.12.1,10.103.8.12|10.103.12.12,type=cp,/dev/sda
instance-528,10.204.196.125,root,xxxxxx,XX:XX:XX:XX:XX:XX,10.103.15.163,255.255.252.0,10.103.12.1,10.103.8.12|10.103.12.12,type=cp,/dev/sda
instance-529,10.204.196.129,root,xxxxxx,XX:XX:XX:XX:XX:XX,10.103.15.165,255.255.252.0,10.103.12.1,10.103.8.12|10.103.12.12,type=worker,/dev/nvme0n1

I have also adjusted my cluster config file to scale the worker node count to 1 and then ran the command

eksctl anywhere upgrade cluster -f cluster-config-upgrade-20240730094824-scale-to-1.yaml --hardware-csv hardware-targeted-scale-down.csv --kubeconfig /home/ubuntu/eksanywhere/eksa-xxxx-cluster2n/eksa-xxxx-cluster2n-eks-a-cluster.kubeconfig --skip-validations=pod-disruption
Performing setup and validations
✅ Tinkerbell provider validation
✅ SSH Keys present
✅ Validate OS is compatible with registry mirror configuration
✅ Validate certificate for registry mirror
✅ Control plane ready
✅ Worker nodes ready
✅ Nodes ready
✅ Cluster CRDs ready
✅ Cluster object present on workload cluster
✅ Upgrade cluster kubernetes version increment
✅ Upgrade cluster worker node group kubernetes version increment
✅ Validate authentication for git provider
✅ Validate immutable fields
✅ Validate cluster's eksaVersion matches EKS-Anywhere Version
✅ Validate eksa controller is not paused
✅ Validate eksaVersion skew is one minor version
Ensuring etcd CAPI providers exist on management cluster before upgrade
Pausing GitOps cluster resources reconcile
Upgrading core components
Backing up management cluster's resources before upgrading
Upgrading management cluster
Updating Git Repo with new EKS-A cluster spec
Finalized commit and committed to local repository {"hash": "2d209dbf9ebd2a0f45ff88c8fe1a793f4d11348a"}
Forcing reconcile Git repo with latest commit
Resuming GitOps cluster resources kustomization
Writing cluster config file
🎉 Cluster upgraded!
Cleaning up backup resources

However, I still see EKS Anywhere does not delete the node that I had removed from hardware csv. Instead, it starts deleting the other node.

kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
instance-528 Ready control-plane 4h26m v1.29.5-eks-1109419 10.103.15.163 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-529 Ready,SchedulingDisabled <none> 33m v1.29.5-eks-1109419 10.103.15.165 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-530 Ready <none> 3h36m v1.29.5-eks-1109419 10.103.15.182 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-531 Ready control-plane 4h9m v1.29.5-eks-1109419 10.103.15.184 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-532 Ready control-plane 3h49m v1.29.5-eks-1109419 10.103.15.186 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4

Per the logic of what I understand has been fixed, instance-530 should have been deleted as it was removed from hardware csv. However after the scale in upgrade, the other node is deleted

kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
instance-528 Ready control-plane 4h32m v1.29.5-eks-1109419 10.103.15.163 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-530 Ready <none> 3h43m v1.29.5-eks-1109419 10.103.15.182 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-531 Ready control-plane 4h15m v1.29.5-eks-1109419 10.103.15.184 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4
instance-532 Ready control-plane 3h55m v1.29.5-eks-1109419 10.103.15.186 <none> Ubuntu 22.04.4 LTS 5.15.0-113-generic containerd://1.7.18-0-gae71819c4

Now, the fix!

Upon discussing with various community members, it felt as if there was something wrong with Cluster-API itself. And this led to investigating a CRD named machinesets.cluster.x-k8s.io

As I studied this CRD specification, it baffled me to see a deletion policy under the CRD which held the key to remediate the issue.

spec:
clusterName: eksa-xxxx-cluster2q
deletePolicy: Random
replicas: 2

The snippet of Cluster API specification for this CRD is given below and can be seen in full detail here MachineSet Controller · The Cluster API Book (k8s.io)

    // DeletePolicy defines the policy used to identify nodes to delete when downscaling.
// Defaults to "Random". Valid values are "Random, "Newest", "Oldest"

None of these options seem to satisfy my criteria, where I could target a specific node for scale-in.

Finally, as luck would have it., I saw another GitHub issue for Cluster-API add docs about deleting specific machines when downscaling · Issue #10306 · kubernetes-sigs/cluster-api (github.com)

And from there I could get a workaround to finally steer this in the right direction. The ultimate fix was to label the node with “cluster.x-k8s.io/delete-machine” : “”

And so it happened, with the below command, I labeled my targeted node

kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
instance-574 Ready control-plane 81m v1.29.5-eks-1109419
instance-575 Ready <none> 24m v1.29.5-eks-1109419
instance-576 Ready control-plane 48m v1.29.5-eks-1109419
instance-577 Ready control-plane 63m v1.29.5-eks-1109419
instance-578 Ready <none> 6m27s v1.29.5-eks-1109419

kubectl label node instance-578 cluster.x-k8s.io/delete-machine=
node/instance-578 labeled

eksctl anywhere upgrade cluster -f cluster-config-upgrade-20240730094824-scale-to-1.yaml --hardware-csv hardware-targeted-scale-down.csv --kubeconfig /home/ubuntu/eksanywhere/eksa-xxxx-cluster2n/eksa-xxxx-cluster2n-eks-a-cluster.kubeconfig --skip-validations=pod-disruption
Performing setup and validations
✅ Tinkerbell provider validation
✅ SSH Keys present
✅ Validate OS is compatible with registry mirror configuration
✅ Validate certificate for registry mirror
✅ Control plane ready
✅ Worker nodes ready
✅ Nodes ready
✅ Cluster CRDs ready
✅ Cluster object present on workload cluster
✅ Upgrade cluster kubernetes version increment
✅ Upgrade cluster worker node group kubernetes version increment
✅ Validate authentication for git provider
✅ Validate immutable fields
✅ Validate cluster's eksaVersion matches EKS-Anywhere Version
✅ Validate eksa controller is not paused
✅ Validate eksaVersion skew is one minor version
Ensuring etcd CAPI providers exist on management cluster before upgrade
Pausing GitOps cluster resources reconcile
Upgrading core components
Backing up management cluster's resources before upgrading
Upgrading management cluster
Updating Git Repo with new EKS-A cluster spec
Finalized commit and committed to local repository
Forcing reconcile Git repo with latest commit
Resuming GitOps cluster resources kustomization
Writing cluster config file
🎉 Cluster upgraded!
Cleaning up backup resources

kubectl get nodes
NAME STATUS ROLES AGE VERSION
instance-574 Ready control-plane 86m v1.29.5-eks-1109419
instance-575 Ready <none> 28m v1.29.5-eks-1109419
instance-576 Ready control-plane 52m v1.29.5-eks-1109419
instance-577 Ready control-plane 67m v1.29.5-eks-1109419

And ran the EKS Anywhere upgrade with the worker count changed from 2 to 1. As a result of the above label set, scale-in processes correctly targeted instance-578 and removed it from the cluster!!!!!

Well, at the end of it, feels so easy-peasy., but clocked me almost a week scratching my head amidst the daily frenzy.

Hope this share helps the wider community who is looking for a similar use-case, which I feel is one of the most important ones when it comes to operating bare-metal clusters.

cheers,

Ambar@thecloudgarage

#iwork4dell

--

--

Ambar Hassani
Ambar Hassani

Written by Ambar Hassani

24+ years of blended experience of technology & people leadership, startup management and disruptive acceleration/adoption of next-gen technologies

No responses yet