Member-only story
Oops, I Deleted the AWS Auth Roles: My EKS Misadventure

Managing an EKS cluster is usually simple, but sometimes things can go wrong. In my case, it all started when I wanted to update our Terraform configuration. This led to a stressful situation where I lost access to the cluster. I had to work hard to fix it and learned some important lessons along the way.
The Setup
We were running our Kubernetes cluster on EKS version 1.23 and managing it with Terraform. At first, we used this module to create managed node groups:
source = "terraform-aws-modules/eks/aws//modules/eks-managed-node-group"
version = "19.15.2"
We created the node groups separately using a submodule. However, we decided to simplify our setup by managing the node groups directly with the EKS module, without using a submodule:
source = "terraform-aws-modules/eks/aws"
version = "19.21.0"
At the same time, we defined our aws_auth_roles
within the EKS module:
aws_auth_roles = [
{
rolearn = "arn:aws:iam::${var.aws_account_id}:role/rolename"
username = "username"
groups = ["system:masters"]
},
...
]
The Problem
Everything looked simple until Terraform tried to apply these changes. The first thing Terraform did…