SearchSpot Tech Blog Series 1: Karpenter AL2023 Nodes Not Joining EKS 1.29+ (Access Entries + nodeadm Authentication Trap)

AL2023 + Karpenter requires a hybrid authentication setup. Keep an EKS Access Entry of type EC2_LINUX (gives system:nodes) Enable authentication_mode = "API_AND_CONFIG_MAP" Add an aws-auth ConfigMap role mapping that includes both: system:bootstrappers system:nodes

SearchSpot Tech Blog Series 1: Karpenter AL2023 Nodes Not Joining EKS 1.29+ (Access Entries + nodeadm Authentication Trap)
Making Karpenter work with Amazon EKS

Fix: Karpenter AL2023 Nodes Not Joining EKS 1.29+ (Access Entries + nodeadm Authentication Trap)

Migrating from EKS managed node groups to Karpenter on Amazon Linux 2023 (AL2023) is usually a big win: better bin-packing, faster scale-out, and lower idle cost. We did the same at SearchSpot because our workloads are bursty (campaign spikes, new feature launches), and waiting minutes for node groups to scale is a tax.

But there’s a sharp edge that can burn days: AL2023 nodes launched by Karpenter can come up healthy in EC2 and still never register as Kubernetes Nodes on EKS 1.29+, especially when you’re using EKS Access Entries for auth.

This post is the fix we wish existed when we hit it.


TL;DR

AL2023 + Karpenter requires a hybrid authentication setup.

  • Keep an EKS Access Entry of type EC2_LINUX (gives system:nodes)
  • Enable authentication_mode = "API_AND_CONFIG_MAP"
  • Add an aws-auth ConfigMap role mapping that includes both:
    • system:bootstrappers
    • system:nodes

Why: with Access Entries, you cannot attach system:* groups via a STANDARD entry, and the EC2_LINUX entry alone doesn’t cover everything AL2023 bootstrap needs.


When this applies

You’re likely affected if all are true:

  • ✅ Karpenter provisions the instances (NodeClaim created, EC2 instance Running)
  • ✅ AMI family is AL2023
  • ✅ EKS cluster version 1.29+ (we’ve seen it across 1.29–1.33)
  • ✅ You’re relying on EKS Access Entries (not only aws-auth)
  • ✅ Symptom is “nodes launch but never join” (kubectl get nodes never shows them)

If your nodes are joining fine, you don’t need this post.


What you see when it fails

Symptom 1: NodeClaim stuck “Unknown” / “Node not registered with cluster”

kubectl get nodeclaim

Example:

NAME                    TYPE         CAPACITY    ZONE            NODE   READY     AGE
general-purpose-wkg62   t4g.medium   on-demand   us-east-1a            Unknown   8m
  • Karpenter provisions EC2
  • Instance health checks pass
  • But no node appears in kubectl get nodes

Symptom 2: Logs vary (don’t overfit to one signal)

Depending on timing and configuration, you might see any of these:

  • kubelet/nodeadm logs imply auth/bootstrap permission issues
  • CSRs may be missing or stuck (more on this below)
  • EC2 console output sometimes shows a warning like:
aws ec2 get-console-output --instance-id i-xxx --region <region> --latest
cloud-init: Unhandled unknown content-type (application/node.eks.aws)

That message is commonly observed in AL2023/nodeadm contexts, but it’s not the only failure signature. Treat it as “seen in the wild,” not as sole proof.


Why this happens (the real trap)

AL2023 bootstraps with nodeadm, not bootstrap.sh

Amazon Linux 2023 for EKS uses nodeadm as the bootstrap mechanism. This is a meaningful change from AL2’s /etc/eks/bootstrap.sh.

The authentication sharp edge: Access Entries vs bootstrap groups

To join the cluster, nodes need the right Kubernetes identity groups during bootstrap and normal operation:

  • system:bootstrappers — used for initial registration/bootstrapping flows
  • system:nodes — used for ongoing node permissions

Now combine that with Access Entries constraints:

  • EC2_LINUX Access Entry automatically maps the role as a node identity and grants system:nodes
  • STANDARD Access Entry lets you specify custom groups… but AWS rejects system:* groups (reserved prefix)

So you end up in a catch-22:

  • EC2_LINUX gives you system:nodes, but you still need system:bootstrappers for AL2023 bootstrap to complete reliably
  • STANDARD can’t be used to add system:bootstrappers because system:* is blocked

Managed Node Groups work because AWS wires up the node auth path automatically for you. Karpenter nodes are self-managed from the cluster’s perspective, so you must do it.


Quick decision tree (before you change anything)

  1. If you’re on AL2023 and using custom userData
    Remove custom userData first. AL2023 uses nodeadm; don’t force bootstrap.sh.
  2. If you’re on Access Entries only (authentication_mode = API)
    You likely need the hybrid fix below.
  3. If you’re on AL2 and nodes don’t join
    This is usually networking/security-group/endpoint reachability, not this specific auth trap.

The fix: Hybrid authentication (Access Entry + aws-auth)

Step 0: Set cluster authentication mode to hybrid

You must run EKS auth in API_AND_CONFIG_MAP mode so both Access Entries and aws-auth are honored.

# modules/eks/main.tf
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.35.0"

  cluster_name    = "my-eks-cluster"
  cluster_version = "1.33"

  # Required for the hybrid fix
  authentication_mode = "API_AND_CONFIG_MAP"

  # ... rest of EKS config
}

Step 1: Keep the EC2_LINUX Access Entry (don’t remove it)

This continues to provide the node identity mapping and system:nodes.

# modules/eks/main.tf
module "eks" {
  # ...
  access_entries = {
    karpenter_node = {
      principal_arn = aws_iam_role.karpenter_node.arn
      type          = "EC2_LINUX"
    }
  }
}

Step 2: Add aws-auth role mapping with BOTH groups

This is the missing piece for AL2023 bootstrap reliability.

Important: Do not put aws-auth inside the eks module if it creates circular dependencies. Put it at the root (environment) level.
# environments/dev/main.tf
module "aws_auth" {
  source  = "terraform-aws-modules/eks/aws//modules/aws-auth"
  version = "20.35.0"

  manage_aws_auth_configmap = true

  aws_auth_roles = [
    {
      rolearn  = module.eks.karpenter_node_role_arn
      username = "system:node:{{EC2PrivateDNSName}}"
      groups   = ["system:bootstrappers", "system:nodes"]
    }
  ]

  depends_on = [module.eks]
}

Step 3: Configure the Kubernetes provider (for aws-auth module)

# environments/dev/provider.tf
provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
  }
}

Karpenter correctness checklist (common “not joining” causes)

Even after fixing auth, these can still block registration. We hit multiple during migration, so here’s the tightened list.

1) Don’t use custom userData with AL2023

AL2023 uses nodeadm. Let Karpenter generate the NodeConfig automatically.

# ✅ Correct: omit userData for AL2023
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: general-purpose-nodeclass
spec:
  amiFamily: AL2023
  role: karpenter-node-role
  amiSelectorTerms:
    - alias: al2023@latest
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-eks-cluster"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-eks-cluster"

2) spec.role is IAM role name, not instance profile

# ✅ Correct
spec:
  role: karpenter-node-role

3) Subnet tagging: tag only the subnets you actually want Karpenter to use

If you tag public subnets unintentionally, Karpenter can schedule nodes there. Whether that breaks depends on your endpoint mode (public/private), NAT, routing, and egress controls. In many real setups it results in “instance up, node never registers.”

# ✅ Recommended: tag private subnets for Karpenter discovery
private_subnet_tags = {
  "karpenter.sh/discovery" = var.cluster_name
}

# Keep public subnet tags for ELB only
public_subnet_tags = {
  "kubernetes.io/role/elb" = 1
}

IRSA + IAM: keep Karpenter able to create instance profiles

Step 4: Verify the controller ServiceAccount name (version drift)

Karpenter chart versions changed the controller SA naming in many setups. Make sure your trust policy matches your deployed SA.

module "karpenter_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "5.48.0"

  role_name                          = "karpenter-controller"
  attach_karpenter_controller_policy = true

  karpenter_controller_cluster_name       = module.eks.cluster_name
  karpenter_controller_node_iam_role_arns = [aws_iam_role.karpenter_node.arn]
  karpenter_sqs_queue_arn                 = aws_sqs_queue.karpenter_interruption.arn

  oidc_providers = {
    ex = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["karpenter:karpenter-controller"]
    }
  }
}

Step 5: Ensure Karpenter has the IAM permissions to manage instance profiles (including PassRole)

This is required for Karpenter to attach the node IAM role to instance profiles.

resource "aws_iam_policy" "karpenter_instance_profile_policy" {
  name        = "karpenter-instance-profile-policy"
  description = "Allow Karpenter to manage instance profiles"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "iam:GetInstanceProfile",
          "iam:CreateInstanceProfile",
          "iam:DeleteInstanceProfile",
          "iam:AddRoleToInstanceProfile",
          "iam:RemoveRoleFromInstanceProfile",
          "iam:TagInstanceProfile",
          "iam:PassRole"
        ]
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "karpenter_controller_instance_profile_policy" {
  policy_arn = aws_iam_policy.karpenter_instance_profile_policy.arn
  role       = module.karpenter_irsa.iam_role_name
}

Verification (copy/paste)

1) aws-auth ConfigMap contains both groups

kubectl get configmap aws-auth -n kube-system -o yaml

Expected snippet:

data:
  mapRoles: |
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::ACCOUNT:role/karpenter-node-role
      username: system:node:{{EC2PrivateDNSName}}

2) Access Entry exists (EC2_LINUX)

aws eks describe-access-entry \
  --cluster-name my-eks-cluster \
  --principal-arn arn:aws:iam::ACCOUNT:role/karpenter-node-role

Expected:

{
  "accessEntry": {
    "type": "EC2_LINUX",
    "kubernetesGroups": ["system:nodes"],
    "username": "system:node:{{EC2PrivateDNSName}}"
  }
}

3) Launch a pod to trigger provisioning

kubectl run test --image=nginx --requests=cpu=100m,memory=128Mi
kubectl get nodeclaim -w

4) Optional: check CSRs (helpful bootstrap signal)

kubectl get csr
kubectl describe csr <name>

5) Node appears

kubectl get nodes

You should see a Ready node within a couple minutes.


Debugging commands (when it still doesn’t join)

# NodeClaim details
kubectl describe nodeclaim <name>

# Karpenter controller logs
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter --tail=200

# EC2 console output (often useful on AL2023)
aws ec2 get-console-output --instance-id i-xxx --region <region> --latest

# Verify the instance profile attached
aws ec2 describe-instances --instance-ids i-xxx \
  --query 'Reservations[0].Instances[0].IamInstanceProfile.Arn'

Key takeaways

  1. AL2023 + Karpenter + Access Entries can require hybrid auth
    Use API_AND_CONFIG_MAP, keep EC2_LINUX, and add aws-auth mapping with system:bootstrappers + system:nodes.
  2. Don’t specify custom userData on AL2023
    Let Karpenter generate nodeadm config.
  3. Be intentional with subnet discovery tags
    Tag only the subnets you want Karpenter to use.
  4. IRSA + IAM permissions matter
    Make sure the controller SA name is correct and Karpenter can manage instance profiles (including iam:PassRole).

References

  • Karpenter docs
  • EKS Access Entries
  • Amazon Linux 2023 EKS-optimized AMI docs
  • nodeadm docs
  • Related GitHub issues: AL2023 node registration, cloud-init content-type warnings

If you want, I can also generate:

  • a “short version” for LinkedIn (problem + fix + 1 code snippet),
  • and a “checklist image” (SearchSpot style) you can attach to the post.