Installing EKS addons with the AWS CDK
I recently got involved with a project to build the infrastructure for testing different quantitative frameworks using AWS Batch on Amazon EKS (EKS). Quantitative frameworks, like equity long/short and fixed income arbitrage are frequently used by hedge funds to find inefficiencies in the market, manage risk, and generate returns for their investors. The team previously used CDK for testing these different frameworks on AWS Batch on Amazon ECS and now wanted to replicate that with EKS.
The biggest hurdles I encountered while working on this problem were:
- Installing EKS Addons
- Identifying which IAM permissions the jobs required and assigning them to the Pods scheduled by AWS Batch
I really didn’t anticipate having trouble with either of these things. If I ran into trouble I would simply consult the CDK documentation.
Installing Addons with CDK
The first issue occurred when I tried installing the EKS Addon for the EBS CSI Driver. In November 2023, Amazon added support for pod identities, a new way to assign an IAM role to a Kubernetes pod. Although I could have used IRSA, I wanted to use the latest method for assigning roles to Pods. Fortunately, the CDK includes a convenient construct for creating service accounts for IRSA and Pod Identities.
sa = eks.ServiceAccount(name="ebs-csi-controller-sa",
namespace="kube-system", identity_type=eks.IdentityType.POD_IDENTITY,
cluster=cluster, policy=iam.ManagedPolicy.from_managed_policy_arn(self, "ebs-csi-policy", managed_policy_arn="arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy")
)
This block of code creates a Kubernetes serviceAccount called ebs-csi-controller-sa in the kube-system namespace and automatically creates an identity association with an IAM Role. The role includes the required trust policy for pod identities along with the AmazonEBSCSIDriverPolicy managed policy.
The next step is to create the Addon.
ebs_csi_addon = eks.Addon(self, "ebs-csi-addon",
addon_name="aws-ebs-csi-driver",
addon_version="v1.35.0-eksbuild.1",
cluster=cluster,
preserve_on_delete=False,
)
If you took this code, as I did, I tried running it, CloudFormation might install the Addon before the service account is created. By default, the EBS CSI Addon creates a Kubernetes serviceAccount called ebs-csi-controller-sa in the kube-system namespace. This causes an issue because the Lambda function used by CloudFormation to create the Kubernetes serviceAccount uses kubectl create
rather than kubectl apply
. Using create
will throw an error since the Kubernetes serviceAccount already exists.
As I learned, there are a couple things you need to do to fix this. First, you reconfigure how the EKS resolves conflicts for the Addon. By setting this to OVERWRITE, EKS will overwrite the current settings with the Amazon EKS default values. Next, you need to instruct CloudFormation to provision the ServiceAccount before the Addon.
cfnaddon = ebs_csi_addon.node.default_child
cfnaddon.add_property_override("ResolveConflicts", 'OVERWRITE')
cfn_eks_pod_identity_agent_addon = cluster.node.try_find_child("EksPodIdentityAgentAddon")
ebs_csi_addon.node.add_dependency(sa, cfn_eks_pod_identity_agent_addon)
Putting it all together
Below is an example that employs all of the things I’ve mentioned in this blog so far. I created a small helper function called create_service_account
to create a ServiceAccount using Pod Identities or IRSA based on the value of the pod_identity [bool] attribute.
class CdkStack(Stack):
def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)
masters_role = iam.Role(self, "eks-admin", assumed_by=iam.AccountRootPrincipal())
# Create an EKS cluster
cluster = eks.Cluster(self, "quant-eks",
version=eks.KubernetesVersion.V1_30,
masters_role=masters_role,
authentication_mode=eks.AuthenticationMode.API_AND_CONFIG_MAP
)
sa = self.create_service_account(service_account_name="ebs-csi-controller-sa",
namespace="kube-system", pod_identity=True,
cluster=cluster, policy=iam.ManagedPolicy.from_managed_policy_arn(self, "ebs-csi-policy", managed_policy_arn="arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy")
)
# create the AddOn using L2
ebs_csi_addon = eks.Addon(self, "ebs-csi-addon",
addon_name="aws-ebs-csi-driver",
addon_version="v1.35.0-eksbuild.1",
cluster=cluster,
preserve_on_delete=False,
)
# add_property_override with PodIdentityAssociations
cfnaddon = ebs_csi_addon.node.default_child
cfnaddon.add_property_override("ResolveConflicts", 'OVERWRITE')
# cfnaddon.add_property_override("ServiceAccountRoleArn", sa.role.role_arn) uncomment for IRSA
cfn_eks_pod_identity_agent_addon = cluster.node.try_find_child("EksPodIdentityAgentAddon")
ebs_csi_addon.node.add_dependency(sa, cfn_eks_pod_identity_agent_addon)
def create_service_account(self, service_account_name: str, namespace: str, pod_identity: bool, cluster: eks.Cluster, policy: iam.ManagedPolicy):
sa = eks.ServiceAccount(self, f"ServiceAccount-{service_account_name}",
cluster=cluster,
name=service_account_name,
namespace=namespace,
identity_type=eks.IdentityType.POD_IDENTITY if pod_identity is True else eks.IdentityType.IRSA
)
sa.role.add_managed_policy(policy=policy)
return sa
Motivation for this blog
My main motivation for publishing the blog is to help others who are using the CDK to manage EKS. I found the examples provided in the CDK documentation to be lacking. I couldn’t find any working examples on StackOverflow or elsewhere either. That said, if you are going to use the CDK to provision and manage EKS, I strongly recommend looking at EKS Blueprints. Blueprints provides a simple mechanism for installing EKS Addon without having to deal with built in level 1/2 constructs.
The old fashioned way
If you need to use IRSA or you want to do things the old fashioned way before there were L2 contructs, you could accomplish a similar thing by following this example:
issuer_hostpath_ebs = CfnJson(
self, "IssuerHostPathEbs",
value={
f"{eks_cluster.open_id_connect_provider.open_id_connect_provider_issuer}:sub":
"system:serviceaccount:kube-system:ebs-csi-controller-sa"
}
)
ebs_csi_trust_policy=iam.FederatedPrincipal(
federated=eks_cluster.open_id_connect_provider.open_id_connect_provider_arn,
conditions={
"StringEquals": issuer_hostpath_ebs
},
assume_role_action="sts:AssumeRoleWithWebIdentity"
)
ebs_csi_driver_role=iam.Role(self, "EbsCsiDriverRole",
managed_policies=[iam.ManagedPolicy.from_managed_policy_arn(self, "EBSCSIDriverPolicy", managed_policy_arn="arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy")],
assumed_by=ebs_csi_trust_policy,
)
ebs_csi_addon = eks.CfnAddon(self, "eks-ebs-csi-driver",
cluster_name=eks_cluster.cluster_name,
addon_name="aws-ebs-csi-driver",
addon_version="v1.34.0-eksbuild.1",
preserve_on_delete=False,
service_account_role_arn=ebs_csi_driver_role.role_arn
)
Identifying IAM Permissions
For identifying which IAM permissions for the batch jobs I used the IAM access analyzer. IAM Access Analyzer reviews your AWS CloudTrail logs and generates a policy template that contains the permissions that the entity used in your specified date range. You can use the template to create a policy with fine-grained permissions that grant only the permissions that are required to support your specific use case.
Update
I discovered another way you can get around the issues I encountered (or perhaps it was there all along) while trying to provision EKS addons with the AWS CDK. It involves creating the IAM role for the addon, then attaching the appropriate policy and service principle (trust policy) to the role. If you do this, instead of creating a ServiceAccount (which automatically creates a role and pod identity association), it will allow you to provision the addon with the role you created. Later, when you pass in the role_arn
and service_account
as parameters to the PodIdentityAssociationProperties
method, the CDK will automatically create the association between the service account and the role, without conflicts or naming collisions.
Here’s the updated code if that was too hard to follow:
from aws_cdk import (
Stack
)
from aws_cdk import aws_eks as eks
from constructs import Construct
from aws_cdk import aws_iam as iam
import aws_cdk.lambda_layer_kubectl_v30 as kubectl
class CdkStack(Stack):
def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)
masters_role = iam.Role(self, "eks-admin", assumed_by=iam.AccountRootPrincipal())
# Create an EKS cluster
cluster = eks.Cluster(self, "quant-eks",
version=eks.KubernetesVersion.V1_30,
masters_role=masters_role,
kubectl_layer=kubectl.KubectlV30Layer(self, "KubectlLayer", ),
authentication_mode=eks.AuthenticationMode.API_AND_CONFIG_MAP
)
pods_trust_policy = iam.ServicePrincipal("pods.eks.amazonaws.com")
role = iam.Role(self, "EBSCsiDriverRole",
assumed_by=pods_trust_policy)
role.add_to_policy(iam.PolicyStatement(
effect=iam.Effect.ALLOW,
actions=["sts:TagSession", "sts:AssumeRole"],
)
)
role.add_managed_policy(iam.ManagedPolicy.from_managed_policy_arn(self, "CSIDriver",
managed_policy_arn="arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy")
)
eks.CfnAddon(self, "ebs-csi-addon",
addon_name="aws-ebs-csi-driver",
addon_version="v1.35.0-eksbuild.1",
cluster_name=cluster.cluster_name,
preserve_on_delete=False,
pod_identity_associations=[eks.CfnAddon.PodIdentityAssociationProperty(
role_arn=role.role_arn,
service_account="ebs-csi-controller-sa"
)]
)
If you’re unsure of the name of the service account to use or the recommended permissions it needs, you can run the following command:
aws eks describe-addon-configuration \
--query podIdentityConfiguration \
--addon-name aws-ebs-csi-driver \
--addon-version v1.31.0-eksbuild.1
This will output the following:
[
{
"serviceAccount": "ebs-csi-controller-sa",
"recommendedManagedPolicies": [
"arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
]
}
]
On November 18th, Amazon published a blog announcing that EKS has simplified providing IAM permissions to EKS add-ons. Although it says that you can directly manage EKS Pod Identities using EKS add-ons operations through the EKS console, CLI, API, eksctl, and IAC tools like AWS CloudFormation, I have yet to see how this effects the experience when using the CDK.