Running Flannel on EKS

Jeremy Cowan
5 min readFeb 9, 2019

--

The Elastic Container Service for Kubernetes (EKS) is a managed service from AWS that was launched in 2018. As part of the service, AWS manages the Kubernetes control plane which consists of a set of masters nodes and an etcd database. When you provision a cluster, it comes pre-configured with the AWS VPC Container Networking Interface (CNI) plugin, a Kubernetes networking plugin that assigns IP addresses from your Virtual Private Cloud (VPC) to pods. Using this plugin has several advantages. First, you don’t incur the overhead of encapsulation and de-encapsulation as you do with overlay networks. Second, you can use VPC Flow Logs to capture information about the IP traffic going to and from the pods in your cluster. Third, there’s less contention for network bandwidth because fewer pods are sharing an Elastic Network Interface (ENI). And finally, traffic from the VPC can be directly routed to pods. The VPC CNI plugin has its own set of challenges, however. For example, the EC2 instance type and size determines number of pods you can run on an instance. And there are instances where attaining higher pod density will force you to over-provision the instance types you use for your worker nodes. Your VPC may also be so IP constrained that you cannot afford to assign IP address from your VPC to your pods, though the VPC CNI custom networking feature attempts to address this by allowing you to specify a separate set of subnets for your pod network.

Despite the VPC CNI’s advantages, folks may still want to use another CNI with EKS. This post explains how to install and configure the flannel CNI with EKS.

Installing flannel

The first step is to create an EKS cluster. I recommend using eksctl because it lets you to provision a cluster (and workers nodes) by issuing a single command.

eksctl create cluster --name flannel --ssh-access --nodes 0

When you create an EKS cluster, a daemonset for the VPC CNI plugin, called aws-node, is automatically created. As worker nodes are joined to the cluster, the Kubernetnes scheduler will schedule an instance of this daemon onto each node. This alters the route table on the instance, affecting its ability to support other network plugins like Flannel. Creating a node-less cluster will allow you to replace the aws-node daemonset with a different networking plugin before nodes are joined to the cluster.

The next step is to delete the aws-node daemonset.

kubectl delete ds aws-node -n kube-system

Since EKS doesn’t allow you to set the pod CIDR on the API server, we’re going to use an external etcd database to store the network configuration for flannel. To get started with etcd, we first need to install CoreOS’s config transpiler(ct).

brew install coreos-ct

Next, we want to get a token for our single node etcd “cluster”.

export TOKEN=$(curl -sw "\n" 'https://discovery.etcd.io/new?size=1' | cut -d "/" -f 4)

Execute the following command to create a file named etcd.yaml

bash
cat > etcd.yaml << EOF | sed 's/[\]//g' > etcd.yaml
# This config is meant to be consumed by the config transpiler, which will
# generate the corresponding Ignition config. Do not pass this config directly
# to instances of Container Linux.

etcd:
advertise_client_urls: http://{PUBLIC_IPV4}:2379
initial_advertise_peer_urls: http://{PRIVATE_IPV4}:2380
listen_client_urls: http://0.0.0.0:2379
listen_peer_urls: http://{PRIVATE_IPV4}:2380
discovery: https://discovery.etcd.io/$TOKEN
EOF

Run the following command to convert the etcd.yaml file into an ignition configuration. The output will be used to configure CoreOS when it first boots.

ct -platform=ec2 < etcd.yaml >> ec2metadata

Launch an instance of CoreOS-stable-1967.4.0. When running this command, replace key_name, sg_ids, and subnet_id with values that correspond to the appropriate resources within your AWS environment.

aws ec2 run-instances --image-id <ami_id> --instance-type t2.small --key-name <key_name> --security-group-ids <sg_ids> --subnet-id <subnet_id> --user-data file://ec2metadata

By adding the etcd instance to the worker node security group , you can avoid creating additional security group rules to allow the flannel daemon to read data from your etcd database.

After the instance is in a running state, SSH to the instance and execute the following commands:

export ETCDCTL_API=3
etcdctl put /coreos.com/network/config '{"Network":"18.16.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan", "VNI": 1}}'

Logout of etcd and install the flannel CNI. Before running the next command replace <etcd_ip> with the IP address of your etcd server.

cat << EOF | kubectl apply -f -
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "18.16.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
template:
metadata:
labels:
tier: node
app: flannel
spec:
hostNetwork: true
nodeSelector:
beta.kubernetes.io/arch: amd64
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.10.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.10.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr=false
- --etcd-endpoints=http://<etcd_ip>:2379
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
EOF

Open the EC2 console and increase the desired and maximum count for the autoscaling group that eksctl created for your worker nodes.

Testing

Now that you’ve finished configuring flannel, let’s deploy some nginx pods.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/website/master/
content/en/examples/application/deployment.yaml

Verify pods are getting created in the CIDR range that you configured.

$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-67594d6bf6-hnll6 1/1 Running 0 35s 18.16.117.226 ip-192-168-104-172.us-west-2.compute.internal
nginx-deployment-67594d6bf6-mb76m 1/1 Running 0 34s 18.16.184.97 ip-192-168-60-139.us-west-2.compute.internal

If you followed all the steps correctly, the IPs of your pods should be in the range you configured.

Conclusion

The VPC CNI plugin from AWS provides robust networking for Kubernetes pods. Nonetheless, there are situations where using an alternate CNI may be preferable. While this blog outlined the steps to install the flannel CNI on EKS, a similar approach can be used to install other CNIs such as Calico or Cilium.

--

--

Jeremy Cowan
Jeremy Cowan

Written by Jeremy Cowan

Jeremy Cowan is a Principal Container Specialist at AWS

Responses (8)