Fedora Atomic, OpenStack, and Kubernetes
While experimenting with Fedora Atomic, I was looking for an elegant way to automatically deploy Atomic into anOpenStack environment and then automatically schedule some Docker containers on the Atomic host. This post describes my solution.
Like many other cloud-targeted distributions, Fedora Atomic runs cloud-init when the system boots. We can take advantage of this to configure the system at first boot by providing a
user-data
blob to Nova when we boot the instance. A user-data
blob can be as simple as a shell script, and while we could arguably mash everything into a single script it wouldn't be particularly maintainable or flexible in the face of different pod/service/etc descriptions.
In order to build a more flexible solution, we're going to take advantage of the following features:
- Support for multipart MIME archives.Cloud-init allows you to pass in multiple files via
user-data
by encoding them as a multipart MIME archive. - Support for a custom part handler.Cloud-init recognizes a number of specific MIME types (such as
text/cloud-config
ortext/x-shellscript
). We can provide a custom part handler that will be used to handle MIME types not intrinsincally supported bycloud-init
.
A custom part handler for Kubernetes configurations
I have written a custom part handler that knows about the following MIME types:
text/x-kube-pod
text/x-kube-service
text/x-kube-replica
When the part handler is first initialized it will ensure the Kubernetes is started. If it is provided with a document matching one of the above MIME types, it will pass it to the appropriate
kubecfg
command to create the objects in Kubernetes.Creating multipart MIME archives
I have also created a modified version of the standard
write-multipart-mime.py
Python script. This script will inspect the first lines of files to determine their content type; in addition to the standard cloud-init
types (like #cloud-config
for a text/cloud-config
type file), this script recognizes:#kube-pod
fortext/x-kube-pod
#kube-service
fortext/x-kube-service
#kube-replica
fortext/x-kube-replca
That is, a simple pod description might look something like:
#kube-pod
id: dbserver
desiredState:
manifest:
version: v1beta1
id: dbserver
containers:
- image: mysql
name: dbserver
env:
- name: MYSQL_ROOT_PASSWORD
value: secret
Putting it all together
Assuming that the pod description presented in the previous section is stored in a file named
dbserver.yaml
, we can bundle that file up with our custom part handler like this:$ write-mime-multipart.py \
kube-part-handler.py dbserver.yaml > userdata
We would then launch a Nova instance using the
nova boot
command, providing the generated userdata
file as an argument to the user-data
command:$ nova boot --image fedora-atomic --key-name mykey \
--flavor m1.small --user-data userdata my-atomic-server
You would obviously need to substitute values for
--image
and --key-name
that are appropriate for your environment.Details, details
If you are experimenting with Fedora Atomic 21, you may find out that the above example doesn't work -- the official
mysql
image generates an selinux error. We can switch selinux to permissive mode by putting the following into a file called disable-selinux.sh
:#!/bin/sh
setenforce 0
sed -i '/^SELINUX=/ s/=.*/=permissive/' /etc/selinux/config
And then including that in our MIME archive:
$ write-mime-multipart.py \
kube-part-handler.py disable-selinux.sh dbserver.yaml > userdata
A brief demonstration
If we launch an instance as described in the previous section and then log in, we should find that the pod has already been scheduled:
# kubecfg list pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
dbserver mysql / Waiting
At this point,
docker
needs to pull the mysql
image locally, so this step can take a bit depending on the state of your local internet connection.
Running
docker ps
at this point will yield:# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3561e39f198c kubernetes/pause:latest "/pause" 46 seconds ago Up 43 seconds k8s--net.d96a64a9--dbserver.etcd--3d30eac0_-_745c_-_11e4_-_b32a_-_fa163e6e92ce--d872be51
The
pause
image here is a Kubernetes detail that is used to configure the networking for a pod (in the Kubernetes world, a pod is a group of linked containers that share a common network namespace).
After a few minutes, you should eventually see:
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
644c8fc5a79c mysql:latest "/entrypoint.sh mysq 3 minutes ago Up 3 minutes k8s--dbserver.fd48803d--dbserver.etcd--3d30eac0_-_745c_-_11e4_-_b32a_-_fa163e6e92ce--58794467
3561e39f198c kubernetes/pause:latest "/pause" 5 minutes ago Up 5 minutes k8s--net.d96a64a9--dbserver.etcd--3d30eac0_-_745c_-_11e4_-_b32a_-_fa163e6e92ce--d872be51
And
kubecfg
should show the pod as running:# kubecfg list pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
dbserver mysql 127.0.0.1/ Running
Problems, problems
This works and is I think a relatively elegant solution. However, there are some drawbacks. In particular, the custom part handler runs fairly early in the
cloud-init
process, which means that it cannot depend on changes implemented by user-data
scripts (because these run much later).
A better solution might be to have the custom part handler simply write the Kubernetes configs into a directory somewhere, and then install a service that launches after Kubernetes and (a) watches that directory for files, then (b) passes the configuration to Kubernetes and deletes (or relocates) the file.
Comments
Post a Comment