These are notes on deploying the docker Roundup image on an odroid running the light weight kubernetes (k8s) environment provided by k3s.
Contents
The image is obtained from: https://hub.docker.com/r/rounduptracker/roundup-development under the multi tag. This image is built for i386, amd64 (aka x86_64), armv7, and the critical arch for this example: arm64.
The odroid c4 computer is running Debian bullseye/sid.
k3s was installed using the quick start method.
I used a demo mode deployment using sqlite as the database. It has 3 replicas all sharing the same Persistent Volume (PV) running on one node.1
Getting started
I created (updated 2023-12-26):
- Persistent Volume Claim using the local-path storage class.
- Service of type Load Balancer
- Deployment of the Roundup image with a volume mounted at /usr/src/app/tracker.
Also can export secrets stored under roundup-secrets to /usr/src/app/secrets/.... These can be used with file://../secrets/secret_name in config.ini.
- The root filesystem is mounted read only.
Because the root filesystem is read only, /tmp is an empty directory mounted read write to allow file uploads.
- a horizontal autoscaler between 2 and 5 replicas.
Here is the single yaml file roundup-demo-deployment.yaml with the four parts:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: roundup-demo-pvc namespace: default labels: app: roundup-demo spec: accessModes: - ReadWriteOnce storageClassName: local-path resources: requests: storage: 2Gi --- apiVersion: v1 kind: Service metadata: name: roundup-demo labels: app: roundup-demo spec: ports: - name: "8080" port: 8917 targetPort: 8080 selector: app: roundup-demo type: LoadBalancer --- apiVersion: apps/v1 kind: Deployment metadata: name: roundup-demo labels: app: roundup-demo namespace: default spec: # comment out replicas due to using autoscaling group # replicas: 3 minReadySeconds: 30 selector: matchLabels: app: roundup-demo #strategy: # type: Recreate template: metadata: labels: app: roundup-demo spec: # add to make secrets files readable to roundup group. securityContext: runAsNonRoot: true runAsUser: 1000 runAsGroup: 1000 fsGroup: 1000 # I need fsGroup for secrets only. tracker dir can be left alone. # Try to prevent the equivalent of a # 'find tracker -exec chgrp 1000 \{}' # down the tracker subdir that can be deep. fsGroupChangePolicy: "OnRootMismatch" containers: - name: roundup-demo image: rounduptracker/roundup-development:multi imagePullPolicy: Always args: ['demo'] ports: - name: roundup-demo containerPort: 8080 resources: # limits: # cpu: 500m # memory: "52428800" requests: cpu: 500m memory: "20971520" readinessProbe: httpGet: path: /demo/ port: roundup-demo failureThreshold: 30 periodSeconds: 10 successThreshold: 2 volumeMounts: - name: trackers mountPath: /usr/src/app/tracker - name: secret-volume mountPath: /usr/src/app/tracker/secrets readOnly: true # required for readOnlyRootFilesystem securityContext - name: tmp-scratch mountPath: /tmp securityContext: readOnlyRootFilesystem: true volumes: - name: trackers persistentVolumeClaim: claimName: roundup-demo-pvc - name: tmp-scratch emptyDir: {} - name: secret-volume secret: secretName: roundup-secret optional: true # octal 0400 -> dec 256; 0440 -> 288 # for some reason even 256 becomes mode 0440 with # the fsGroup security context. Without securitycontext # it's mode 400. defaultMode: 288 --- apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: labels: app: roundup-demo name: roundup-demo spec: maxReplicas: 5 metrics: - resource: name: cpu target: averageUtilization: 70 type: Utilization type: Resource minReplicas: 2 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: roundup-demo
It was deployed using: sudo kubectl create -f roundup-demo-deployment.yaml.
Once it was deployed and a pod was active, I used: sudo kubectl get pods to get the name of one of the pods. Then I sudo kubectl exec -it pod/roundup-demo-578d6c65d8-v2wmr -- sh to get a shell. I edited the tracker's config.ini replacing localhost with the hostname of the odroid. Then I restarted all the pods using sudo kubectl rollout restart deployment/roundup-demo. I was able to watch the pods get recycled.
NAME READY STATUS RESTARTS AGE roundup-demo-7fb6bcb5b9-l6w59 1/1 Running 0 36s roundup-demo-7fb6bcb5b9-w7sg8 1/1 Running 0 33s roundup-demo-657796ff66-9jm68 1/1 Running 0 3s roundup-demo-7fb6bcb5b9-4kj8l 1/1 Terminating 0 30s roundup-demo-657796ff66-gv7bx 0/1 Pending 0 0s
Then I could connect to 'http://odroid_name:8179/demo/' and interact with the tracker.
Also I set up the same app label across all of the k8s objects. So I could run:
% sudo kubectl get pvc,pv,deployments,pods,service,hpa -l app=roundup-demo NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/roundup-demo-pvc Bound pvc-7d5808e0-5ce8-45ee-a794-f5cdc3f1c677 2Gi RWO local-path 15h NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/pvc-7d5808e0-5ce8-45ee-a794-f5cdc3f1c677 2Gi RWO Delete Bound default/roundup-demo-pvc local-path 15h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/roundup-demo 3/3 3 3 15h NAME READY STATUS RESTARTS AGE pod/roundup-demo-657796ff66-9jm68 1/1 Running 0 89s pod/roundup-demo-657796ff66-gv7bx 1/1 Running 0 86s pod/roundup-demo-657796ff66-m77wn 1/1 Running 0 83s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/roundup-demo LoadBalancer 10.43.175.253 172.23.1.28 8917:32564/TCP 15h NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/roundup-demo Deployment/roundup-demo 2%/70% 2 5 2 5h
to list all of the resources:
- Persistent Volume Claim
- Persistent Volume
- Deployment
- pods (three replicas)
- service load balancer
- horizontal autoscaler
At one point, I added interfaces.py to the tracker. It was missing an import. When I restarted the rollout, all three pods errored. Since there is no way to kubectl run an image with attached volumes, I had to create a Pod declaration:
apiVersion: v1 kind: Pod metadata: name: roundup-demo-edit labels: app: roundup-demo-edit namespace: default spec: containers: - name: roundup-demo-edit image: rounduptracker/roundup-development:multi imagePullPolicy: Always args: ['shell'] stdin: true tty: true volumeMounts: - name: trackers mountPath: /usr/src/app/tracker volumes: - name: trackers persistentVolumeClaim: claimName: roundup-demo-pvc
Note the use of stdin and tty to do the equivalent of docker -it or kubectl run -it for the deployed pod. Without these settings, the container will exit because it requires an interactive tty for shell mode. Using sudo kubectl create -f demo-edit.yaml I was able to start a running pod with the PV attached. I then ran sudo kubectl attach -it roundup-demo-edit and edited the broken interfaces.py. One of the broken pods was trying to restart and did come up but the other two were still in a failed state. However even the one working pod made the demo tracker accessible.
I then restarted a rollout and all three came up.
This shouldn't be needed. When Roundup crashed one of the working pods should have been left. However at the time, I didn't have minReadySeconds: 30 setup. I think k3s restarted the first pod and it didn't crash before all of the rest of the pods had been recycled as well. This left me without any working pods 8-(.
Also I was missing the readinessProbe at the time. The latest update above fixes this.
Database Integrity (what's that??)
One thing to note is that I am playing fast and loose with the data. SQLite has locks and other mechanisms to prevent data loss when multiple processes are accessing the same database. These work on a single file on disk. I believe they also work when using a local file data provider. If I was using NFS or other network/shared disk provider I think there would be a higher chance of data corruption. If your roundup servers are running on multiple systems, you should use journal (DELETE) mode, not the default WAL mode. The shared mmap'ed memory segment required for WAL mode can't be shared across systems.
If you run multiple replicas in an HA config across multiple nodes, you should (must) use mysql or postgresql.
The way that Roundup stores file data is to get a file id number from the db and use that to store the data. One file is written and read multiple times. If the db is working, there should never be two roundup processes trying to overwrite the same file. This is similar to courier maildir or MH maildir handling which were designed to work on NFS.
Backups
For backups I have resorted to using kubectl exec into one of the pods. For example:
% sudo kubectl exec roundup-demo-7bfdf97595-2zsfm -- tar -C /usr/src/app -cf - tracker | tar -xvf - --wildcards 'tracker/demo/db/db*'
can create a copy of the db files. The entire tarfile could just be captured for backup purposes. It would probably be a good idea to use the sqlite3 command (which is not in the docker image 8-() to make a consistent copy of the db files and back them up. See: https://www.sqlite.org/backup.html, https://www.sqlite.org/forum/info/2ea989bbe9a6dfc8 for other ideas including .dump.
Creating a cronjob that does the export is a future idea.
Debugging - running pod using kubectl debug
At one point I needed to check to see if the sqlite db was in wal mode. However the image is missing the sqlite3 cli. Also there is no way to run as root inside a pod container (docker exec -it -u root is not a thing in k8s world). Also the root filesystem was mounted read only. So I ran a debugging image using:
sudo kubectl-superdebug roundup-demo-7bfdf97595-czckt -t roundup-demo -I alpine:latest
from https://github.com/JonMerlevede/kubectl-superdebug and described at https://medium.com/datamindedbe/debugging-running-pods-on-kubernetes-2ba160c47ef5. This creates a ephemeral container that you attach to. In the ephemeral container, you run as root, so you can apk install sqlite 8-). This container shares the pod resources which allows additional investigation if needed.
You can also spin up a new pod similar to my edit pod above.
I had issues with k3s not starting on a new system. It reported:
failed to find cpu cgroup (v2)
To try to fix I used https://rootlesscontaine.rs/getting-started/common/cgroup2/#enabling-cpu-cpuset-and-io-delegation
Rootless Containers / Getting Started / Common steps (Read first!) / [Optional] cgroup v2 Edit this page [Optional] cgroup v2 Note Enabling cgroup v2 is optional. Enabling cgroup v2 is often needed for running Rootless Containers with limiting the consumption of the CPU, memory, I/O, and PIDs resources, e.g. docker run --memory 32m. Note that cgroup is not needed for just limiting resources with traditional ulimit and cpulimit, though they work in process-granularity rather than in container-granularity. See here for the further information. Checking whether cgroup v2 is already enabled If /sys/fs/cgroup/cgroup.controllers is present on your system, you are using v2, otherwise you are using v1. The following distributions are known to use cgroup v2 by default: Fedora (since 31) Arch Linux (since April 2021) openSUSE Tumbleweed (since c. 2021) Debian GNU/Linux (since 11) Ubuntu (since 21.10) RHEL and RHEL-like distributions (since 9) Enabling cgroup v2 Enabling cgroup v2 for containers requires kernel 4.15 or later. Kernel 5.2 or later is recommended. And yet, delegating cgroup v2 controllers to non-root users requires a recent version of systemd. systemd 244 or later is recommended. To boot the host with cgroup v2, add the following string to the GRUB_CMDLINE_LINUX line in /etc/default/grub and then run sudo update-grub. systemd.unified_cgroup_hierarchy=1 For ubuntu on azure, you should add this in /etc/default/grub.d/50-cloudimg-settings.cfg Enabling CPU, CPUSET, and I/O delegation By default, a non-root user can only get memory controller and pids controller to be delegated. $ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers memory pids To allow delegation of other controllers such as cpu, cpuset, and io, run the following commands: sudo mkdir -p /etc/systemd/system/user@.service.d cat <<EOF | sudo tee /etc/systemd/system/user@.service.d/delegate.conf [Service] Delegate=cpu cpuset io memory pids EOF sudo systemctl daemon-reload Delegating cpuset is recommended as well as cpu. Delegating cpuset requires systemd 244 or later. After changing the systemd configuration, you need to re-login or reboot the host. Rebooting the host is recommended.
but this did nothing probably because I was still using a 4.9 kernel and cpu needs a 4.15 kernel per: https://www.man7.org/linux/man-pages/man7/cgroups.7.html.
The PV is set to ReadWriteOnce. This prevents another node from mounting the volume. However due to a feature/bug in PV's multiple pods can use the same PV even if it is ReadWriteOnce as long as they run on the same node. (1)