Troubleshooting
Something doesn't work as expected? Check out our advice to help you troubleshoot.
CrashLoopBackoff when restarting all pods
Question from a user: "When I restart all Kubernetes pods at once, they get stuck in a CrashLoopBackoff
for a number of minutes before eventually resolving - Why does it happen?"
You are likely hitting the startup issue of a Java application causing failure of liveness probes. The startup can consume a lot of resources, since Java will load many classes at startup. Setting the CPU limit to 0.5
can improve the startup time of the server and should resolve the failing healthchecks.
Unprocessable execution
Sometimes, things can go wrong in an unattended manner; in such situations, you can skip an execution that Kestra is not able to process.
To do so, you can start the executor server (or the standalone server if not using a deployment with separate components) with a list of execution identifiers to skip.
kestra server executor --skip-executions 6FSPERUe1JwbYmMmdwRlgV,5iLGjTLOHAVGUGlsesFaMb
Docker in Docker (DinD)
If you face some issues using Docker in Docker e.g. with Script tasks using DOCKER
runner, start troubleshooting by attaching the terminal: docker run -it --privileged docker:dind sh
. Then, use docker logs container_ID
to get the container logs.
Also, try docker inspect container_ID
to get more information about your Docker container. The output from this command displays details about the container, its environments, network settings, etc. This information can help you identify what might be wrong.
Docker in Docker using Helm charts
On some Kubernetes deployments, using DinD with our default Helm charts can lead to:
Device "ip_tables" does not exist.
ip_tables 24576 4 iptable_raw,iptable_mangle,iptable_nat,iptable_filter
modprobe: can't change directory to '/lib/modules': No such file or directory
error: attempting to run rootless dockerd but need 'kernel.unprivileged_userns_clone' (/proc/sys/kernel/unprivileged_userns_clone) set to 1
To fix this, use root
to launch the DinD container by setting the following values:
dind:
image:
tag: dind
args:
- --log-level=fatal
securityContext:
runAsUser: 0
runAsGroup: 0
securityContext:
runAsUser: 0
runAsGroup: 0
Docker in Docker (DinD) on a Mac with ARM-based Apple silicon chip
If you are getting an error similar to: java.io.IOException: com.sun.jna.LastErrorException: [111] Connection refused
, it might be related to a Docker in Docker (DinD) issue.
Try using an embedded Docker server as shown below:
tmp directory "No such file or directory"
If you're encountering errors associated to the tmp directory such as "No such file or directory", there's a good chance your tmp directory isn't mounted correctly in your Kestra configuration.
When configuring your docker-compose.yml
, it's important to make sure that the tmp-dir
has the same path as the volume otherwise Kestra won't know what directory to mount for the tmp
directory.
kestra:
tasks:
tmp-dir:
path: /home/kestra/tmp
In this example, /home/kestra:/home/kestra
matches the tasks tmp-dir
field.
volumes:
- kestra-data:/app/storage
- /var/run/docker.sock:/var/run/docker.sock
- /home/kestra:/home/kestra
Was this page helpful?