Kubernetes Task Runner
Available on: Enterprise Edition>= 0.18.0
Run tasks as Kubernetes pods.
How to use the Kubernetes task runner
Thr Kubernetes task runner executes tasks in a specified Kubernetes cluster. It is useful to declare resource limits and resource requests.
Here is an example of a workflow with a task running Shell commands in a Kubernetes pod:
id: kubernetes_task_runner
namespace: company.team
description: |
To get the kubeconfig file, run: `kubectl config view --minify --flatten`.
Then, copy the values to the configuration below.
Here is how Kubernetes task runner properties (on the left) map to the kubeconfig file's properties (on the right):
- clientKeyData: client-key-data
- clientCertData: client-certificate-data
- caCertData: certificate-authority-data
- masterUrl: server e.g. https://docker-for-desktop:6443
- username: user e.g. docker-desktop
inputs:
- id: file
type: FILE
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
inputFiles:
data.txt: "{{ inputs.file }}"
outputFiles:
- "*.txt"
containerImage: centos
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
commands:
- echo "Hello from a Kubernetes task runner!"
- cp data.txt out.txt
File handling
If your script task has inputFiles
or namespaceFiles
configured, an init container will be added to upload files into the main container.
Similarly, if your script task has outputFiles
configured, a sidecar container will be added to download files from the main container.
All containers will use an in-memory emptyDir
volume for file exchange.
Failure scenarios
If a task is resubmitted (e.g. due to a retry or a Worker crash), the new Worker will reattach to the already running (or an already finished) pod instead of starting a new one.
Specifying resource requests for Python scripts
Some Python scripts may require more resources than others. You can specify the resources required by the Python script in the resources
property of the task runner.
id: kubernetes_resources
namespace: company.team
tasks:
- id: python_script
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/kestra-io/pydata:latest
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
namespace: default
pullPolicy: Always
config:
username: docker-desktop
masterUrl: https://docker-for-desktop:6443
caCertData: xxx
clientCertData: xxx
clientKeyData: xxx
resources:
request: # The resources the container is guaranteed to get
cpu: "500m" # Request 1/2 of a CPU (500 milliCPU)
memory: "128Mi" # Request 128 MB of memory
outputFiles:
- "*.json"
script: |
import platform
import socket
import sys
import json
from kestra import Kestra
print("Hello from a Kubernetes runner!")
host = platform.node()
py_version = platform.python_version()
platform = platform.platform()
os_arch = f"{sys.platform}/{platform.machine()}"
def print_environment_info():
print(f"Host's network name: {host}")
print(f"Python version: {py_version}")
print(f"Platform info: {platform}")
print(f"OS/Arch: {os_arch}")
env_info = {
"host": host,
"platform": platform,
"os_arch": os_arch,
"python_version": py_version,
}
Kestra.outputs(env_info)
filename = "environment_info.json"
with open(filename, "w") as json_file:
json.dump(env_info, json_file, indent=4)
if __name__ == '__main__':
print_environment_info()
For a full list of properties available in the Kubernetes task runner, check the Kubernetes plugin documentation or explore the same in the built-in Code Editor in the Kestra UI.
Using plugin defaults to avoid repetition
You can use pluginDefaults
to avoid repeating the same configuration across multiple tasks. For example, you can set the pullPolicy
to Always
for all tasks in a namespace:
id: k8s_taskrunner
namespace: company.team
tasks:
- id: parallel
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: run_command
type: io.kestra.plugin.scripts.python.Commands
containerImage: ghcr.io/kestra-io/kestrapy:latest
commands:
- pip show kestra
- id: run_python
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/kestra-io/pydata:latest
script: |
import socket
ip_address = socket.gethostbyname(hostname)
print("Hello from AWS EKS and kestra!")
print(f"Host IP Address: {ip_address}")
pluginDefaults:
- type: io.kestra.plugin.scripts.python
forced: true
values:
taskRunner:
type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes
namespace: default
pullPolicy: Always
config:
username: docker-desktop
masterUrl: https://docker-for-desktop:6443
caCertData: |-
placeholder
clientCertData: |-
placeholder
clientKeyData: |-
placeholder
Was this page helpful?