Different storage options available in Kestra.

Kestra uses internal storage to store data processed by tasks. This includes files from flow inputs and data stored as task outputs.

The default internal storage implementation is the local storage which is not suitable for production as it will store data in a local folder on the host filesystem.

This local storage can be configured as follows:

yaml
kestra:
  storage:
    type: local
    local:
      base-path: /tmp/kestra/storage/ # your custom path

Other internal storage types include:

S3

First, make sure that the S3 storage plugin is installed in your environment. You can install it with the following Kestra command: ./kestra plugins install io.kestra.storage:storage-s3:LATEST. This command will download the plugin's jar file into the plugin's directory.

Then, enable the storage using the following configuration:

yaml
kestra:
  storage:
    type: s3
    s3:
      endpoint: "<your-s3-endpoint>" # Only needed if you host your own S3 storage
      accessKey: "<your-aws-access-key-id>"
      secretKey: "<your-aws-secret-access-key>"
      region: "<your-aws-region>"
      bucket: "<your-s3-bucket-name>"
      forcePathStyle: "<true|false>" # optional, defaults to false, useful if you mandate domain resolution to be path-based due to DNS for eg.

If you are using an AWS EC2 or EKS instance, you can use the default credentials provider chain. In this case, you can omit the accessKey and secretKey options:

yaml
kestra:
  storage:
    type: s3
    s3:
      region: "<your-aws-region>"
      bucket: "<your-s3-bucket-name>"

Additional configurations can be found here.

Assume IAM Role - AWS Security Token Service (STS)

You can configure Amazon S3 storage to utilize AWS AssumeRole to temporarily assume the permissions of a specific IAM role. When using AssumeRole with S3, you can omit specifying the AccessKeyId and SecretAccessKey in your configurations. Instead, the default credentials provider chain will be used to authenticate with AWS STS and retrieve the temporary security credentials.

yaml
kestra:
  storage:
    type: s3
    s3:
      region: "<your-aws-region>"
      bucket: "<your-s3-bucket-name>"
      stsRoleArn: "<ARN of the role to assume>"
      # (Optional)
      stsRoleExternalId: "<AWS STS External Id>"
      # (Optional, Default: 'kestra-storage-s3-<timestamp>')
      stsRoleSessionName: "<AWS STS Session name>"
      # (Optional, Default: 15m)
      stsRoleSessionDuration: "<AWS STS Session duration>"
      # (Optional)
      stsEndpointOverride: "<AWS Endpoint to communicate with>"

Minio

If you use Minio or similar S3-compatible storage options, you can follow the same process as shown above to install the Minio storage plugin. Then, make sure to include the Minio's endpoint and port in the storage configuration:

yaml
kestra:
  storage:
    type: minio
    minio:
      endpoint: "<your-endpoint>"
      port: "<your-port>"
      accessKey: "<your-accessKey>"
      secretKey: "<your-secretKey>"
      region: "<your-region>"
      secure: "<your-secure>"
      bucket: "<your-bucket>"
      partSize: "<your-part-size>" # syntax: <number><unit> (KB, MB, GB), defaults to 5MB

Optionally and if the Minio configured is configured to do so (MINIO_DOMAIN=my.domain.com environment variable on Minio server), you can also use the kestra.storage.minio.vhost: true property to make Minio client to use the virtual host syntax.

Please note that the endpoint should always be your base domain (even if you use the virtual host syntax). In the above example, endpoint: my.domain.com, bucket: my-bucket. Setting endpoint: my-bucket.my.domain.com will lead to failure.

Azure

First, install the Azure storage plugin. To do that, you can leverage the following Kestra command: ./kestra plugins install io.kestra.storage:storage-azure:LATEST. This command will download the plugin's jar file into the plugins directory.

Adjust the storage configuration shown below depending on your chosen authentication method:

yaml
kestra:
  storage:
    type: azure
    azure:
      endpoint: "https://unittestkt.blob.core.windows.net"
      container: storage
      connectionString: "<connection-string>"
      sharedKeyAccountName: "<shared-key-account-name>"
      sharedKeyAccountAccessKey: "<shared-key-account-access-key>"
      sasToken: "<sas-token>"

GCS

You can install the GCS storage plugin using the following Kestra command: ./kestra plugins install io.kestra.storage:storage-gcs:LATEST. This command will download the plugin's jar file into the plugins directory.

Then, you can enable the storage using the following configuration:

yaml
kestra:
  storage:
    type: gcs
    gcs:
      bucket: "<your-bucket-name>"
      project-id: "<project-id or use default projectId>"
      serviceAccount: "<serviceAccount key as JSON or use default credentials>"

If you haven't configured the kestra.storage.gcs.serviceAccount option, Kestra will use the default service account, which is:

  • the service account defined on the cluster (for GKE deployments)
  • the service account defined on the compute instance (for GCE deployments).

You can also provide the environment variable GOOGLE_APPLICATION_CREDENTIALS with a path to a JSON file containing GCP service account key.

You can find more details in the GCP documentation.

Was this page helpful?