Storage
Different storage options available in Kestra.
Kestra uses internal storage to store data processed by tasks. This includes files from flow inputs and data stored as task outputs.
The default internal storage implementation is the local storage which is not suitable for production as it will store data in a local folder on the host filesystem.
This local storage can be configured as follows:
kestra:
storage:
type: local
local:
base-path: /tmp/kestra/storage/ # your custom path
Other internal storage types include:
- Storage S3 for AWS S3
- Storage GCS for Google Cloud Storage
- Storage Minio compatible with others S3 like storage services
- Storage Azure for Azure Blob Storage
S3
First, make sure that the S3 storage plugin is installed in your environment. You can install it with the following Kestra command:
./kestra plugins install io.kestra.storage:storage-s3:LATEST
. This command will download the plugin's jar file into the plugin's directory.
Then, enable the storage using the following configuration:
kestra:
storage:
type: s3
s3:
endpoint: "<your-s3-endpoint>" # Only needed if you host your own S3 storage
accessKey: "<your-aws-access-key-id>"
secretKey: "<your-aws-secret-access-key>"
region: "<your-aws-region>"
bucket: "<your-s3-bucket-name>"
forcePathStyle: "<true|false>" # optional, defaults to false, useful if you mandate domain resolution to be path-based due to DNS for eg.
If you are using an AWS EC2 or EKS instance, you can use the default credentials provider chain. In this case, you can omit the accessKey
and secretKey
options:
kestra:
storage:
type: s3
s3:
region: "<your-aws-region>"
bucket: "<your-s3-bucket-name>"
Additional configurations can be found here.
Assume IAM Role - AWS Security Token Service (STS)
You can configure Amazon S3 storage to utilize AWS AssumeRole to temporarily assume the permissions of a specific IAM role.
When using AssumeRole with S3, you can omit specifying the AccessKeyId
and SecretAccessKey
in your configurations.
Instead, the default credentials provider chain will be used to authenticate with AWS STS and retrieve the temporary security credentials.
kestra:
storage:
type: s3
s3:
region: "<your-aws-region>"
bucket: "<your-s3-bucket-name>"
stsRoleArn: "<ARN of the role to assume>"
# (Optional)
stsRoleExternalId: "<AWS STS External Id>"
# (Optional, Default: 'kestra-storage-s3-<timestamp>')
stsRoleSessionName: "<AWS STS Session name>"
# (Optional, Default: 15m)
stsRoleSessionDuration: "<AWS STS Session duration>"
# (Optional)
stsEndpointOverride: "<AWS Endpoint to communicate with>"
Minio
If you use Minio or similar S3-compatible storage options, you can follow the same process as shown above to install the Minio storage plugin. Then, make sure to include the Minio's endpoint
and port
in the storage configuration:
kestra:
storage:
type: minio
minio:
endpoint: "<your-endpoint>"
port: "<your-port>"
accessKey: "<your-accessKey>"
secretKey: "<your-secretKey>"
region: "<your-region>"
secure: "<your-secure>"
bucket: "<your-bucket>"
partSize: "<your-part-size>" # syntax: <number><unit> (KB, MB, GB), defaults to 5MB
Optionally and if the Minio configured is configured to do so (MINIO_DOMAIN=my.domain.com
environment variable on Minio server), you can also use the kestra.storage.minio.vhost: true
property to make Minio client to use the virtual host syntax.
Please note that the endpoint should always be your base domain (even if you use the virtual host syntax). In the above example, endpoint: my.domain.com
, bucket: my-bucket
. Setting endpoint: my-bucket.my.domain.com
will lead to failure.
Azure
First, install the Azure storage plugin. To do that, you can leverage the following Kestra command:
./kestra plugins install io.kestra.storage:storage-azure:LATEST
. This command will download the plugin's jar file into the plugins directory.
Adjust the storage configuration shown below depending on your chosen authentication method:
kestra:
storage:
type: azure
azure:
endpoint: "https://unittestkt.blob.core.windows.net"
container: storage
connectionString: "<connection-string>"
sharedKeyAccountName: "<shared-key-account-name>"
sharedKeyAccountAccessKey: "<shared-key-account-access-key>"
sasToken: "<sas-token>"
GCS
You can install the GCS storage plugin using the following Kestra command:
./kestra plugins install io.kestra.storage:storage-gcs:LATEST
. This command will download the plugin's jar file into the plugins directory.
Then, you can enable the storage using the following configuration:
kestra:
storage:
type: gcs
gcs:
bucket: "<your-bucket-name>"
project-id: "<project-id or use default projectId>"
serviceAccount: "<serviceAccount key as JSON or use default credentials>"
If you haven't configured the kestra.storage.gcs.serviceAccount
option, Kestra will use the default service account, which is:
- the service account defined on the cluster (for GKE deployments)
- the service account defined on the compute instance (for GCE deployments).
You can also provide the environment variable GOOGLE_APPLICATION_CREDENTIALS
with a path to a JSON file containing GCP service account key.
You can find more details in the GCP documentation.
Was this page helpful?