EOEPCA+ MinIO Deployment Guide⚓︎
MinIO is a high-performance object storage system that’s compatible with the Amazon S3 API. In the EOEPCA+ ecosystem, MinIO can serve as the object storage backend for various services, including user workspaces, MLOps and other data storage needs. This does not preclude the possibility to configure an alternative S3-compatible object storage solution instead of MinIO.
Introduction⚓︎
MinIO provides a scalable and high-performance object storage solution that’s compatible with AWS S3 APIs. It’s used within the EOEPCA+ platform to store and manage data securely.
Scripted Deployment⚓︎
The Minio deployment in this guide follows the same Scripted Deployment Approach, as for the EOEPCA building blocks.
Prerequisites⚓︎
Before you begin, make sure you have the following:
Component | Requirement | Documentation Link |
---|---|---|
Kubernetes | Cluster (tested on v1.28) | Installation Guide |
Helm | Version 3.5 or newer | Installation Guide |
kubectl | Configured for cluster access | Installation Guide |
Ingress | Properly installed | Ingress Controller Setup Guide |
TLS Certificates | Managed via cert-manager or manually |
TLS Certificate Management Guide |
Clone the Deployment Guide Repository:
Validate your environment:
Run the validation script to ensure all prerequisites are met:
Deployment Steps⚓︎
1. Configure MinIO⚓︎
Run the configuration script:
During the script execution, you will be prompted for:
INGRESS_HOST
: Base domain for ingress hosts.- Example:
example.com
- Example:
CLUSTER_ISSUER
(if usingcert-manager
): Name of the ClusterIssuer.- Example:
letsencrypt-http01-apisix
- Example:
STORAGE_CLASS
: Storage class for persistent volumes.- Example:
standard
- Example:
2. Deploy MinIO⚓︎
Install MinIO using Helm:
helm repo add minio https://charts.min.io/
helm repo update minio
helm upgrade -i minio minio/minio \
--version 5.4.0 \
--values generated-values.yaml \
--namespace minio \
--create-namespace
Apply additional ingress configuration:
3. Create Access Keys⚓︎
Access the MinIO Console to create access keys:
- Navigate to
https://console-minio.${INGRESS_HOST}/access-keys/new-account
- Log in using the MinIO User (
user
) and MinIO Password generated during the configuration step - see file~/.eoepca/state
. - Under
Access Keys
select toCreate access key +
- Note down or download the Access Key and Secret Key.
- Click
Create
.
Run the following script to save these keys to your EOEPCA+ state file:
By saving these keys to the EOEPCA+ state file, the credentials will be automatically set during the deployment of S3-integrated Building Blocks.
Validation⚓︎
Automated Validation:
This script performs several checks to validate your MinIO deployment:
Pod and Service Checks: Verifies that all MinIO pods are running and services are available.
Endpoint Checks: Confirms that the MinIO endpoints are accessible and return the expected HTTP status codes.
Functionality Tests: - Creates a test bucket. - Uploads a test file to the bucket. - Deletes the test file. - Deletes the test bucket.
Note: The script uses
s3cmd
to interact with MinIO. Ensure thats3cmd
is installed and configured on your machine. Please refer to thes3cmd
Installation Guide.
Manual Validation:
-
Check Kubernetes Resources:
-
Access Dashboard:
-
Log In:
Use the credentials generated during the configuration step:
- Username:
user
- Password: (the password generated by the script)
- Username:
-
Verify Buckets:
You should see the following buckets:
- cache-bucket
- cluster-storage
- eoepca
- gitlab-backup-storage
- gitlab-lfs-storage
- gitlab-tmp-storage
- mlopbb-mlflow-sharinghub
- mlopbb-sharinghub
- openeo-geotrellis-data
-
Create a Test Bucket:
Use the dashboard to create a new bucket and upload a test file.
ReadWriteMany Storage using JuiceFS⚓︎
This is optional. It is presented here in case you are in need of a
ReadWriteMany
storage solution, and want to explore the possibility to provide this through object storage.
As described in the Storage section, JuiceFS offers a ReadWriteMany
(RWX) storage solution backed by Object Storage.
The steps in this section illustrate the approach to create the storage class eoepca-rw-many
that can be used by BBs requiring ReadWriteMany persistence.
The JuiceFS approach is designed to exploit the prevailing object storage solution that is provided by your cloud of choice. Hence, while it is possible to use MinIO as the object storage backend, this is not really the intended use case.
Nevertheless, MinIO provides a convenient way to demonstrate the principles of JuiceFS in a self-contained manner. The approach shown here can be adapted to the object storage of your cloud provider.
Deploy the JuiceFS CSI Driver⚓︎
Ref. JuiceFS CSI Driver Documentation
helm upgrade -i juicefs-csi-driver juicefs-csi-driver \
--repo https://juicedata.github.io/charts/ \
--namespace juicefs \
--create-namespace
Sidecar
Mount Mode⚓︎
The default deployment uses a Mount Pod
for each PVC. This is the recommended approach, but does not suit all situations as it relies upon a DaemonSet
. The alternative Sidecar
mode can be used by setting the helm value…
In this case each namespace
that needs to use the JuiceFS CSI Driver (i.e. whose workloads are using the defined StorageClass
), must be labelled for webhook injection…
Deploy a Metadata Engine⚓︎
We will create a StorageClass that uses the JuiceFS CSI Driver to dynamically provision PersistentVolumes. But first we need a metadata engine accessible from all cluster nodes. There are many options, but for simplicity we will use Redis.
Redis is an in-memory key-value database with excellent performance but relatively weak reliability. For production there are better alternative to Redis - such as TiKV, PostgreSQL - which should be used in preference to Redis, and aligned with your performance and reliability requirements.
helm install redis redis \
--repo https://charts.bitnami.com/bitnami \
--set architecture=standalone \
--set auth.enabled=false \
--namespace juicefs \
--create-namespace
Create the StorageClass⚓︎
Now we can create a StorageClass that uses the JuiceFS CSI Driver (provisioner), referencing the Redis metadata engine and an S3-compatible Object Storage solution.
The below configuration uses the bucket
cluster-storage
in the MinIO instance. It is assumed that this bucket already exists - as will be the case if MinIO was deployed as per this guide.
source ~/.eoepca/state
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: sc-eoepca-rw-many
namespace: juicefs
type: Opaque
stringData:
name: eoepca-rw-many # The JuiceFS file system name
access-key: ${S3_ACCESS_KEY} # Object storage credentials
secret-key: ${S3_SECRET_KEY} # Object storage credentials
metaurl: redis://redis-master.juicefs.svc.cluster.local # Connection URL for metadata engine.
storage: s3 # Object storage type, such as s3, gs, oss.
bucket: ${S3_ENDPOINT}/cluster-storage # Bucket URL of object storage.
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: eoepca-rw-many
reclaimPolicy: Delete # Specify "Retain" if you want to retain the data after PVC deletion
provisioner: csi.juicefs.com
parameters:
csi.storage.k8s.io/provisioner-secret-name: sc-eoepca-rw-many
csi.storage.k8s.io/provisioner-secret-namespace: juicefs
csi.storage.k8s.io/node-publish-secret-name: sc-eoepca-rw-many
csi.storage.k8s.io/node-publish-secret-namespace: juicefs
EOF
Test the StorageClass⚓︎
We can test the StorageClass and associated provisioner by creating a PersistentVolumeClaim that uses it.
cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: juicefs-test-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: eoepca-rw-many
---
apiVersion: v1
kind: Pod
metadata:
name: juicefs-test-app
namespace: default
spec:
containers:
- args:
- -c
- while true; do echo $(date -u) >> /data/out.txt; sleep 5; done
command:
- /bin/sh
image: busybox
name: app
volumeMounts:
- mountPath: /data
name: juicefs-test-pv
volumes:
- name: juicefs-test-pv
persistentVolumeClaim:
claimName: juicefs-test-pvc
EOF
Check the request PVC is Bound
by the eoepca-rw-many
StorageClass:
You can inspect the object storage bucket to see the data chunks being written to the JuiceFS eoepca-rw-many
volume by the test pod.
There may be an initial delay before the first writes are reflected in the MinIO UI.
Run another pod to read the data being written by the test pod:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: juicefs-tail
namespace: default
spec:
containers:
- name: tailer
image: busybox
command: ["/bin/sh", "-c", "tail -f /data/out.txt"]
volumeMounts:
- mountPath: /data
name: juicefs-test-pv
volumes:
- name: juicefs-test-pv
persistentVolumeClaim:
claimName: juicefs-test-pvc
restartPolicy: Never
EOF
kubectl wait --for=condition=Ready pod/juicefs-tail --timeout=60s
kubectl logs -f juicefs-tail
Use Ctrl-C to exit the log stream.
Remove the test resources when done:
kubectl delete pod juicefs-tail
kubectl delete pod juicefs-test-app
kubectl delete pvc juicefs-test-pvc
JuiceFS Summary⚓︎
We have created a StorageClass that uses the cluster-storage
bucket in the MinIO instance to create a file-system inside this bucket under the path eoepca-rw-many/
. This file-system is then used to back ReadWriteMany
PersistentVolumes that are dynamically provisioned by the JuiceFS CSI Driver.
Uninstallation⚓︎
Remove MinIO and associated resources:
Further Reading⚓︎
Feedback⚓︎
If you have any issues or suggestions, please open an issue on the EOEPCA+ Deployment Guide Repository.