Processing - OpenEO ArgoWorkflows with Dask⚓︎
Note: This Building Block is under active development. Some features may still be evolving, so we recommend using it with consideration as updates are rolled out.
OpenEO ArgoWorkflows provides a Kubernetes-native implementation of the OpenEO API specification using Dask for distributed processing. This deployment offers an alternative to the GeoTrellis backend, leveraging Dask’s parallel computing capabilities for Earth observation data processing.
Note: OIDC authentication is configured by default for OpenEO ArgoWorkflows. The deployment integrates with external OIDC providers (e.g., EGI AAI) for authentication. Refer to the IAM Deployment Guide if you need to set up your own OIDC Provider.
Prerequisites⚓︎
Before deploying, ensure your environment meets these requirements:
| Component | Requirement | Documentation Link |
|---|---|---|
| Kubernetes | Cluster (tested on v1.28+) | Installation Guide |
| Helm | Version 3.5 or newer | Installation Guide |
| kubectl | Configured for cluster access | Installation Guide |
| Ingress | Properly installed | Installation Guide |
| Cert Manager | Properly installed | Installation Guide |
| OIDC Provider | Required for authentication | Installation Guide |
| STAC Catalogue | Required for data access | eoAPI Deployment |
Clone the Deployment Guide Repository:
git clone https://github.com/EOEPCA/deployment-guide
cd deployment-guide/scripts/processing/openeo-argo
Validate your environment:
Deployment Steps⚓︎
1. Run the Configuration Script⚓︎
You’ll be prompted for:
| Parameter | Description | Example |
|---|---|---|
INGRESS_HOST |
Base domain for ingress hosts | example.com |
PERSISTENT_STORAGECLASS |
Kubernetes storage class for persistent volumes | standard |
CLUSTER_ISSUER |
Cert-manager Cluster Issuer for TLS certificates | letsencrypt-prod |
OPENEO_ARGO_ENABLE_OIDC |
Enable OIDC authentication (yes/no) | yes |
OIDC_ISSUER_URL |
OIDC provider URL (if OIDC enabled) | https://auth.example.com/realms/eoepca |
OIDC_ORGANISATION |
OIDC organisation identifier (if OIDC enabled) | eoepca |
STAC_CATALOG_ENDPOINT |
STAC catalog URL | https://eoapi.example.com/stac |
2. Add Helm Repositories⚓︎
helm repo add argo https://argoproj.github.io/argo-helm
helm repo add dask https://helm.dask.org
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
3. Prepare the Helm Chart⚓︎
Clone the charts repository and build dependencies:
git clone https://github.com/jzvolensky/charts
helm dependency update charts/eodc/openeo-argo
helm dependency build charts/eodc/openeo-argo
4. Deploy OpenEO ArgoWorkflows⚓︎
helm upgrade -i openeo charts/eodc/openeo-argo \
--namespace openeo \
--create-namespace \
--values generated-values.yaml \
--timeout 10m
5. Deploy Ingress⚓︎
6. Deploy Basic Auth Proxy (if OIDC disabled)⚓︎
If you disabled OIDC authentication during configuration:
7. Configure OIDC Client (if using custom OIDC)⚓︎
A Keycloak client is required for the ingress protection of the Processing BB openEO Argo Engine. The client can be created using the Crossplane Keycloak provider via the Client CRD.
source ~/.eoepca/state
cat <<EOF | kubectl apply -f -
apiVersion: openidclient.keycloak.m.crossplane.io/v1alpha1
kind: Client
metadata:
name: openeo-argo
namespace: iam-management
spec:
forProvider:
realmId: ${REALM}
clientId: openeo-argo
name: openEO Argo Engine
description: openEO Argo Engine OIDC
enabled: true
accessType: PUBLIC
rootUrl: ${HTTP_SCHEME}://openeo.${INGRESS_HOST}
baseUrl: ${HTTP_SCHEME}://openeo.${INGRESS_HOST}
adminUrl: ${HTTP_SCHEME}://openeo.${INGRESS_HOST}
directAccessGrantsEnabled: true
standardFlowEnabled: true
oauth2DeviceAuthorizationGrantEnabled: true
useRefreshTokens: true
validRedirectUris:
- "/*"
- "https://editor.openeo.org/*"
webOrigins:
- "+"
providerConfigRef:
name: provider-keycloak
kind: ProviderConfig
EOF
The Client should be created successfully.
Then remove the role Clients → openeo-public → Client scopes tab Remove roles or other scopes from “Assigned default client scopes” if they’re adding the audience
Validation⚓︎
Automated Validation⚓︎
This verifies:
- All pods in the openeo namespace are running
- PostgreSQL and Redis are operational
- API endpoints return valid responses
Manual Validation⚓︎
Check pod status:
API Health Check:
source ~/.eoepca/state
# Without authentication (basic info only)
curl -s https://openeo.${INGRESS_HOST}/openeo/1.1.0 | jq .
# With basic auth (if OIDC disabled)
curl -s -u eoepcauser:eoepcapass https://openeo.${INGRESS_HOST}/openeo/1.1.0 | jq .
List available processes:
Check Argo Workflows:
API Usage⚓︎
Submit and monitor a job:
# Get access token
ACCESS_TOKEN=$(curl -s -X POST \
"${OIDC_ISSUER_URL}/protocol/openid-connect/token" \
-d "grant_type=password" \
-d "username=${KEYCLOAK_TEST_USER}" \
-d "password=${KEYCLOAK_TEST_PASSWORD}" \
-d "client_id=openeo-argo" \
-d "scope=openid" | jq -r '.access_token')
AUTH_TOKEN="oidc/eoepca/${ACCESS_TOKEN}"
# Create a job
JOB_ID=$(curl -s -i -X POST "https://openeo.${INGRESS_HOST}/openeo/1.1.0/jobs" \
-H "Authorization: Bearer ${AUTH_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"process": {
"process_graph": {
"load": {
"process_id": "load_collection",
"arguments": {
"id": "your-collection-id",
"spatial_extent": {"west": -34.0, "south": 38.8, "east": -33.0, "north": 39.5},
"temporal_extent": ["2025-10-20", "2025-10-31"]
}
},
"save": {
"process_id": "save_result",
"arguments": {
"data": {"from_node": "load"},
"format": "GTiff"
},
"result": true
}
}
},
"title": "Test Job"
}' | grep -i "^openeo-identifier:" | cut -d' ' -f2 | tr -d '\r\n')
echo "Created job: ${JOB_ID}"
# Start the job
curl -s -X POST "https://openeo.${INGRESS_HOST}/openeo/1.1.0/jobs/${JOB_ID}/results" \
-H "Authorization: Bearer ${AUTH_TOKEN}"
# Check status
curl -s "https://openeo.${INGRESS_HOST}/openeo/1.1.0/jobs/${JOB_ID}" \
-H "Authorization: Bearer ${AUTH_TOKEN}" | jq '{id, status, title}'
# List all jobs
curl -s "https://openeo.${INGRESS_HOST}/openeo/1.1.0/jobs" \
-H "Authorization: Bearer ${AUTH_TOKEN}" | jq
Note: The STAC catalogue must contain collections with data formatted for OpenEO processing. Check the available collections at your STAC endpoint and ensure the spatial/temporal extent matches actual data.