Processing - OGC API Processes Engine⚓︎
Introduction⚓︎
The OGC API Processes Engine lets users deploy, manage, and execute OGC Application Packages through a standardised API. It’s built on the ZOO-Project zoo-project-dru implementation, which supports OGC WPS 1.0.0/2.0.0 and OGC API Processes Parts 1 & 2.
The engine supports multiple execution backends depending on your infrastructure:
| Execution Engine | Backend | Best For |
|---|---|---|
| Calrissian | Kubernetes jobs in dedicated namespaces | Pure Kubernetes environments |
| Toil | HPC batch schedulers (HTCondor, Slurm, PBS, LSF, etc.) | Hybrid Kubernetes + HPC environments |
Both backends use the same OGC API Processes interface - the difference is where the actual computation runs.
Prerequisites⚓︎
Common Requirements⚓︎
| Component | Requirement | Documentation Link |
|---|---|---|
| Kubernetes | Cluster (tested on v1.32) | Installation Guide |
| Helm | Version 3.5 or newer | Installation Guide |
| kubectl | Configured for cluster access | Installation Guide |
| Ingress | Properly installed | Installation Guide |
| TLS Certificates | Managed via cert-manager or manually |
TLS Certificate Management Guide |
| Stage-In S3 | Accessible | MinIO Deployment Guide |
| Stage-Out S3 | Accessible | MinIO Deployment Guide |
Calrissian-Specific Requirements⚓︎
No additional requirements beyond the common prerequisites. Calrissian runs CWL workflows as Kubernetes jobs, so everything stays within your cluster.
Toil/HPC-Specific Requirements⚓︎
You’ll need an HPC cluster with:
- Container support (Docker or Apptainer/Singularity)
- Internet access from compute nodes (or local container registries and data repositories)
- A Toil WES service endpoint
Toil supports several batch schedulers: HTCondor, Slurm, PBS/Torque/PBS Pro LSF, and Grid Engine.
Setting up a Local HTCondor (Development/Testing Only)⚓︎
This is only for Toil, if you are using Calrissian, skip this section.
Warning: This setup is for development and testing purposes only. Do not use this in production - use your organisation’s HPC infrastructure instead.
If you don’t have access to an HPC cluster and want to test the Toil integration locally, you can install MiniHTCondor, a single-node HTCondor package designed for testing.
Install HTCondor using the official script:
# Download and run the HTCondor installer (installs minicondor by default)
curl -fsSL https://get.htcondor.org | sudo /bin/bash -s -- --no-dry-run
# Verify HTCondor is running
condor_status
You should see output showing your local machine as a condor slot. If condor_status returns an error, check that the condor service is running:
Configure Docker for HTCondor jobs:
HTCondor needs to run containers for CWL workflows. Add your user to the docker group and create a wrapper to mount /etc/hosts for DNS resolution:
# Add your user to the docker group
sudo usermod -a -G docker $USER
# Create a docker wrapper for DNS resolution in containers
sudo tee /usr/local/bin/docker > /dev/null << 'EOF'
#!/usr/bin/python3
import sys, os
n = sys.argv
n[0] = "/usr/bin/docker"
if "run" in n:
n.insert(n.index("run") + 1, "-v=/etc/hosts:/etc/hosts:ro")
os.execv(n[0], n)
EOF
sudo chmod +x /usr/local/bin/docker
# Log out and back in for the docker group change to take effect
After logging back in, verify HTCondor can see your machine:
Setting up Toil WES⚓︎
Already have a Toil WES service? Skip to Clone the Deployment Guide Repository.
If you need to set up Toil WES on your HPC cluster (or local MiniCondor), follow these steps. The examples use HTCondor, but the process is similar for other schedulers.
Install Toil
Install Toil in a Python virtual environment on storage accessible to all compute nodes:
# Create directories for Toil venv and job storage
mkdir -p ~/toil ~/toil/storage
python3 -m venv --prompt toil ~/toil/venv
# Activate and install Toil with required extras
source ~/toil/venv/bin/activate
python3 -m pip install toil[cwl,htcondor,server,aws] htcondor
Note: Replace
htcondorwith your batch system if different (e.g.,toil[cwl,slurm,server,aws]for Slurm).
Test the Installation
Run a sample CWL workflow to verify everything works:
source ~/toil/venv/bin/activate
# Download a test application
wget https://github.com/EOEPCA/deployment-guide/raw/refs/heads/main/scripts/processing/oapip/examples/convert-url-app.cwl
# Create test directories and parameters
jobid=$(uuidgen)
mkdir -p ~/toil/storage/test/{work_dir,job_store}
cat <<EOF > ~/toil/storage/test/work_dir/$jobid.params.yaml
fn: resize
url: https://eoepca.org/media_portal/images/logo6_med.original.png
size: 50%
EOF
# Run the test (adjust --batchSystem for your scheduler)
toil-cwl-runner \
--batchSystem htcondor \
--workDir ~/toil/storage/test/work_dir \
--jobStore ~/toil/storage/test/job_store/$jobid \
convert-url-app.cwl#convert-url \
~/toil/storage/test/work_dir/$jobid.params.yaml
If successful, you’ll see JSON output representing a STAC Item. Clean up:
Start the Toil WES Service
The WES service needs RabbitMQ for job queuing and Celery for queue management.
Start RabbitMQ:
Start Celery:
source ~/toil/venv/bin/activate
celery --broker=amqp://guest:guest@127.0.0.1:5672// -A toil.server.celery_app multi start w1 \
--loglevel=INFO --pidfile=$HOME/celery.pid --logfile=$HOME/celery.log
Start the Toil WES server:
source ~/toil/venv/bin/activate
mkdir -p $HOME/toil/storage/workdir $HOME/toil/storage/workflows
TOIL_WES_BROKER_URL=amqp://guest:guest@127.0.0.1:5672// nohup toil server \
--host 0.0.0.0 \
--work_dir $HOME/toil/storage/workflows \
--opt=--batchSystem=htcondor \
--opt=--workDir=$HOME/toil/storage/workdir \
--logFile $HOME/toil.log \
--logLevel INFO \
-w 1 &>$HOME/toil_run.log </dev/null &
echo "$!" > $HOME/toil.pid
sleep 5
Note: Adjust
--batchSystem=htcondorto match your scheduler.
Verify the WES Service
You should see JSON service information. Your WES endpoint URL will be:
For both Calrissian and Toil
Clone the Deployment Guide Repository⚓︎
Validate your environment:
Deployment⚓︎
Run the Configuration Script⚓︎
Common Configuration Parameters⚓︎
INGRESS_HOST: Base domain for ingress hosts.- Example:
example.com
- Example:
CLUSTER_ISSUER(if usingcert-manager): Name of the ClusterIssuer.- Example:
letsencrypt-http01-apisix
- Example:
PERSISTENT_STORAGECLASS: Storage class for persistent volumes.- Example:
standard
- Example:
Workspace Integration⚓︎
The engine supports two options for stage-out of processing results:
- With the EOEPCA+ Workspace BB - results go directly to the user’s workspace bucket
- With a dedicated S3 bucket - results go to a pre-configured shared bucket
This is controlled by:
USE_WORKSPACE_API: Set totrueto integrate with user Workspace storage
If using Workspace integration:
- The Workspace BB must already be deployed
- The username from the JWT Bearer token (or path prefix for open services) determines which workspace bucket to use, following the
ws-<username>naming convention
Stage-Out S3 Configuration⚓︎
Ensure you have an S3-compatible object store set up. See the MinIO Deployment Guide if needed.
S3_ENDPOINT,S3_ACCESS_KEY,S3_SECRET_KEY,S3_REGION: Credentials for Stage-Out storage
Stage-In S3 Configuration⚓︎
If your input data is hosted separately from output storage:
STAGEIN_S3_ENDPOINT,STAGEIN_S3_ACCESS_KEY,STAGEIN_S3_SECRET_KEY,STAGEIN_S3_REGION
OIDC Configuration⚓︎
Note: The EOEPCA OIDC protection requires the APISIX Ingress Controller. If you’re using a different ingress controller, OIDC will not be available and you can skip this configuration.
If using APISIX, you can enable OIDC authentication during configuration. When prompted for the Client ID, we recommend oapip-engine.
See the IAM Building Block guide for IAM setup, and Enable OIDC with Keycloak below for post-deployment configuration.
Calrissian Configuration⚓︎
When prompted for execution engine, select calrissian. You’ll need to configure:
NODE_SELECTOR_KEY: Determines which nodes run processing workflows- Example:
kubernetes.io/os - Read more: Node Selector Documentation
- Example:
NODE_SELECTOR_VALUE: Value for the node selector- Example:
linux
- Example:
Toil Configuration⚓︎
When prompted for execution engine, select toil. You’ll need to configure:
OAPIP_TOIL_WES_URL: Your Toil WES endpoint, must end with/ga4gh/wes/v1/- Example:
http://192.168.1.100:8080/ga4gh/wes/v1/ - Read more: Zoo WES Runner documentation
- Example:
OAPIP_TOIL_WES_USER: WES service username- Example:
test
- Example:
OAPIP_TOIL_WES_PASSWORD: WES service password (htpasswd format)- Example:
$2y$12$ci.4U63YX83CwkyUrjqxAucnmi2xXOIlEF6T/KdP9824f1Rf1iyNG
- Example:
Note: If you set up Toil WES without authentication (as in the setup guide above), use placeholder credentials - they’ll be ignored.
Important: Network Reachability
The WES URL must be reachable from within the Kubernetes cluster.
- If Toil runs on the same machine as Kubernetes, use the host’s IP address (e.g.,
http://192.168.1.100:8080/ga4gh/wes/v1/)- If Toil runs on a separate HPC system, ensure network routing and firewall rules allow traffic from the Kubernetes pod network to the WES endpoint
You can verify connectivity from within the cluster:
Deploy the Helm Chart⚓︎
helm upgrade -i zoo-project-dru zoo-project/zoo-project-dru \
--version 0.9.1 \
--values generated-values.yaml \
--namespace processing \
--create-namespace
Optional: Enable OIDC with Keycloak⚓︎
This requires the APISIX Ingress Controller. If you’re using a different Ingress Controller, skip to Validation.
Skip this section if you don’t need IAM protection right now - the engine will work, just without access restrictions.
To protect OAPIP endpoints with Keycloak tokens and policies, follow these steps after enabling OIDC in the configuration script.
First, ensure you’ve followed the IAM Deployment Guide and have Keycloak running.
Create a Keycloak Client⚓︎
source ~/.eoepca/state
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: ${OAPIP_CLIENT_ID}-keycloak-client
namespace: iam-management
stringData:
client_secret: ${OAPIP_CLIENT_SECRET}
---
apiVersion: openidclient.keycloak.m.crossplane.io/v1alpha1
kind: Client
metadata:
name: ${OAPIP_CLIENT_ID}
namespace: iam-management
spec:
forProvider:
realmId: ${REALM}
clientId: ${OAPIP_CLIENT_ID}
name: Processing OAPIP Engine
description: Processing OAPIP Engine OIDC
enabled: true
accessType: CONFIDENTIAL
rootUrl: ${HTTP_SCHEME}://zoo.${INGRESS_HOST}
baseUrl: ${HTTP_SCHEME}://zoo.${INGRESS_HOST}
adminUrl: ${HTTP_SCHEME}://zoo.${INGRESS_HOST}
serviceAccountsEnabled: true
directAccessGrantsEnabled: true
standardFlowEnabled: true
oauth2DeviceAuthorizationGrantEnabled: true
useRefreshTokens: true
authorization:
- allowRemoteResourceManagement: false
decisionStrategy: UNANIMOUS
keepDefaults: true
policyEnforcementMode: ENFORCING
validRedirectUris:
- "/*"
webOrigins:
- "/*"
clientSecretSecretRef:
name: ${OAPIP_CLIENT_ID}-keycloak-client
key: client_secret
providerConfigRef:
name: provider-keycloak
kind: ProviderConfig
EOF
Protect the User’s Processing Context⚓︎
The ZOO-Project uses a path prefix to establish user context (e.g., /<username>/ogc-api/processes/...). You can protect this so only the owning user can access it.
This example protects the context for eoepcauser (see Create Test Users):
source ~/.eoepca/state
export OAPIP_USER="${KEYCLOAK_TEST_USER}"
envsubst < protect-oapip-user.yaml | kubectl apply -f -
This creates: eoepcauser-group, eoepcauser-membership, eoepcauser-resource, eoepcauser-policy, eoepcauser-access.
Create APISIX Route Ingress⚓︎
Confirm Protection⚓︎
Wait for the ingress and TLS to be established first.
If you see 401 Authorization errors when using a valid token, check your token and resource protection configuration.
For more detailed testing, see Resource Protection with Keycloak Policies.
Validation⚓︎
Automated Validation⚓︎
Web Endpoints⚓︎
Check these are accessible:
- ZOO-Project Swagger UI -
https://zoo.${INGRESS_HOST}/swagger-ui/oapip/ - OGC API Processes Landing Page -
https://zoo.${INGRESS_HOST}/ogc-api/processes/
Expected Kubernetes Resources⚓︎
All pods should be Running with no CrashLoopBackOff or Error states.
Using the API⚓︎
This walkthrough covers deploying, executing, monitoring, and retrieving results from a sample application.
Prefer a notebook? Run
../../../notebooks/run.shand open the OAPIP Engine Validation notebook athttp://localhost:8888.
Initialise Environment⚓︎
List Processes⚓︎
source oapip-utils.sh
curl --silent --show-error \
-X GET "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/processes" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Accept: application/json" | jq
Deploy Process convert⚓︎
source oapip-utils.sh
curl --silent --show-error \
-X POST "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/processes" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d @- <<EOF | jq
{
"executionUnit": {
"href": "https://raw.githubusercontent.com/EOEPCA/deployment-guide/refs/heads/main/scripts/processing/oapip/examples/convert-url-app.cwl",
"type": "application/cwl"
}
}
EOF
Verify it’s deployed:
source oapip-utils.sh
curl --silent --show-error \
-X GET "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/processes/convert-url" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Accept: application/json" | jq
Execute Process convert⚓︎
source oapip-utils.sh
JOB_ID=$(
curl --silent --show-error \
-X POST "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/processes/convert-url/execution" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Prefer: respond-async" \
-d @- <<EOF | jq -r '.jobID'
{
"inputs": {
"fn": "resize",
"url": "https://eoepca.org/media_portal/images/logo6_med.original.png",
"size": "50%"
}
}
EOF
)
echo "JOB ID: ${JOB_ID}"
Check Execution Status⚓︎
source oapip-utils.sh
curl --silent --show-error \
-X GET "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/jobs/${JOB_ID}" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Accept: application/json" | jq
The status field shows running, successful, or failed.
Check Execution Results⚓︎
Once the job completes successfully:
source oapip-utils.sh
curl --silent --show-error \
-X GET "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/jobs/${JOB_ID}/results" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Accept: application/json" | jq
Use the Minio mc CLI or the MinIO console (installed as per the MinIO Deployment Guide) to access the output file in the Stage-Out bucket.
source ~/.eoepca/state
if [ "${USE_WORKSPACE_API=}" = "true" ]; then BUCKET_NAME="ws-${OAPIP_USER}"; else BUCKET_NAME="eoepca"; fi
xdg-open "https://console-minio.${INGRESS_HOST}/browser/${BUCKET_NAME}/processing-results/"
Undeploy Process convert⚓︎
source oapip-utils.sh
curl --silent --show-error \
-X DELETE "${OAPIP_HOST}/${OAPIP_USER}/ogc-api/processes/convert-url" \
${OAPIP_AUTH_HEADER:+-H "$OAPIP_AUTH_HEADER"} \
-H "Accept: application/json" | jq
Monitoring Jobs on HPC (Toil only)⚓︎
When using Toil, you can also monitor jobs directly on the HPC cluster:
Toil WES logs:
HPC queue status:
For HTCondor:
For Slurm:
Uninstallation⚓︎
Remove the OAPIP Engine⚓︎
source ~/.eoepca/state
export OAPIP_USER="${KEYCLOAK_TEST_USER}"
kubectl delete -f generated-ingress.yaml
envsubst < protect-oapip-user.yaml | kubectl delete -f -
kubectl -n iam-management delete client.openidclient.keycloak.m.crossplane.io ${OAPIP_CLIENT_ID}
kubectl -n iam-management delete secret ${OAPIP_CLIENT_ID}-keycloak-client
helm -n processing uninstall zoo-project-dru
kubectl delete ns processing
Stop Toil WES (Toil only)⚓︎
If you set up Toil WES on your HPC cluster:
# Stop Toil server
kill $(cat $HOME/toil.pid)
# Stop Celery
celery --broker=amqp://guest:guest@127.0.0.1:5672// -A toil.server.celery_app multi stop w1 \
--pidfile=$HOME/celery.pid
# Stop RabbitMQ
docker stop toil-wes-rabbitmq
docker rm toil-wes-rabbitmq
Further Reading⚓︎
General:
- ZOO-Project DRU Helm Chart
- EOEPCA+ Deployment Guide Repository
- OGC API Processes Standards
- Common Workflow Language (CWL)
Calrissian:
Toil:
- Toil Documentation
- Toil WES Server Documentation
- Zoo WES Runner Documentation
- EOEPCA+ Cookiecutter Template (WES)
Feedback⚓︎
If you have any issues or suggestions, please open an issue on the EOEPCA+ Deployment Guide GitHub Repository.