Data models
Process chains
As described in the section on workflow scheduling, Steep
transforms a workflow to one or more process chains. A process chain is a
sequential list of instructions that will be sent to Steep’s remote agents to
execute processing services in a distributed environment.
Some of the properties specified in the table below are only available once Steep has
started executing a process chain (e.g. startTime
) or after the execution has
finished (e.g. endTime
).
Also, the /processchains
HTTP endpoint, which provides a list of process chains, omits some properties although they
are marked as required in the table below (e.g. executables
or totalRuns
).
If you want to get all required properties, you have to use the /processchains/:id
HTTP endpoint.
Steep records each execution of a process chain in a separate ‘run’.
The property totalRuns
specifies how often a process chain has been executed
(including any currently running execution). If a process chain has just been
created and still has the status REGISTERED
, totalRuns
equals 0
, but as
soon as the status switches to RUNNING
, a new run is created and totalRuns
is incremented to 1
. If a process chain fails and is later retried, for
example, a new run will be created and totalRuns
will be 2
, etc.
Requesting a process chain through the HTTP endpoints /processchains
or
/processchains/:id
,
always renders the latest run. The properties agentId
, startTime
, endTime
, status
,
errorMessage
, and autoResumeAfter
depend on the actual run rendered (e.g.
different runs have different start times; or one run might have failed, while a
newer one might have succeeded or is still running, so their statuses are
different). If you want to list all runs of a process chain or retrieve
information about a specific run, use the /processchains/:id/runs
or /processchains/:id/runs/:runNumber
HTTP endpoints, respectively. The property runNumber
from the table below
specifies, which run out of totalRuns
is rendered.
Property | Type | Description |
---|
id (required) | string | Unique process chain identifier |
executables (required) | array | A list of executable objects that describe what processing services should be called and with which arguments |
submissionId (required) | string | The ID of the submission to which this process chain belongs |
agentId (optional) | string | The ID of the agent that currently executes the process chain (if its status is RUNNING ) or has executed it (if it is finished). May be null if the execution has not started yet. |
startTime (optional) | string | An ISO 8601 timestamp denoting the date and time when the process chain execution was started. May be null if the execution has not started yet. |
endTime (optional) | string | An ISO 8601 timestamp denoting the date and time when the process chain execution finished. May be null if the execution has not finished yet. |
status (required) | string | The current status of the process chain |
requiredCapabilities (optional) | array | A set of strings specifying capabilities a host system must provide to be able to execute this process chain. See also setups. |
priority (optional) | number | A priority used during scheduling. Process chains with higher priorities will be scheduled before those with lower priorities. Negative values are allowed. The default value is 0 . |
retries (optional) | object | An optional retry policy specifying how often this process chain will be rescheduled in case an error has occurred. |
results (optional) | object | If status is SUCCESS , this property contains the list of process chain result files grouped by their output variable ID. Otherwise, it is null . |
totalRuns (required) | number | The number of times the process chain has been executed (including any currently running execution). |
runNumber (optional) | number | The number of the run currently rendered. May be null if the process chain has not been executed yet. |
autoResumeAfter (optional) | string | If the process chain’s status is PAUSED , this optional property may specify a point in time (as an ISO 8601 timestamp) after which the process chain will be automatically resumed. It is typically only given, if a retry policy is configured (see retries property): if a process chain run has failed and there are still attempts left, autoResumeAfter specifies when the next attempt will be performed. |
estimatedProgress (optional) | number | A floating point number between 0.0 (0%) and 1.0 (100%) indicating the current execution progress of this process chain. This property will only be provided if the process chain is currently being executed (i.e. if its status equals RUNNING ) and if a progress could actually be estimated. Note that the value is an estimation based on various factors and does not have to represent the real progress. More precise values can be calculated with a progress estimator plugin. Sometimes, progress cannot be estimated at all. In this case, the value will be null . |
errorMessage (optional) | string | If status is ERROR , this property contains a human-readable error message. Otherwise, it is null . |
Executables
An executable is part of a process chain.
It describes how a processing service should be executed and with which parameters.
Property | Type | Description |
---|
id (required) | string | An identifier (does not have to be unique). Typically refers to the id of the execute action, from which the executable was derived. Possibly suffixed with a dollar sign $ and a number denoting the iteration of an enclosing for-each action (e.g. myaction$1 ) or nested for-each actions (e.g. myaction$2$1 ). |
path (required) | string | The path to the binary of the service to be executed. This property is specific to the runtime . For example, for the docker and the kubernetes runtimes, this property refers to the container image. |
serviceId (required) | string | The ID of the processing service to be executed. |
arguments (required) | array | A list of arguments to pass to the service. May be empty. |
runtime (required) | string | The name of the runtime that will execute the service. Built-in runtimes are currently other (for any service that is executable on the target system), docker for Docker containers, and kubernetes for Kubernetes jobs. More runtimes can be added through plugins |
runtimeArgs (optional) | array | A list of arguments to pass to the runtime. May be empty. |
retries (optional) | object | An optional retry policy specifying how often this executable should be restarted in case of an error. |
maxInactivity (optional) | object | An optional timeout policy that defines how long the executable can run without producing any output (i.e. without writing anything to the standard output and error streams) before it is automatically cancelled or aborted. |
maxRuntime (optional) | object | An optional timeout policy that defines how long the executable can run before it is automatically cancelled or aborted, even if the service regularly writes to the standard output and error streams. |
deadline (optional) | object | An optional timeout policy that defines how long the executable can run at all (including all retries and their associated delays) until it is cancelled or aborted. |
Arguments
An argument is part of an executable.
Property | Type | Description |
---|
id (required) | string | An argument identifier |
label (optional) | string | An optional label to use when the argument is passed to the service (e.g. --input ). |
variable (required) | object | A variable that holds the value of this argument. |
type (required) | string | The type of this argument. Valid values: input , output |
dataType (required) | string | The type of the argument value. If this property is directory , Steep will create a new directory for the service’s output and recursively search it for result files after the service has been executed. Otherwise, this property can be an arbitrary string. New data types with special handling can be added through output adapter plugins. |
Argument variables
An argument variable holds the value of an argument.
Property | Type | Description |
---|
id (required) | string | The variable’s unique identifier |
value (required) | string | The variable’s value |
Process chain status
The following table shows the statuses a process chain
can have:
Status | Description |
---|
REGISTERED | The process chain has been created but execution has not started yet |
RUNNING | The process chain is currently being executed |
PAUSED | The execution of the process chain is paused |
CANCELLED | The execution of the process chain was cancelled |
SUCCESS | The process chain was executed successfully |
ERROR | The execution of the process chain failed |