Data models

Workflows

The main components of the workflow model are variables and actions. Use variables to specify input files and parameters for your processing services. Variables for output files must not have a value. The names of output files will be generated by Steep during workflow execution.

PropertyTypeDescription
api
(required)
stringThe API (or data model) version. Should be 4.7.0.
name
(optional)
stringAn optional human-readable workflow name
priority
(optional)
numberA priority used during scheduling. Process chains generated from workflows with higher priorities will be scheduled before those with lower priorities. Negative values are allowed. The default value is 0.
retries
(optional)
objectDefault retry policies that should be used within the workflow unless more specific retry policies are defined elsewhere.
vars
(optional)
arrayAn array of variables
actions
(required)
arrayAn array of actions that make up the workflow
Example
api: 4.7.0
actions:
  - type: execute
    service: copy
    inputs:
      - id: input_file
        value: example1.txt
    outputs:
      - id: output_file
        var: outputFile1
 
  - type: execute
    service: copy
    inputs:
      - id: input_file
        value: example2.txt
    outputs:
      - id: output_file
        var: outputFile2

Variables

A variable holds a value for inputs and outputs of processing services. It can be defined (inputs) or undefined (outputs). Defined values are immutable. Undefined variables will be assigned a value by Steep during workflow execution.

Variables are also used to link two services together and to define the data flow in the workflow graph. For example, if the output parameter of a service A refers to a variable V, and the input parameter of service B refers to the same variable, Steep will first execute A to determine the value of V and then execute B.

PropertyTypeDescription
id
(required)
stringA unique variable identifier
value
(optional)
anyThe variable’s value or null if the variable is undefined
Example
id: input_file
value: /data/input.txt

Actions

There are three types of actions in a workflow: execute actions, for-each actions, and include actions. They are differentiated by their type attribute.

Execute actions

An execute action instructs Steep to execute a certain service with given inputs and outputs.

PropertyTypeDescription
id
(optional)
stringAn optional string uniquely identifying the action within the workflow. If not given, a random identifier will be generated.
type
(required)
stringThe type of the action. Must be execute.
service
(required)
stringThe ID of the service to execute
inputs
(optional)
arrayAn array of input parameters
outputs
(optional)
arrayAn array of output parameters
dependsOn
(optional)
arrayA list of identifiers of actions this action needs to finish first before it is ready to be executed. Note that Steep is able to identify dependencies between actions itself based on outputs and inputs, so this attribute is normally not needed. However, it may be useful if a preceding action does not have an output parameter or if the depending action does not have an input parameter. Execute actions may depend on other execute actions but also on for-each actions and include actions and vice versa.
retries
(optional)
objectAn optional retry policy specifying how often this action should be retried in case of an error. Overrides any default retry policy defined in the service metadata.
maxInactivity
(optional)
duration or objectAn optional duration or timeout policy that defines how long the execution of the service can take without producing any output (i.e. without writing anything to the standard output and error streams) before it is automatically cancelled or aborted. Can be combined with maxRuntime and deadline (see below). Overrides any default inactivity timeout defined in the service metadata. Note that a service cancelled due to inactivity is still subject to any configured retry policy, which means its execution may be retried even if one attempt timed out. If you want to cancel a long-running service immediately even if there is a retry policy configured, use a deadline.
maxRuntime
(optional)
duration or objectAn optional duration or timeout policy that defines how long the execution of the service can take before it is automatically cancelled or aborted, even if the service regularly writes to the standard output and error streams. Can be combined with maxInactivity (see above) and deadline (see below). Overrides any default maximum runtime defined in the service metadata. Note that a service cancelled due to a too long runtime is still subject to any configured retry policy, which means its execution may be retried even if one attempt timed out. If you want to cancel a long-running service immediately even if there is a retry policy configured, use a deadline.
deadline
(optional)
duration or objectAn optional duration or timeout policy that defines how long the execution of the service can take at all (including all retries and their associated delays) until it is cancelled or aborted. Can be combined with maxInactivity and maxRuntime (see above). Overrides any default deadline defined in the service metadata.
Example
type: execute
service: my_service
inputs:
  - id: verbose
    var: is_verbose
  - id: resolution
    value: 10
  - id: input_file
    var: my_input_file
outputs:
  - id: output_file
    var: my_output_file
    store: true

For-each actions

A for-each action has an input, a list of sub-actions, and an output. It clones the sub-actions as many times as there are items in its input, executes the actions, and then collects the results in the output.

Although the action is called ‘for-each’, the execution order of the sub-actions is undefined (i.e. the execution is non-sequential and non-deterministic). Instead, Steep always tries to execute as many sub-actions as possible in parallel.

For-each actions may contain execute actions but also nested for-each actions.

PropertyTypeDescription
id
(optional)
stringAn optional string uniquely identifying the action within the workflow. If not given, a random identifier will be generated.
type
(required)
stringThe type of the action. Must be for.
input
(required)
stringThe ID of a variable containing the items to which to apply the sub-actions
enumerator
(required)
stringThe ID of a variable that holds the current value from input for each iteration
output
(optional)
stringThe ID of a variable that will collect output values from all iterations (see yieldToOutput)
dependsOn
(optional)
arrayA list of identifiers of actions this action needs to finish first before it is ready to be executed. Note that Steep is able to identify dependencies between actions itself based on outputs and inputs, so this attribute is normally not needed. However, it may be useful if a preceding action does not have an output parameter or if the depending action does not have an input parameter. For-each actions may depend on execute actions and include actions but also on other for-each actions and vice versa.
actions
(optional)
arrayAn array of sub-actions to execute in each iteration
yieldToOutput
(optional)
stringThe ID of a sub-action’s output variable whose value should be appended to the for-each action’s output
yieldToInput
(optional)
stringThe ID of a sub-action’s output variable whose value should be appended to the for-each action’s input to generate further iterations
Example
type: for
input: all_input_files
output: all_output_files
enumerator: i
yieldToOutput: output_file
actions:
  - type: execute
    service: copy
    inputs:
      - id: input
        var: i
    outputs:
      - id: output
        var: output_file

Include actions

Include actions can be used to include the actions of a macro at a certain point in a workflow. They can also be used in macros to include other macros.

Note that include actions are evaluated in a static pre-processing step during workflow parsing. The pre-processor replaces each include action with the list of actions it specifies and takes care of assigning parameter values as well as renaming IDs and variables to avoid naming collisions.

PropertyTypeDescription
id
(optional)
stringAn optional string uniquely identifying the action within the workflow. If not given, a random identifier will be generated.
type
(required)
stringThe type of the action. Must be include.
macro
(required)
stringThe ID of the macro to include.
inputs
(optional)
arrayAn array of input parameters.
outputs
(optional)
arrayAn array of include output parameters.
dependsOn
(optional)
arrayA list of identifiers of actions this action needs to finish first before it is ready to be executed. Note that Steep is able to identify dependencies between actions itself based on outputs and inputs, so this attribute is normally not needed. However, it may be useful if a preceding action does not have an output parameter or if the depending action does not have an input parameter. Include actions may depend on execute actions and for-each actions but also on other include actions and vice versa.
Example
type: include
macro: delayed_docker_hello_world
inputs:
  - id: seconds
    value: 5

Parameters

This data model represents inputs and generic parameters of execute actions as well as inputs of include actions.

PropertyTypeDescription
id
(required)
stringThe ID of the parameter as defined in the service metadata or the macro definition
var
(optional)
stringThe ID of a variable that holds the value for this parameter (required if value is not given)
value
(optional)
anyThe parameter value (required if var is not given)

Note: Either var or value must be given but not both!

Example
id: input
var: i

Output parameters

Output parameters of execute actions have additional properties compared to inputs.

PropertyTypeDescription
id
(required)
stringThe ID of the parameter as defined in the service metadata
var
(required)
stringThe ID of a variable to which Steep will assign the generated name of the output file. This variable can then be used, for example, as an input parameter of a subsequent action.
prefix
(optional)
stringAn optional string to prepend to the generated name of the output file. For example, if Steep generates the name "name123abc" and the prefix is "my/dir/", the output filename will be "my/dir/name123abc". Note that the prefix must end with a slash if you want to create a directory. The output filename will be relative to the configured temporary directory or output directory (depending on the store property). You may even specify an absolute path: if the generated name is "name456fgh" and the prefix is "/absolute/dir/", the output filename will be "/absolute/dir/name456fgh".
store
(optional)
booleanIf this property is true, Steep will generate an output filename that is relative to the configured output directory instead of the temporary directory. The default value is false.
Example
id: output
var: o
prefix: some_directory/
store: false

Include output parameters

This data model describes output parameters of include actions.

PropertyTypeDescription
id
(required)
stringThe ID of the parameter as defined in the macro definition
var
(required)
stringThe ID of a variable to which Steep will assign the macro’s return value. This variable can then be used, for example, as an input parameter of a subsequent action.
Example
id: output
var: o

Retry policy defaults

A default retry policy that should be used within a workflow unless a more specific retry policy is defined elsewhere.

PropertyTypeDescription
processChains
(optional)
objectAn optional default retry policy that should be applied to every generated process chain
Example
processChains:
  maxAttempts: 5
  delay: 1s
  exponentialBackoff: 2
  maxDelay: 10s