Data models

Timeouts and retries

This page describes data models that control how long a workflow execution may take (timeout policies) and what should happen if it runs into an error (retry policies).

Time-based values in Steep’s data models are specified as human-readable durations.

Timeout policies

A timeout policy defines how long a service or an executable may run before it is automatically cancelled or aborted with an error. Timeout policies can be specified with the maxInactivity, maxRuntime and deadline attributes, either per service in the service metadata or per executable action in the workflow.

A timeout policy is either a string or an object. If it is a string, it represents a duration specifying a maximum amount of time until the execution is cancelled.

If specified as an object, the timeout policy has the following properties:

PropertyTypeDescription
timeoutdurationThe maximum amount of time that may pass until the execution is cancelled or aborted.
errorOnTimeout
(optional)
booleantrue if an execution that is aborted due to a timeout should lead to an error (i.e. if the process chain’s status should be set to ERROR). false if it should just be cancelled (process chain status CANCELLED). By default, the execution will be cancelled.

Multiple timeout policies can be combined. For example, a service may be cancelled after 5 minutes of inactivity and aborted with an error if its total execution takes longer than 1 hour.

Example
maxInactivity: 5m
deadline:
  timeout: 1h
  errorOnTimeout: true

Retry policies

A retry policy specifies how often the execution of a workflow action should be retried in case of an error. Retry policies can be specified per service in the service metadata or per executable action in the workflow.

PropertyTypeDescription
maxAttempts
(optional)
numberThe maximum number of attempts to perform. This includes the initial attempt. For example, a value of 3 means 1 initial attempt and 2 retries. The default value is 1. A value of -1 means an unlimited (infinite) number of attempts. 0 means there will be no attempt at all (the service or action will be skipped).
delay
(optional)
durationThe amount of time that should pass between two attempts. The default is 0, which means the operation will be retried immediately.
exponentialBackoff
(optional)
numberA factor for an exponential backoff (see description below)
maxDelay
(optional)
durationThe maximum amount of time that should pass between two attempts. Only applies if exponentialBackoff is larger than 1. By default, there is no upper limit.
Exponential backoff:

The exponential backoff factor can be used to gradually increase the delay. The actual delay between two attempts will be calculated as follows:

actualDelay = min(delay * pow(exponentialBackoff, nAttempt - 1), maxDelay)

For example, if delay equals 1s, exponentialBackoff equals 2, and maxDelay equals 10s, the following actual delays will apply:

  • Delay after attempt 1:

    min(1s * pow(2, 0), 10s) = 1s

  • Delay after attempt 2:

    min(1s * pow(2, 1), 10s) = 2s

  • Delay after attempt 3:

    min(1s * pow(2, 2), 10s) = 4s

  • Delay after attempt 4:

    min(1s * pow(2, 3), 10s) = 8s

  • Delay after attempt 5:

    min(1s * pow(2, 4), 10s) = 10s

  • Delay after attempt 6:

    min(1s * pow(2, 4), 10s) = 10s

The default value is 1, which means there is no backoff and the actual delay always equals the specified one.

Example
maxAttempts: 5
delay: 1s
exponentialBackoff: 2
maxDelay: 10s

Durations

A duration consists of one or more number/unit pairs possibly separated by whitespace characters. Supported units are:

  • milliseconds, millisecond, millis, milli, ms
  • seconds, second, secs, sec, s
  • minutes, minute, mins, min, m
  • hours, hour, hrs, hr, h
  • days, day, d

Numbers must be positive integers. The default unit is milliseconds

Examples:
1000ms
3 secs
5m
20mins
10h 30 minutes
1 hour 10minutes 5s
1d 5h
10 days 1hrs 30m 15 secs