Tutorials

Loops

In the workflows from the previous tutorials, data flowed in one direction from one action to the next until the final output was produced. However, sometimes it is necessary to feed back the results of one action into a preceding one or even into the same until a certain condition has been reached.

In general purpose programming languages, this concept can be modelled with while or for loops. Steep has a similar but much more powerful concept. In the previous tutorial, you’ve learned about for-each actions, which apply a certain set of actions to a list of inputs. This can be used to execute multiple actions in parallel. Also, the list of inputs is dynamic. New items can be appended during workflow execution, which will make Steep automatically generate more process chains. The loop will finish once there are no more items in the input list.

Although, for-each actions can be used to model loops, as described in the previous tutorial, for-each actions are not loops per se! In fact, a for-each action just applies a set of other actions to each item in its input. This can be done in parallel and in any order. The fact that you can append more items to its input list does not change this behaviour. Process chains for new items might be scheduled in parallel to existing process chains or even earlier.

This tutorial teaches you how to use the yieldToInput keyword to append new items to a for-each action’s input list during workflow execution.

Step 1: Create a countdown service

To demonstrate how loops work in Steep, we create a workflow that counts a number down until it has reached 0. The workflow uses a for-each action to repeatedly call a service that reads a number from a file, decreases it, and then writes the new value to an output file. After each service call, the output is fed back into the for-each action’s input, which makes Steep call the service again and again.

To end the loop, we just need to make sure that no more items are appended to the for-each action’s input list. When the service reaches the value 0, it therefore does not produce a new output file.

To implement the service, we use Node.js. Create a new file countdown.js and paste the following code into it:

countdown.js

javascript

#!/usr/bin/env node
 
const fs = require("fs").promises
 
async function countDown(input, output) {
  let value = parseInt(await fs.readFile(input, "utf-8"))
  console.log(`Old value: ${value}`)
 
  value--
  if (value > 0) {
    console.log(`New value: ${value}`)
    await fs.writeFile(output, "" + value, "utf-8")
  } else {
    console.log("No new value")
  }
}
 
countDown(process.argv[2], process.argv[3]).catch((err) => {
  console.error(err)
  process.exit(1)
})

Note that we could use any programming language to implement this service. We’re just using Node.js here because it makes it very easy to create executable scripts. Read the tutorial on bringing your own service for a more advanced example.

Step 2: Make the service executable

The countdown script contains a Shebang at the beginning, which tells your system’s program loader to execute it through Node.js.

You just need to make it executable as follows:

Terminal

shell

chmod +x countdown.js

Step 3: Add service metadata

Next, we need to define the metadata for our new service. Open the file conf/services/services.yaml and add the following code:

conf/services/services.yaml

yaml

- id: countdown
  name: Count Down
  description: Read a number, subtract 1, and write the result
  path: ./countdown.js  # change if necessary
  runtime: other
 
  parameters:
    - id: input
      name: Input file
      description: The input file containing the number to decrease
      type: input
      cardinality: 1..1
      dataType: file
 
    - id: output
      name: Output file
      description: The path to the output file
      type: output
      cardinality: 1..1
      dataType: fileOrEmptyList

Change the path attribute to the absolute location of the countdown.js file on your system.

Note that we use the data type fileOrEmptyList for the service’s output parameter. This is a special data type that either returns the generated file or an empty list if the file does not exist. Using the data type file would lead to an error during workflow execution.

Don’t forget to restart Steep for the changes to take effect.

Step 4: Create a workflow with a loop

Create a new file loop-workflow.yaml and add the following code:

loop-workflow.yaml

api: 4.7.0
vars:
  - id: input_file
    value: input.txt
 
actions:
  - type: for
    input: input_file
    enumerator: i
    yieldToInput: output_file   # feed into for-each action's input
    actions:
      - type: execute
        service: countdown
        inputs:
          - id: input
            var: i
        outputs:
          - id: output
            var: output_file

In the first iteration of the for-each action, the service reads from the file input.txt and writes to an output file with a name generated during runtime. The path of this output file is fed back into the for-each action via yieldToInput. In the second iteration, the service reads from the output file and produces another one. This process continues until the number equals 0, in which case the service does not write an output file and the workflow finishes.

Step 5: Submit the workflow

Create a new file input.txt with the number 10 in it:

Terminal

shell

echo 10 > input.txt

Edit the file loop-workflow.yaml that you’ve created in the previous step and replace the relative path to input.txt with the absolute path to this file on your computer.

Then, submit the workflow:

Terminal

shell

curl -X POST http://localhost:8080/workflows --data-binary @loop-workflow.yaml

Visit the workflow page in Steep’s web UI (http://localhost:8080/workflows) and click on the ID of your submitted workflow. After the workflow has finished, it should display ten completed process chains, one for each iteration.

Parallelization

Aerial image segmentation