Executing Commands

Executing commands using the OS shell is a common way to generate data for a pipe (as input-exec), to process data inside a pipe (as action-exec), and finally to send data to a desired destination (as output-exec).

On Unix-like systems /bin/sh is used, and when Windows support is available, will use cmd.exe. An option to directly execute commands without invoking the shell is planned.

The current working directory of a pipe is the directory containing the pipe. As a rule, do not expect any files written to this directory to survive a pipe update: use absolute file paths. This directory is also in PATH.

Input Exec

The command is executed by the default shell, and by default all line feeds are removed. That is, you may arrange a complicated commaand like this without needing backslashes:

input:
  exec:
    command: |
      complex-command
        --first-flag 1
        --second-flag 2

You can disable this removal with no-strip-linefeeds: true.

By default, each line of the output is considered a fresh event, and is 'quoted'

input:
  exec:
    command: echo ay; echo bee
# {"_raw":"ay"}
# {"_raw":"bee"}

If the output is already JSON, or needs further mangling with action-raw, then raw: true will switch off this quoting.

input:
  exec:
    no-strip-linefeeds: true # (or use semicolons)
    command: |
      echo '{"msg":"hello"}'
      echo '{"msg":"dolly"}'
    raw: true
# {"msg":"hello"}
# {"msg":"dolly"}

By default, the command is run once. This is appropriate for commands like ping which continuously create output with a specified interval.

Scheduling Commands

Other commands can be scheduled to run on regular intervals - the easiest way is to specify a value for interval.

With scheduled commands, you can choose to process all of the output from each invocation as a single event with ignore-line-breaks: true:

input:
  exec:
    command:  echo ay ; echo bee
    ignore-line-breaks: true
    interval: 2s
# {"_raw":"ay\nbee"}

You can specify that such a command executes precisely once with count: 1. Currently this is needed since ignore-line-breaks only works with scheduled inputs.

We need everything as one blob of text when an event corresponds to multiple lines of output. For example:

pipes$ openssl s_client -connect google.com:443 < /dev/null 2>/dev/null | openssl x509 -fingerprint -dates
SHA1 Fingerprint=95:3A:FF:D9:19:64:D9:09:40:8D:EE:DA:40:48:0E:FF:5E:DA:52:8C
notBefore=Sep  3 06:36:33 2020 GMT
notAfter=Nov 26 06:36:33 2020 GMT
-----BEGIN CERTIFICATE-----
MIIJcDCCCFigAwIBAgIRAM6z8MoewgyIAgAAAAB6SxEwDQYJKoZIhvcNAQELBQAw
QjELMAkGA1UEBhMCVVMxHjAcBgNVBAoTFUdvb2dsZSBUcnVzdCBTZXJ2aWNlczET
MBEGA1UEAxMKR1RTIENBIDFPMTAeFw0yMDA5MDMwNjM2MzNaFw0yMDExMjYwNjM2
MzNaMGYxCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9ybmlhMRYwFAYDVQQH
Ew1Nb3VudGFpbiBWaWV3MRMwEQYDVQQKEwpHb29nbGUgTExDMRUwEwYDVQQDDAwq
...
cn4CwbVgtP2Hjrqsq2r9a/rY54APyENt56JswP7XSeFGNF3OeCudKNhAybeZNZ7g
QvwtGTOj3hzC7Qv2oEjM3oLgepk9FOkAcMWnt2afC2ICnB/EcFP7l72T0yo+UPnq
4FLcf4CTrDJng6vGSCzVkkqSjj0=
-----END CERTIFICATE-----

This is a job for expand-key-value, except the delimiter is a line feed. And we need to strip out the certificate:

name: ssl2
input:
  exec:
    command: openssl s_client -connect google.com:443 < /dev/null 2>/dev/null | openssl x509 -fingerprint -dates
    ignore-linebreaks: true
    interval: 1s
    count: 1
actions:
  # pull out everything up to the first '-'
  - raw:
      extract:
        input-field: _raw
        pattern: '^([^\-]+)'
  # and expand key values using line-feed as a delimiter
  - expand:
      input-field: _raw
      remove: true
      delim: '\n'
      key-value:
        key-value-delim: '='
output:
  write: console
# {"SHA1 Fingerprint":"95:3A:FF:D9:19:64:D9:09:40:8D:EE:DA:40:48:0E:FF:5E:DA:52:8C","notBefore":"Sep  3 06:36:33 2020 GMT","notAfter":"Nov 26 06:36:33 2020 GMT"}

Handling Errors

So far, we are handling the happy situation where the command has executed successfully. The result field allows standard output, standard error and status to be optionally captured as custom fields:

input:
  exec:
    command: echo hello && foo
    result:
      stdout-field: out
      stderr-field: err
      status-field: status
# {"err":"sh: 1: foo: not found\n","out":"hello\n","status":127}

(So the default is just stdout-field: _raw)

A command may not execute correctly every time, but often with networking issues 'try again later' is a sound strategy.

input:
  exec:
    command: echo 'hi there' | nc -N -v 127.0.0.1 3030
    retry:
      forever: true
      pause: 2s  

This will try forever to write to the TCP port 3030 locally, waiting for 2s before retrying:

$ hotrod pipes run -f exec1.yml 
nc: connect to 127.0.0.1 port 3030 (tcp) failed: Connection refused
[ERROR] exec: failed Exited(1) input-exec step 0
LINE: 
nc: connect to 127.0.0.1 port 3030 (tcp) failed: Connection refused
[ERROR] exec: failed Exited(1) input-exec step 0
LINE: 
Connection to 127.0.0.1 3030 port [tcp/*] succeeded!
{"_raw":""}

Especially for a scheduled command, you would not want to try forever, so count: 3 instead of forever: true will give "Three Strikes and you're Out" behaviour.

Action Exec

Invoking a command in a series of actions can be useful. action-exec allows you to simply run a command for its side-effects, execute commands conditionally and merge the output of another command into an event.

By default, data passes through unmodified through action-exec:

# {"msg":"hello"}
actions:
- exec:
    command: echo ${msg} > temp.txt

Like any other action we can use field expansions.

$ hotrod pipes run -f exec2.yml 
{"msg":"hello"}
scratch$ cat temp.txt
hello

Plus, the event itself is passed in as the input of the command:

actions:
- exec:
    command: cat > temp.txt

With previous input, temp.txt contains {"msg":"hello"} as expected!

By specifying an input field, we can pass exactly what we want to a command's input:

actions:
- exec:
    input-field: msg
    command: cat > temp.txt

Afterwards, temp.txt contains "hello".

NOTE pipes that write to their own directory (or the directories of any other pipe) are not recommended. Pipe directories are managed, and there is no guarantee that any such files will be around after the pipe is re-started, etc.

action-exec does not complain about a missing input-field by default, resulting in a useful feature: conditional execution of commands.

# {"msg":"hello"}
# {"greeting":"bye"}
actions:
- exec:
    input-field: msg
    command: nc -N -v 127.0.0.1 3030
# {"msg":"hello"}
# {"greeting":"bye"}    

If we have a TCP server listening to that port, it will receive "hello" as expected when msg is an existing field. Actually, input-field does not even have to be a string - if not, we just pass the whole event.

The output of a command can be merged into the current event using result, which works as in input-exec

# {"msg":"hello"}
actions:
- exec:
    command: whoami
    result:
      stdout-field: who
# {"msg":"hello","who":"steve"}      

A silly example, but a powerful way to combine various views of a system together as one event.

For instance, here is a little pipe we used to monitor Linux load average and CPU usage extracted from mpstat:

name: mpstat
input:
  exec:
    command: mpstat | tail -n1
    interval: 5s
actions:
- exec:
    command: uptime
    result:
      stdout-field: uptime
# extract and convert mpstat fields
- expand:
    input-field: _raw
    remove: true
    delim: ' '
    csv:
      fields:
      - time: str
      - CPU: str
      - usr: num
      - nice: num
      - sys: num
      - iowait: num
      - irq: num
      - soft: num
      - steal: num
      - guest: num
      - gnice: num
      - idle: num
# likewise for uptime
- extract:
    input-field: uptime
    remove: true
    pattern: 'load average: ([^,]+),'
    output-fields: [avg1]
- convert:
  - avg1: num
output:
  write: console

Output Exec

Generally, you should use as specific an output as possible, but it's always possible to use output-exec.

Like action-exec, you may use field expansions ("dollar-curly") and pass all or part of an event as input to a command.

However, there are restrictions. By default, the command is started, and thereafter, all events are written to its input.

output:
  exec:
    command: cat

In this case, it would be better just to say write: console, but this illustrates that the command is meant to be long-lived and act like the end of an OS pipeline.

You cannot use field expansions in this situation, since the data is passed through directly. In fact, if you use field expansions then the command is implicitly run for each event.

output:
  exec:
    command: echo ${msg} >> /var/log/mypipe.log

You can explicitly force the command to be run once per event with streaming: false. You cannot force streaming if there are any field expansions.

Also like action-exec, input-field can be provided. Previous to 2.4, this implied streaming: false but you can force streaming in this case with streaming: true.

As with input-exec, retry allows you to modify the default retry behaviour, which is 3 times with a pause of 300ms.