Scheduling Inputs

Many inputs, like exec and http-poll schedule inputs by time. (The others are waiting for some external data, like tcp)

Scheduling with Intervals

With interval, the intervals between command invocations are equal, but execution starts immediately. If we start at 12:30:15 with one minute interval, then the next time will be 12:31:15 and so on.

If you need full control of poll at precise times, then use cron scheduling.

name: cron
input:
    exec:
        command: date
        cron: '0  *  *  *  *  *  *'
output:
    write: console
# output:
{"_raw":"Thu Jun  6 12:55:00 SAST 2019"}
{"_raw":"Thu Jun  6 12:56:00 SAST 2019"}
{"_raw":"Thu Jun  6 12:57:00 SAST 2019"}

The command runs at each minute.

See this guide for more. Note the special pre-defined values like '@hourly', '@daily', etc.

Since cron commands occur at precise clock times, there's a need to offset them by a random interval, so that for instance a thousand targets do not overwhelm some endpoint by polling it at the same time.

input:
    exec:
        command: date
        interval: 1m  # on every minute (16:00:00, 16:01:00, etc)
        random-offset: 5s (so we'll actually run at 16:00:02, and so forth)

Windowing

Windowing is a very useful feature if you are querying a resource that needs a time window to be specified.

input:
    exec:
        command: 'echo ${start_time_secs} ${end_time_secs}'
        cron: '0  *  *  *  *  *  *'
        window:
            size: 1m

The end time is now, expressed as a Unix timestamp, and the start time is now minus 60s.

window also has offset - this is when you would like to wait a little for the data to settle (for instance, for Elasticsearch to ingest some real-time data). This will shift the (start_time,end_time) interval back by the offset (same units as interval and size)

If you provide interval, then window.size is automatically set.

In addition to windowing intervals the cron supports a start-time, which allows the windowing to start at a specified time, this time has to be in the following format: 2019-07-10 18:45:00.000 +02:00. A highwatermark-file file can be specified in order for cron to track the last successful window and resume from the high water mark point.

So, if a pipe using a highwater-mark file is stopped and restarted, then the scheduler will "catch up" and run for any intervals that were missing.

Note: start_time will be ignored unless combined with highwatermark-file. start_time causes offset to be ignored.

Special Variables

Note that special variables are available; there is support for custom time formats. For example, ${start_time_fmt %F %T} will give a local date-time like "2019-09-28 15:36:02".

  • now_time_secs => current time in seconds since epoch,
  • now_time_msecs => current time in milliseconds since epoch,
  • now_time_iso => current time in "%Y-%m-%dT%H:%M:%SZ" format,
  • start_time_secs => start time in seconds since epoch,
  • start_time_msecs => start time in milliseconds since epoch,
  • start_time_iso => start time in "%Y-%m-%dT%H:%M:%SZ" format,
  • end_time_secs => end time in seconds since epoch,
  • end_time_msecs => end time in seconds since epoch,
  • end_time_iso => end time in "%Y-%m-%dT%H:%M:%SZ") format,
  • now_time_fmt <fmt> => e.g. ${now_time_fmt %F %T}
  • start_time_fmt <fmt>=> e.g ${start_time_fmt %Y-%m-%dT%H:%M:%SZ} (same as _iso above)
  • end_time_fmt <fmt>=> e.g ${end_time_fmt %s} (same as _secs above)