Detecting CPU Changes

The following section assumes that you already have Hotrod (possibly deployed on some Bboxes) up and running, for more information related to getting up and running see our Getting Started Guide.

It is very straightforward to extract columnar output from `ps``:

name: ps
input:
    exec:
        command: ps ax --no-header -o pid,cputime,rss,comm
        interval: 2s
actions:
- extract:
    input-field: _raw
    remove: true
    pattern: '(\S+)\s+(\S+)\s+(\S+)\s+(.+)'
    output-fields: [PID,TIME,RSS,CMD]
- filter:
    exclude:
    - RSS: '0'
output:
    write: console

We pull out the columns with extract pattern and filter out kernel workers etc with filter exclude.

This happens every 2 seconds, so this is a lot of data. So this will show how to only respond to changes in values.

The first issue is that cputime is (DAY-)HOUR:MIN:SEC, so extract these. The 'day' may be empty!

- extract:
    input-field: TIME
    pattern: '^(\d+-)*(\d+):(\d+):(\d+)'
    output-fields: [day,hour,min,sec]
- add:
    output-fields:
    - day: 0
- convert:
  - RSS: num
  - day: num
  - hour: num
  - min: num
  - sec: num    

If the 'day' part is empty, then we will not capture the day field. In this case, the add step will provide a default value, since add by default never overwrites existing fields.

From these numbers, we can work out the total number of CPU seconds used by a process. We also get RSS in MiB. (script also does not overwrite by default, unless overwrite: true)

- script:
    overwrite: true
    let:
    - sec: sec + 60*(min + 60*(hour + 24*day))
    - RSS: round(RSS/1024)  # in Mb
- remove: [hour,min,day,TIME]

Now for the interesting part - we don't want to see processes unless they have used at least a second of CPU time. stream delta looks for changes in the field it is "watching", here sec. With only-changes: true the elapsed field will contain the time in milliseconds since the actual change occured.

- stream:
    operation: delta
    group-by: PID
    elapsed-field: 'elapsed'
    only-changes: true
    watch: sec 
- filter:
    condition: delta > 0

Once the initial data passes, you will only get events when a process has used more than one second.

The effective CPU utilitization for that process can now be calculated from delta and elapsed.

The full pipe definition:

name: ps
input:
    exec:
        command: ps ax --no-header -o pid,cputime,rss,comm
        interval: 2s
actions:
- extract:
    input-field: _raw
    remove: true
    pattern: '(\S+)\s+(\S+)\s+(\S+)\s+(.+)'
    output-fields: [PID,TIME,RSS,CMD]
- filter:
    exclude:
    - RSS: '0'
- extract:
    input-field: TIME
    pattern: '^(\d+-)*(\d+):(\d+):(\d+)'
    output-fields: [day,hour,min,sec]
- add:
    output-fields:
    - day: 0
- convert:
  - RSS: num
  - day: num
  - hour: num
  - min: num
  - sec: num
- script:
    overwrite: true
    let:
    - sec: sec + 60*(min + 60*(hour + 24*day))
    - RSS: round(RSS/1000)  # in Mb
- remove: [hour,min,day,TIME]
- stream:
    operation: delta
    group-by: PID
    elapsed-field: 'elapsed'
    only-changes: true
    watch: sec 
- filter:
    condition: delta > 0
    
output:
    write: console