Sichere Datenverarbeitung mit "pflock" und "safewrite"

data
metadata
etl

#1

"pflock" und "safewrite" sind zwei nette kleine Tools, die für unseren cron-basierten job scheduler auf "eltiempo" entstanden sind, siehe auch Erneuerung der Luftdatenpumpe. Wir wollen sie kurz vorstellen.

  • pflock prevents multiple invocations of the same program
  • safewrite writes stdin to file only when yielding data

#2

pflock/unpflock

Usage

pflock \
    luftdatenpumpe stations --reverse-geocode \
    --target=postgresql://username:password@localhost/weatherbase

Sources

pflock

Prevent multiple invocations of the same program.

#!/bin/bash
# "pflock" prevents multiple invocations of the same program

# Choose the error channel
#echoerr() { echo "$@" 1>&2; }
echoerr() { /usr/bin/logger "$@"; }

program=$1
name=$( basename $program )
shift
command="${program} $@"
echoerr "pflock is running command: ${command}"
flock -xn "/var/lock/program-${name}.pflock" -c "${command}"

unpflock

Remove all pflocks.

#!/bin/bash

rm /var/lock/program-*.pflock

#3

safewrite

Write stdin to file only when yielding data. When stdin is empty, nothing is written to the output file, i.e. it will not be truncated.

Usage

pflock luftdatenpumpe stations --reverse-geocode --progress \
    | jq '[ .[] | {key: .station_id | tostring, name: .name} ]' \
    | safewrite "/var/lib/grafana-metadata-api/json/ldi-stations.json"

Source

#!/bin/bash
# "safewrite" writes stdin to file only when yielding data

output=$1
shift

buffer="$(cat -)"

# Choose the error channel
#echoerr() { echo "$@" 1>&2; }
echoerr() { /usr/bin/logger "$@"; }

# Debugging
#echo "buffer: ${buffer}"

# Write stdin to output file if not empty
if [[ ! -z "$buffer" ]]; then
  echoerr "Writing to $output"
  echo "$buffer" > $output
  echoerr "Writing to $output succeeded"
else
  echoerr "Writing to $output failed"
fi