InfluxDB performance tuning

Andreas · April 26, 2018, 9:37pm

As we experienced some general memory greediness and several high load conditions with InfluxDB 1.4.3, especially causing erratic behavior on our data acquisition server for harvesting Open weather data - #6 by mois, which is processing high-cardinality data feeds from DWD, we just amended some configuration parameters in /etc/influxdb/influxdb.conf in the [data] section as an attempt to relax the stress level of the system:

# The amount of time that a write will wait before fsyncing.  A duration
# greater than 0 can be used to batch up multiple fsync calls.  This is useful for slower
# disks or when WAL write contention is seen.  A value of 0s fsyncs every write to the WAL.
# Values in the range of 0-100ms are recommended for non-SSD disks.
wal-fsync-delay = "1s"

# The type of shard index to use for new shards.  The default is an in-memory index that is
# recreated at startup.  A value of "tsi1" will use a disk based index that supports higher
# cardinality datasets.
index-version = "tsi1"

# CacheSnapshotMemorySize is the size at which the engine will
# snapshot the cache and write it to a TSM file, freeing up memory
# Valid size suffixes are k, m, or g (case insensitive, 1024 = 1k).
# Values without a size suffix are in bytes.
cache-snapshot-memory-size = "128m"

As the InfluxDB documentation about the GOMAXPROCS environment variable also says things like

You can override this value to be less than the maximum value, which can be useful in cases where you are running InfluxDB along with other processes on the same machine and want to ensure that the database doesn’t completely starve those processes.

we tamed InfluxDB a bit to use only 3 CPU cores of 5 available:

cat /etc/default/influxdb
GOMAXPROCS=3

Andreas · May 12, 2018, 2:16pm

We recently increased the main memory of the machine harvesting Open weather data to 16 GB as it still experienced unstable behavior occasionally. The system is behaving well since doing that.

Enjoy https://weather.hiveeyes.org/ and thanks again to @wtf and @weef for doing all the programming over there.

Andreas · June 5, 2018, 10:28pm

Just a little update on this: On our data acquisition machine eltiempo, we configured the GOMAXPROCS=3 setting for InfluxDB to prevent processes starving each other (documentation above updated) and also increased the available main memory to 24 GB.