Check out my first novel, midnight's simulacra!

Visualization

From dankwiki

My current home metrics + visualization stack is self-hosted Grafana OSS atop a Prometheus time series database. Prometheus is fed by mqtt2prometheus, which reads Mosquitto-brokered metrics from my bespoke scripts and various IoT devices. It also ingests SNMP via its own snmp_exporter. I was originally using Prometheus's node_exporter, but thought it horribly heavyweight.

At Microsoft, I use Grafana atop an otherwise custom stack. At Google, I used an entirely custom stack because hey, SWEs gotta get promoted, and the authors of Monarch weren't gonna demonstrate complexity by adapting standard open source tooling.

Prometheus

By default, your databases will be in /var/lib/prometheus. The initial retention size and time, at least on Debian, are ridiculously small. Edit /etc/default/prometheus and add something like --storage.tsdb.retention.size=10GB --storage.tsdb.retention.time=10y to ARGS. I don't know whether Prometheus gets slow when it hits these sizes/ages, but I do know it sucks to go look at your data and find out you're only retaining two months' worth. I currently (2023-06) have ~600MB across six months, and have seen no issues.

By default, Prometheus provides a data explorer UI and API on port 9090. Exporters (e.g. snmp_exporter) usually run their own UIs/APIs on their own ports (e.g. 9116 for snmp_exporter).