Problem:
Influxd db series size is managed by max-series-per-database config parameter.
In our case it is 1Mil.
InfluxDb documentation suggest do not have too many series
Ways To keep low series:
- Database per measurment type (per table), if you think it will be too many tables than plan B: database for each project (nitro-web, milano)
- Retention Policy
- Shard Group Durations (mainly for query performance)
- Downsampling with Continuous Queries (CQs) functions:
- Average
- Sum:
- Minimum/Maximum
- First/Last
- Count
Scenario: Calculate CI builds on dayli basis
Solution:
- Create measurment 'ci_builds_daily'.
- Create Continuous Queries into
ci_builds_weekly,ci_build_monthly,ci_builds_yearlyand aggregate data into these tables (be done automatically). - Create for each measurment
Shard Group Durationsequals its inteval for query performance. - Create for each measurment
Retention Policyequals table range perion + 1. For exampleci_builds_dailywill have 2 days ofRetention Policyand so on.
Most of these operations and optimizations might be done during rake task or migrations, if so we will need a simply statefulset service with a influxdb setting.
Another way is to create CRD with configuration and apply it to cluster. Configuration might look like:
kind: InfluxDbConfiguration
spec:
measurment:
- db: ci_build
downsampling:
interval: "1d 1w 1m 1y"
command: `SELECT MEAN(value)` / `SELECT SUM(amount)` / `SELECT COUNT(*)`