Skip to content

Instantly share code, notes, and snippets.

@rurkss
Last active June 5, 2023 19:23
Show Gist options
  • Save rurkss/c3922263e4dab28bd8677a58ad8acf05 to your computer and use it in GitHub Desktop.
Save rurkss/c3922263e4dab28bd8677a58ad8acf05 to your computer and use it in GitHub Desktop.

Problem: Influxd db series size is managed by max-series-per-database config parameter. In our case it is 1Mil. InfluxDb documentation suggest do not have too many series

Ways To keep low series:

  • Database per measurment type (per table), if you think it will be too many tables than plan B: database for each project (nitro-web, milano)
  • Retention Policy
  • Shard Group Durations (mainly for query performance)
  • Downsampling with Continuous Queries (CQs) functions:
  1. Average
  2. Sum:
  3. Minimum/Maximum
  4. First/Last
  5. Count

Scenario: Calculate CI builds on dayli basis

Solution:

  1. Create measurment 'ci_builds_daily'.
  2. Create Continuous Queries into ci_builds_weekly, ci_build_monthly, ci_builds_yearly and aggregate data into these tables (be done automatically).
  3. Create for each measurment Shard Group Durations equals its inteval for query performance.
  4. Create for each measurment Retention Policy equals table range perion + 1. For example ci_builds_daily will have 2 days of Retention Policy and so on.

Most of these operations and optimizations might be done during rake task or migrations, if so we will need a simply statefulset service with a influxdb setting.

Another way is to create CRD with configuration and apply it to cluster. Configuration might look like:

kind: InfluxDbConfiguration
spec:
  measurment:
  - db: ci_build
    downsampling:
     interval: "1d 1w 1m 1y"
     command: `SELECT MEAN(value)` / `SELECT SUM(amount)` / `SELECT COUNT(*)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment