Skip to content

Instantly share code, notes, and snippets.

@nachiket-lab
Created May 10, 2021 04:31
Show Gist options
  • Save nachiket-lab/fe7b65ebd8daddc206855a06eb1f50e7 to your computer and use it in GitHub Desktop.
Save nachiket-lab/fe7b65ebd8daddc206855a06eb1f50e7 to your computer and use it in GitHub Desktop.

Revisions

  1. nachiket-lab created this gist May 10, 2021.
    49 changes: 49 additions & 0 deletions Bucket Aggregations
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,49 @@
    # Bucket Selector Query

    Suppose we need to find a scenario where multiple users are logging in from a single source IP. We would write a query to do the following:
    1. Query and filter the event id you need
    2. Aggregate the results by ip addresses
    3. Aggregate the output of ip addresses by the users
    4. Select only the buckets that cross our threshold (In our case 10)

    We could easily write a query to do this in dev_tools as follows:
    ```
    GET <index_name>/_search
    {
    "query": {
    ....
    },
    "size": 0,
    "aggs": {
    "ip": {
    "terms": {
    "field": "src.ip",
    "size": 1000,
    ## This is for searching partitions to ensure you get all data.
    ## Try keeping the `doc_count_error_upper_bound` & `sum_other_doc_count` as low as possible
    "include": {
    "partition": 0,
    "num_partitions": 10
    }
    },
    "aggs": {
    "user": {
    "terms": {
    "field": "src.user.keyword",
    "size": 100
    }
    },
    ## notice this is inside aggs of the user bucket to ensure match from the user bucket
    "user_selector": {
    "bucket_selector": {
    "buckets_path": {
    "val": "user._bucket_count"
    },
    "script": "params.val>10"
    }
    }
    }
    }
    }
    }
    ```