Skip to content

Instantly share code, notes, and snippets.

@deepns
Created April 23, 2021 00:35
Show Gist options
  • Select an option

  • Save deepns/5b28ac586c31cf9d3cd8e4aedbbc425b to your computer and use it in GitHub Desktop.

Select an option

Save deepns/5b28ac586c31cf9d3cd8e4aedbbc425b to your computer and use it in GitHub Desktop.

Revisions

  1. deepns created this gist Apr 23, 2021.
    169 changes: 169 additions & 0 deletions jq_examples.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,169 @@
    # Some examples of using jq

    - [Simple filter](#simple-filter)
    - [Access objects](#access-objects)
    - [Access lists/arrays](#access-listsarrays)
    - [Combine filters with pipe](#combine-filters-with-pipe)
    - [Raw output](#raw-output)
    - [Transform](#transform)
    - [Feed into multiple filters](#feed-into-multiple-filters)


    [jq](https://stedolan.github.io/jq/) is an excellent command line tool to operate on JSON data. I have been using it to process, filter and transform json objects for easy inference of the data. Noting down some commonly used operations for my later reference.

    - **Syntax** - `jq [options] <filter>`. Reads from stdin by default.
    - Filter specifies the expression to apply on the json data.
    - `.` - identity filter, output is same as input.

    ## Simple filter

    `~ % curl -s --compressed "https://api.stackexchange.com/2.2/tags?site=stackoverflow&pagesize=2" | jq '.'`

    ```json
    {
    "items": [
    {
    "has_synonyms": true,
    "is_moderator_only": false,
    "is_required": false,
    "count": 2204785,
    "name": "javascript"
    },
    {
    "has_synonyms": true,
    "is_moderator_only": false,
    "is_required": false,
    "count": 1770006,
    "name": "java"
    }
    ],
    "has_more": true,
    "quota_max": 300,
    "quota_remaining": 219
    }
    ```

    ## Access objects

    - `.object` - access `object` in the current stream. `.object1,.object2` to access multiple objects

    ```text
    ~ % curl -s --compressed "https://api.stackexchange.com/2.2/sites" | jq '.quota_max,.quota_remaining'
    300
    216
    ```

    - `.parent.child` - access child of a parent json value. Equivalent to `parent[child]` syntax

    ## Access lists/arrays

    Arrays are accessed using `[]` operator

    - `.[]` - access all items in the array (e.g. `input | jq '.items[]'`)
    - `.[i]` - index object at index `i`(e.g. `input | jq '.items[1].name'`)
    - `.[i:j]` - slice the array between index `i` and `j`.

    ```text
    ~ % cat stackexchange_sites | jq '.items[1].name'
    "Server Fault"
    ```

    ## Combine filters with pipe

    Filters can be combined using pipe operator `|`. Filter expressions are separated by space.

    e.g. `~ % cat stackexchange_sites | jq '.items[] | .api_site_parameter'` (api_site_parameter specifies the name of the API to be used in the "site" parameter in StackExchange API requests.)

    ```json
    "stackoverflow"
    "serverfault"
    "superuser"
    "meta"
    "webapps"
    "webapps.meta"
    "gaming"
    "gaming.meta"
    "webmasters"
    "webmasters.meta"
    "cooking"
    "cooking.meta"
    "gamedev"
    "gamedev.meta"
    "photo"
    "photo.meta"
    "stats"
    "stats.meta"
    "math"
    "math.meta"
    "diy"
    "diy.meta"
    "meta.superuser"
    "meta.serverfault"
    "gis"
    "gis.meta"
    "tex"
    "tex.meta"
    "askubuntu"
    "meta.askubuntu"
    ```

    ## Raw output

    `--raw-output / -r` option outputs the data as raw (without any json formatting). This comes in handy to apply further operations on the data using shell commands.

    e.g. list stack exchange sites, starting with S, in sorted order.
    `~ % cat stackexchange_sites | jq --raw-output '.items[] | .name' | sort | grep "^S"`

    ```text
    Seasoned Advice
    Seasoned Advice Meta
    Server Fault
    Stack Overflow
    Super User
    ```

    ## Transform

    We can also transform one json stream into another by specifying a filter with structure in `{ key : value}` where `value` is the object to extract from the stream.

    e.g. Extracting the site_url from StackExchange sites list `~ % cat stackexchange_sites | jq '.items[0:5] | .[] | { "name" : .name, "site" : .site_url}'`

    ```json
    {
    "name": "Stack Overflow",
    "site": "https://stackoverflow.com"
    }
    {
    "name": "Server Fault",
    "site": "https://serverfault.com"
    }
    {
    "name": "Super User",
    "site": "https://superuser.com"
    }
    {
    "name": "Meta Stack Exchange",
    "site": "https://meta.stackexchange.com"
    }
    {
    "name": "Web Applications",
    "site": "https://webapps.stackexchange.com"
    }
    ```

    ## Feed into multiple filters

    `,` operator can be to feed same input into multiple filters (similar to the `tee` utility in Linux). Comes handy in sequential processing.

    e.g. `~ % cat stackexchange_sites | jq '.items[1:5] | .[].name,.[].site_url'`

    ```json
    "Server Fault"
    "Super User"
    "Meta Stack Exchange"
    "Web Applications"
    "https://serverfault.com"
    "https://superuser.com"
    "https://meta.stackexchange.com"
    "https://webapps.stackexchange.com"
    ```