Skip to content

Instantly share code, notes, and snippets.

@kylebrandt
Last active May 13, 2025 00:49
Show Gist options
  • Save kylebrandt/4d55f7958b1197f99aa3c976dfcfb8ff to your computer and use it in GitHub Desktop.
Save kylebrandt/4d55f7958b1197f99aa3c976dfcfb8ff to your computer and use it in GitHub Desktop.

Revisions

  1. kylebrandt revised this gist May 13, 2025. 1 changed file with 103 additions and 0 deletions.
    103 changes: 103 additions & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,108 @@
    ![image](https://gist.github.com/user-attachments/assets/4f7bf485-9258-48d5-8993-535dc8be3b3f)

    # The queries

    ## Panel: Main Sheet
    SQL:
    ```
    SELECT title, author, copyright, href, image, rating, series
    FROM main
    ```

    ## Panel: Genre Stats
    SQL:
    ```
    WITH genre_counts AS (
    SELECT asin, COUNT(*) AS genre_count
    FROM genres
    GROUP BY asin
    ),
    joined AS (
    SELECT
    g.asin,
    g.genre,
    m.runtimeLengthMin AS total_minutes,
    m.runtimeLengthMin / gc.genre_count AS weighted_minutes
    FROM main m
    JOIN genres g ON m.asin = g.asin
    JOIN genre_counts gc ON m.asin = gc.asin
    )
    SELECT
    genre,
    COUNT(DISTINCT asin) AS book_count,
    ROUND(SUM(total_minutes) / 60.0, 1) AS total_hours_unweighted,
    ROUND(SUM(weighted_minutes) / 60.0, 1) AS total_hours_weighted
    FROM joined
    GROUP BY genre
    HAVING book_count > 3
    ORDER BY total_hours_unweighted DESC
    ```

    ## Panel: Author Stats
    SQL:
    ```
    WITH author_counts AS (
    SELECT asin, COUNT(*) AS author_count
    FROM authors
    GROUP BY asin
    ),
    joined AS (
    SELECT
    a.asin,
    a.author,
    m.runtimeLengthMin AS total_minutes,
    m.runtimeLengthMin / ac.author_count AS weighted_minutes
    FROM main m
    JOIN authors a ON m.asin = a.asin
    JOIN author_counts ac ON m.asin = ac.asin
    )
    SELECT
    author,
    COUNT(DISTINCT asin) AS book_count,
    ROUND(SUM(total_minutes) / 60.0, 1) AS total_hours_unweighted,
    ROUND(SUM(weighted_minutes) / 60.0, 1) AS total_hours_weighted
    FROM joined
    GROUP BY author
    HAVING book_count > 3
    ORDER BY total_hours_unweighted DESC
    ```

    ## Panel: Hours Per Copyright Year
    SQL:

    ```
    SELECT
    CAST(CONCAT(
    COALESCE(
    copyright,
    EXTRACT(YEAR FROM releaseDate)
    ), '-01-01') AS DATETIME) AS time,
    ROUND(SUM(runtimeLengthMin) / 60.0, 1) AS total_hours
    FROM main
    GROUP BY time
    ORDER BY time
    ```

    ## Panel: Series Stats

    SQL:
    ```
    SELECT
    m.seriesPrimary_name AS series_name,
    COUNT(DISTINCT m.asin) AS book_count,
    ROUND(SUM(m.runtimeLengthMin) / 60.0, 1) AS total_hours,
    GROUP_CONCAT(DISTINCT a.author ORDER BY a.author SEPARATOR ', ') AS authors
    FROM main m
    JOIN authors a ON m.asin = a.asin
    WHERE m.seriesPrimary_asin IS NOT NULL
    GROUP BY series_name
    ORDER BY book_count DESC, total_hours DESC
    ```

    ## Panel: Just the sheets
    (No SQL expression; shows raw named ranges from Google Sheets directly.)


    # 📚 Audible Library Dashboard – Project Summary

    ## ✅ 1. Scrape Audible Library
  2. kylebrandt created this gist May 13, 2025.
    30 changes: 30 additions & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,30 @@
    ![image](https://gist.github.com/user-attachments/assets/4f7bf485-9258-48d5-8993-535dc8be3b3f)

    # 📚 Audible Library Dashboard – Project Summary

    ## ✅ 1. Scrape Audible Library
    Used browser DevTools console JavaScript to extract:
    - Titles
    - ASINs
    - Authors, Series, and Links
    …directly from the Audible "My Library" HTML.

    ## ✅ 2. Enrich Data via API
    Wrote a Python script to enrich each ASIN by calling the **Audnexus API**:
    - Added runtime, genres, narrators, release date, images, ISBN, and summary
    - Output: a clean `audible_lib_enriched.json` file

    ## ✅ 3. Export to Multi-Table CSV
    Used Python to:
    - Flatten array fields (authors, narrators, genres) into separate CSVs
    - Normalize fields (e.g., `seriesPrimary_name`)
    - Output multiple CSVs for import into **Google Sheets** with named ranges

    ## ✅ 4. Build a Grafana Dashboard Using SQL Expressions
    - Backed by **Google Sheets** as the data source
    - Powered by **SQL Expressions (GMS engine)**:
    - Time series by release or copyright year
    - Runtime and book count by genre, author, and series
    - Weighted calculations for multi-author/genre relationships
    - CSV-style field aggregation with `GROUP_CONCAT`
    - Clean handling of joins and string normalization
    16 changes: 16 additions & 0 deletions cons.js
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,16 @@
    const books = Array.from(document.querySelectorAll('.adbl-library-content-row')).map(row => {
    const link = row.querySelector('a.bc-link.bc-color-base');
    const title = link?.querySelector('span.bc-size-headline3')?.textContent.trim() || '';
    const href = link?.href || '';

    const asinMatch = href.match(/\/([A-Z0-9]{10})(?:[/?]|$)/);
    const asin = asinMatch ? asinMatch[1] : '';

    const author = row.querySelector('.authorLabel')?.parentElement.querySelector('a.bc-link')?.textContent.trim() || '';
    const series = row.querySelector('.seriesLabel')?.parentElement.querySelector('a.bc-link')?.textContent.trim() || '';

    return { title, asin, href, author, series };
    }).filter(book => book.title);

    console.table(books);
    copy(JSON.stringify(books, null, 2));
    48 changes: 48 additions & 0 deletions enrich_audiobook.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,48 @@
    import json
    import requests
    import time
    from pathlib import Path

    INPUT_FILE = "audible_lib.json"
    OUTPUT_FILE = "audible_lib_enriched.json"

    def load_books(path):
    with open(path, "r", encoding="utf-8") as f:
    return json.load(f)

    def enrich_book(book):
    asin = book.get("asin")
    if not asin:
    book["audnexus_error"] = "Missing ASIN"
    return book

    url = f"https://api.audnex.us/books/{asin}"
    try:
    response = requests.get(url, timeout=10)
    if response.status_code == 200:
    data = response.json()
    return {**book, **data}
    else:
    book["audnexus_error"] = f"HTTP {response.status_code}"
    return book
    except Exception as e:
    book["audnexus_error"] = str(e)
    return book

    def save_books(path, books):
    with open(path, "w", encoding="utf-8") as f:
    json.dump(books, f, indent=2, ensure_ascii=False)

    def main():
    books = load_books(INPUT_FILE)
    enriched = []

    for book in books:
    enriched.append(enrich_book(book))
    time.sleep(0.2) # 200ms pause between requests

    save_books(OUTPUT_FILE, enriched)
    print(f"Enriched {len(enriched)} books saved to {OUTPUT_FILE}")

    if __name__ == "__main__":
    main()
    72 changes: 72 additions & 0 deletions make_audible_csv.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,72 @@
    import json
    import csv

    INPUT = "audible_lib_enriched.json"

    # Load JSON
    with open(INPUT, "r", encoding="utf-8") as f:
    books = json.load(f)

    main_rows = []
    author_rows = []
    narrator_rows = []
    genre_rows = []

    for book in books:
    asin = book.get("asin")

    # Main entry: everything except arrays and nested series objects
    main = {
    k: v for k, v in book.items()
    if k not in ("authors", "narrators", "genres", "seriesPrimary", "seriesSecondary")
    }

    # Flatten seriesPrimary and seriesSecondary
    for key in ["seriesPrimary", "seriesSecondary"]:
    series = book.get(key)
    if isinstance(series, dict):
    main[f"{key}_name"] = series.get("name")
    main[f"{key}_asin"] = series.get("asin")
    main[f"{key}_order"] = series.get("order") or series.get("position")
    else:
    main[f"{key}_name"] = None
    main[f"{key}_asin"] = None
    main[f"{key}_order"] = None

    main_rows.append(main)

    for a in book.get("authors", []):
    author_rows.append({"asin": asin, "author": a.get("name", "")})

    for n in book.get("narrators", []):
    narrator_rows.append({"asin": asin, "narrator": n.get("name", "")})

    for g in book.get("genres", []):
    genre_rows.append({"asin": asin, "genre": g.get("name", "")})

    # Write CSV helper
    def write_csv(filename, rows):
    if not rows:
    print(f"[skip] No rows to write for {filename}")
    return

    all_keys = set()
    for row in rows:
    all_keys.update(row.keys())
    fieldnames = sorted(all_keys)

    with open(filename, "w", newline='', encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)

    print(f"[done] Wrote {len(rows)} rows to {filename}")

    # Output files
    write_csv("audible_lib_main.csv", main_rows)
    write_csv("audible_lib_authors.csv", author_rows)
    write_csv("audible_lib_narrators.csv", narrator_rows)
    write_csv("audible_lib_genres.csv", genre_rows)

    # Debug summary
    print(f"authors: {len(author_rows)} | narrators: {len(narrator_rows)} | genres: {len(genre_rows)}")