Skip to content

Instantly share code, notes, and snippets.

@flexchar
Created April 26, 2024 14:07
Show Gist options
  • Save flexchar/0a9c6ecf0c1b9c2592e45af78c30cb23 to your computer and use it in GitHub Desktop.
Save flexchar/0a9c6ecf0c1b9c2592e45af78c30cb23 to your computer and use it in GitHub Desktop.

Revisions

  1. flexchar created this gist Apr 26, 2024.
    86 changes: 86 additions & 0 deletions video-to-json.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,86 @@
    ### Prerequisites

    - This script assumes you have `r2` alias set for

    ```warp-runnable-command
    r2='s5cmd --credentials-file ~/.config/s3/r2-config.cfg --endpoint-url https://{your bucket id}.r2.cloudflarestorage.com'
    ```

    - This script uses `s5cmd` as an s3 client which can be installed using `brew install s5cmd`\, or equivalent on other systems\.
    - Provide Replicate token

    ```warp-runnable-command
    export REPLICATE_API_TOKEN=
    ```

    ### Process

    1. Upload video to S3 on Cloudflare R2 bucket\. This will create as web accessible link so we don\'t have to kill ourselves uploading the file directly\. The bucket files are not browsable by public so they\'re still private\.

    ```warp-runnable-command
    S3_PATH="s3://tmp/7d/{{destination_name}}"
    r2 cp --sp {{source_path}} $S3_PATH
    ```

    Couple notes\: Our `tmp` bucket is configured with two rules that delete files that start with either `1d` and `7d` prefix respectively\. The rule of course matches the prefix duration\. As the bucket name implies\, this bucket shouldn\'t be used for anything that should be persistent\. Each file is wiped on its 30 days year old birthday\, aka\, it gets removed\.
    Please see section Object lifecycle rules in your bucket settings.

    2. After uploading\, get a publicly accesible link \- that is presign the URL\.

    ```warp-runnable-command
    SIGNED_URL=$(r2 presign --expire 1h $S3_PATH)
    echo $SIGNED_URL
    ```

    3. Submit to Replicate

    ```warp-runnable-command
    REPLICATE_RES=$(curl --location 'https://api.replicate.com/v1/predictions' \
    --header 'Content-Type: application/json' \
    --header "Authorization: Token $REPLICATE_API_TOKEN" \
    --data @- <<EOF
    {
    "version": "4f41e90243af171da918f04da3e526b2c247065583ea9b757f2071f573965408",
    "input": {
    "url": "$SIGNED_URL",
    "task": "transcribe",
    "timestamp": "chunk",
    "batch_size": 64,
    "language": "en"
    }
    }
    EOF
    )
    echo $REPLICATE_RES | jq
    WHISPER_ID=$(echo $REPLICATE_RES | jq -r '.id')
    echo $WHISPER_ID
    ```

    This is a cold\-started ML endpoint\. It may take up to 3 minutes\.

    4. Get the results using the id in the response

    ```warp-runnable-command
    OUTPUT_FILE="{{output_filename}}.json"
    echo $OUTPUT_FILE
    curl --location "https://api.replicate.com/v1/predictions/$WHISPER_ID" \
    --header "Authorization: Token $REPLICATE_API_TOKEN" | jq > $OUTPUT_FILE
    ```

    5. Get the complete transcript

    ```warp-runnable-command
    jq '.output.text' $OUTPUT_FILE
    ```

    ### Use AI to summarize

    This is an example of how we could use AI to extract key points\.
    I am currently using [Fabric by Daniel Miessler](https://github.com/danielmiessler/fabric) repository\. In short it provides a set of convenience prompts to submit together with the input text\, that can be accepted as a piped input\, while not leaving the terminal\.

    One of my favorite is `extract_wisdom` which was designed for YouTube videos\.

    ```warp-runnable-command
    jq '.output.text' $OUTPUT_FILE | fabric -p extract_wisdom -s
    ```