Skip to content

Instantly share code, notes, and snippets.

@dannguyen
Last active October 18, 2025 15:51
Show Gist options
  • Select an option

  • Save dannguyen/9b8c51f5bb853209f19f1a0f18f0f74c to your computer and use it in GitHub Desktop.

Select an option

Save dannguyen/9b8c51f5bb853209f19f1a0f18f0f74c to your computer and use it in GitHub Desktop.

Revisions

  1. dannguyen revised this gist Jan 25, 2019. 2 changed files with 6917 additions and 0 deletions.
    20 changes: 20 additions & 0 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -115,6 +115,26 @@ Plaintext of the transcription, with leading/trailing words trimmed, and the rep
    > I haven't. I haven't heard the statement, but I do understand. And perhaps you should have said it differently. Local people know who they are when they go for groceries and everything else. And I think what Wilbur is probably trying to say is that they will work along. I know banks have working along. If you have mortgages, the mortgages and mortgage, the folks collecting the interest and all of those things, they work along. And that's what happens in time like this. They know the people. They've been dealing with them for years, and they work along the grocery store. And I think that's probably what Wilbur Ross. But I haven't seen a statement, but he's done a great job, I will tell you that.

    ## Update 2019-01-24 Senate debate Sen. Michael Bennett

    Thought it'd be worth trying the multiple-speaker identification on this other political video that's floating around Twitter today:

    https://twitter.com/AuthorFarrah/status/1088565656327458817

    The invocation, following [the API's requirements](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.start_transcription_job) that the max number of possible speakers be specified (I chose 4):

    ```sh
    aws transcribe start-transcription-job --language-code 'en-US' --media-format mp3 \
    --settings '{"ShowSpeakerLabels": true, "MaxSpeakerLabels": 4}' \
    --transcription-job-name $FNAME \
    --media "{\"MediaFileUri\": \"s3://data.danwin.com/tmp/${FNAME}.mp3\"}" \
    --output-bucket-name 'data.danwin.com'
    ```

    Resulting [JSON output](#file-transcript-senate-bennett-json)



    ## Other stuff

    - [Amazon's official guide to Getting Started with AWS CLI and Transcribe](https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html)
    6,897 changes: 6,897 additions & 0 deletions transcript-senate-bennett.json
    6,897 additions, 0 deletions not shown because the diff is too large. Please use a local Git client to view these changes.
  2. dannguyen revised this gist Jan 25, 2019. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -106,7 +106,7 @@ Ran a Transcribe job on [President Trump's presser today](https://twitter.com/Ma

    - [original tweet/clip](https://twitter.com/ManInTheHoody/status/1088578862127013888)
    - [mp3](http://data.danwin.com/trump-grocery.mp3)
    - [full JSON response](#file-transcript-trump-presser)
    - [full JSON response](#file-transcript-trump-presser-json)

    Plaintext of the transcription, with leading/trailing words trimmed, and the reporter's question in italics -- it's pretty good, all things considered.

  3. dannguyen revised this gist Jan 25, 2019. 2 changed files with 1904 additions and 0 deletions.
    15 changes: 15 additions & 0 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -100,6 +100,21 @@ For convenience's sake, here's a [screenshot of a transcription tweet that @Jord
    **How much did it cost?** AWS Transcribe [charges $0.0004 per second](https://aws.amazon.com/transcribe/pricing/). This clip was 57 seconds. Not counting the S3 upload/stroage fee, the price for transcription comes out to about **2.3 cents**


    ### Update [2019-01-24]

    Ran a Transcribe job on [President Trump's presser today](https://twitter.com/ManInTheHoody/status/1088578862127013888), regarding the shutdown and something about working with groceries and banks. Here's how Transcribe does with multiple speakers (e.g. Trump, and the reporter):

    - [original tweet/clip](https://twitter.com/ManInTheHoody/status/1088578862127013888)
    - [mp3](http://data.danwin.com/trump-grocery.mp3)
    - [full JSON response](#file-transcript-trump-presser)

    Plaintext of the transcription, with leading/trailing words trimmed, and the reporter's question in italics -- it's pretty good, all things considered.

    > *Ross said that he doesn't understand. What federal workers, we help getting food. You Can you understand that?*
    > I haven't. I haven't heard the statement, but I do understand. And perhaps you should have said it differently. Local people know who they are when they go for groceries and everything else. And I think what Wilbur is probably trying to say is that they will work along. I know banks have working along. If you have mortgages, the mortgages and mortgage, the folks collecting the interest and all of those things, they work along. And that's what happens in time like this. They know the people. They've been dealing with them for years, and they work along the grocery store. And I think that's probably what Wilbur Ross. But I haven't seen a statement, but he's done a great job, I will tell you that.

    ## Other stuff

    - [Amazon's official guide to Getting Started with AWS CLI and Transcribe](https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html)
    1,889 changes: 1,889 additions & 0 deletions transcript-trump-presser.json
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,1889 @@
    {
    "jobName": "trump-grocery",
    "accountId": "263510883111",
    "results": {
    "transcripts": [
    {
    "transcript": "Brother over. Ross said that he doesn't understand. What federal workers, we help getting food. You Can you understand that? I haven't. I haven't heard the statement, but I do understand. And perhaps you should have said it differently. Local people know who they are when they go for groceries and everything else. And I think what Wilbur is probably trying to say is that they will work along. I know banks have working along. If you have mortgages, the mortgages and mortgage, the folks collecting the interest and all of those things, they work along. And that's what happens in time like this. They know the people. They've been dealing with them for years, and they work along the grocery store. And I think that's probably what Wilbur Ross. But I haven't seen a statement, but he's done a great job, I will tell you that. Yes, you were"
    }
    ],
    "items": [
    {
    "start_time": "0.47",
    "end_time": "0.79",
    "alternatives": [
    {
    "confidence": "0.6249",
    "content": "Brother"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.0",
    "end_time": "1.34",
    "alternatives": [
    {
    "confidence": "0.9933",
    "content": "over"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "1.34",
    "end_time": "1.76",
    "alternatives": [
    {
    "confidence": "0.9979",
    "content": "Ross"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.76",
    "end_time": "1.93",
    "alternatives": [
    {
    "confidence": "0.9933",
    "content": "said"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.93",
    "end_time": "2.03",
    "alternatives": [
    {
    "confidence": "0.9960",
    "content": "that"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.03",
    "end_time": "2.13",
    "alternatives": [
    {
    "confidence": "0.9951",
    "content": "he"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.14",
    "end_time": "2.34",
    "alternatives": [
    {
    "confidence": "0.6273",
    "content": "doesn't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.34",
    "end_time": "2.85",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "understand"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "2.85",
    "end_time": "2.97",
    "alternatives": [
    {
    "confidence": "0.8847",
    "content": "What"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.98",
    "end_time": "3.27",
    "alternatives": [
    {
    "confidence": "0.9580",
    "content": "federal"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "3.28",
    "end_time": "3.66",
    "alternatives": [
    {
    "confidence": "0.9974",
    "content": "workers"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "3.67",
    "end_time": "3.86",
    "alternatives": [
    {
    "confidence": "0.7832",
    "content": "we"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "3.86",
    "end_time": "4.11",
    "alternatives": [
    {
    "confidence": "0.8871",
    "content": "help"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "4.11",
    "end_time": "4.47",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "getting"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "4.48",
    "end_time": "4.85",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "food"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "4.86",
    "end_time": "5.0",
    "alternatives": [
    {
    "confidence": "0.6305",
    "content": "You"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.0",
    "end_time": "5.13",
    "alternatives": [
    {
    "confidence": "0.9996",
    "content": "Can"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.13",
    "end_time": "5.25",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "you"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.25",
    "end_time": "5.69",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "understand"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.69",
    "end_time": "5.81",
    "alternatives": [
    {
    "confidence": "0.6436",
    "content": "that"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "?"
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "5.81",
    "end_time": "5.86",
    "alternatives": [
    {
    "confidence": "0.9949",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.86",
    "end_time": "6.24",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "haven't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "6.25",
    "end_time": "6.35",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "6.35",
    "end_time": "6.69",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "haven't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "6.69",
    "end_time": "6.85",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "heard"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "6.85",
    "end_time": "6.93",
    "alternatives": [
    {
    "confidence": "0.9972",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "6.93",
    "end_time": "7.3",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "statement"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "7.3",
    "end_time": "7.43",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "but"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "7.43",
    "end_time": "7.75",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "7.76",
    "end_time": "8.07",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "do"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "8.08",
    "end_time": "8.81",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "understand"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "8.81",
    "end_time": "8.92",
    "alternatives": [
    {
    "confidence": "0.8735",
    "content": "And"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "8.92",
    "end_time": "9.32",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "perhaps"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.32",
    "end_time": "9.4",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "you"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.4",
    "end_time": "9.58",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "should"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.58",
    "end_time": "9.67",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "have"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.67",
    "end_time": "9.86",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "said"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.86",
    "end_time": "9.94",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "it"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.94",
    "end_time": "10.46",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "differently"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "12.44",
    "end_time": "12.88",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "Local"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "12.88",
    "end_time": "13.17",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "people"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "13.17",
    "end_time": "13.65",
    "alternatives": [
    {
    "confidence": "0.5406",
    "content": "know"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "14.24",
    "end_time": "14.47",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "who"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "14.47",
    "end_time": "14.65",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "they"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "14.65",
    "end_time": "14.93",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "are"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "14.94",
    "end_time": "15.08",
    "alternatives": [
    {
    "confidence": "0.9695",
    "content": "when"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "15.08",
    "end_time": "15.28",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "they"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "15.29",
    "end_time": "15.73",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "go"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "15.73",
    "end_time": "15.89",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "15.89",
    "end_time": "16.39",
    "alternatives": [
    {
    "confidence": "0.7787",
    "content": "groceries"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.39",
    "end_time": "16.49",
    "alternatives": [
    {
    "confidence": "0.9985",
    "content": "and"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.49",
    "end_time": "16.87",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "everything"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.87",
    "end_time": "17.18",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "else"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "17.18",
    "end_time": "17.3",
    "alternatives": [
    {
    "confidence": "0.9869",
    "content": "And"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.3",
    "end_time": "17.35",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.35",
    "end_time": "17.51",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "think"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.52",
    "end_time": "17.69",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "what"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.69",
    "end_time": "17.99",
    "alternatives": [
    {
    "confidence": "0.7347",
    "content": "Wilbur"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.99",
    "end_time": "18.11",
    "alternatives": [
    {
    "confidence": "0.7762",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.11",
    "end_time": "18.54",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "probably"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.54",
    "end_time": "18.82",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "trying"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.82",
    "end_time": "18.91",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.91",
    "end_time": "19.19",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "say"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "19.19",
    "end_time": "19.32",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "19.32",
    "end_time": "19.55",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "that"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "20.03",
    "end_time": "20.36",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "they"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "20.36",
    "end_time": "20.93",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "will"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "21.06",
    "end_time": "21.37",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "work"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "21.37",
    "end_time": "21.79",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "along"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "21.79",
    "end_time": "21.83",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "21.83",
    "end_time": "22.01",
    "alternatives": [
    {
    "confidence": "0.9351",
    "content": "know"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "22.01",
    "end_time": "22.32",
    "alternatives": [
    {
    "confidence": "0.9950",
    "content": "banks"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "22.32",
    "end_time": "22.44",
    "alternatives": [
    {
    "confidence": "0.5503",
    "content": "have"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "22.44",
    "end_time": "22.73",
    "alternatives": [
    {
    "confidence": "0.9973",
    "content": "working"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "22.73",
    "end_time": "23.19",
    "alternatives": [
    {
    "confidence": "0.9998",
    "content": "along"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "23.66",
    "end_time": "24.17",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "If"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "24.18",
    "end_time": "24.34",
    "alternatives": [
    {
    "confidence": "0.8498",
    "content": "you"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "24.34",
    "end_time": "24.44",
    "alternatives": [
    {
    "confidence": "0.9321",
    "content": "have"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "24.45",
    "end_time": "25.01",
    "alternatives": [
    {
    "confidence": "0.9785",
    "content": "mortgages"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "25.01",
    "end_time": "25.15",
    "alternatives": [
    {
    "confidence": "0.9988",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.16",
    "end_time": "25.8",
    "alternatives": [
    {
    "confidence": "0.9399",
    "content": "mortgages"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.8",
    "end_time": "25.89",
    "alternatives": [
    {
    "confidence": "0.8193",
    "content": "and"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.89",
    "end_time": "26.25",
    "alternatives": [
    {
    "confidence": "0.9820",
    "content": "mortgage"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "26.94",
    "end_time": "27.18",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "27.19",
    "end_time": "27.73",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "folks"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "28.03",
    "end_time": "28.83",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "collecting"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "28.84",
    "end_time": "29.32",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.33",
    "end_time": "29.92",
    "alternatives": [
    {
    "confidence": "0.9697",
    "content": "interest"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.92",
    "end_time": "30.12",
    "alternatives": [
    {
    "confidence": "0.9990",
    "content": "and"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.12",
    "end_time": "30.37",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "all"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.37",
    "end_time": "30.47",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "of"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.47",
    "end_time": "30.71",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "those"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.71",
    "end_time": "30.97",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "things"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "30.97",
    "end_time": "31.07",
    "alternatives": [
    {
    "confidence": "0.9775",
    "content": "they"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.07",
    "end_time": "31.29",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "work"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.29",
    "end_time": "31.67",
    "alternatives": [
    {
    "confidence": "0.9967",
    "content": "along"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "31.92",
    "end_time": "32.06",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "And"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.06",
    "end_time": "32.24",
    "alternatives": [
    {
    "confidence": "0.9996",
    "content": "that's"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.24",
    "end_time": "32.38",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "what"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.38",
    "end_time": "32.78",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "happens"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.78",
    "end_time": "32.9",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "in"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.9",
    "end_time": "33.18",
    "alternatives": [
    {
    "confidence": "0.5632",
    "content": "time"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.18",
    "end_time": "33.39",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "like"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.39",
    "end_time": "33.54",
    "alternatives": [
    {
    "confidence": "0.9908",
    "content": "this"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "33.54",
    "end_time": "33.64",
    "alternatives": [
    {
    "confidence": "0.9999",
    "content": "They"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.64",
    "end_time": "33.83",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "know"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.83",
    "end_time": "33.94",
    "alternatives": [
    {
    "confidence": "0.8501",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.95",
    "end_time": "34.23",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "people"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "34.23",
    "end_time": "34.36",
    "alternatives": [
    {
    "confidence": "0.9666",
    "content": "They've"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "34.36",
    "end_time": "34.48",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "been"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "34.48",
    "end_time": "34.77",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "dealing"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "34.77",
    "end_time": "34.89",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "with"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "34.89",
    "end_time": "35.02",
    "alternatives": [
    {
    "confidence": "0.8215",
    "content": "them"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "35.02",
    "end_time": "35.14",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "35.14",
    "end_time": "35.54",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "years"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "35.78",
    "end_time": "35.93",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "and"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "35.93",
    "end_time": "36.01",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "they"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "36.01",
    "end_time": "36.22",
    "alternatives": [
    {
    "confidence": "0.9980",
    "content": "work"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "36.22",
    "end_time": "36.5",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "along"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "36.5",
    "end_time": "36.6",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "36.6",
    "end_time": "37.05",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "grocery"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "37.05",
    "end_time": "37.45",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "store"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "37.96",
    "end_time": "38.13",
    "alternatives": [
    {
    "confidence": "0.9997",
    "content": "And"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "38.13",
    "end_time": "38.17",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "38.17",
    "end_time": "38.39",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "think"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "38.39",
    "end_time": "38.59",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "that's"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "38.59",
    "end_time": "39.06",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "probably"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.06",
    "end_time": "39.27",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "what"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.27",
    "end_time": "39.62",
    "alternatives": [
    {
    "confidence": "0.9602",
    "content": "Wilbur"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.62",
    "end_time": "40.14",
    "alternatives": [
    {
    "confidence": "0.9989",
    "content": "Ross"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "40.53",
    "end_time": "40.73",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "But"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "40.73",
    "end_time": "40.77",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "40.77",
    "end_time": "41.06",
    "alternatives": [
    {
    "confidence": "0.9988",
    "content": "haven't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "41.06",
    "end_time": "41.24",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "seen"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "41.24",
    "end_time": "41.3",
    "alternatives": [
    {
    "confidence": "0.9408",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "41.3",
    "end_time": "41.78",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "statement"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "41.79",
    "end_time": "42.0",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "but"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.0",
    "end_time": "42.13",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "he's"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.13",
    "end_time": "42.27",
    "alternatives": [
    {
    "confidence": "0.9986",
    "content": "done"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.27",
    "end_time": "42.34",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.34",
    "end_time": "42.6",
    "alternatives": [
    {
    "confidence": "0.9954",
    "content": "great"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.6",
    "end_time": "43.09",
    "alternatives": [
    {
    "confidence": "0.9954",
    "content": "job"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "43.41",
    "end_time": "43.51",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "43.51",
    "end_time": "43.65",
    "alternatives": [
    {
    "confidence": "0.9997",
    "content": "will"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "43.65",
    "end_time": "43.84",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "tell"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "43.84",
    "end_time": "43.97",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "you"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "43.97",
    "end_time": "44.26",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "that"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "44.27",
    "end_time": "44.63",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "Yes"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "46.28",
    "end_time": "46.5",
    "alternatives": [
    {
    "confidence": "0.6760",
    "content": "you"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "46.51",
    "end_time": "46.73",
    "alternatives": [
    {
    "confidence": "0.3080",
    "content": "were"
    }
    ],
    "type": "pronunciation"
    }
    ]
    },
    "status": "COMPLETED"
    }
  4. dannguyen revised this gist Jan 18, 2019. 1 changed file with 4 additions and 4 deletions.
    8 changes: 4 additions & 4 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -31,10 +31,10 @@ The best way to learn-and-use the command-line is to practice the [UNIX philosop

    - Find a tweet containing a video you like
    - Get that tweet's URL, e.g. https://twitter.com/JordanUhl/status/1085669288051175424
    - Use **youtube-dl** to download the video from that tweet and save it to disk, e.g. `cardib.mp4`
    - Because AWS Transcribe requires we send it an audio file, use **ffmpeg** to extract the audio from `cardib.mp4` and save it to `cardib.mp3`
    - Because AWS Transcribe only works on audio files stored on AWS S3, we use `awscli` to upload `cardiob.mp3` to an online S3 bucket, eg. http://data.danwin.com/tmp/cardib.mp3
    - Use `awscli` to access the AWS Transcribe API and [start a transcription job](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.start_transcription_job)
    - Use **youtube-dl** to download the video from that tweet and save it to disk, e.g. `cardib-shutdown.mp4`
    - Because AWS Transcribe requires we send it an audio file, use **ffmpeg** to extract the audio from `cardib-shutdown.mp4` and save it to `cardib-shutdown.mp3`
    - Because AWS Transcribe only works on audio files stored on AWS S3, we use **awscli** to upload `cardiob-shutdown.mp3` to an online S3 bucket, eg. http://data.danwin.com/tmp/cardib-shutdown.mp3
    - Use **awscli** to access the AWS Transcribe API and [start a transcription job](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.start_transcription_job)
    - Wait a couple of minutes, and/or use **awscli** to occassionally [get the details](https://boto3.amazona`ws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.get_transcription_job) of the transcription job to see if it's finished ([sample response JSON from the get-transcription-job endpoint](#file-get-transcript-job-json))
    - Use **curl** to download the transcript data from the expected URL, e.g. http://data.danwin.com/cardib-shutdown.json (see [pretty preview here](#file-transcript-file-json))
    - Use **jq** to process the [transcript data](#file-transcript-file-json) and extract the `transcript` value, which contains the transcription text as a single string.
  5. dannguyen revised this gist Jan 18, 2019. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -92,7 +92,10 @@ Here's what Cardi B said, according to AWS Transcribe, which you can read along
    > I feel that we need to take some action. I don't know what type of actual base because it is not what I do, but I'm scared. This is crazy. And I really feel bad for these people. They got to go to fucking work, to not get motherfucking paid.
    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent. It stumbles for very fast cuss words -- "yall mother fuckas" is "your mother focus" and "check that pussy" becomes "take a piss". But it also manages to get accurately transcribe fast and unusual phrases like "gynecologist with no motherfucking problem".

    For convenience's sake, here's a [screenshot of a transcription tweet that @JordanUhl sent out later](https://twitter.com/JordanUhl/status/1085669288051175424):

    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent. It stumbles for very fast cuss words -- *"yall mother fuckas"* is *"your mother focus"* and *"check that pussy"* becomes *"take a piss"*. But it also manages to accurately transcribe fast and unusual phrases like *"in the gynecologist with no motherfucking problem"*.

    **How much did it cost?** AWS Transcribe [charges $0.0004 per second](https://aws.amazon.com/transcribe/pricing/). This clip was 57 seconds. Not counting the S3 upload/stroage fee, the price for transcription comes out to about **2.3 cents**

  6. dannguyen revised this gist Jan 18, 2019. 1 changed file with 6 additions and 1 deletion.
    7 changes: 6 additions & 1 deletion cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -94,4 +94,9 @@ Here's what Cardi B said, according to AWS Transcribe, which you can read along
    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent. It stumbles for very fast cuss words -- "yall mother fuckas" is "your mother focus" and "check that pussy" becomes "take a piss". But it also manages to get accurately transcribe fast and unusual phrases like "gynecologist with no motherfucking problem".

    **How much did it cost?** AWS Transcribe [charges $0.0004 per second](https://aws.amazon.com/transcribe/pricing/). This clip was 57 seconds. Not counting the S3 upload/stroage fee, the price for transcription comes out to about **2.3 cents**
    **How much did it cost?** AWS Transcribe [charges $0.0004 per second](https://aws.amazon.com/transcribe/pricing/). This clip was 57 seconds. Not counting the S3 upload/stroage fee, the price for transcription comes out to about **2.3 cents**


    ## Other stuff

    - [Amazon's official guide to Getting Started with AWS CLI and Transcribe](https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html)
  7. dannguyen revised this gist Jan 18, 2019. 1 changed file with 3 additions and 2 deletions.
    5 changes: 3 additions & 2 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -13,9 +13,10 @@ See the [transcribed text here](#transcription-results), and the [full prettifie

    ## Requirements

    Sign-up for Amazon Web Services: https://aws.amazon.com
    - Sign-up for Amazon Web Services: https://aws.amazon.com
    - [Create a S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-bucket.html) (the example I use in my code is `data.danwin.com`

    Install:
    And install the following tools (using homebrew, pip, and what-have-you)

    - [youtube-dl](https://rg3.github.io/youtube-dl/) - for fetching video files from social media services
    - [awscli](https://aws.amazon.com/cli/) - for accessing various AWS services, specfically S3 (for storing the video and its processed transcription) and Transcribe
  8. dannguyen revised this gist Jan 18, 2019. 1 changed file with 5 additions and 5 deletions.
    10 changes: 5 additions & 5 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -6,7 +6,7 @@ Inspired by the following exchange on Twitter, in [which someone captures and po

    The instructions and code below show how to use command-line tools/scripting and [Amazon's Transcribe service](https://aws.amazon.com/transcribe/) to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amazing service!

    **tl:dr** [AWS Transcribe](https://aws.amazon.com/transcribe/) is surprisingly accurate and efficient. It took about **2 minutes** for it to process a **57-second clip** at a cost of less than **2.5 cents**.
    **tl:dr** [AWS Transcribe](https://aws.amazon.com/transcribe/) is surprisingly accurate and efficient. It took about **2 minutes** for it to process a **57-second clip** at a cost of less than **2.5 cents**. It beats the pants off of what I remember [IBM Watson was capable of doing](https://github.com/dannguyen/watson-word-watcher) (albeit, from a few years ago).

    See the [transcribed text here](#transcription-results), and the [full prettified JSON response here](#file-transcript-file-json).

    @@ -85,12 +85,12 @@ Here's what Cardi B said, according to AWS Transcribe, which you can read along

    > Hey. Yeah. I just want to remind you because there's been a little bit over three weeks, okay? It's been a little bit over three weeks. Trump is now ordering as his some missing federal government workers to go back to work without getting paid.
    > Now, I don't want to hear your mother focus talking about all but Obama Shut down the government for seventeen days year bitch for health care. So your grandma could check her blood pressure and your business to go take a piss in the gynecologist with no motherfucking problem. Now, I know a lot of guys don't care because I don't work for the government or your partner.
    > Now, I don't want to hear your mother focus talking about all but Obama Shut down the government for seventeen days year bitch for health care. So your grandma could check her blood pressure and your business to go take a piss in the gynecologist with no motherfucking problem.
    > They have a job, but this shit is really fucking serious, bro. This city is crazy. Like a country is in a hell hole right now. All for fucking war. And we really need to take this serious. I feel that we need to take some action. I don't know what type of actual base because it is not what I do, but I'm scared.
    > Now, I know a lot of guys don't care because I don't work for the government or your partner. They have a job, but this shit is really fucking serious, bro. This city is crazy. Like a country is in a hell hole right now. All for fucking war. And we really need to take this serious.
    > This is crazy. And I really feel bad for these people. They got to go to fucking work, to not get motherfucking paid.
    > I feel that we need to take some action. I don't know what type of actual base because it is not what I do, but I'm scared. This is crazy. And I really feel bad for these people. They got to go to fucking work, to not get motherfucking paid.
    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent.
    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent. It stumbles for very fast cuss words -- "yall mother fuckas" is "your mother focus" and "check that pussy" becomes "take a piss". But it also manages to get accurately transcribe fast and unusual phrases like "gynecologist with no motherfucking problem".

    **How much did it cost?** AWS Transcribe [charges $0.0004 per second](https://aws.amazon.com/transcribe/pricing/). This clip was 57 seconds. Not counting the S3 upload/stroage fee, the price for transcription comes out to about **2.3 cents**
  9. dannguyen revised this gist Jan 18, 2019. 1 changed file with 4 additions and 3 deletions.
    7 changes: 4 additions & 3 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -81,13 +81,14 @@ curl -s http://data.danwin.com/cardib-shutdown.json \

    ## Transcription results

    Here's what Cardi B said, according to AWS Transcribe, which you can read along with the [audio](http://data.danwin.com/tmp/cardib-shutdown.mp3) or the [original tweet video](https://twitter.com/JordanUhl/status/1085669288051175424):

    (I've added some paragraph breaks for easier reading)
    Here's what Cardi B said, according to AWS Transcribe, which you can read along with the [audio](http://data.danwin.com/tmp/cardib-shutdown.mp3) or the [original tweet video](https://twitter.com/JordanUhl/status/1085669288051175424). I've added some paragraph breaks for easier reading, but the period/sentence-breaks are all from the AWS Transcribe service:

    > Hey. Yeah. I just want to remind you because there's been a little bit over three weeks, okay? It's been a little bit over three weeks. Trump is now ordering as his some missing federal government workers to go back to work without getting paid.
    > Now, I don't want to hear your mother focus talking about all but Obama Shut down the government for seventeen days year bitch for health care. So your grandma could check her blood pressure and your business to go take a piss in the gynecologist with no motherfucking problem. Now, I know a lot of guys don't care because I don't work for the government or your partner.
    > They have a job, but this shit is really fucking serious, bro. This city is crazy. Like a country is in a hell hole right now. All for fucking war. And we really need to take this serious. I feel that we need to take some action. I don't know what type of actual base because it is not what I do, but I'm scared.
    > This is crazy. And I really feel bad for these people. They got to go to fucking work, to not get motherfucking paid.
    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent.
  10. dannguyen revised this gist Jan 18, 2019. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    # Transcribing Cardi B's political speech with command-line tools
    # Transcribing Cardi B's political speech with AWS Transcribe and command-line tools

    Inspired by the following exchange on Twitter, in [which someone captures and posts a valuable video](https://twitter.com/JordanUhl/status/1085669288051175424) onto Twitter, but doesn't have the resources to [easily transcribe it for the hearing-impaired](https://twitter.com/riotpedestrian/status/1085753671726452736), I thought it'd be fun to try out Amazon's [AWS Transcribe service](https://aws.amazon.com/transcribe/) to help with this problem, and to see if I could do it all from the bash command-line like a Unix dork.

  11. dannguyen revised this gist Jan 18, 2019. 1 changed file with 8 additions and 5 deletions.
    13 changes: 8 additions & 5 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -1,13 +1,14 @@
    # Transcribing Cardi B's political speech with command-line tools

    Inspired by the following exchange on Twitter, in [which someone captures and posts a valuable video](https://twitter.com/JordanUhl/status/1085669288051175424) onto Twitter, but doesn't have the resources to [easily transcribe it for the hearing-impaired](https://twitter.com/riotpedestrian/status/1085753671726452736):

    Inspired by the following exchange on Twitter, in [which someone captures and posts a valuable video](https://twitter.com/JordanUhl/status/1085669288051175424) onto Twitter, but doesn't have the resources to [easily transcribe it for the hearing-impaired](https://twitter.com/riotpedestrian/status/1085753671726452736), I thought it'd be fun to try out Amazon's [AWS Transcribe service](https://aws.amazon.com/transcribe/) to help with this problem, and to see if I could do it all from the bash command-line like a Unix dork.

    ![Screencap of @jordanuhl's video tweet, followed by a request for a transcript](https://user-images.githubusercontent.com/121520/51369686-23e59b00-1aba-11e9-87db-e145f8d2cda5.png)

    The instructions and code below show how to use command-line tools/scripting and [Amazon's Transcribe service](https://aws.amazon.com/transcribe/) to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amazing service!

    **tl:dr** [AWS Transcribe](https://aws.amazon.com/transcribe/) is surprisingly accurate and efficient. It took about **2 minutes** for it to process a **57-second clip** at a cost of less than **2.5 cents**. See the transcribed text here, and the full prettified JSON response here.
    **tl:dr** [AWS Transcribe](https://aws.amazon.com/transcribe/) is surprisingly accurate and efficient. It took about **2 minutes** for it to process a **57-second clip** at a cost of less than **2.5 cents**.

    See the [transcribed text here](#transcription-results), and the [full prettified JSON response here](#file-transcript-file-json).


    ## Requirements
    @@ -25,6 +26,8 @@ Install:

    ## The steps

    The best way to learn-and-use the command-line is to practice the [UNIX philosophy](https://en.wikipedia.org/wiki/Unix_philosophy) of **do one thing and do it well**, which requires breaking the process down into individual steps:

    - Find a tweet containing a video you like
    - Get that tweet's URL, e.g. https://twitter.com/JordanUhl/status/1085669288051175424
    - Use **youtube-dl** to download the video from that tweet and save it to disk, e.g. `cardib.mp4`
    @@ -33,12 +36,12 @@ Install:
    - Use `awscli` to access the AWS Transcribe API and [start a transcription job](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.start_transcription_job)
    - Wait a couple of minutes, and/or use **awscli** to occassionally [get the details](https://boto3.amazona`ws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.get_transcription_job) of the transcription job to see if it's finished ([sample response JSON from the get-transcription-job endpoint](#file-get-transcript-job-json))
    - Use **curl** to download the transcript data from the expected URL, e.g. http://data.danwin.com/cardib-shutdown.json (see [pretty preview here](#file-transcript-file-json))
    - Use **jq** to process the [transcript data](#file-transcript-file-json) and extract the `transcript` value, which contains the transcription text as a single string
    - Use **jq** to process the [transcript data](#file-transcript-file-json) and extract the `transcript` value, which contains the transcription text as a single string.


    ## The script

    So obviously you should **not** do this as a big ol bash script (or even bash/CLI at all). But I wrote this example up for a talk on how you can learn the CLI by messing around for fun, and this is an elaborate example of the pain you can put yourself through.
    So obviously you should **not** do this as a big ol bash script (or even bash/CLI at all). But I wrote this example up for a talk on how you can learn the CLI by messing around for fun, and this is an elaborate example of the pain you can put yourself through. Maybe later I'll show how to approach it as a novice but this is what it looks like if you're trying to not care too much, but also not wanting it to be too painful:


    ~~~sh
  12. dannguyen revised this gist Jan 18, 2019. 1 changed file with 18 additions and 1 deletion.
    19 changes: 18 additions & 1 deletion cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -5,7 +5,9 @@ Inspired by the following exchange on Twitter, in [which someone captures and po

    ![Screencap of @jordanuhl's video tweet, followed by a request for a transcript](https://user-images.githubusercontent.com/121520/51369686-23e59b00-1aba-11e9-87db-e145f8d2cda5.png)

    The instructions and code below show how to use command-line tools/scripting and [Amazon's Transcribe service](https://console.aws.amazon.com/transcribe) to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amazing service!
    The instructions and code below show how to use command-line tools/scripting and [Amazon's Transcribe service](https://aws.amazon.com/transcribe/) to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amazing service!

    **tl:dr** [AWS Transcribe](https://aws.amazon.com/transcribe/) is surprisingly accurate and efficient. It took about **2 minutes** for it to process a **57-second clip** at a cost of less than **2.5 cents**. See the transcribed text here, and the full prettified JSON response here.


    ## Requirements
    @@ -73,3 +75,18 @@ aws transcribe get-transcription-job \
    curl -s http://data.danwin.com/cardib-shutdown.json \
    | jq '.results.transcripts[0].transcript' --raw-output
    ~~~

    ## Transcription results

    Here's what Cardi B said, according to AWS Transcribe, which you can read along with the [audio](http://data.danwin.com/tmp/cardib-shutdown.mp3) or the [original tweet video](https://twitter.com/JordanUhl/status/1085669288051175424):

    (I've added some paragraph breaks for easier reading)

    > Hey. Yeah. I just want to remind you because there's been a little bit over three weeks, okay? It's been a little bit over three weeks. Trump is now ordering as his some missing federal government workers to go back to work without getting paid.
    > Now, I don't want to hear your mother focus talking about all but Obama Shut down the government for seventeen days year bitch for health care. So your grandma could check her blood pressure and your business to go take a piss in the gynecologist with no motherfucking problem. Now, I know a lot of guys don't care because I don't work for the government or your partner.
    > They have a job, but this shit is really fucking serious, bro. This city is crazy. Like a country is in a hell hole right now. All for fucking war. And we really need to take this serious. I feel that we need to take some action. I don't know what type of actual base because it is not what I do, but I'm scared.
    > This is crazy. And I really feel bad for these people. They got to go to fucking work, to not get motherfucking paid.
    **The verdict?** Not bad! You can see the word-by-word confidence in the [full transcript JSON](#file-transcript-file-json), but I'm impressed with the simple text output, which contains capitalization of proper nouns (e.g. "Obama") and guesses at where sentences begin, nevermind pretty good understading of Cardi B's Bronx accent.

    **How much did it cost?** AWS Transcribe [charges $0.0004 per second](https://aws.amazon.com/transcribe/pricing/). This clip was 57 seconds. Not counting the S3 upload/stroage fee, the price for transcription comes out to about **2.3 cents**
  13. dannguyen revised this gist Jan 18, 2019. 1 changed file with 8 additions and 5 deletions.
    13 changes: 8 additions & 5 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -38,8 +38,9 @@ Install:

    So obviously you should **not** do this as a big ol bash script (or even bash/CLI at all). But I wrote this example up for a talk on how you can learn the CLI by messing around for fun, and this is an elaborate example of the pain you can put yourself through.

    ```sh
    # fetch that video and save it to the working directory

    ~~~sh
    # Fetch that video and save it to the working directory
    # as `cardib-shutdown.mp4`
    youtube-dl --output cardib-shutdown.mp4 \
    https://twitter.com/JordanUhl/status/1085669288051175424
    @@ -53,8 +54,11 @@ ffmpeg -i cardib-shutdown.mp4 \
    aws s3 cp --acl public-read \
    cardib-shutdown.mp3 s3://data.danwin.com/tmp/cardib-shutdown.mp3

    # Start the transcription job and specify that the transcription result data
    # be saved to a given bucket, e.g. data.danwin.com
    aws transcribe start-transcription-job \
    --language-code 'en-US' --media-format 'mp3' \
    --language-code 'en-US' \
    --media-format 'mp3' \
    --transcription-job-name 'cardib-shutdown' \
    --media '{"MediaFileUri": "s3://data.danwin.com/tmp/cardib-shutdown.mp3"}' \
    --output-bucket-name 'data.danwin.com'
    @@ -68,5 +72,4 @@ aws transcribe get-transcription-job \
    # and spit it out as raw text
    curl -s http://data.danwin.com/cardib-shutdown.json \
    | jq '.results.transcripts[0].transcript' --raw-output
    ```

    ~~~
  14. dannguyen revised this gist Jan 18, 2019. 1 changed file with 41 additions and 5 deletions.
    46 changes: 41 additions & 5 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -7,8 +7,6 @@ Inspired by the following exchange on Twitter, in [which someone captures and po

    The instructions and code below show how to use command-line tools/scripting and [Amazon's Transcribe service](https://console.aws.amazon.com/transcribe) to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amazing service!

    (note: you should obviously not do this as one big old bash script, but I wrote this up as an example of what CLI can do if you have some weird elaborate needs)


    ## Requirements

    @@ -31,6 +29,44 @@ Install:
    - Because AWS Transcribe requires we send it an audio file, use **ffmpeg** to extract the audio from `cardib.mp4` and save it to `cardib.mp3`
    - Because AWS Transcribe only works on audio files stored on AWS S3, we use `awscli` to upload `cardiob.mp3` to an online S3 bucket, eg. http://data.danwin.com/tmp/cardib.mp3
    - Use `awscli` to access the AWS Transcribe API and [start a transcription job](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.start_transcription_job)
    - Wait a couple of minutes
    - Use **awscli** to [get the details](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.get_transcription_job) of the initiated (and hopefully, finished) transcription job
    - Use **jq**
    - Wait a couple of minutes, and/or use **awscli** to occassionally [get the details](https://boto3.amazona`ws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.get_transcription_job) of the transcription job to see if it's finished ([sample response JSON from the get-transcription-job endpoint](#file-get-transcript-job-json))
    - Use **curl** to download the transcript data from the expected URL, e.g. http://data.danwin.com/cardib-shutdown.json (see [pretty preview here](#file-transcript-file-json))
    - Use **jq** to process the [transcript data](#file-transcript-file-json) and extract the `transcript` value, which contains the transcription text as a single string


    ## The script

    So obviously you should **not** do this as a big ol bash script (or even bash/CLI at all). But I wrote this example up for a talk on how you can learn the CLI by messing around for fun, and this is an elaborate example of the pain you can put yourself through.

    ```sh
    # fetch that video and save it to the working directory
    # as `cardib-shutdown.mp4`
    youtube-dl --output cardib-shutdown.mp4 \
    https://twitter.com/JordanUhl/status/1085669288051175424

    # extract the audio as a mp3 file
    ffmpeg -i cardib-shutdown.mp4 \
    -acodec libmp3lame cardib-shutdown.mp3

    # upload the mp3 file to a S3 bucket
    # (and optionally make it publicly readable)
    aws s3 cp --acl public-read \
    cardib-shutdown.mp3 s3://data.danwin.com/tmp/cardib-shutdown.mp3

    aws transcribe start-transcription-job \
    --language-code 'en-US' --media-format 'mp3' \
    --transcription-job-name 'cardib-shutdown' \
    --media '{"MediaFileUri": "s3://data.danwin.com/tmp/cardib-shutdown.mp3"}' \
    --output-bucket-name 'data.danwin.com'

    # optionally: use this to check the status of the job before attempting
    # to download the transcript
    aws transcribe get-transcription-job \
    --transcription-job-name cardib-shutdown

    # Download the JSON at the expected S3 URL, parse it with jq
    # and spit it out as raw text
    curl -s http://data.danwin.com/cardib-shutdown.json \
    | jq '.results.transcripts[0].transcript' --raw-output
    ```

  15. dannguyen revised this gist Jan 18, 2019. 1 changed file with 20 additions and 0 deletions.
    20 changes: 20 additions & 0 deletions get-transcript-job.json
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,20 @@
    {
    "TranscriptionJob": {
    "TranscriptionJobName": "cardib-shutdown",
    "TranscriptionJobStatus": "COMPLETED",
    "LanguageCode": "en-US",
    "MediaSampleRateHertz": 44100,
    "MediaFormat": "mp3",
    "Media": {
    "MediaFileUri": "s3://data.danwin.com/tmp/cardib-shutdown.mp3"
    },
    "Transcript": {
    "TranscriptFileUri": "https://s3.amazonaws.com/data.danwin.com/cardib-shutdown.json"
    },
    "CreationTime": 1547795428.734,
    "CompletionTime": 1547795570.152,
    "Settings": {
    "ChannelIdentification": false
    }
    }
    }
  16. dannguyen revised this gist Jan 18, 2019. 1 changed file with 2450 additions and 0 deletions.
    2,450 changes: 2,450 additions & 0 deletions transcript-file.json
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,2450 @@
    {
    "jobName": "cardib-shutdown",
    "accountId": "263510883111",
    "results": {
    "transcripts": [
    {
    "transcript": "Hey. Yeah. I just want to remind you because there's been a little bit over three weeks, okay? It's been a little bit over three weeks. Trump is now ordering as his some missing federal government workers to go back to work without getting paid. Now, I don't want to hear your mother focus talking about all but Obama Shut down the government for seventeen days year bitch for health care. So your grandma could check her blood pressure and your business to go take a piss in the gynecologist with no motherfucking problem. Now, I know a lot of guys don't care because I don't work for the government or your partner. They have a job, but this shit is really fucking serious, bro. This city is crazy. Like a country is in a hell hole right now. All for fucking war. And we really need to take this serious. I feel that we need to take some action. I don't know what type of actual base because it is not what I do, but I'm scared. This is crazy. And I really feel bad for these people. They got to go to fucking work, to not get motherfucking paid."
    }
    ],
    "items": [
    {
    "start_time": "0.09",
    "end_time": "0.38",
    "alternatives": [
    {
    "confidence": "0.9769",
    "content": "Hey"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "0.38",
    "end_time": "0.77",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "Yeah"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "0.78",
    "end_time": "0.89",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "0.89",
    "end_time": "1.09",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "just"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.09",
    "end_time": "1.24",
    "alternatives": [
    {
    "confidence": "0.4934",
    "content": "want"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.24",
    "end_time": "1.3",
    "alternatives": [
    {
    "confidence": "0.9752",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.31",
    "end_time": "1.74",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "remind"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.74",
    "end_time": "1.82",
    "alternatives": [
    {
    "confidence": "0.9990",
    "content": "you"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "1.83",
    "end_time": "2.2",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "because"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.2",
    "end_time": "2.33",
    "alternatives": [
    {
    "confidence": "0.9905",
    "content": "there's"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.33",
    "end_time": "2.52",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "been"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.52",
    "end_time": "2.59",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.59",
    "end_time": "2.81",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "little"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.81",
    "end_time": "2.97",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "bit"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "2.97",
    "end_time": "3.16",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "over"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "3.16",
    "end_time": "3.44",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "three"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "3.44",
    "end_time": "4.11",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "weeks"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "4.18",
    "end_time": "4.72",
    "alternatives": [
    {
    "confidence": "0.5594",
    "content": "okay"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "?"
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "4.73",
    "end_time": "4.91",
    "alternatives": [
    {
    "confidence": "0.9994",
    "content": "It's"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "4.91",
    "end_time": "5.06",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "been"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.06",
    "end_time": "5.13",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.13",
    "end_time": "5.32",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "little"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.32",
    "end_time": "5.43",
    "alternatives": [
    {
    "confidence": "0.5651",
    "content": "bit"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.43",
    "end_time": "5.59",
    "alternatives": [
    {
    "confidence": "0.9845",
    "content": "over"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.6",
    "end_time": "5.82",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "three"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "5.82",
    "end_time": "6.15",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "weeks"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "6.54",
    "end_time": "7.15",
    "alternatives": [
    {
    "confidence": "0.9779",
    "content": "Trump"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "7.64",
    "end_time": "7.86",
    "alternatives": [
    {
    "confidence": "0.7704",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "7.87",
    "end_time": "8.3",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "now"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "8.31",
    "end_time": "8.94",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "ordering"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "8.94",
    "end_time": "9.11",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "as"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.11",
    "end_time": "9.28",
    "alternatives": [
    {
    "confidence": "0.6055",
    "content": "his"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.28",
    "end_time": "9.54",
    "alternatives": [
    {
    "confidence": "0.9152",
    "content": "some"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "9.54",
    "end_time": "10.1",
    "alternatives": [
    {
    "confidence": "0.9541",
    "content": "missing"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "10.43",
    "end_time": "10.86",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "federal"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "10.86",
    "end_time": "11.37",
    "alternatives": [
    {
    "confidence": "0.6420",
    "content": "government"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "11.37",
    "end_time": "11.72",
    "alternatives": [
    {
    "confidence": "0.9982",
    "content": "workers"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "11.72",
    "end_time": "11.82",
    "alternatives": [
    {
    "confidence": "0.9979",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "11.82",
    "end_time": "12.08",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "go"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "12.08",
    "end_time": "12.39",
    "alternatives": [
    {
    "confidence": "0.9311",
    "content": "back"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "12.39",
    "end_time": "12.53",
    "alternatives": [
    {
    "confidence": "0.9311",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "12.53",
    "end_time": "13.05",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "work"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "13.73",
    "end_time": "14.24",
    "alternatives": [
    {
    "confidence": "0.9964",
    "content": "without"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "14.24",
    "end_time": "14.52",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "getting"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "14.52",
    "end_time": "14.95",
    "alternatives": [
    {
    "confidence": "0.5376",
    "content": "paid"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "15.14",
    "end_time": "15.64",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "Now"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "15.65",
    "end_time": "15.77",
    "alternatives": [
    {
    "confidence": "0.9997",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "15.77",
    "end_time": "15.91",
    "alternatives": [
    {
    "confidence": "0.9940",
    "content": "don't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "15.91",
    "end_time": "16.04",
    "alternatives": [
    {
    "confidence": "0.9951",
    "content": "want"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.04",
    "end_time": "16.1",
    "alternatives": [
    {
    "confidence": "0.9992",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.1",
    "end_time": "16.21",
    "alternatives": [
    {
    "confidence": "0.9996",
    "content": "hear"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.21",
    "end_time": "16.37",
    "alternatives": [
    {
    "confidence": "0.6916",
    "content": "your"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.37",
    "end_time": "16.56",
    "alternatives": [
    {
    "confidence": "0.3523",
    "content": "mother"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.56",
    "end_time": "16.84",
    "alternatives": [
    {
    "confidence": "0.3799",
    "content": "focus"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "16.84",
    "end_time": "17.08",
    "alternatives": [
    {
    "confidence": "0.9541",
    "content": "talking"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.08",
    "end_time": "17.25",
    "alternatives": [
    {
    "confidence": "0.9243",
    "content": "about"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.26",
    "end_time": "17.54",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "all"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.54",
    "end_time": "17.71",
    "alternatives": [
    {
    "confidence": "0.9833",
    "content": "but"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "17.71",
    "end_time": "18.13",
    "alternatives": [
    {
    "confidence": "0.9066",
    "content": "Obama"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.13",
    "end_time": "18.34",
    "alternatives": [
    {
    "confidence": "0.9986",
    "content": "Shut"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.34",
    "end_time": "18.51",
    "alternatives": [
    {
    "confidence": "0.9986",
    "content": "down"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.51",
    "end_time": "18.6",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "18.6",
    "end_time": "19.0",
    "alternatives": [
    {
    "confidence": "0.8789",
    "content": "government"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "19.0",
    "end_time": "19.08",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "19.08",
    "end_time": "19.46",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "seventeen"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "19.46",
    "end_time": "19.73",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "days"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "19.87",
    "end_time": "20.36",
    "alternatives": [
    {
    "confidence": "0.6234",
    "content": "year"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "20.36",
    "end_time": "20.92",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "bitch"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "21.18",
    "end_time": "21.42",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "21.42",
    "end_time": "21.75",
    "alternatives": [
    {
    "confidence": "0.7029",
    "content": "health"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "21.76",
    "end_time": "22.25",
    "alternatives": [
    {
    "confidence": "0.7029",
    "content": "care"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "22.32",
    "end_time": "22.53",
    "alternatives": [
    {
    "confidence": "0.9736",
    "content": "So"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "22.53",
    "end_time": "22.66",
    "alternatives": [
    {
    "confidence": "0.8489",
    "content": "your"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "22.66",
    "end_time": "23.06",
    "alternatives": [
    {
    "confidence": "0.7744",
    "content": "grandma"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "23.06",
    "end_time": "23.25",
    "alternatives": [
    {
    "confidence": "0.9692",
    "content": "could"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "23.25",
    "end_time": "23.47",
    "alternatives": [
    {
    "confidence": "0.9988",
    "content": "check"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "23.47",
    "end_time": "23.55",
    "alternatives": [
    {
    "confidence": "0.7296",
    "content": "her"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "23.55",
    "end_time": "23.89",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "blood"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "23.89",
    "end_time": "24.51",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "pressure"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "24.67",
    "end_time": "24.79",
    "alternatives": [
    {
    "confidence": "0.6264",
    "content": "and"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "24.79",
    "end_time": "24.92",
    "alternatives": [
    {
    "confidence": "0.6246",
    "content": "your"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "24.92",
    "end_time": "25.2",
    "alternatives": [
    {
    "confidence": "0.9947",
    "content": "business"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.2",
    "end_time": "25.31",
    "alternatives": [
    {
    "confidence": "0.8231",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.31",
    "end_time": "25.48",
    "alternatives": [
    {
    "confidence": "0.9360",
    "content": "go"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.48",
    "end_time": "25.73",
    "alternatives": [
    {
    "confidence": "0.8723",
    "content": "take"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.73",
    "end_time": "25.85",
    "alternatives": [
    {
    "confidence": "0.8261",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "25.85",
    "end_time": "26.04",
    "alternatives": [
    {
    "confidence": "0.7526",
    "content": "piss"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "26.04",
    "end_time": "26.15",
    "alternatives": [
    {
    "confidence": "0.9690",
    "content": "in"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "26.15",
    "end_time": "26.23",
    "alternatives": [
    {
    "confidence": "0.9797",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "26.23",
    "end_time": "26.85",
    "alternatives": [
    {
    "confidence": "0.7954",
    "content": "gynecologist"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "26.85",
    "end_time": "26.94",
    "alternatives": [
    {
    "confidence": "0.4121",
    "content": "with"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "26.94",
    "end_time": "27.07",
    "alternatives": [
    {
    "confidence": "0.9952",
    "content": "no"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "27.07",
    "end_time": "27.56",
    "alternatives": [
    {
    "confidence": "0.9913",
    "content": "motherfucking"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "27.57",
    "end_time": "28.25",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "problem"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "28.6",
    "end_time": "29.21",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "Now"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "29.22",
    "end_time": "29.39",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.39",
    "end_time": "29.6",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "know"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.6",
    "end_time": "29.65",
    "alternatives": [
    {
    "confidence": "0.9913",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.65",
    "end_time": "29.82",
    "alternatives": [
    {
    "confidence": "0.9913",
    "content": "lot"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.82",
    "end_time": "29.88",
    "alternatives": [
    {
    "confidence": "0.9764",
    "content": "of"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "29.88",
    "end_time": "30.02",
    "alternatives": [
    {
    "confidence": "0.6201",
    "content": "guys"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.03",
    "end_time": "30.25",
    "alternatives": [
    {
    "confidence": "0.9986",
    "content": "don't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.25",
    "end_time": "30.46",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "care"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.46",
    "end_time": "30.8",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "because"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.81",
    "end_time": "30.85",
    "alternatives": [
    {
    "confidence": "0.8321",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.85",
    "end_time": "30.98",
    "alternatives": [
    {
    "confidence": "0.9692",
    "content": "don't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "30.98",
    "end_time": "31.16",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "work"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.16",
    "end_time": "31.25",
    "alternatives": [
    {
    "confidence": "0.9996",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.25",
    "end_time": "31.35",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "the"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.35",
    "end_time": "31.77",
    "alternatives": [
    {
    "confidence": "0.5647",
    "content": "government"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.77",
    "end_time": "31.87",
    "alternatives": [
    {
    "confidence": "0.9998",
    "content": "or"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "31.87",
    "end_time": "32.03",
    "alternatives": [
    {
    "confidence": "0.9808",
    "content": "your"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.03",
    "end_time": "32.41",
    "alternatives": [
    {
    "confidence": "0.7594",
    "content": "partner"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "32.41",
    "end_time": "32.5",
    "alternatives": [
    {
    "confidence": "0.8450",
    "content": "They"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.5",
    "end_time": "32.66",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "have"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.66",
    "end_time": "32.75",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "32.75",
    "end_time": "33.15",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "job"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "33.34",
    "end_time": "33.51",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "but"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.51",
    "end_time": "33.67",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "this"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.67",
    "end_time": "33.82",
    "alternatives": [
    {
    "confidence": "0.8953",
    "content": "shit"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.82",
    "end_time": "33.96",
    "alternatives": [
    {
    "confidence": "0.9805",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "33.96",
    "end_time": "34.19",
    "alternatives": [
    {
    "confidence": "0.8380",
    "content": "really"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "34.2",
    "end_time": "34.49",
    "alternatives": [
    {
    "confidence": "0.9950",
    "content": "fucking"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "34.5",
    "end_time": "35.0",
    "alternatives": [
    {
    "confidence": "0.9943",
    "content": "serious"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "35.0",
    "end_time": "35.35",
    "alternatives": [
    {
    "confidence": "0.9517",
    "content": "bro"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "35.69",
    "end_time": "35.91",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "This"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "35.91",
    "end_time": "36.11",
    "alternatives": [
    {
    "confidence": "0.5930",
    "content": "city"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "36.11",
    "end_time": "36.21",
    "alternatives": [
    {
    "confidence": "0.9769",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "36.21",
    "end_time": "36.74",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "crazy"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "36.75",
    "end_time": "37.02",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "Like"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "37.03",
    "end_time": "37.42",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "37.43",
    "end_time": "38.05",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "country"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "38.84",
    "end_time": "39.04",
    "alternatives": [
    {
    "confidence": "0.8721",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.04",
    "end_time": "39.16",
    "alternatives": [
    {
    "confidence": "0.8711",
    "content": "in"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.16",
    "end_time": "39.23",
    "alternatives": [
    {
    "confidence": "0.9954",
    "content": "a"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.23",
    "end_time": "39.41",
    "alternatives": [
    {
    "confidence": "0.8528",
    "content": "hell"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.41",
    "end_time": "39.6",
    "alternatives": [
    {
    "confidence": "0.8528",
    "content": "hole"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.61",
    "end_time": "39.8",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "right"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "39.8",
    "end_time": "40.1",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "now"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "40.11",
    "end_time": "40.4",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "All"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "40.4",
    "end_time": "40.67",
    "alternatives": [
    {
    "confidence": "0.4697",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "40.67",
    "end_time": "40.98",
    "alternatives": [
    {
    "confidence": "0.9511",
    "content": "fucking"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "40.98",
    "end_time": "41.31",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "war"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "41.69",
    "end_time": "41.81",
    "alternatives": [
    {
    "confidence": "0.8176",
    "content": "And"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "41.81",
    "end_time": "41.89",
    "alternatives": [
    {
    "confidence": "0.9436",
    "content": "we"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "41.89",
    "end_time": "42.12",
    "alternatives": [
    {
    "confidence": "0.8468",
    "content": "really"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.12",
    "end_time": "42.27",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "need"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.27",
    "end_time": "42.34",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.34",
    "end_time": "42.63",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "take"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.63",
    "end_time": "42.79",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "this"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "42.79",
    "end_time": "43.38",
    "alternatives": [
    {
    "confidence": "0.7554",
    "content": "serious"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "43.63",
    "end_time": "43.77",
    "alternatives": [
    {
    "confidence": "0.9593",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "43.77",
    "end_time": "43.92",
    "alternatives": [
    {
    "confidence": "0.9835",
    "content": "feel"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "43.92",
    "end_time": "44.04",
    "alternatives": [
    {
    "confidence": "0.5382",
    "content": "that"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "44.04",
    "end_time": "44.14",
    "alternatives": [
    {
    "confidence": "0.9993",
    "content": "we"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "44.14",
    "end_time": "44.26",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "need"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "44.26",
    "end_time": "44.32",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "44.33",
    "end_time": "44.51",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "take"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "44.51",
    "end_time": "44.69",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "some"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "44.69",
    "end_time": "45.1",
    "alternatives": [
    {
    "confidence": "0.9994",
    "content": "action"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "45.22",
    "end_time": "45.51",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "45.51",
    "end_time": "45.69",
    "alternatives": [
    {
    "confidence": "0.9950",
    "content": "don't"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "45.69",
    "end_time": "45.87",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "know"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "45.87",
    "end_time": "46.17",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "what"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "46.17",
    "end_time": "46.41",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "type"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "46.41",
    "end_time": "46.48",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "of"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "46.49",
    "end_time": "46.83",
    "alternatives": [
    {
    "confidence": "0.7523",
    "content": "actual"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "46.83",
    "end_time": "47.07",
    "alternatives": [
    {
    "confidence": "0.2273",
    "content": "base"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "47.07",
    "end_time": "47.55",
    "alternatives": [
    {
    "confidence": "0.9998",
    "content": "because"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "48.24",
    "end_time": "48.37",
    "alternatives": [
    {
    "confidence": "0.8595",
    "content": "it"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "48.37",
    "end_time": "48.55",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "48.55",
    "end_time": "48.76",
    "alternatives": [
    {
    "confidence": "0.9915",
    "content": "not"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "48.76",
    "end_time": "48.89",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "what"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "48.89",
    "end_time": "49.01",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "49.02",
    "end_time": "49.34",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "do"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "49.49",
    "end_time": "50.06",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "but"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "50.56",
    "end_time": "50.71",
    "alternatives": [
    {
    "confidence": "0.9841",
    "content": "I'm"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "50.71",
    "end_time": "51.25",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "scared"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "52.04",
    "end_time": "52.34",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "This"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "52.34",
    "end_time": "52.47",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "is"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "52.47",
    "end_time": "52.95",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "crazy"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "52.95",
    "end_time": "53.15",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "And"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "53.15",
    "end_time": "53.23",
    "alternatives": [
    {
    "confidence": "0.9940",
    "content": "I"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "53.24",
    "end_time": "53.5",
    "alternatives": [
    {
    "confidence": "0.9821",
    "content": "really"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "53.5",
    "end_time": "53.73",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "feel"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "53.73",
    "end_time": "54.15",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "bad"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "54.3",
    "end_time": "54.45",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "for"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "54.45",
    "end_time": "54.6",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "these"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "54.6",
    "end_time": "54.81",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "people"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "54.81",
    "end_time": "54.9",
    "alternatives": [
    {
    "confidence": "0.7858",
    "content": "They"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "54.9",
    "end_time": "55.03",
    "alternatives": [
    {
    "confidence": "0.7654",
    "content": "got"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "55.03",
    "end_time": "55.09",
    "alternatives": [
    {
    "confidence": "0.7615",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "55.1",
    "end_time": "55.24",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "go"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "55.24",
    "end_time": "55.32",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "55.32",
    "end_time": "55.67",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "fucking"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "55.68",
    "end_time": "56.02",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "work"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": ","
    }
    ],
    "type": "punctuation"
    },
    {
    "start_time": "56.2",
    "end_time": "56.35",
    "alternatives": [
    {
    "confidence": "0.6794",
    "content": "to"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "56.35",
    "end_time": "56.57",
    "alternatives": [
    {
    "confidence": "0.9940",
    "content": "not"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "56.57",
    "end_time": "56.75",
    "alternatives": [
    {
    "confidence": "1.0000",
    "content": "get"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "56.75",
    "end_time": "57.32",
    "alternatives": [
    {
    "confidence": "0.9937",
    "content": "motherfucking"
    }
    ],
    "type": "pronunciation"
    },
    {
    "start_time": "57.33",
    "end_time": "57.65",
    "alternatives": [
    {
    "confidence": "0.9516",
    "content": "paid"
    }
    ],
    "type": "pronunciation"
    },
    {
    "alternatives": [
    {
    "confidence": null,
    "content": "."
    }
    ],
    "type": "punctuation"
    }
    ]
    },
    "status": "COMPLETED"
    }
  17. dannguyen revised this gist Jan 18, 2019. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions start-transcript-job-response.json
    Original file line number Diff line number Diff line change
    @@ -1,12 +1,12 @@
    {
    "TranscriptionJob": {
    "TranscriptionJobName": "cardib-politics",
    "TranscriptionJobName": "cardib-shutdown",
    "TranscriptionJobStatus": "IN_PROGRESS",
    "LanguageCode": "en-US",
    "MediaFormat": "mp3",
    "Media": {
    "MediaFileUri": "s3://data.danwin.com/tmp/cardib.mp3"
    "MediaFileUri": "s3://data.danwin.com/tmp/cardib-shutdown.mp3"
    },
    "CreationTime": 1547794736.719
    "CreationTime": 1547795428.734
    }
    }
  18. dannguyen revised this gist Jan 18, 2019. 1 changed file with 12 additions and 0 deletions.
    12 changes: 12 additions & 0 deletions start-transcript-job-response.json
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,12 @@
    {
    "TranscriptionJob": {
    "TranscriptionJobName": "cardib-politics",
    "TranscriptionJobStatus": "IN_PROGRESS",
    "LanguageCode": "en-US",
    "MediaFormat": "mp3",
    "Media": {
    "MediaFileUri": "s3://data.danwin.com/tmp/cardib.mp3"
    },
    "CreationTime": 1547794736.719
    }
    }
  19. dannguyen revised this gist Jan 18, 2019. 1 changed file with 19 additions and 0 deletions.
    19 changes: 19 additions & 0 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -3,6 +3,13 @@
    Inspired by the following exchange on Twitter, in [which someone captures and posts a valuable video](https://twitter.com/JordanUhl/status/1085669288051175424) onto Twitter, but doesn't have the resources to [easily transcribe it for the hearing-impaired](https://twitter.com/riotpedestrian/status/1085753671726452736):


    ![Screencap of @jordanuhl's video tweet, followed by a request for a transcript](https://user-images.githubusercontent.com/121520/51369686-23e59b00-1aba-11e9-87db-e145f8d2cda5.png)

    The instructions and code below show how to use command-line tools/scripting and [Amazon's Transcribe service](https://console.aws.amazon.com/transcribe) to transcribe the audio from online video. tl;dr: AWS Transcribe is a pretty amazing service!

    (note: you should obviously not do this as one big old bash script, but I wrote this up as an example of what CLI can do if you have some weird elaborate needs)


    ## Requirements

    Sign-up for Amazon Web Services: https://aws.amazon.com
    @@ -15,3 +22,15 @@ Install:
    - [jq](https://stedolan.github.io/jq/) - for parsing JSON data
    - [ffmpeg](https://www.ffmpeg.org/) - for media file conversion, e.g. extracting mp3 audio from video


    ## The steps

    - Find a tweet containing a video you like
    - Get that tweet's URL, e.g. https://twitter.com/JordanUhl/status/1085669288051175424
    - Use **youtube-dl** to download the video from that tweet and save it to disk, e.g. `cardib.mp4`
    - Because AWS Transcribe requires we send it an audio file, use **ffmpeg** to extract the audio from `cardib.mp4` and save it to `cardib.mp3`
    - Because AWS Transcribe only works on audio files stored on AWS S3, we use `awscli` to upload `cardiob.mp3` to an online S3 bucket, eg. http://data.danwin.com/tmp/cardib.mp3
    - Use `awscli` to access the AWS Transcribe API and [start a transcription job](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.start_transcription_job)
    - Wait a couple of minutes
    - Use **awscli** to [get the details](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html?highlight=transcribe#TranscribeService.Client.get_transcription_job) of the initiated (and hopefully, finished) transcription job
    - Use **jq**
  20. dannguyen revised this gist Jan 18, 2019. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -1,9 +1,12 @@
    # Transcribing Cardi B's political speech with command-line tools

    Inspired by the following exchange on Twitter, in [which someone captures and posts a valuable video](https://twitter.com/JordanUhl/status/1085669288051175424) onto Twitter, but doesn't have the resources to [easily transcribe it for the hearing-impaired](https://twitter.com/riotpedestrian/status/1085753671726452736):


    ## Requirements

    Sign-up for Amazon Web Services: https://aws.amazon.com

    Install:

    - [youtube-dl](https://rg3.github.io/youtube-dl/) - for fetching video files from social media services
    @@ -12,4 +15,3 @@ Install:
    - [jq](https://stedolan.github.io/jq/) - for parsing JSON data
    - [ffmpeg](https://www.ffmpeg.org/) - for media file conversion, e.g. extracting mp3 audio from video

    Sign-up for Amazon Web Services: https://aws.amazon.com
  21. dannguyen created this gist Jan 18, 2019.
    15 changes: 15 additions & 0 deletions cardib-politics-talk-transcribe.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,15 @@
    # Transcribing Cardi B's political speech with command-line tools



    ## Requirements

    Install:

    - [youtube-dl](https://rg3.github.io/youtube-dl/) - for fetching video files from social media services
    - [awscli](https://aws.amazon.com/cli/) - for accessing various AWS services, specfically S3 (for storing the video and its processed transcription) and Transcribe
    - [curl](https://curl.haxx.se/) - for downloading from URLs
    - [jq](https://stedolan.github.io/jq/) - for parsing JSON data
    - [ffmpeg](https://www.ffmpeg.org/) - for media file conversion, e.g. extracting mp3 audio from video

    Sign-up for Amazon Web Services: https://aws.amazon.com