Skip to content

Instantly share code, notes, and snippets.

@mccutchen
Last active July 19, 2023 19:58
Show Gist options
  • Save mccutchen/3416442743b8ae20a502ae367dba43f9 to your computer and use it in GitHub Desktop.
Save mccutchen/3416442743b8ae20a502ae367dba43f9 to your computer and use it in GitHub Desktop.

Revisions

  1. mccutchen revised this gist Jul 19, 2023. 1 changed file with 5 additions and 1 deletion.
    6 changes: 5 additions & 1 deletion thresholderbot_twitter_usage_analysis.md
    Original file line number Diff line number Diff line change
    @@ -9,7 +9,7 @@ reads.

    ### Tweets read by Thresholderbot every 30 days over the past year

    ![graph image](https://p.datadoghq.com/snapshot/view/dd-snapshots-prod/org_17809/2023-07-19/089b133dc8c63b974c034d11a837ad0e013f2f92.png)
    ![graph image](https://p.datadoghq.com/snapshot/view/dd-snapshots-prod/org_17809/2023-07-19/4185a9fdb4ec5a7ca590e26015a580297e436d21.png)


    ## Details
    @@ -46,6 +46,10 @@ Our AWS spend in particular could be optimized by simplifying Thresholderbot's
    hilariously over-engineered, overly complex codebase and architecture, but
    … that effort would be wasted now!

    ### Emails sent to Thresholderbot users every 30 days over the past year

    ![graph image](https://p.datadoghq.com/snapshot/view/dd-snapshots-prod/org_17809/2023-07-19/77a413384fb900677bbcfef2426550190afd8f85.png)

    [^1]: Note: that 90s interval was specifically chosen to stay well under the
    [rate limits][] for that API endpoint to make sure we're not abusing Twitter's
    systems!
  2. mccutchen revised this gist Jul 19, 2023. 1 changed file with 5 additions and 2 deletions.
    7 changes: 5 additions & 2 deletions thresholderbot_twitter_usage_analysis.md
    Original file line number Diff line number Diff line change
    @@ -25,7 +25,7 @@ API request returns 100 tweets from a user's timeline, that would count as 100
    reads against the allowance. For bonus points, every _retweet_ returned in a
    user's timeline counts as _two reads_ according to [this forum post][forum].

    Given that 2x penalty for retweets, the 15M tweets per month reported above is
    Given that 2x penalty for retweets[^2], the 15M tweets per month reported above is
    actually _much lower_ than what we'd be charged for if we paid for API access,
    because a significant proportion of tweets in the average user's timeline are
    retweets.
    @@ -39,7 +39,7 @@ number of emails sent in each month. That cost breaks down like so:
    | Service | Cost per month | Notes |
    | --- | --- | --- |
    | AWS (Amazon Web Services) | $200 | Infrastructure provider (EC2, RDS, S3) |
    | Mailgun | $12-20 | Outbound email to users |
    | Mailgun | $12-20 | Outbound email to users (35,000 – 45,000 per month) |
    | Google | $12 | G Suite for @thresholderbot.com email accounts |

    Our AWS spend in particular could be optimized by simplifying Thresholderbot's
    @@ -50,6 +50,9 @@ hilariously over-engineered, overly complex codebase and architecture, but
    [rate limits][] for that API endpoint to make sure we're not abusing Twitter's
    systems!

    [^2]: On a technical level, I understand why they would charge 2x views for
    each retweet, since each retweet is literally returned as two tweets in a
    trenchcoat.

    [pricing]: https://developer.twitter.com/en/portal/petition/essential/basic-info
    [caps]: https://developer.twitter.com/en/docs/twitter-api/tweet-caps
  3. mccutchen revised this gist Jul 19, 2023. 1 changed file with 4 additions and 4 deletions.
    8 changes: 4 additions & 4 deletions thresholderbot_twitter_usage_analysis.md
    Original file line number Diff line number Diff line change
    @@ -2,10 +2,10 @@

    ## Summary
    Twitter's [new API pricing starts at $5,000 per month][pricing] to read 1
    million tweets per month. According to our metrics, [Thresholderbot] has read
    more than **15 million tweets per month** on average over the last 12 full
    months of activity — and that estimate may be far too low if retweets are
    counted as 2 reads.
    million tweets per month. According to our metrics, [Thresholderbot] read more
    than **15 million tweets per month** on average over the last 12 full months of
    activity — and that estimate may be far too low if retweets are counted as 2
    reads.

    ### Tweets read by Thresholderbot every 30 days over the past year

  4. mccutchen renamed this gist Jul 19, 2023. 1 changed file with 0 additions and 0 deletions.
  5. mccutchen revised this gist Jul 19, 2023. No changes.
  6. mccutchen created this gist Jul 19, 2023.
    58 changes: 58 additions & 0 deletions thresholderbot_api_usage_analysis.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,58 @@
    # Quick and dirty Twitter API pricing analysis for [Thresholderbot]

    ## Summary
    Twitter's [new API pricing starts at $5,000 per month][pricing] to read 1
    million tweets per month. According to our metrics, [Thresholderbot] has read
    more than **15 million tweets per month** on average over the last 12 full
    months of activity — and that estimate may be far too low if retweets are
    counted as 2 reads.

    ### Tweets read by Thresholderbot every 30 days over the past year

    ![graph image](https://p.datadoghq.com/snapshot/view/dd-snapshots-prod/org_17809/2023-07-19/089b133dc8c63b974c034d11a837ad0e013f2f92.png)


    ## Details

    Our implementation fetches each Thresholderbot user's "reverse chronological"
    timeline every ~90 seconds. That's one Twitter API request per user per 90
    second interval[^1], but each one of those API requests may return 0 or more
    individual tweets (depending on how many accounts each user follows, how
    frequently those accounts tweet, etc).

    According to [the Twitter API "Tweet Caps" documentation][caps], if a single
    API request returns 100 tweets from a user's timeline, that would count as 100
    reads against the allowance. For bonus points, every _retweet_ returned in a
    user's timeline counts as _two reads_ according to [this forum post][forum].

    Given that 2x penalty for retweets, the 15M tweets per month reported above is
    actually _much lower_ than what we'd be charged for if we paid for API access,
    because a significant proportion of tweets in the average user's timeline are
    retweets.


    ## Our own monthly costs

    For the record, Thresholderbot costs about $230/month to run, depending on the
    number of emails sent in each month. That cost breaks down like so:

    | Service | Cost per month | Notes |
    | --- | --- | --- |
    | AWS (Amazon Web Services) | $200 | Infrastructure provider (EC2, RDS, S3) |
    | Mailgun | $12-20 | Outbound email to users |
    | Google | $12 | G Suite for @thresholderbot.com email accounts |

    Our AWS spend in particular could be optimized by simplifying Thresholderbot's
    hilariously over-engineered, overly complex codebase and architecture, but
    … that effort would be wasted now!

    [^1]: Note: that 90s interval was specifically chosen to stay well under the
    [rate limits][] for that API endpoint to make sure we're not abusing Twitter's
    systems!


    [pricing]: https://developer.twitter.com/en/portal/petition/essential/basic-info
    [caps]: https://developer.twitter.com/en/docs/twitter-api/tweet-caps
    [forum]: https://twittercommunity.com/t/basic-api-10k-read-limit-rate-cap-question/191900/2
    [rate limits]: https://mashable.com/article/twitter-rate-limit-exceeded-elon-musk
    [Thresholderbot]: https://thresholderbot.com/