Skip to content

Instantly share code, notes, and snippets.

@ihfbib
Forked from landsman/facebook_crawler.sh
Created February 19, 2024 01:10
Show Gist options
  • Save ihfbib/09db36b176f796c03d3c653663ce404f to your computer and use it in GitHub Desktop.
Save ihfbib/09db36b176f796c03d3c653663ce404f to your computer and use it in GitHub Desktop.

Revisions

  1. @landsman landsman revised this gist Aug 22, 2023. 1 changed file with 4 additions and 6 deletions.
    10 changes: 4 additions & 6 deletions facebook_crawler.sh
    Original file line number Diff line number Diff line change
    @@ -1,9 +1,7 @@
    #!/bin/bash

    #
    # README: run this script like this: sh facebook_crawler.sh > fb_ips.csv
    #
    # doc: https://developers.facebook.com/docs/sharing/webmasters/crawler/
    # see: https://developers.facebook.com/docs/sharing/webmasters/crawler/
    #

    # Run the whois command and store the output in a variable
    @@ -12,8 +10,8 @@ whois_output=$(whois -h whois.radb.net -- '-i origin AS32934')
    # Use grep to extract lines starting with "route"
    routes=$(echo "$whois_output" | grep '^route')

    # Use grep and regular expression to extract IPv4 addresses
    ipv4only=$(echo "$routes" | grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}')
    # Use grep and regular expression to extract lines with IPv4 addresses and ranges
    ipv4only=$(echo "$routes" | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}(/[0-9]{1,2})?')

    # Print the formatted IPv4 addresses
    echo "$ipv4only"
    echo "$ipv4only"
  2. @landsman landsman revised this gist Aug 22, 2023. 1 changed file with 19 additions and 0 deletions.
    19 changes: 19 additions & 0 deletions facebook_crawler.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,19 @@
    #!/bin/bash

    #
    # README: run this script like this: sh facebook_crawler.sh > fb_ips.csv
    #
    # doc: https://developers.facebook.com/docs/sharing/webmasters/crawler/
    #

    # Run the whois command and store the output in a variable
    whois_output=$(whois -h whois.radb.net -- '-i origin AS32934')

    # Use grep to extract lines starting with "route"
    routes=$(echo "$whois_output" | grep '^route')

    # Use grep and regular expression to extract IPv4 addresses
    ipv4only=$(echo "$routes" | grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}')

    # Print the formatted IPv4 addresses
    echo "$ipv4only"
  3. @landsman landsman revised this gist Aug 17, 2023. 2 changed files with 127 additions and 0 deletions.
    File renamed without changes.
    127 changes: 127 additions & 0 deletions google_bot_crawlers.csv
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,127 @@
    192.178.5.0/27
    34.100.182.96/28
    34.101.50.144/28
    34.118.254.0/28
    34.118.66.0/28
    34.126.178.96/28
    34.146.150.144/28
    34.147.110.144/28
    34.151.74.144/28
    34.152.50.64/28
    34.154.114.144/28
    34.155.98.32/28
    34.165.18.176/28
    34.175.160.64/28
    34.176.130.16/28
    34.22.85.0/27
    34.64.82.64/28
    34.65.242.112/28
    34.80.50.80/28
    34.88.194.0/28
    34.89.10.80/28
    34.89.198.80/28
    34.96.162.48/28
    35.247.243.240/28
    66.249.64.0/27
    66.249.64.128/27
    66.249.64.160/27
    66.249.64.192/27
    66.249.64.224/27
    66.249.64.32/27
    66.249.64.64/27
    66.249.64.96/27
    66.249.65.0/27
    66.249.65.128/27
    66.249.65.160/27
    66.249.65.192/27
    66.249.65.224/27
    66.249.65.32/27
    66.249.65.64/27
    66.249.65.96/27
    66.249.66.0/27
    66.249.66.128/27
    66.249.66.160/27
    66.249.66.192/27
    66.249.66.32/27
    66.249.66.64/27
    66.249.66.96/27
    66.249.68.0/27
    66.249.68.32/27
    66.249.68.64/27
    66.249.69.0/27
    66.249.69.128/27
    66.249.69.160/27
    66.249.69.192/27
    66.249.69.224/27
    66.249.69.32/27
    66.249.69.64/27
    66.249.69.96/27
    66.249.70.0/27
    66.249.70.128/27
    66.249.70.160/27
    66.249.70.192/27
    66.249.70.224/27
    66.249.70.32/27
    66.249.70.64/27
    66.249.70.96/27
    66.249.71.0/27
    66.249.71.128/27
    66.249.71.160/27
    66.249.71.192/27
    66.249.71.224/27
    66.249.71.32/27
    66.249.71.64/27
    66.249.71.96/27
    66.249.72.0/27
    66.249.72.128/27
    66.249.72.160/27
    66.249.72.192/27
    66.249.72.224/27
    66.249.72.32/27
    66.249.72.64/27
    66.249.72.96/27
    66.249.73.0/27
    66.249.73.128/27
    66.249.73.160/27
    66.249.73.192/27
    66.249.73.224/27
    66.249.73.32/27
    66.249.73.64/27
    66.249.73.96/27
    66.249.74.0/27
    66.249.74.128/27
    66.249.74.32/27
    66.249.74.64/27
    66.249.74.96/27
    66.249.75.0/27
    66.249.75.128/27
    66.249.75.160/27
    66.249.75.192/27
    66.249.75.224/27
    66.249.75.32/27
    66.249.75.64/27
    66.249.75.96/27
    66.249.76.0/27
    66.249.76.128/27
    66.249.76.160/27
    66.249.76.192/27
    66.249.76.224/27
    66.249.76.32/27
    66.249.76.64/27
    66.249.76.96/27
    66.249.77.0/27
    66.249.77.128/27
    66.249.77.160/27
    66.249.77.192/27
    66.249.77.32/27
    66.249.77.64/27
    66.249.77.96/27
    66.249.78.0/27
    66.249.79.0/27
    66.249.79.128/27
    66.249.79.160/27
    66.249.79.192/27
    66.249.79.224/27
    66.249.79.32/27
    66.249.79.64/27
    66.249.79.96/27
  4. @landsman landsman created this gist Aug 11, 2023.
    104 changes: 104 additions & 0 deletions google_crawlers.csv
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,104 @@
    192.178.17.0/27
    209.85.238.0/27
    209.85.238.128/27
    209.85.238.160/27
    209.85.238.192/27
    209.85.238.224/27
    209.85.238.32/27
    209.85.238.64/27
    209.85.238.96/27
    66.249.87.0/27
    66.249.87.128/27
    66.249.87.160/27
    66.249.87.192/27
    66.249.87.224/27
    66.249.87.32/27
    66.249.87.64/27
    66.249.87.96/27
    66.249.89.0/27
    66.249.89.128/27
    66.249.89.160/27
    66.249.89.192/27
    66.249.89.224/27
    66.249.89.32/27
    66.249.89.64/27
    66.249.89.96/27
    66.249.90.0/27
    66.249.90.128/27
    66.249.90.160/27
    66.249.90.192/27
    66.249.90.224/27
    66.249.90.32/27
    66.249.90.64/27
    66.249.90.96/27
    66.249.91.0/27
    66.249.91.128/27
    66.249.91.160/27
    66.249.91.192/27
    66.249.91.224/27
    66.249.91.32/27
    66.249.91.64/27
    66.249.91.96/27
    66.249.92.0/27
    66.249.92.128/27
    66.249.92.160/27
    66.249.92.192/27
    66.249.92.32/27
    66.249.92.64/27
    66.249.92.96/27
    72.14.199.0/27
    72.14.199.128/27
    72.14.199.160/27
    72.14.199.192/27
    72.14.199.224/27
    72.14.199.32/27
    72.14.199.64/27
    72.14.199.96/27
    74.125.148.0/27
    74.125.148.128/27
    74.125.148.160/27
    74.125.148.192/27
    74.125.148.224/27
    74.125.148.32/27
    74.125.148.64/27
    74.125.148.96/27
    74.125.149.0/27
    74.125.149.128/27
    74.125.149.160/27
    74.125.149.192/27
    74.125.149.224/27
    74.125.149.32/27
    74.125.149.64/27
    74.125.149.96/27
    74.125.150.0/27
    74.125.150.32/27
    74.125.150.64/27
    74.125.151.0/27
    74.125.151.128/27
    74.125.151.160/27
    74.125.151.192/27
    74.125.151.224/27
    74.125.151.32/27
    74.125.151.64/27
    74.125.151.96/27
    74.125.216.0/27
    74.125.216.128/27
    74.125.216.160/27
    74.125.216.192/27
    74.125.216.224/27
    74.125.216.32/27
    74.125.216.64/27
    74.125.216.96/27
    74.125.217.0/27
    74.125.217.128/27
    74.125.217.32/27
    74.125.217.64/27
    74.125.217.96/27
    74.125.218.0/27
    74.125.218.128/27
    74.125.218.160/27
    74.125.218.192/27
    74.125.218.32/27
    74.125.218.64/27
    74.125.218.96/27
    74.125.219.0/27
    9 changes: 9 additions & 0 deletions parse_ips.js
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,9 @@
    // copy-paste here file from https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot
    const crawlers = {};

    const prefixes = crawlers.prefixes;
    prefixes.map(i => {
    if (i.ipv4Prefix) {
    console.log(i.ipv4Prefix);
    }
    });