Skip to content

Instantly share code, notes, and snippets.

@geerlingguy
Created May 7, 2014 18:48
Show Gist options
  • Save geerlingguy/a438b41a9a8f988ee106 to your computer and use it in GitHub Desktop.
Save geerlingguy/a438b41a9a8f988ee106 to your computer and use it in GitHub Desktop.

Revisions

  1. geerlingguy created this gist May 7, 2014.
    35 changes: 35 additions & 0 deletions crawler_detect.php
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,35 @@
    <?php

    /**
    * Check if the given user agent string is one of a crawler, spider, or bot.
    *
    * @param string $user_agent
    * A user agent string (e.g. Googlebot/2.1 (+http://www.google.com/bot.html))
    *
    * @return bool
    * TRUE if the user agent is a bot, FALSE if not.
    */
    function smart_ip_detect_crawler($user_agent) {
    // User lowercase string for comparison.
    $user_agent = strtolower($_SERVER['HTTP_USER_AGENT']);

    // A list of some common words used only for bots and crawlers.
    $bot_identifiers = array(
    'bot',
    'slurp',
    'crawler',
    'spider',
    'curl',
    'facebook',
    'fetch',
    );

    // See if one of the identifiers is in the UA string.
    foreach ($bot_identifiers as $identifier) {
    if (strpos($user_agent, $identifier) !== FALSE) {
    return TRUE;
    }
    }

    return FALSE;
    }