#Bot-Buster™
Tracks nefarious activity on website, and manages accordingly.
The requesting entity:
- declares its user-agent as being wget, curl, webcopier etc - it's probably a bot.
- requests details -> details -> details -> details ad nauseum - it's probably a bot.
- requests the html, but not .css, .js or site furniture - it's probably a bot.
- generates a large number of HTTP error codes > 400 (1.e 401, 403, 404 & 500)- it's probably a bot.
- originates from an unlikely human traffic source (i.e Amazon AWS) - it's probably a bot.
- no user-agent (or matching a pattern of known bad ones) - it's probably a bot.
- no cookie, and wont honor a set cookie - it's probably a bot.
- no referrer, ever - it's probably a bot.
Probable bots will be presented with a captcha type page. Humans can confirm their cognisance, bots will be trapped.
This will work at the top of the stack using the ZTM to "manage" the offender.
One more environment to consider: the corporate network.
likely to find many dozens or hundreds of users with the exact same OS, browser, plugins, fonts etc. IP addresses are likely to be the same if the users are behind a corporate firewall.
##JavaScript Detection:
window._phantom (or window.callPhantom or navigator.onLine=false && navigator.plugins="") //phantomjs
window.__phantomas //PhantomJS-based web perf metrics + monitoring tool
window.Buffer //nodejs
window.emit //couchjs
window.spawn //rhino
window.webdriver //selenium
window.domAutomation (or window.domAutomationController) //chromium based automation driver
if (window.outerWidth === 0 && window.outerHeight === 0){ //headless browser }
##Create fingerprint, and store forever:
##Set bitmap:
| X-Bot | X-BotBitMap | Threat |
|---|---|---|
| 1 | 0000000000000001 | Unlikely Human Traffic Source (AWS, Azure, etc) |
| 2 | 0000000000000010 | Known Evasively Tricky Source Country |
| 4 | 0000000000000100 | Browser Integrity (Not requesting furniture) |
| 8 | 0000000000001000 | User Agent Spoof (Headers dont match User-Agent String) |
| 16 | 0000000000010000 | Unlikely Human Behaviour |
| 32 | 0000000000100000 | Honeytrap Access |
| 64 | 0000000001000000 | No Referrer |
| 128 | 0000000010000000 | Session Length Exceeded |
| 256 | 0000000100000000 | Pages Per Session Exceeded |
| 512 | 0000001000000000 | Bad User Agent |
| 1024 | 0000010000000000 | No Cookie |
| 2048 | 0000100000000000 | Generates lots of errors (404s) |
| 4096 | 0001000000000000 | No JavaScript |
| 8192 | 0010000000000000 | JavaScript validation Failed |
| 16384 | 0100000000000000 | Fingerprint Validation Error |
| 32768 | 1000000000000000 | Known Automation (curl, wget, Selenium/Webdriver, Phantomjs) |