Skip to main content
Threat Detection

Bot & Datacenter Detection Database

Identify automated traffic from bots, scrapers, and datacenter infrastructure. A downloadable database of datacenter IP ranges, cloud providers, and known bot signatures you can query locally with zero latency. Pairs with reCAPTCHA and challenge-response flows - use IP intelligence to pre-filter obvious bot traffic before putting real users through a CAPTCHA.

~85%
Datacenter bot coverage
<2%
False positive rate
Monthly
Database Updates

How bot detection works

Most bots operate from datacenter and cloud infrastructure rather than residential ISPs. By identifying the source network of incoming traffic, you can distinguish genuine users from automated scripts with high accuracy.

1

Datacenter IP Range Mapping

We map IP ranges belonging to major cloud providers (AWS, GCP, Azure, DigitalOcean, Linode, Vultr, OVH, Hetzner) and hundreds of smaller hosting companies. Traffic from these ranges is almost never legitimate end-user traffic.

2

Known Bot Signature Database

We maintain a database of IP ranges used by known bots - both good (Googlebot, Bingbot, Applebot) and bad (SEO scrapers, content thieves, vulnerability scanners). Each entry is classified so you can allow legitimate crawlers while blocking bad actors.

3

Cloud Provider Identification

When a request comes from an AWS EC2 instance or a Google Cloud VM, that's a strong signal it's automated. Our database identifies the specific cloud provider so you can make granular decisions about which sources to trust.

4

Good Bot vs. Bad Bot Classification

Not all bots are harmful. Search engine crawlers, uptime monitors, and feed readers are beneficial. Our database flags each bot as "good" or "bad" so you can build nuanced access policies instead of blocking all automated traffic.

What traffic sources we identify

Major Cloud Providers

AWS, GCP, Azure, Oracle Cloud, IBM Cloud

VPS & Hosting

DigitalOcean, Linode, Vultr, OVH, Hetzner

Search Engine Crawlers

Googlebot, Bingbot, Applebot, YandexBot

SEO Tool Bots

AhrefsBot, SemrushBot, MJ12bot, DotBot

Monitoring Services

UptimeRobot, Pingdom, StatusCake

Content Scrapers

Known scraping infrastructure and IP pools

Good bot
Neutral
Bad bot / Datacenter

What's included in the database

The Bot & Datacenter database is delivered as CSV and JSON files. Each record maps an IP range to its hosting provider, bot classification, and risk level.

Field Type Description
ip_start string Start of the IP range (IPv4 or IPv6)
ip_end string End of the IP range
provider string Name of the hosting/cloud provider (e.g., AWS, DigitalOcean, OVH)
type enum Classification: datacenter, cloud, hosting, crawler, known_bot
bot_name string Name of the known bot if identified (e.g., Googlebot, AhrefsBot, SemrushBot)
is_good_bot boolean Whether this is a legitimate crawler (search engines, monitoring tools)
country string Country code of the datacenter or hosting provider (ISO 3166-1 alpha-2)
risk_level enum Risk classification: low, medium, high, critical
last_seen date Date the IP was last confirmed as belonging to a datacenter or bot
CSV & JSON
Data Formats
Monthly
Update Frequency
IPv4 & IPv6
IP Version Support

Use cases for bot detection

Automated traffic accounts for nearly half of all web traffic. Knowing which requests come from bots lets you protect your content, infrastructure, and revenue.

Content Scraping Defense

Detect and block scrapers running on cloud infrastructure that steal your pricing data, product catalogs, articles, or proprietary content for competitive advantage.

Credential Stuffing Prevention

Identify login attempts originating from datacenter IPs - a strong signal that automated tools are being used to test stolen username/password combinations against your authentication system.

Form Spam Filtering

Block automated form submissions from bots that pollute your contact forms, comment sections, and registration flows with spam content and phishing links.

Ad Fraud & Click Fraud Detection

Filter out non-human traffic from your analytics and advertising platforms. Ensure that ad clicks, impressions, and conversion events come from real users, not bots.

Inventory & Checkout Protection

Prevent scalper bots from hoarding limited-edition products, concert tickets, or flash-sale inventory. Detect datacenter-origin traffic before it reaches your checkout flow.

Good Bot Management

Distinguish between legitimate crawlers (Googlebot, Bingbot) and malicious bots. Allow search engines through while blocking everything else from datacenter IPs.

Detect bots with a local database lookup

Import the datacenter and bot database into your preferred data store. A single IP lookup tells you whether traffic is from a datacenter, and whether it's a known good or bad bot.

PHP - Bot Detection with Good Bot Allowlist
prepare(
        "SELECT provider, type, bot_name, is_good_bot, risk_level
         FROM bot_ips
         WHERE ip_start <= INET6_ATON(:ip)
           AND ip_end   >= INET6_ATON(:ip)
         LIMIT 1"
    );
    $stmt->execute(['ip' => $ip]);
    return $stmt->fetch(PDO::FETCH_ASSOC) ?: null;
}

$ip = $_SERVER['REMOTE_ADDR'];
$bot = detect_bot($pdo, $ip);

if ($bot) {
    if ($bot['is_good_bot']) {
        // Allow legitimate crawlers (Googlebot, Bingbot, etc.)
        // Optionally serve a simplified page for faster crawling
        header('X-Bot-Detected: good');
    } else {
        // Datacenter or bad bot traffic
        if ($bot['risk_level'] === 'critical') {
            http_response_code(403);
            exit('Automated access is not permitted.');
        }
        // Medium risk: serve a CAPTCHA challenge
        header('X-Bot-Detected: suspicious');
        require_captcha();
    }
}
Python - FastAPI Middleware
import sqlite3
import ipaddress
from fastapi import FastAPI, Request, HTTPException

app = FastAPI()
DB_PATH = "antiproxies_bots.db"

def check_bot(ip: str) -> dict | None:
    """Look up an IP in the local bot/datacenter database."""
    conn = sqlite3.connect(DB_PATH)
    conn.row_factory = sqlite3.Row
    ip_int = int(ipaddress.ip_address(ip))

    row = conn.execute(
        """SELECT provider, type, bot_name, is_good_bot, risk_level
           FROM bot_ips
           WHERE ip_start <= ? AND ip_end >= ?
           LIMIT 1""",
        (ip_int, ip_int)
    ).fetchone()
    conn.close()
    return dict(row) if row else None

@app.middleware("http")
async def bot_detection_middleware(request: Request, call_next):
    client_ip = request.client.host
    bot = check_bot(client_ip)

    if bot and not bot["is_good_bot"]:
        if bot["risk_level"] in ("high", "critical"):
            raise HTTPException(status_code=403, detail="Automated access blocked.")
        # Flag suspicious traffic for logging
        request.state.bot_detected = True

    response = await call_next(request)
    return response

Want to see what's in the database?

Download once, query as many times as you need. €99/year for all 22 databases, unlimited servers, and a full year of monthly updates. No usage limits, no per-query fees, no data leaving your servers.

30-day money-back guarantee
All databases included
Monthly updates