Ki-Ki

Web foundations for SMEs

Bot mitigation for public interest and watchdog sites

Public interest sites are targeted from both ends. You need search engines and genuine readers, but you also get scrapers, scanners, and automated nonsense, sometimes from very interested networks.

I design Cloudflare based bot mitigation for advocacy, watchdog, and anti corruption projects so automated traffic is controlled, evidence is captured, and real people still get through.

Bot mitigation Scraping control Cloudflare rules High risk work Fingerprint aware

For the full public interest picture, see Advocacy, campaign, and public interest sites, Secure static sites, and Evidence grade logging.

Who bot mitigation work fits best

This is for people who care less about abstract bot scores, and more about whether their infrastructure holds up under pressure.

  • Investigative and anti corruption sites

    Projects that expect repeated scanning and scraping from organisations whose work is being examined.

  • Campaign and mobilisation platforms

    Campaigns that hit social media cycles, get brigaded, and need to stop their contact routes from turning into a junk fire.

  • Whistleblower and disclosure platforms

    Sites that publish guidance and host secure contact routes, where automated access is a risk, not just an annoyance.

  • NGOs and community watchdogs

    Groups that cannot afford to lose limited bandwidth and attention to constant scraping and opportunistic tools.

If you regularly see strange spikes in your logs or mysterious referrers, bot mitigation is not a luxury, it is basic hygiene.

What usually goes wrong with bots on public interest sites

Most sites either ignore bots completely or overreact and end up blocking the wrong things. Neither is great when your work is sensitive.

Scrapers treated as harmless curiosity

Automated tools quietly cloning content and hitting sensitive pages, with no attempt to distinguish them from normal readership.

Legitimate crawlers accidentally blocked

Overly aggressive rules that knock out search engines, social previews, or accessibility tools, then never get noticed until traffic drops.

Rate limits that work against real people

Limits tuned on guesswork that throttle live events, media coverage, or genuine spikes more than they slow hostile scans.

No clear picture of who is probing the site

Logs that show noise, but do not make it clear which networks are behind the most persistent or targeted automation.

Contact forms overwhelmed during disputes

Simple forms that get flooded whenever a story lands or someone encourages followers to pile on, rendering them almost useless.

Defences that nobody understands six months later

A pile of Cloudflare rules added in a rush, with nobody quite sure what they do now or what will break if they are changed.

Bot mitigation for public interest sites should be deliberate and documented, not a trail of half remembered tweaks in the hope that something helps.

How I approach bot mitigation for serious work

The target is simple. Let genuine people in, keep search engines happy, and make hostile automation pay a higher price.

  • Map your real world risks, such as known institutional interest, previous pile ons, or specific pages that attract scrutiny.
  • Tune Cloudflare to separate default crawlers, custom tools, and clearly abusive or evasive behaviour.
  • Use rate limits and managed challenges in a way that protects critical routes without punishing normal readers.
  • Integrate evidence grade logging so that automated patterns can be documented, not just blocked. See Evidence grade logging.
  • Optionally layer lawful fingerprinting on your own domain so repeated automated devices stand out clearly over time. See Fingerprinting and Edge Tracker.

What that looks like in practice

Clear class separation

Known good crawlers, unknown automation, and obvious junk are handled differently, not thrown into one generic bucket.

Critical route protection

Contact forms, upload routes, and sensitive content paths receive extra scrutiny and challenge logic compared with public static pages.

Evidence, not just blocking

When something is blocked or challenged, it is logged in a way that can be used later in complaints, internal reviews, or regulator correspondence.

This is not about winning a game against bots. It is about shifting the balance so your limited human capacity is not being wasted on preventable noise.

Bots, monitoring, and institutional reality

Not all bots are random. Some automated access comes from institutions, contractors, or firms who have a direct interest in your work.

Bot mitigation for public interest sites should assume that some automated traffic is part of institutional monitoring, not just generic scanning. The answer is not to panic. The answer is to capture it, limit its impact, and be able to describe it calmly if challenged.

Proper logging and fingerprinting turn vague suspicions about who is watching your work into patterns that can be explained in plain language.

Boundaries for bot and fingerprint work

Power needs limits. That includes defensive tools.

  • Ki-Ki configures bot mitigation and logging for your own site only. I do not help anyone target or interfere with other people’s infrastructure.
  • I do not build systems that aim to secretly identify individuals by name or track them beyond your own domain.
  • Bot mitigation rules are designed to be compatible with a clear privacy position and with search engine indexing, not to create dark patterns.
  • I expect you to disclose your cookies, analytics, and fingerprinting accurately in your policies.
  • You must not misrepresent what your logs or bot controls show. They reveal patterns and behaviour, not personal motives.
  • If a proposed rule set is clearly excessive or likely to catch vulnerable users unfairly, I will say so and may refuse the work.

The wider position sits in the Cookies, analytics, and fingerprinting policy and the Neutral infrastructure policy.

Questions people usually ask about bots

Will bot mitigation stop all scraping and automation?

No. Nothing will. What it can do is make scraping slower, noisier, and easier to see in logs, while keeping the site stable for real people.

Can we still be indexed by search engines?

Yes, that is part of the job. Known search crawlers are handled separately from unknown tools, so you can protect sensitive routes without vanishing from search.

Will challenges annoy our supporters?

Occasional challenges are normal when protections are tuned for hostile traffic. The aim is to keep them rare for genuine users and heavier for suspicious patterns.

Can this help with denial of service attempts?

Cloudflare can absorb a huge amount of junk traffic when configured properly. Bot mitigation and rate limits are part of that, but not a full replacement for proper DDoS protection at higher tiers.

How does this link to evidence grade logging?

Every interesting bot rule or challenge should have a logging story. If something is blocked or slowed, you should be able to see that in your evidence logs and explain it later.

Do we need fingerprinting as well as bot controls?

Not always. Fingerprinting is best reserved for higher risk projects facing repeated targeted or institutional attention. Bot controls and logging alone already solve a lot of pain for many sites.

Start the conversation

Tell me what kind of traffic you are seeing, what worries you most, and what your current Cloudflare setup looks like. I will tell you what is realistic to fix and how heavy that work is likely to be.

No mailing lists. NDA available for sensitive projects.