Field notes

What we learn running scrapers at scale — anti-bot arms races, browser internals, compliance, and the occasional war story.

RSS feed

May 12, 2026·4 min read

TLS fingerprints are the new User-Agent

Spoofing the User-Agent header stopped working years ago. Here's why your HTTP client gets blocked before it sends a single byte of HTTP, and what to do about it.

anti-bothttptlsRead

April 28, 2026·4 min read

When to reach for a real browser (and when not to)

A headless browser is the most expensive way to fetch a page. Here's a practical decision tree for picking the lightest engine that actually works on your target.

enginesperformancebrowserRead

March 19, 2026·4 min read

Scraping and the GDPR: what a French SaaS actually has to do

Public doesn't mean free-for-all. A grounded look at where web scraping sits under the GDPR, written by a company that has to answer the questionnaire.

compliancegdprlegalRead