Field notes
What we learn running scrapers at scale — anti-bot arms races, browser internals, compliance, and the occasional war story.
·4 min read
TLS fingerprints are the new User-Agent
Spoofing the User-Agent header stopped working years ago. Here's why your HTTP client gets blocked before it sends a single byte of HTTP, and what to do about it.
anti-bothttptlsRead
·4 min read
When to reach for a real browser (and when not to)
A headless browser is the most expensive way to fetch a page. Here's a practical decision tree for picking the lightest engine that actually works on your target.
enginesperformancebrowserRead
·4 min read
Scraping and the GDPR: what a French SaaS actually has to do
Public doesn't mean free-for-all. A grounded look at where web scraping sits under the GDPR, written by a company that has to answer the questionnaire.
compliancegdprlegalRead
