feat(crawl): browser-like headers, HTTP/2, curl_cffi TLS fingerprint fallback

- get_headers(url): Referer, Sec-Fetch-*, sec-ch-ua, API vs HTML Accept
- httpx AsyncClient/ sync Client with optional HTTP/2 (h2 extra)
- On 403/429/503/520-523/525/567 or request errors, retry via curl_cffi chrome124 impersonate
- POST: Origin, Referer, Content-Type for form posts
- kuaidaili/ip3366: forward get_headers(url=...)

Made-with: Cursor
This commit is contained in:
祀梦
2026-04-05 14:40:36 +08:00
parent ce667dba13
commit 07248ff4ee
4 changed files with 234 additions and 29 deletions

View File

@@ -6,4 +6,5 @@ aiohttp-socks==0.9.1
beautifulsoup4==4.12.3
lxml==5.1.0
pydantic-settings==2.8.1
httpx==0.27.0
httpx[http2]==0.27.0
curl-cffi>=0.7.0