Mirror of a627388 but for the forum image path. The same image is often
re-uploaded under different filenames across pages/posts, so existsSync
on the target name can't catch content-duplicates. After fetching the
buffer, hash the first 64KB and compare against existing same-size files
in the target folder (same md5+size signature as gallery's duplicate
scanner). Confirmed against a known dani-speegle-2 pair:
skip IMG_79695f8914f20ce38b07.jpg — same content as
72759c89-7e53-4976-839a-7d952c444579.jpg
buildSizeIndex is built once per job in runForumScrape and threaded
through scrapeForumPage → downloadImage; the hash cache amortizes across
all pages in the job.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
DDoS-Guard now binds session cookies to the issuing browser's fingerprint, so
direct Node fetch returns 403 even with valid cookies. Page HTML for any
forum_site with stored cookies is now fetched via a FlareSolverr browser
session opened once per scrape job.
- Hybrid cookie refresh: FlareSolverr clears the DDoS-Guard captcha, those
cookies seed undetected_chromedriver, Turnstile auto-solves in the real
browser, login form submits, final cookies + browser UA persist to forum_sites
- Per-site user_agent column so subsequent scraper requests match the UA the
cookies were issued for (DDoS-Guard rejects UA mismatches)
- XenForo search rewritten as proper CSRF POST /search/search → results page
parse, replacing the broken ?q=... GET that only returned the search form
- Pagination regex fallback in detectMaxPage catches XenForo pages that
cheerio's class-based selectors miss
- New scrapers/turbo.js handles turbo.cr /embed/ and /a/ URLs by rendering
the page via FlareSolverr and grabbing the signed mp4 from the resolved
<video src> attribute (gallery-dl can't extract these — obfuscated WASM)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- JWT-based app authentication with user roles, folder/route access control
- Dashboard with storage stats, health checks, and recent activity
- Auto-download/scrape scheduler (12h interval) with per-user and per-job configs
- Video upload, tagging, HLS transcoding, and detail pages
- New scrapers: LeakGallery, Mega (megajs), yt-dlp
- FlareSolverr integration for Cloudflare-protected sites
- Gallery: advanced filtering (date, size, search), sort modes, equal-mix shuffle
- Forum sites management with stored cookies/auth
- GridWall/GridCell components for responsive media grid
- Media API with folder-access permissions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DRM video download pipeline with pywidevine subprocess for Widevine key acquisition
- Scraper system: forum threads, Coomer/Kemono API, and MediaLink (Fapello) scrapers
- SQLite-backed media index for instant gallery loads with startup scan
- Duplicate detection and gallery filtering/sorting
- HLS video component, log viewer, and scrape management UI
- Dockerfile updated for Python/pywidevine, docker-compose volume for CDM
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>