- Remove old rails code (branch
remove-rails-app; deleted 232 files — app/config/test/public/db/bin/etc. + Gemfile. Kept infra/, static-site/, docs/. Clears 70/71 Dependabot alerts.) - Explore the 8 archived Heroku apps in more detail — decide what (if anything) is worth reviving or mirroring to GitHub. See
docs/heroku-source-archive.md.
Phase 1 is live on https://pixieengine.com (cutover 2026-06-01). See docs/deploy.md, infra/, static-site/.
- Delete the staging bucket
pixieengine-com-sitenow that content lives inpixieengine-static. (Verified no CloudFront dist used it as an origin; deleted via CloudShell 2026-06-02.) - Delete the
pixieengine-auditIAM user (discovery complete). Seedocs/aws-inventory.md§6. (defer, this user has come in handy while exploring) - Confirm
www.pixieengine.comstill redirects to the apex (separate distE1YLL4NVXDIDA5, untouched by the cutover).
- Deployed attribution + moderation (2026-06-02).
generate.mjs→aws s3 sync --delete(full, then--size-onlydelta: 3,217 up / 400 del) → CloudFront invalidation (E2QQUW2BPHXXNP/*). Live-verified: removed→404, adult→gate+noindex, attributionrel=authorlive, normal→200, sitemap→200. This was also the first deploy of the attribution rollout (creator links + 11,876 profile pages). Gotcha logged: don't pipe the sync totail(truncates the count log); and re-generateafter any further moderation before syncing.--size-onlyis unsafe for paginated listings — when the page count changes digit-preserving (e.g. last-page4115→4108), page byte size is identical so--size-onlyskips it, leaving stale "Last »" links. Use a default (mtime) sync after a regen, or it desyncs the gallery/tag/profile pagination.
- Not a bug:
.../original.images (~3,166). The S3 object is literally namedoriginal.(trailing dot, no extension) and returns 200 image/png — the live site displays these correctly. The review tool showed "no image" because the server hardcodedoriginal.png; fixed to use the dataset's realimgfield (+ athumb.pngfallback). Likely uploads whose source file had no extension.
- User attribution + profile pages (2026-06-02).
/<display_name>/profile pages (×11,896, paginated) with avatar + bio + their sprites/tunes; sprite/tune cards now link to the creator. ~104,989 sprites attributed (rest are post-2017 S3-only → Anonymous). Display_name + public bio + CDN-verified avatar only — no email/hash/token. Revises architecture decision #4 (was "no listing"); now consistent with the published recovered comment handles. Source:static-site/extract-attribution.sh→build/{users.ndjson,sprite_owners.tsv,tune_owners.tsv}. Noinfra/functions/rewrite.jschange (profiles ride the dir→index.html rule). Not yet deployed — needsaws s3 sync(most HTML changed) + a broad CloudFront invalidation. - Submit
https://pixieengine.com/sitemap.xmlto Google Search Console. (sitemap index → 6 children, 252,566 URLs, all 200/valid; robots.txt advertises it) - Recover titles for the ~135k still-untitled sprites (more CF-log / Wayback mining) → re-merge via
static-site/build-dataset.mjs, regenerate, re-sync. Seedocs/wayback-recovery.md,docs/cloudfront-log-recovery.md. - (Optional) Return a true 410 for missing sprites via Lambda@Edge — currently 404 (CloudFront custom-error responses can't emit 410; both de-index cleanly).
- Drop the
id % 4CDN domain sharding.build-dataset.mjs:36,generate.mjs:236, and the client loader (generate.mjs:528) spread image/avatar URLs across0–3.pixiecdn.com— an HTTP/1.1-era trick for more parallel connections. All four hostnames already serve HTTP/2 (distE30UBGU2BPKA0U) and point at the same distribution/origin, so sharding now costs up to 4× the DNS+TCP+TLS handshakes and breaks HTTP/2 connection reuse + HPACK sharing. Fix is generator-only (regenerate HTML, no image migration): collapse${id % 4}to a single existing host (e.g. always0.pixiecdn.com) — reusing one of the four avoids any new alias/ACM cert. Then re-generate+s3 sync+ invalidate. - Enable HTTP/3 on the HTTP/2-only dists. Static site
E2QQUW2BPHXXNPand image CDNE30UBGU2BPKA0UareHTTP2only; others in the account already runHTTP2and3. Static dist = one CDK line ininfra/lib/pixie-static-stack.civet(httpVersion: cf.HttpVersion.HTTP2_AND_3in theDistributionprops). Image dist is legacy/not in CDK → bump via console orupdate-distribution.
- Dependabot: all 71 open alerts (incl. all 3 critical / 17 high) are in the Rails
Gemfile.lock— auto-resolve onceremove-rails-appmerges. The lone npm alert (#163,brace-expansion5.0.5, GHSA-jxxr-4gwj-5jf2, moderate) was dismissed (tolerable_risk): bundled insideaws-cdk-lib@2.257.0(latest) via minimatch — overrides/audit fixcan't rewrite a bundled dep, no upstream patch yet, build-time-only CDK dep with no real exposure. Will auto-resolve onnpm installonce AWS bundlesbrace-expansion ≥ 5.0.6. - Rotate exposed prod secrets (see
docs/aws-inventory.md§6):-
stay-peggedAWS keys (local[default]) — rotated. - GitHub token — verified dead (legacy 40-char classic PAT, returns 401; auto-expired/revoked). No action needed.
-
ADMIN_CODE— moot: Heroku app is at 0 dynos with no DB, and nothing in the static archive reads it. Dies with the app.
-
- Phase 2 — whimsy progressive enhancement: Cognito login + favorites/comments via
api-whimsy-space+ Briefcase S3. Seedocs/architecture.md. - Design the incremental gallery / baking system — so adding or removing a sprite doesn't re-bake every gallery page (offset-pagination cascade: front-insert shifts all ~4,100 pages). For the event-driven re-bake loop (DynamoDB Streams → render Lambda → S3).
- Exploration step (do first): brainstorm options beyond the starting set (list is NOT exhaustive), pin evaluation criteria (re-bake fan-out, SEO/URL stability, UX, build/per-event cost, data needs), spike the leaders against real data + the Streams→Lambda loop, then decide.
- Starting options: (A) fixed-boundary stable pages (
floor(seq/60), no repack, oldest=page-1, browse newest-first via Newest/Older, relative/sprites/landing); (B) bake page-1 + cursor-fed deep gallery (DynamoQuery+LastEvaluatedKey; sitemap is the crawl path); (D) year/month date-based buckets (/sprites/2024/09/, immutable past months); + others to brainstorm. Tag/profile listings already localized. Full writeup + tradeoffs:docs/incremental-baking.md.
- Moderation: reactive takedown process for the ~105k post-2017 sprites.
- Removal mechanism (2026-06-02).
static-site/removed.tsv(tracked, source of truth) +remove.mjsappender;generate.mjsenforces it — removed sprites/tunes get no page (S3 404 →/410"Gone") and drop from every gallery/tag/profile/comment/sitemap surface; auserrow removes the profile, de-attributes their art to Anonymous, and scrubs their comment handle. Verified end-to-end. Runbook (incl. urgent CDN-image deletion + NCMEC note):static-site/README.md"Removing content". - Per-comment removal (2026-06-02).
commenttype inremoved.tsv, addressed by<spriteId>#<hash>(stable content hash viacomment-key.mjs, shared by generator + tool so it can't drift).node remove.mjs comment <spriteId>lists a thread's keys; the removal drops one comment's body+handle while the rest of the thread survives. Closes the gap where auserremoval scrubbed only the handle, not an abusive comment body. Verified end-to-end. - Intake: abuse@ address + per-page "Report" link (mailto / reuse the Feedback form).
- Proactive scan of the unreviewed content:
- Text-signal scan (2026-06-02).
moderation/scan-text.mjs+moderation/terms.tsv→build/scan-candidates.tsv(ranked review queue). First run flagged 320 items (sev3=5 csai, sev2=289 sexual/hate, sev1=26), incl. comment-handle/body hits. Covers the 119,666 sprites (48%) with any title/tags/description — incl. 87% of the 105,542 unreviewed (CF-log-recovered titles). 5 sev-3 to action first: sprites 139734, 153693 ("pedo bear"), 185865, 185897, 258085 ("loli"). - Image classifier scan — REQUIRED for the 128,509 sprites (52%) with no text signal (incl. 13,744 unreviewed). Plan: NSFWJS (tfjs-node, local/free) PoC on the text-flagged set + a random blind-set sample to gauge accuracy on 64×64 pixel art, then full run. Caveat: NSFW classifiers are photo-trained; pixel-art recall is unproven. Output appends to
build/scan-candidates.tsv→ flows into the same review tool. - CSAM hash-matching — the real legal exposure; a generic NSFW model does NOT detect it. Needs PhotoDNA / NCMEC / Cloudflare CSAM tool (authorized enrollment). Separate track.
- Replay scan for uploads/empties (2026-06-02).
moderation/scan-replay.mjs— fetches replay.json, counts ops (v0 stroke-array / v1{history}); flagsempty(0 ops),single-stroke/single-op/single-resize-upload(one paste),large-canvas(>256px),no-replay. No image decode needed; resumable cache (build/replay-scan-cache.ndjson); output →build/replay-candidates.tsv→ review withnode moderation/review.mjs build/replay-candidates.tsv. PoC (600 ids): 284 drawn, 245 pre-replay, 71 flagged (~20% of replay-era sprites). Precision strong on large uploads (1190×1540, 1280×720…) + empties; small single-op lower-confidence (review-gated). Covers the post-2017 (id ≳110k) upload-prone set.- Full replay scan done (2026-06-02). 149,446 ids; 121,474 drawn. 25,601 flagged: empty 3,884 (v0, truly blank, sev2 trash), no-edits 5,927 (v1 history=0 — may be uploads, sev1), large-canvas 5,885, single-resize-upload 5,491, single-op 2,484, single-stroke 1,747, no-replay 183. Key finding: v1
history=0≠ blank (initialState may hold an upload) — only v0-empty is auto-trash. Formats + this distinction documented indocs/replay-format.md. Output:build/replay-candidates.tsv→node moderation/review.mjs build/replay-candidates.tsv. - Review the replay candidates —
upload/no-editson merit (uploads ≠ bad).
- Full replay scan done (2026-06-02). 149,446 ids; 121,474 drawn. 25,601 flagged: empty 3,884 (v0, truly blank, sev2 trash), no-edits 5,927 (v1 history=0 — may be uploads, sev1), large-canvas 5,885, single-resize-upload 5,491, single-op 2,484, single-stroke 1,747, no-replay 183. Key finding: v1
- Pixel-decode emptiness verify (2026-06-02). The replay
emptysignal (v0 ops=0) over-flags badly — pixel decode showed only 643 of 3,884 (17%) are truly empty (512 transparent + 131 uniform); 3,241 (83%) actually have content. Added zero-dep PNG decodermoderation/png-decode.mjs(node:zlib, 8-bit non-interlaced, color types 0/2/3/4/6) +moderation/check-empty.mjs→build/empty-verified.tsv(real empties) +build/empty-content.tsv(false empties → merit review). Lesson: replay signals need pixel confirmation before any "trash" claim. Decoder is reusable for the broader image-decode pass. - Review-tool scroll-jump fixed (2026-06-02) —
render()only resets scroll on page/filter change, preserves position after a decision. - Image-decode pass (optional follow-up) — for scribble detection + the pre-replay (<110k) set + photo pixel-distribution. Zero-dep PNG decoder (node:zlib) → coverage % / distinct colors / continuous-tone. Heavier (decode ~248k). Decisions: dep (hand-roll vs sharp/pngjs) + scope. Lower priority now that replays cover uploads/empties.
- Text-signal scan (2026-06-02).
- Review tool (2026-06-02).
moderation/review.mjs(local server) +review.html— 100-at-a-time grid, arrow-key paging, integer-scaled native pixel art, comment/user views, Valid removal / 🔞 Adult / False positive buttons + multi-select & bulk (filter → select-all-shown → bulk-decide). Valid →removed.tsv; adult →adult.tsv; every decision →moderation/reviewed.tsv(FP allowlist, so dismissed items don't resurface). Listsaws s3 rmfor valid sprite removals. Loads the scan queue or any id list. Sharedremoved-list.mjs/comment-key.mjskeep formats from drifting. First session: 269 removals, 50 FPs. - Adult (18+) gating (2026-06-02). "Tasteful NSFW" allowed but age-restricted.
adult.tsv(tracked, shares the list machinery);generate.mjskeeps the sprite page but addsnoindex+ a fail-closed self-attestation interstitial + 🔞 badge, dropsog:image/JSON-LD, and excludes it from galleries/tags/profiles/sitemap (listable). Verified end-to-end. Static stopgap only — real "logged-in + over-18" enforcement needs the Phase-2 dynamic layer (Cognito);adult.tsvis the input that will drive it. Open follow-up: decide whether adult shows blurred-in-gallery vs fully hidden (currently hidden). - Triage/review tool (page through candidates by CF-log traffic → write
removed.tsv).
- Removal mechanism (2026-06-02).