Context
I operate a portfolio of four niche content sites as a long-term play — small revenue today, compounding SEO authority over time. None of them is my primary work; they sit in the background while I focus on client engagements.
The portfolio had two problems eating attention.
Problem 1: stale infrastructure. Three of the four were running on WordPress, with the usual costs — hosting fees, plugin maintenance, periodic SEO regressions from theme updates, security patches.
Problem 2: decision tax. Even with AI-assisted drafting, each post needed me to pick a topic, judge draft quality, decide ship-vs-fix, publish, then check Search Console for what worked. Multiplied by four properties, the math worked out to ~2-3h/week scattered across the calendar. The naive solution — “use AI to write the posts” — missed the real bottleneck. The bottleneck was the decision tax, not the drafting.
I solved both in two phases.
Phase 1 — WordPress → Astro migrations (3 sites, zero regression)
Three properties migrated to Astro + Cloudflare static hosting, with a disciplined process built for repeatability:
- WXR-driven content migration. Python tool converts WordPress export to Astro content collections, preserving slugs, dates, categories, and Yoast SEO metadata. URL preservation is non-negotiable — these sites have indexed authority worth keeping.
- Affiliate guard. Automated scan/baseline/verify on monetization links. Critical for revenue continuity — affiliate-monetized content can’t survive a quiet broken-link disaster.
- Image migration with rate-limit handling. Origin treats HTTP 415 as a transient backoff signal; browser-style headers plus exponential backoff recovered 181 of 187 attachments on the largest property without manual intervention.
- Pre-cutover SEO audit. Scans the built
dist/for cross-site template contamination (caught asiteUrlhardcoded to the wrong domain in a cloned layout), title/description length compliance, canonical and OG tag presence. - Staging-first deploy. Cloudflare Workers
*.workers.devsubdomain validates the build under real Cloudflare conditions before DNS cutover. Rollback path: recreate the original A record. - DNS snapshot. Pre-cutover state captured as a rollback reference before the old WP A record is deleted.
The most recent migration (a property doing ~4.8k clicks / 678k impressions per quarter, with 8 high-volume URLs in the 5-73k impression range) cut over with all 8 GSC-priority URLs returning 200 immediately, no redirect chains, no metadata regressions. Sitemap re-submitted to GSC the same day. Hosting bill went to zero — replaced by Cloudflare’s free tier.
Three migrations in a row sharpened the playbook. Each iteration codified new edge cases (Astro 5’s glob loader, Elementor thumbnail paths, WordPress image size suffixes, cross-site template contamination). What started as a one-off port is now a productizable migration service.
Phase 2 — Autonomous content operations with a critic gate
With the infrastructure clean, I built an autonomous loop on top of it. The shape:
- Scout — pulls candidate work from Google Search Console, Reddit question threads, and Serper SERP difficulty scoring. Outputs a ranked queue.
- Picker — picks the next item from the queue, respecting cooldowns so the same slug doesn’t get touched twice in a week.
- Producer — Claude generates the artifact (a title + meta rewrite, or a full post draft).
- Critic gate — Claude scores the output against a deterministic rubric with hard pass thresholds.
- PASS → apply the change, take a backup snapshot,
git commit + push→ Cloudflare auto-deploys → live in production within ~90 seconds. - FAIL → park the output in
needs-human/for review. Loop continues to the next item.
- PASS → apply the change, take a backup snapshot,
- Weekly digest — Sunday 9am email summarizing what shipped, what’s parked (with critic verdicts), queue depth, and GSC week-over-week CTR on rewritten slugs.
Two loop variants share that shape:
- SEO autopilot loop (3 established blogs, Mon + Thu 9am): scout finds underperforming pages on GSC (position 5-15 AND CTR ≤ 2% AND impressions ≥ 100), Claude rewrites title + meta description, critic scores 5 dimensions, PASS commits to git → Cloudflare deploys live.
- Content loop (1 young blog, Mon 9am): scout merges GSC almost-ranking queries + Reddit question threads + Serper SERP difficulty, Claude drafts a full post, critic scores 7 dimensions, PASS publishes.
Scheduled via local macOS launchd.
The critic is the entire game
The system runs unattended because of one design choice: Claude judges Claude’s output against a deterministic rubric, with hard thresholds deciding ship vs. park.
Rubric for SEO rewrites (0-10 each):
keyword_preserved— does the new title still rank for the original term?length_compliance— title ≤ 60 chars, description 130-155 chars (mechanical, not opinion)ctr_uplift_plausibility— is the new version actually more clickable?brand_voice— matches the property’s established tone?factual_safety— any hallucinated claims? Any unprovable promises?
Pass requires all dimensions ≥ 7, length_compliance ≥ 9, factual_safety ≥ 8. The thresholds aren’t fungible. Length is mechanical and absolute — a 65-character title is broken, no opinion needed. Factual safety is strict because one of the properties is YMYL — health-adjacent content can’t ship hallucinated claims.
The critic is stricter than I would be manually. That’s deliberate. The system bias should be “park questionable work for human review,” not “ship anything that looks fine.” First smoke test: 1 of 3 proposals parked (33%) — critic caught a real grammar bug and unprovable claims that I’d likely have shipped manually.
Engineering decisions worth flagging
- Local launchd over remote agents. GSC OAuth and wrangler credentials are cached locally; moving them remote meant either committing tokens or rewriting the auth flow. Local launchd uses what’s already there. Tradeoff: jobs fire only when the laptop is awake. For Mon/Thu 9am on a daily-use machine, fine.
- Explicit
git addpaths, nevergit add .A monorepo full of unrelated WIP means one wrong glob sweeps unrelated work into autonomous commits. Every orchestrator stages exact files by path. - Three-level kill switch.
touch .tmp/PAUSEhalts one property. A bulk helper halts all jobs across the portfolio. Auto-pause triggers ifneeds-human/accumulates more than 3 parked items. - Weekly digest as the only inbox-touching surface. No per-run pings, no per-park alerts. One Sunday email per property. If I never open the project repo, I still know what happened.
- Copy-paste over premature abstraction. Four nearly-identical helpers (one per blog). No shared package. Refactor when there’s a 5th property, not before.
Results
- Time freed: ~2-3h/week scattered across the calendar → ~5 min reading the Sunday email.
- First live shipment: a category page deployed within 90 seconds of critic PASS — title shortened from 63 to 45 chars, description rewritten to a concrete benefit-driven phrasing.
- First parked item: the critic caught a grammar agreement error plus unprovable claims that I’d have shipped manually.
- Code: ~14 Python files, 8 launchd plists, 4 shell orchestrators, ~3,000 lines total.
- Cost ceiling: ~$2-3/month total across 4 properties. The system trivially scales to 10× volume before hitting Resend or Claude free-tier limits.
Why this engagement matters
Two things compound here.
Infrastructure that earns its keep. The migration playbook is repeatable now — three sites in a row, each iteration codifying new edge cases. Productizable as a service offering. Validated on a property with meaningful organic traffic, not a sandbox.
A reusable agentic pattern, not a blog-specific hack. Most “AI content automation” is producer-only and ships whatever the model generates. The critic gate is what makes the system safe to leave running unattended. The same shape works for cold-email outbound, sales follow-up sequences, content moderation queues, lead enrichment QA — anywhere you have (1) a deterministic queue source, (2) a Claude producer with a clear contract, (3) a Claude critic with a rigorous rubric and hard thresholds, (4) a kill switch, (5) a weekly digest you actually read.
Skip any of those and the system either does nothing useful or quietly ships garbage. With all five, it runs while you sleep.