From WordPress migration to autonomous content operations across 4 properties

Context

I operate a portfolio of four niche content sites as a long-term play — small revenue today, compounding SEO authority over time. None of them is my primary work; they sit in the background while I focus on client engagements.

The portfolio had two problems eating attention.

Problem 1: stale infrastructure. Three of the four were running on WordPress, with the usual costs — hosting fees, plugin maintenance, periodic SEO regressions from theme updates, security patches.

Problem 2: decision tax. Even with AI-assisted drafting, each post needed me to pick a topic, judge draft quality, decide ship-vs-fix, publish, then check Search Console for what worked. Multiplied by four properties, the math worked out to ~2-3h/week scattered across the calendar. The naive solution — “use AI to write the posts” — missed the real bottleneck. The bottleneck was the decision tax, not the drafting.

I solved both in two phases.

Phase 1 — WordPress → Astro migrations (3 sites, zero regression)

Three properties migrated to Astro + Cloudflare static hosting, with a disciplined process built for repeatability:

WXR-driven content migration. Python tool converts WordPress export to Astro content collections, preserving slugs, dates, categories, and Yoast SEO metadata. URL preservation is non-negotiable — these sites have indexed authority worth keeping.
Affiliate guard. Automated scan/baseline/verify on monetization links. Critical for revenue continuity — affiliate-monetized content can’t survive a quiet broken-link disaster.
Image migration with rate-limit handling. Origin treats HTTP 415 as a transient backoff signal; browser-style headers plus exponential backoff recovered 181 of 187 attachments on the largest property without manual intervention.
Pre-cutover SEO audit. Scans the built dist/ for cross-site template contamination (caught a siteUrl hardcoded to the wrong domain in a cloned layout), title/description length compliance, canonical and OG tag presence.
Staging-first deploy. Cloudflare Workers *.workers.dev subdomain validates the build under real Cloudflare conditions before DNS cutover. Rollback path: recreate the original A record.
DNS snapshot. Pre-cutover state captured as a rollback reference before the old WP A record is deleted.

The most recent migration (a property doing ~4.8k clicks / 678k impressions per quarter, with 8 high-volume URLs in the 5-73k impression range) cut over with all 8 GSC-priority URLs returning 200 immediately, no redirect chains, no metadata regressions. Sitemap re-submitted to GSC the same day. Hosting bill went to zero — replaced by Cloudflare’s free tier.

Three migrations in a row sharpened the playbook. Each iteration codified new edge cases (Astro 5’s glob loader, Elementor thumbnail paths, WordPress image size suffixes, cross-site template contamination). What started as a one-off port is now a productizable migration service.

Phase 2 — Autonomous content operations with a critic gate

With the infrastructure clean, I built an autonomous loop on top of it. The shape:

Scout — pulls candidate work from Google Search Console, Reddit question threads, and Serper SERP difficulty scoring. Outputs a ranked queue.
Picker — picks the next item from the queue, respecting cooldowns so the same slug doesn’t get touched twice in a week.
Producer — Claude generates the artifact (a title + meta rewrite, or a full post draft).
Critic gate — Claude scores the output against a deterministic rubric with hard pass thresholds.
- PASS → apply the change, take a backup snapshot, git commit + push → Cloudflare auto-deploys → live in production within ~90 seconds.
- FAIL → park the output in needs-human/ for review. Loop continues to the next item.
Weekly digest — Sunday 9am email summarizing what shipped, what’s parked (with critic verdicts), queue depth, and GSC week-over-week CTR on rewritten slugs.

Two loop variants share that shape:

SEO autopilot loop (3 established blogs, Mon + Thu 9am): scout finds underperforming pages on GSC (position 5-15 AND CTR ≤ 2% AND impressions ≥ 100), Claude rewrites title + meta description, critic scores 5 dimensions, PASS commits to git → Cloudflare deploys live.
Content loop (1 young blog, Mon 9am): scout merges GSC almost-ranking queries + Reddit question threads + Serper SERP difficulty, Claude drafts a full post, critic scores 7 dimensions, PASS publishes.

Scheduled via local macOS launchd.

The critic is the entire game

The system runs unattended because of one design choice: Claude judges Claude’s output against a deterministic rubric, with hard thresholds deciding ship vs. park.

Rubric for SEO rewrites (0-10 each):

keyword_preserved — does the new title still rank for the original term?
length_compliance — title ≤ 60 chars, description 130-155 chars (mechanical, not opinion)
ctr_uplift_plausibility — is the new version actually more clickable?
brand_voice — matches the property’s established tone?
factual_safety — any hallucinated claims? Any unprovable promises?

Pass requires all dimensions ≥ 7, length_compliance ≥ 9, factual_safety ≥ 8. The thresholds aren’t fungible. Length is mechanical and absolute — a 65-character title is broken, no opinion needed. Factual safety is strict because one of the properties is YMYL — health-adjacent content can’t ship hallucinated claims.

The critic is stricter than I would be manually. That’s deliberate. The system bias should be “park questionable work for human review,” not “ship anything that looks fine.” First smoke test: 1 of 3 proposals parked (33%) — critic caught a real grammar bug and unprovable claims that I’d likely have shipped manually.

Engineering decisions worth flagging

Local launchd over remote agents. GSC OAuth and wrangler credentials are cached locally; moving them remote meant either committing tokens or rewriting the auth flow. Local launchd uses what’s already there. Tradeoff: jobs fire only when the laptop is awake. For Mon/Thu 9am on a daily-use machine, fine.
Explicit git add paths, never git add . A monorepo full of unrelated WIP means one wrong glob sweeps unrelated work into autonomous commits. Every orchestrator stages exact files by path.
Three-level kill switch. touch .tmp/PAUSE halts one property. A bulk helper halts all jobs across the portfolio. Auto-pause triggers if needs-human/ accumulates more than 3 parked items.
Weekly digest as the only inbox-touching surface. No per-run pings, no per-park alerts. One Sunday email per property. If I never open the project repo, I still know what happened.
Copy-paste over premature abstraction. Four nearly-identical helpers (one per blog). No shared package. Refactor when there’s a 5th property, not before.

Results

Time freed: ~2-3h/week scattered across the calendar → ~5 min reading the Sunday email.
First live shipment: a category page deployed within 90 seconds of critic PASS — title shortened from 63 to 45 chars, description rewritten to a concrete benefit-driven phrasing.
First parked item: the critic caught a grammar agreement error plus unprovable claims that I’d have shipped manually.
Code: ~14 Python files, 8 launchd plists, 4 shell orchestrators, ~3,000 lines total.
Cost ceiling: ~$2-3/month total across 4 properties. The system trivially scales to 10× volume before hitting Resend or Claude free-tier limits.

Why this engagement matters

Two things compound here.

Infrastructure that earns its keep. The migration playbook is repeatable now — three sites in a row, each iteration codifying new edge cases. Productizable as a service offering. Validated on a property with meaningful organic traffic, not a sandbox.

A reusable agentic pattern, not a blog-specific hack. Most “AI content automation” is producer-only and ships whatever the model generates. The critic gate is what makes the system safe to leave running unattended. The same shape works for cold-email outbound, sales follow-up sequences, content moderation queues, lead enrichment QA — anywhere you have (1) a deterministic queue source, (2) a Claude producer with a clear contract, (3) a Claude critic with a rigorous rubric and hard thresholds, (4) a kill switch, (5) a weekly digest you actually read.

Skip any of those and the system either does nothing useful or quietly ships garbage. With all five, it runs while you sleep.