AI Planner: Bring Your Own AI, Pay Nothing to PSW#

What it is#

The AI planner is what decides how your apps map onto your hardware:

  • how many LXC targets to create,
  • which apps go on which target,
  • how many cores, how much RAM, and how big a rootdisk each target gets,
  • which target gets GPU passthrough,
  • which target gets which USB dongle,
  • where bulk data lives (local ZFS vs NFS from a storage node),
  • what the storage layout looks like on each Proxmox node.

PSW does not call any AI itself. The planner is a two-step copy/paste protocol:

  1. PSW renders a prompt. You click Copy (or run psw node plan) and get a self-contained Markdown prompt with everything the AI needs: your hardware inventory, your app selection, the storage rules, the placement rules, and the exact JSON schema the reply must match.
  2. You paste it into any AI. ChatGPT, Claude.ai , Gemini, a local Ollama model, a paid subscription you already have, whatever. The AI returns one JSON object. You paste that back into PSW.

PSW validates the JSON against the schema and writes the plan atomically. If something is wrong, you get a pointer-and-message list you can paste back to your AI for a corrected reply.

Why it’s built this way#

  1. No vendor lock-in. Your AI, your choice — today’s best model, tomorrow’s different one. PSW doesn’t care.
  2. No API keys in PSW. The Start screen collects no LLM credentials. The fewer secrets PSW holds, the smaller its blast radius.
  3. No LLM SDK dependency. anthropic, openai, google-generativeai are not in PSW’s pyproject.toml.
  4. Good failure mode. When the AI gets it wrong, you re-ask your AI. The round-trip is interactive; you know your AI better than PSW could.

The round-trip#

┌──────────────┐  (1) show prompt   ┌──────────────┐  (2) paste response  ┌──────────────┐
│  PSW wizard  │ ─────────────────▶ │  your AI     │ ───────────────────▶ │  PSW wizard  │
│     or       │                    │ (any vendor) │                      │     or       │
│   psw CLI    │                    │              │                      │   psw CLI    │
└──────────────┘                    └──────────────┘                      └──────────────┘
       ▲                                                                          │
       │                                                                          │
       └──────────────────── (3a) validation errors ─────────────────────────────┘
                              (paste back to AI, iterate)
       ┌──────────────────── (3b) valid → write files, advance ──────────────────┐
       │                                                                         ▼
       │                                                       deployment-plan.yml
       │                                                       nodes/<n>/storage.yml

What’s in the prompt#

Six sections, always in this order:

  1. Role & tone. Tells the AI it’s emitting a machine-readable plan.
  2. Inputs. JSON-ified: your user_apps + each node’s hardware.yml.
  3. Storage rules. The full contents of PSW’s storage-ai-guide.md, verbatim. This is the authoritative ZFS guide — ashift selection, recordsize per workload, compression choices, SLOG/L2ARC rules, cluster NFS patterns, replication planning.
  4. Placement rules. The shapes table (small box → 2 targets, big box → full category split), the GPU ladder (reserve AI-class GPUs for AI even if no AI apps selected), USB pinning, NFS overlays, resource sizing formulas, consolidation strategy.
  5. App catalog. Per-app placement: + resources: + declared storage[] for every app you picked.
  6. Output contract. Strict rules (“return ONE JSON object, no prose, no fences”) followed by the JSON schema the reply must match — generated at render time from PSW’s pydantic models.

Total prompt size: ~70 KB. Every capable AI can read it in one shot.

What the AI returns#

One JSON object shaped like PlannerOutput:

{
  "plan": {
    "user_apps": ["jellyfin", "postgres", "..."],
    "targets": {
      "core": {"node": "homelab", "apps": ["..."], "resources": {...}, "devices": {...}, "rationale": "..."},
      "media": {"node": "homelab", "apps": ["..."], "resources": {...}, "devices": {"gpu": {...}, "usb": [...]}, "rationale": "..."}
    },
    "cluster": {"roles": {"homelab": "single"}, "reserved_gpus": []},
    "generator_version": "v1"
  },
  "storage": {
    "homelab": {
      "zfs_pools": {"rpool": {...}},
      "nfs_mounts": [...]
    }
  },
  "warnings": ["No GPU class 'ai' available — dropped localai from the selection."]
}

NFS exports + mounts live in storage.yml , not in the plan — the planner emits them under storage.<node>.zfs_pools[].datasets[].nfs_export and storage.<node>.nfs_mounts[]. Each target’s GPU + USB classification lives in targets.<name>.devices (gpu.render_device, usb[].type); the planner picks the class from each node’s hardware.yml, which only carries identification (PCI address, vid:pid, by-id name) — see hardware .

Validation#

Before PSW writes anything to disk, the response parser runs:

  1. Fence strip. ```json wrappers are peeled. Prose before/after the top-level {...} is tolerated.
  2. json.loads. JSON-level errors (unbalanced braces, missing commas) surface as root: ....
  3. Schema validation. Pydantic validates every field. Errors come back with JSON pointers like plan.targets.media.resources.memory_mb.
  4. Cross-field checks.
    • generator_version must match PSW’s current version.
    • Every target.node must name a node with a hardware.yml .
    • Every target name is snake_case (lowercase letters, digits, underscores). Hyphenated names are auto-normalized on load (home-automationhome_automation) — they end up as Ansible inventory groups and Jinja variable fragments downstream, both of which need Python-identifier characters.
    • Every USB pinned to a target must reference a by_id_name that exists in the node’s hardware.yml . The type (one of zigbee, zwave, coral, serial) is the planner’s classification of that dongle and must match a usb_class entry in the consuming app’s placement.hardware ordered preference list. (coral covers the Google Coral USB Accelerator — Edge TPU for on-device neural-net inference, listed by Frigate as the fallback path when no AI-class GPU is available.)
    • Every node’s storage must have exactly one OS pool (purpose: os) with a non-null hdsize — the Proxmox auto-installer reads it verbatim.
    • Summed target.memory_mb per node ≤ node RAM × 0.9.

Any failure surfaces as a list of {pointer, message} entries you can copy and paste back to your AI.

CLI#

psw node plan
    Step 1. Print the prompt to stdout. Also writes
    <project>/.psw/plan-prompt.md so you can diff across runs.

psw node plan --incremental
    Step 1 variant. Prompt includes the previous plan so the AI
    minimises diff. Used internally by `psw app plan`.

psw node plan --apply <response.json>
    Step 2. Validate, then atomically write:
      - deployment-plan.yml
      - nodes/<n>/storage.yml (per node)
      - nodes/<n>/proxmox-storage.yml (installer-focused, per node)

psw app plan <name>...
    Append apps to user_apps and emit an incremental prompt.

psw app plan-remove <name>...
    Inverse: strip apps and emit an incremental prompt.

Wizard equivalent#

The Plan step of the web wizard exposes the same protocol: Select apps → Show prompt → Paste response → Review. The prompt and the response parser are the same Python code the CLI uses — nothing UI-specific about the planner itself.

Non-determinism is fine#

Different AIs (and different re-asks of the same AI) produce different plans. That is by design. The round-trip is cheap; just regenerate.

What PSW never does#

  • Call an LLM API. Not Anthropic, not OpenAI, not Gemini, not Ollama-over-HTTP.
  • Ask for an AI API key anywhere — wizard, CLI, env var, nothing.
  • Ship an LLM SDK in its dependency tree.
  • Auto-retry a malformed response. You re-ask; PSW doesn’t.
  • Repair broken JSON. The parser either extracts a balanced top-level object or errors out.

Key Concepts Referenced#