Storage#

What Is It?#

“Storage” in PSW means the disks and filesystems your apps store their data on. It’s split across three places:

ZFS pools — the big shared disks living directly on your Proxmox node , carved into datasets (like labeled drawers)
Container root disks — a per-target small filesystem where the operating system and local-only app data live
Podman-managed volumes — for app data that should stay local to one container, not shared

An AI planner sits at the front of all this: you tell it what apps you want and it designs the whole layout — pool configuration, datasets with the right tuning, rootdisk sizes per target — based on your actual hardware and the apps you picked. PSW renders the prompt; you paste it into whatever AI you already use (Claude.ai , ChatGPT, Gemini, a local LLM, etc.); you paste the reply back into PSW. PSW validates it and writes nodes/<name>/storage.yml. No API keys collected anywhere in PSW.

Why Not Just One Big Disk?#

Different kinds of data want different things. A media library wants maximum compression efficiency and large record sizes (good for big files). A database wants small record sizes tuned to its page size and absolutely no compression surprises. A download folder wants to be fast but doesn’t care about deduplication. Throwing everything into one filesystem with one set of tunables means every workload gets an average setting that’s wrong for all of them.

ZFS solves this by letting you create multiple datasets inside one pool, each with its own compression, recordsize, atime, and other settings — while still sharing the same free space across all of them. That’s what the AI planner designs for you.

The Layout On Disk#

When the planner runs, it writes nodes/<name>/storage.yml with a structure like:

zfs_pools:
  rpool:                        # dict key = on-disk zpool name
    raid_level: mirror
    devices: [WD-ABC123, WD-DEF456]
    purpose: os
    hdsize: 120
    datasets: [ ... ]
  tank:
    raid_level: raidz1
    devices: [ST-XYZ789, ...]
    purpose: media
    datasets:
      - name: media
        recordsize: 1M
        compression: zstd
        workload_profile: media
      - name: downloads
        recordsize: 128K
        compression: lz4
      - name: databases
        recordsize: 16K
        compression: lz4
        atime: off

Per-target rootdisk sizes live on the deployment plan itself (plan.targets[name].resources.root_disk_gb), not in storage.yml — one source of truth for “how big is this LXC’s rootfs”.

Pool purpose is one of a small set of known values — os, vms, nas, backups, media, databases — declared in psw-proxmox-installer/src/psw_proxmox_installer/storage/enums.py . The planner uses it to pick sensible defaults and to match pools against apps that ask for that kind of storage.

RAID level is one of single, mirror, raidz1, raidz2, raidz3 — determined by how many disks you have and how much redundancy you asked for.

How Apps Ask for Storage#

Each app declares what storage kinds it needs in its meta.yml :

# sonarr/meta.yml
storage:
  - type: config
    path: /config
    local: true                # stays in a Podman volume on this container
  - type: media
    path: /media
    mode: rw                   # bind-mounted from the shared ZFS dataset
    required: true
  - type: downloads
    path: /downloads
    mode: rw
    required: true

The type is one of a handful of well-known names in use across the catalog: config, data, media, downloads, database, cache, metadata, letsencrypt, models. The models class is for AI/ML model weights (Ollama, future vLLM / ComfyUI / Whisper) — see ai.md § Where the models live for the ZFS tuning the planner applies (fastest local NVMe, recordsize=1M, compression=off, never NFS, backup-excluded by default).

Three flags control how each one is resolved:

Flag	What it does
`local: true`	Store in a Podman named volume on this container’s rootdisk. Nothing else sees it
`local: false` (or absent)	Bind-mount from a shared ZFS dataset on the host. Multiple apps on multiple targets can share it
`mode: rw` / `mode: ro`	Container gets read-write or read-only access. E.g. Jellyfin mounts `/media` as `ro` because it only reads the library; Sonarr mounts it `rw` because it writes new episodes
`required: true` / `required: false`	Should deployment fail if this storage can’t be resolved? Defaults to `true`

See the storage resolver in psw-lib/src/psw_lib/deploy/storage.py for the exact logic — everything below is what it actually does.

“Local” vs “Shared”: The Rule#

This is the single most important idea in PSW storage:

Local storage lives on the target’s own rootdisk, inside a Podman named volume. It’s invisible to everything else. Good for: per-app config files, a cache, anything that’s pointless to share. The rootdisk was sized by the AI planner to have room for all the local volumes of every app on that target.
Shared storage is a ZFS dataset that lives on the Proxmox host, outside any container. PSW bind-mounts it into each container that wants access. Multiple apps on multiple targets can read and write to the same underlying files. Good for: the media library (Sonarr writes, Jellyfin reads), the download staging area, any big dataset you don’t want duplicated.

The storage resolver looks at local: to make the decision. If local: true it creates a Podman volume. If local: false (or absent), it finds the matching ZFS dataset on the host, pct sets an LXC bind mount pointing at it, and the container sees it at the declared container path. If the dataset doesn’t exist yet, the resolver creates it on the fly with sensible defaults (rpool/data → /mnt/<type>) — but the AI planner almost always creates them upfront with the right tuning for the workload.

How Storage Planning Feeds the Install#

The unified AI planner (entry point in psw-proxmox-installer/src/psw_proxmox_installer/commands/plan/ ) takes two inputs:

Hardware inventory — from nodes/<name>/hardware.yml: disks by serial, their sizes, RAM, block sizes
The deployment plan — which apps the user picked (targets / resources / mounts are planner output, not user input)

Plus the authoritative storage-ai-guide.md embedded verbatim in the prompt, and each app’s resources.storage_estimate_gb from app metadata .

PSW does not call any AI directly — the prompt is rendered for the user to paste into whatever AI they prefer. The JSON response is validated against the StorageRecommendation schema. See AI Planner . What lands in storage.yml:

zfs_pools — the actual pool + dataset layout, with compression, recordsize, atime, etc. tuned per workload

Per-target LXC rootdisk sizes are NOT written here — they live on deployment-plan.yml (plan.targets[name].resources.root_disk_gb) so there’s exactly one place to read or edit them.

The Proxmox auto-installer reads storage.yml and builds the exact pool + dataset layout during Install . After install, the psw node apply-storage step creates the datasets and sets their tunables. From then on, every deploy resolver call finds the right dataset waiting.

The Three Disks of an LXC Target#

When you deploy an app to a managed target , its data lands in up to three places:

┌─ LXC rootdisk (sized by plan.targets[*].resources) ─────┐
│  OS packages, container images, systemd journal         │
│  ┌─ Podman volumes (local: true) ───────────┐           │
│  │  sonarr-config.volume → /config          │           │
│  │  jellyfin-cache.volume → /cache          │           │
│  └────────────────────────────────────────┘             │
└─────────────────────────────────────────────────────────┘
        │ LXC bind mounts (pct set ... -mp)
        ▼
┌─ Proxmox host: ZFS datasets ────────────────────────────┐
│  tank/media (shared) ← /media in Sonarr + Jellyfin      │
│  tank/downloads (shared) ← /downloads in qBittorrent    │
│  tank/databases (shared) ← /var/lib/postgresql          │
└─────────────────────────────────────────────────────────┘

The rootdisk keeps the OS and truly-local stuff tight (30–50 GB is typical). Everything with mass goes to the shared pool, where it lives once and everyone who needs it mounts it in. This is what lets you run a 5 TB media library without inflating every container by 5 TB.

Key Ideas#

AI designs it, you review it — the storage planner reads your hardware and your deployment plan and produces a complete layout; you don’t tune recordsize by hand
One pool, many datasets — ZFS lets every workload get its own tunables while sharing the same free space
local: is the big switch — per-app isolated data (Podman volume) vs. cross-target shared data (ZFS dataset bind-mount)
Rootdisks stay small — because big data lives on the pool, not inside containers
It’s all declarative — the storage: block in an app’s meta.yml is the contract; the resolver makes it real at deploy time