TrueNAS ZFS Pool Design

Lay out vdevs, pick the right redundancy, and avoid the mistakes you can't undo without destroying the pool.

The One Rule You Cannot Unwind

ZFS pool topology is decided at creation. You can add vdevs later, but you cannot remove them, you cannot change a mirror into a raidz, and you cannot shrink a raidz. Plan the layout before you click create — anything else means destroying the pool and starting over.

Vdevs Are the Real Unit

A ZFS pool is a stripe across one or more vdevs. Redundancy lives at the vdev level, not the pool level. Lose a vdev, lose the pool. Everything else — the dataset hierarchy, snapshots, replication — sits on top of this.

  • Single disk: No redundancy. Fine for scratch space, never for data you care about.
  • Mirror: Two or more disks, each a full copy. Best random read performance, fastest resilver, simple to expand. Loses 50% of raw capacity for a 2-way mirror.
  • Raidz1: One parity disk. Survives one failure. Avoid for disks larger than ~4TB — resilver time stretches into days and a second failure during resilver kills the vdev.
  • Raidz2: Two parity disks. The default recommendation for bulk storage. Survives two simultaneous failures.
  • Raidz3: Three parity disks. For very large disks or wide vdevs where long resilvers compound failure risk.

Choosing a Topology

Match the layout to the workload, not just the disk count:

  • VM and database storage: Mirrors. IOPS scale with vdev count, and random I/O on raidz is painful because every read touches every disk in the vdev.
  • Media library, backups, archives: Raidz2 with 6–10 disks per vdev. Sequential I/O is what raidz does well.
  • Mixed workload: Multiple pools. A small mirror pool for VMs, a wide raidz2 pool for bulk. Do not try to make one pool do everything.

ashift: Set It Right, Once

ashift is the sector size exponent. Wrong ashift causes write amplification and you cannot change it later. Modern disks (almost all spinners and SSDs since ~2012) use 4K physical sectors:

  • ashift=12 (4K sectors) — correct for 99% of modern drives
  • ashift=13 (8K sectors) — some NVMe drives and Optane
  • ashift=9 (512B) — only true legacy 512n drives

TrueNAS sets ashift=12 by default. Verify with zpool get ashift tank after creation.

Datasets vs Zvols

  • Datasets are filesystems. Use for SMB/NFS shares, snapshot/replication targets, anything served over a network share.
  • Zvols are block devices. Use for iSCSI targets and VM disks when you want raw block access.
  • For Proxmox VM storage you have a choice: NFS export of a dataset (simpler, file-level snapshots) or iSCSI to a zvol (lower overhead, block-level). Mirrors handle either well; do not put VMs on raidz.

Recordsize Tuning

Default recordsize is 128K. Tune per-dataset for the workload:

  • Databases (Postgres, MySQL): 16K to match page size
  • VM images: 64K or 128K depending on guest filesystem
  • Media files, backups: 1M for better compression ratio and less metadata
  • General file shares: Leave at 128K
zfs set recordsize=1M tank/media
zfs set recordsize=16K tank/postgres

Compression: Always On

Use lz4 as the default — it is essentially free on modern CPUs and even slightly faster than uncompressed for compressible data because of reduced I/O. For archival datasets, zstd-3 trades a bit of CPU for noticeably better ratios.

Do not enable compression on already-compressed data (video, music, encrypted backups). It costs CPU for no gain. Set compression=off per-dataset where needed.

The SLOG and L2ARC Trap

Adding a SLOG or L2ARC is the most common over-engineered ZFS decision:

  • SLOG only accelerates synchronous writes (NFS with sync=always, some databases, iSCSI). For async writes, the SLOG does nothing. If you add one, it must be power-loss-protected (Intel Optane, datacenter-grade SSDs with PLP).
  • L2ARC only helps if your working set exceeds RAM. More RAM is almost always a better investment than L2ARC because L2ARC itself consumes RAM for its index (~25 bytes per record).
  • Special vdev stores metadata and small files. Useful for media pools where directory listings are slow. Must be redundant (mirror) because losing it kills the pool.

Snapshots and Retention

Snapshots are nearly free in ZFS and are your first line of defense against ransomware and accidental deletion. TrueNAS Periodic Snapshot Tasks make this easy:

  • Hourly: Keep 24. Catches "I deleted that an hour ago".
  • Daily: Keep 14.
  • Weekly: Keep 8.
  • Monthly: Keep 6 or 12 depending on space.

Snapshots are not backups — they live on the same pool. Pair them with zfs send replication to a second host (and offsite) for a real 3-2-1 setup. See the Backup Strategies guide for the broader picture.

Capacity Planning

  • Never fill past 80%. ZFS performance degrades sharply above that threshold because the allocator switches strategies.
  • Reserve dataset: Create an empty dataset with a refreservation as emergency headroom. Delete it to free space if the pool ever hits the wall.
  • Expand by adding vdevs. A pool with two raidz2 vdevs of 6 disks each is better than one wide 12-disk vdev — more IOPS, faster resilver, less risk during rebuilds.

Scrubs and SMART

  • Monthly scrubs on consumer disks, quarterly on enterprise. TrueNAS schedules these by default.
  • SMART short tests daily, long tests weekly. Configure under Data Protection → S.M.A.R.T. Tests.
  • Alerts: Configure email or webhook alerts so a degraded pool wakes you up, not the absence of a working share.

Common Mistakes

  • Raidz on small vdev counts: 3-disk raidz1 is worse in nearly every dimension than a 3-way mirror or 4-disk raidz2.
  • Mixing vdev types in one pool: ZFS will let you, but performance becomes unpredictable. Keep vdevs uniform.
  • Using deduplication: Almost always a mistake. Dedup costs ~5GB of RAM per 1TB of pool and once enabled is painful to disable. Use compression instead.
  • No spare capacity for replacement: Resilvering a degraded raidz2 while at 95% full is a recipe for a second failure.

Validation Checklist

  • zpool status shows ONLINE with no errors
  • zpool get ashift returns 12 (or correct value for your disks)
  • Compression is enabled on appropriate datasets (zfs get compression)
  • Snapshot tasks are scheduled and running
  • Replication target exists and is current
  • SMART tests are scheduled
  • Email/webhook alerts are configured and tested
  • Pool is under 80% capacity

- Crafted by Axiom|Spectre