Skip to content

Storage Architecture

Reliable, scalable storage is the foundation of the homelab. TrueNAS Scale provides centralized storage with ZFS for data integrity, iSCSI/NFS for flexibility, and automated snapshots for protection.

TrueNAS Scale

Overview

TrueNAS Scale is the primary storage platform, serving both block (iSCSI) and file (NFS/SMB) storage to the entire infrastructure.

Key Features: - ZFS File System – Copy-on-write, checksums, data integrity - Storage Pools – Aggregated disks with redundancy (RAID-Z) - Snapshots – Point-in-time recovery with minimal overhead - Replication – Automated sync to backup targets - Apps/VMs – Built-in Kubernetes for containerized apps

Hardware Configuration

  • Disk Array – Multiple pools with different redundancy levels
  • Hot Spares – Automatic replacement on drive failure
  • SSD Cache – Read/write cache for performance
  • 10GbE Networking – High-speed connectivity for storage traffic

Storage Types

iSCSI (Block Storage)

Used for VM disks requiring high performance and low latency.

Use Cases: - Proxmox VM disk storage (shared LUNs) - Database servers (MySQL, PostgreSQL) - High-performance application storage

Configuration:

# iSCSI LUN for Proxmox VMs
Pool: tank-ssd
LUN Size: 500GB
Target: iqn.2024-01.com.truenas:proxmox-vms

Benefits: - Block-level access for direct I/O - Supports live migration of VMs - Compatible with clustering and HA

NFS (Network File Storage)

Used for shared file access across multiple systems.

Use Cases: - Media libraries (Jellyfin, Plex) - Backup storage (Proxmox Backup Server) - Container persistent volumes - ISO images and templates

Configuration:

# NFS share for media
Path: /mnt/tank-hdd/media
Permissions: maproot=root
Networks: 10.0.20.0/24, 10.0.40.0/24

Benefits: - Simple file-level sharing - Cross-platform compatibility - Ideal for bulk/archival storage

SMB/CIFS (Windows Shares)

Used for Windows clients and workstations.

Use Cases: - User home directories - Shared documents - Windows VM storage

ZFS Features

Data Integrity

  • Checksums – Every block verified on read
  • Self-Healing – Automatic repair with redundancy
  • Scrubbing – Regular integrity checks

Snapshots

  • Instant Point-in-Time Copies – Minimal space usage
  • Rollback Capability – Restore to any snapshot
  • Clone Support – Writable snapshot copies

Snapshot Schedule:

Hourly:  Keep 24 (1 day)
Daily:   Keep 7 (1 week)
Weekly:  Keep 4 (1 month)
Monthly: Keep 12 (1 year)

Replication

  • Automated Sync – Periodic replication to backup server
  • Incremental Transfers – Only changed blocks sent
  • Disaster Recovery – Full pool reconstruction from replicas

Backup Strategy

3-2-1 Backup Rule

  • 3 Copies – Production + local backup + offsite
  • 2 Media Types – Disk + cloud/tape
  • 1 Offsite – Protection against local disasters

Backup Targets

Proxmox Backup Server

  • VM/Container Backups – Daily automated backups
  • Retention – 7 daily, 4 weekly, 6 monthly
  • Deduplication – Space-efficient incremental backups

TrueNAS Snapshots

  • Local Snapshots – Hourly/daily/weekly/monthly
  • Instant Recovery – Rollback to any snapshot
  • Replication – Synced to secondary TrueNAS instance

Cloud Backup

  • Critical Data Only – Vaultwarden, documents, configs
  • Encrypted Sync – rclone to cloud provider
  • Offsite Protection – Fire/flood/theft recovery

Storage Pools

tank-ssd (Performance Pool)

  • Disks: 4x NVMe SSD in RAID-Z1
  • Purpose: VM disks, databases, hot data
  • Performance: High IOPS, low latency

tank-hdd (Capacity Pool)

  • Disks: 8x SATA HDD in RAID-Z2
  • Purpose: Media, backups, archives
  • Performance: High throughput, cost-effective

tank-nvme (Cache Pool)

  • Disks: 2x NVMe in mirror
  • Purpose: ZFS L2ARC (read cache) and SLOG (write cache)
  • Performance: Accelerates pool access

Monitoring & Alerts

Health Monitoring

  • SMART Data – Disk health metrics
  • Scrub Status – Regular integrity checks
  • Pool Capacity – Space usage alerts
  • Replication Status – Backup job monitoring

Alerts

  • Email Notifications – Critical events (disk failure, scrub errors)
  • Checkmk Integration – Comprehensive monitoring dashboard
  • Graylog – Storage-related log aggregation

Performance Optimization

Caching

  • L2ARC – SSD read cache for frequently accessed data
  • SLOG – Dedicated SSD for synchronous writes
  • ARC – RAM cache (automatically managed by ZFS)

Network Optimization

  • Jumbo Frames – MTU 9000 for storage VLAN
  • 10GbE Links – High bandwidth for iSCSI/NFS
  • LACP Bonding – Link aggregation for redundancy

Tuning

# Example ZFS tuning parameters
recordsize=128k      # For media files
compression=lz4      # Transparent compression
atime=off           # Disable access time updates

Best Practices

  1. Regular Scrubs – Run monthly ZFS scrubs to detect/repair corruption
  2. Monitor Disk Health – Check SMART data and replace failing disks proactively
  3. Test Restores – Regularly verify backup integrity with test restores
  4. Capacity Planning – Keep pools under 80% usage for optimal performance
  5. Snapshot Retention – Balance space usage with recovery needs
  6. Encryption – Use ZFS native encryption for sensitive data
  7. Documentation – Maintain records of pool configs, shares, and dependencies

Disaster Recovery

Recovery Scenarios

Failed Disk

  1. ZFS detects failure and alerts
  2. Replace disk with hot spare
  3. Automatic resilver restores redundancy

Corrupted Data

  1. ZFS scrub detects checksum error
  2. Self-healing restores from redundant copy
  3. Alert sent if repair fails

Pool Failure

  1. Restore from Proxmox Backup Server
  2. Recreate pool and import snapshots
  3. Replicate from offsite TrueNAS backup

Resources