Storage Architecture¶
Reliable, scalable storage is the foundation of the homelab. TrueNAS Scale provides centralized storage with ZFS for data integrity, iSCSI/NFS for flexibility, and automated snapshots for protection.
TrueNAS Scale¶
Overview¶
TrueNAS Scale is the primary storage platform, serving both block (iSCSI) and file (NFS/SMB) storage to the entire infrastructure.
Key Features: - ZFS File System – Copy-on-write, checksums, data integrity - Storage Pools – Aggregated disks with redundancy (RAID-Z) - Snapshots – Point-in-time recovery with minimal overhead - Replication – Automated sync to backup targets - Apps/VMs – Built-in Kubernetes for containerized apps
Hardware Configuration¶
- Disk Array – Multiple pools with different redundancy levels
- Hot Spares – Automatic replacement on drive failure
- SSD Cache – Read/write cache for performance
- 10GbE Networking – High-speed connectivity for storage traffic
Storage Types¶
iSCSI (Block Storage)¶
Used for VM disks requiring high performance and low latency.
Use Cases: - Proxmox VM disk storage (shared LUNs) - Database servers (MySQL, PostgreSQL) - High-performance application storage
Configuration:
# iSCSI LUN for Proxmox VMs
Pool: tank-ssd
LUN Size: 500GB
Target: iqn.2024-01.com.truenas:proxmox-vms
Benefits: - Block-level access for direct I/O - Supports live migration of VMs - Compatible with clustering and HA
NFS (Network File Storage)¶
Used for shared file access across multiple systems.
Use Cases: - Media libraries (Jellyfin, Plex) - Backup storage (Proxmox Backup Server) - Container persistent volumes - ISO images and templates
Configuration:
# NFS share for media
Path: /mnt/tank-hdd/media
Permissions: maproot=root
Networks: 10.0.20.0/24, 10.0.40.0/24
Benefits: - Simple file-level sharing - Cross-platform compatibility - Ideal for bulk/archival storage
SMB/CIFS (Windows Shares)¶
Used for Windows clients and workstations.
Use Cases: - User home directories - Shared documents - Windows VM storage
ZFS Features¶
Data Integrity¶
- Checksums – Every block verified on read
- Self-Healing – Automatic repair with redundancy
- Scrubbing – Regular integrity checks
Snapshots¶
- Instant Point-in-Time Copies – Minimal space usage
- Rollback Capability – Restore to any snapshot
- Clone Support – Writable snapshot copies
Snapshot Schedule:
Hourly: Keep 24 (1 day)
Daily: Keep 7 (1 week)
Weekly: Keep 4 (1 month)
Monthly: Keep 12 (1 year)
Replication¶
- Automated Sync – Periodic replication to backup server
- Incremental Transfers – Only changed blocks sent
- Disaster Recovery – Full pool reconstruction from replicas
Backup Strategy¶
3-2-1 Backup Rule¶
- 3 Copies – Production + local backup + offsite
- 2 Media Types – Disk + cloud/tape
- 1 Offsite – Protection against local disasters
Backup Targets¶
Proxmox Backup Server¶
- VM/Container Backups – Daily automated backups
- Retention – 7 daily, 4 weekly, 6 monthly
- Deduplication – Space-efficient incremental backups
TrueNAS Snapshots¶
- Local Snapshots – Hourly/daily/weekly/monthly
- Instant Recovery – Rollback to any snapshot
- Replication – Synced to secondary TrueNAS instance
Cloud Backup¶
- Critical Data Only – Vaultwarden, documents, configs
- Encrypted Sync – rclone to cloud provider
- Offsite Protection – Fire/flood/theft recovery
Storage Pools¶
tank-ssd (Performance Pool)¶
- Disks: 4x NVMe SSD in RAID-Z1
- Purpose: VM disks, databases, hot data
- Performance: High IOPS, low latency
tank-hdd (Capacity Pool)¶
- Disks: 8x SATA HDD in RAID-Z2
- Purpose: Media, backups, archives
- Performance: High throughput, cost-effective
tank-nvme (Cache Pool)¶
- Disks: 2x NVMe in mirror
- Purpose: ZFS L2ARC (read cache) and SLOG (write cache)
- Performance: Accelerates pool access
Monitoring & Alerts¶
Health Monitoring¶
- SMART Data – Disk health metrics
- Scrub Status – Regular integrity checks
- Pool Capacity – Space usage alerts
- Replication Status – Backup job monitoring
Alerts¶
- Email Notifications – Critical events (disk failure, scrub errors)
- Checkmk Integration – Comprehensive monitoring dashboard
- Graylog – Storage-related log aggregation
Performance Optimization¶
Caching¶
- L2ARC – SSD read cache for frequently accessed data
- SLOG – Dedicated SSD for synchronous writes
- ARC – RAM cache (automatically managed by ZFS)
Network Optimization¶
- Jumbo Frames – MTU 9000 for storage VLAN
- 10GbE Links – High bandwidth for iSCSI/NFS
- LACP Bonding – Link aggregation for redundancy
Tuning¶
# Example ZFS tuning parameters
recordsize=128k # For media files
compression=lz4 # Transparent compression
atime=off # Disable access time updates
Best Practices¶
- Regular Scrubs – Run monthly ZFS scrubs to detect/repair corruption
- Monitor Disk Health – Check SMART data and replace failing disks proactively
- Test Restores – Regularly verify backup integrity with test restores
- Capacity Planning – Keep pools under 80% usage for optimal performance
- Snapshot Retention – Balance space usage with recovery needs
- Encryption – Use ZFS native encryption for sensitive data
- Documentation – Maintain records of pool configs, shares, and dependencies
Disaster Recovery¶
Recovery Scenarios¶
Failed Disk¶
- ZFS detects failure and alerts
- Replace disk with hot spare
- Automatic resilver restores redundancy
Corrupted Data¶
- ZFS scrub detects checksum error
- Self-healing restores from redundant copy
- Alert sent if repair fails
Pool Failure¶
- Restore from Proxmox Backup Server
- Recreate pool and import snapshots
- Replicate from offsite TrueNAS backup