The first entry in this shiny new blog is about network attached storage. I'm pretty fond of it. A write-up seems a good a place to start. Finding means to reliably store and quickly retrieve ever growing datasets was an interesting challenge at work beginning in the 1980's. That became a hobby and I think maybe an obsession. It allows me to indulge a tendency to hoard without the mess. Folks' eyes glaze over whenever I talk about it. If you're interested in this stuff you may already know exactly what I mean.
In brief, this is a FreeBSD system using ZFS on an array of five 6TB drives with a 120GB SSD cache and two partitioned 40GB SSD log/boot drives running from an ASRock C2550D4I Mini ITX Server Motherboard which I first assembled on March 19, 2015.
|amazon 2015.03.05||WD Green 6TB WD60EZRX 3.5" hard drive x 5||$1344.12|
|newegg 2015.03.02||Crucial Ct102472bd160b 8gb ddr3 1600 Ecc memory x 4||$363.96|
|newegg 2015.03.03||ASRock C2550D4I Mini ITX Server Motherboard x 1||$288.98|
|newegg 2015.03.19||SEASONIC X650 GOLD SS-650KM PSU x 1||$143.02|
|newegg 2015.02.28||Lian Li PC-Q35A Black Aluminum Mini-ITX Tower Case x 1||$135.98|
|amazon 2015.03.11||Intel 320 Series 40 GB 2.5" SATA SSD x 2||$104.90|
|newegg 2015.03.01||ICY DOCK MB155SP-B FatCage 5x3.5" x 1||$92.99|
|newegg 2015.03.05||Intel Pro 2500 SSDSC2BF120H501 2.5" 120GB SSD x 1||$80.98|
|newegg 2015.03.01||ICY DOCK MB153SP-B 3 in 2 SATA Internal Backplane x 1||$64.99|
Over two years in and all is well with these drives. I have no complaints. These were about 10% cheaper than the WD reds. The reds include a TLER (time limited error recovery) feature that would be useful in a hardware raid environment but not required by this software raid. The reds spindown after 300 seconds of inactivity while the greens do so after only 8 seconds but spindown hasn't been an issue. It's possible that it could be an issue in other use scenarios, like for a web server. The warranty period was 2 years on these where it was 3 years on the reds and I'm right in the middle of the window between. I'll keep my fingers crossed another 6 months before considering it money saved.
ZFS loves RAM, really loves it. I threw 32GB at it. These were among the modules recommended by ASRock for use with the motherboard. They're fine.
In theory I absolutely love this board. In practice it has not been flawless. The first board I received lasted 2 days. I found an EVGA 500W 80PLUS PSU purchased for the build wouldn't reliably deliver enough power to the 8 drives and had newegg send me the Seasonic PSU next day air since I was itching to finish the build. I then installed FreeBSD on the already setup zpool, powered it down, moved it into some cabinetry and it never booted again. I could access the IPMI but nothing else. Newegg RMA'd the board and I had another in a couple days. The second board lasted until May 10, 2017, almost 26 months, when I found the NAS offline. Again there was no POST at power on but I could access the management console. The board has a 3 year manufacturer's warranty. This time ASRock provided the RMA and the process wasn't terrible. Swapping the motherboard didn't cause any issues with the zpools and was a snap, pretty much just pull, replace and reboot. In both instances I spoke with a gentleman named William at ASRock who was very professional, knowlegable and helpful.
I've been contemplating buying two more of these. One to build as an off-site backup; a replacement for an aging, former NAS box which is currently doing that duty and the other to have on hand if or when either fails. I've read conflicting reports regarding the cause of the failures in these boards. I've read that the reliability issue has been resolved and hope that's true because this board is so very close to being great. I don't know of another board with this feature set in this form factor. There are some that are close but 3 times the price.
The price for this was $109.99 plus shipping. It's good. It's quiet. I like it.
What arrived was a silver or natural aluminum case not black in color but I liked it just as well so I kept it without complaint. It's a nice case.
Two of these are partioned and mirrored to serve dual roles as boot / OS drives and ZIL(ZFS Intent Log) which is basically a write cache for metadata to faster media. They are working as intended.
The spinning media lives here. I thought it would be nice to remove and replace drives without opening the case but haven't had occasion to do that. It has a fan to cool the drives. It also allows for powering five drives from three 15 pin SATA power connectors.
This drive serves as a read cache between main memory and the disks and performs as expected.
All three SSD's live here. They are powered by two 15 pin SATA connectors. That puts all eight drives in the five 5.25" bays provided by the Lian Li case.
Much of what follows is adapted for this home server from documentation provided about the Hyades Supercomputer dedicated to Computational Astrophysics research at the University of California, Santa Cruz (UCSC).
FreeBSD-10.1 was originally installed but I've since upgraded to FreeBSD-11. I'll try to catch any changes required in this recipe to start from FreeBSD-11. There should be few if any deviations.
To create a bootable flash drive from a FreeBSD system download the the appropriate image then copy that to the flash drive like this.
$ dd if=/dev/zero of=/dev/da0 bs=64k count=10 $ dd if=FreeBSD-10.1-RELEASE-amd64-memstick.img of=/dev/da0 bs=64k
Do the initial bsdinstall setup. There will be dialog boxes for setting time, how to connect your network, giving a hostname, and creating user accounts and passwords. When you get to the partitoning tool choose the Shell option.
The output below is current. The Marvel Console is a raid configuration utility. There is a tool available from ASRock to turn it off. It isn't needed for just a bunch of disks. I did that before but not after replacing the motherboard in May. I've noticed no difference except the console appears in the device list.
# camcontrol devlist <WDC WD60EZRX-00MVLB1 80.00A80> at scbus0 target 0 lun 0 (pass0,ada0) <WDC WD60EZRX-00MVLB1 80.00A80> at scbus1 target 0 lun 0 (pass1,ada1) <WDC WD60EZRX-00MVLB1 80.00A80> at scbus2 target 0 lun 0 (pass2,ada2) <WDC WD60EZRX-00MVLB1 80.00A80> at scbus3 target 0 lun 0 (pass3,ada3) <WDC WD60EZRX-00MVLB1 80.00A80> at scbus4 target 0 lun 0 (pass4,ada4) <INTEL SSDSC2BF120A5 TG20> at scbus5 target 0 lun 0 (pass5,ada5) <Marvell Console 1.01> at scbus9 target 0 lun 0 (pass6) <INTEL SSDSA2CT040G3 4PC10362> at scbus14 target 0 lun 0 (pass7,ada6) <INTEL SSDSA2CT040G3 4PC10362> at scbus15 target 0 lun 0 (pass8,ada7)
If you had drives on a different interface, for example USB, you'd see them listed as da0, da1, etc., instead of ada0, ada1, and so on with the SATA drives.
Set a sysctl variable forcing ZFS to choose 4k disk blocks. Some drives use 4k sectors but lie by reporting 512k to allow interoperability with 32 bit Windows. This step may increase the speed of the array by 5% to 20% at the cost of wasted space. If you have a huge number of small ( <4kb ) files this could be a bad idea.
# sysctl vfs.zfs.min_auto_ashift=12
Remove any extant partitioning.
# gpart destroy -F ada0 # gpart destroy -F ada1 # gpart destroy -F ada2 # gpart destroy -F ada3 # gpart destroy -F ada4 # gpart destroy -F ada5 # gpart destroy -F ada6 # gpart destroy -F ada7
Assign to the drives a GUID Partition Table scheme. If you have some odd hardware or are a masochist you could choose instead BSD disklabel, MBR or one of several other schemes.
# gpart create -s gpt ada0 # gpart create -s gpt ada1 # gpart create -s gpt ada2 # gpart create -s gpt ada3 # gpart create -s gpt ada4 # gpart create -s gpt ada5 # gpart create -s gpt ada6 # gpart create -s gpt ada7
# gpart add -a 4k -t freebsd-zfs -l hdd0 ada0 # gpart add -a 4k -t freebsd-zfs -l hdd1 ada1 # gpart add -a 4k -t freebsd-zfs -l hdd2 ada2 # gpart add -a 4k -t freebsd-zfs -l hdd3 ada3 # gpart add -a 4k -t freebsd-zfs -l hdd4 ada4 # gpart add -t freebsd-zfs -l cache0 ada5
The cache drive (natively 512kb) could not be 4k aligned due to an error resolved by a later update. The work-around at the time was to revert the modified sysctl variable and make another sysctl variable change before adding the drive, then subsequent restoration of the variables.
# sysctl vfs.zfs.min_auto_ashift=9 # sysctl vfs.zfs.max_auto_ashift=9 # gpart add -t freebsd-zfs -l cache0 ada5 # sysctl vfs.zfs.min_auto_ashift=12 # sysctl vfs.zfs.max_auto_ashift=12
I wanted ZIL redundancy, so I set up a pair of drives to be mirrored. The operating system will be mirrored also. The boot sectors and swap space are duplicated. If one of these fails the other will, in theory, be available to boot and the pool will operate normally until the drive can be replaced. I bought 4 of the Intel 320 Series drives at a discounted price. The two extra drives remain unused.
# gpart add -s 222 -a 4k -t freebsd-boot -l boot0 ada6 # gpart add -s 1g -a 4k -t freebsd-swap -l swap0 ada6 # gpart add -s 16g -a 4k -t freebsd-zfs -l ssd0 ada6 # gpart add -s 16g -a 4k -t freebsd-zfs -l log0 ada6 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada6 # gpart add -s 222 -a 4k -t freebsd-boot -l boot1 ada7 # gpart add -s 1g -a 4k -t freebsd-swap -l swap1 ada7 # gpart add -s 16g -a 4k -t freebsd-zfs -l ssd1 ada7 # gpart add -s 16g -a 4k -t freebsd-zfs -l log1 ada7 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada7
The 222 block freebsd-boot sector figure was I think copied from the guide I was following at the time. It's fine but I've been trying to recall the logic behind that number and can't. With this drive that number is about 113k. The bootcode for FreeBSD-10.1 was <64k. I updated the bootcode during the update to FreeBSD-11. It is now 87k. There was and may still be a 512k limitation imposed by the loader on the size of the bootsector. I think next time I'll use it all for future-proofing.
# kldload zfs
# zpool create -o altroot=/mnt -O canmount=off -m none zroot mirror /dev/gpt/ssd0 /dev/gpt/ssd1
# zfs set checksum=fletcher4 zroot # zfs set atime=off zroot # zfs create -o mountpoint=none zroot/ROOT # zfs create -o mountpoint=/ zroot/ROOT/default # zfs create -o mountpoint=/home -o setuid=off zroot/home # zfs create -o mountpoint=/tmp -o compression=lz4 -o setuid=off zroot/tmp # chmod 1777 /mnt/tmp # zfs create -o mountpoint=/usr zroot/usr # zfs create zroot/usr/local # zfs create zroot/usr/obj # zfs create -o compression=lz4 -o setuid=off zroot/usr/ports # zfs create -o compression=off -o exec=off -o setuid=off zroot/usr/ports/distfiles # zfs create -o compression=off -o exec=off -o setuid=off zroot/usr/ports/packages # zfs create -o compression=lz4 -o exec=off -o setuid=off zroot/usr/src # zfs create -o mountpoint=/var zroot/var # zfs create -o compression=lz4 -o exec=off -o setuid=off zroot/var/crash # zfs create -o exec=off -o setuid=off zroot/var/db # zfs create -o compression=lz4 -o exec=on -o setuid=off zroot/var/db/pkg # zfs create -o exec=off -o setuid=off zroot/var/empty # zfs set readonly=on zroot/var/empty # zfs create -o compression=lz4 -o exec=off -o setuid=off zroot/var/log # zfs create -o compression=gzip -o exec=off -o setuid=off zroot/var/mail # zfs create -o exec=off -o setuid=off zroot/var/run # zfs create -o compression=lz4 -o exec=on -o setuid=off zroot/var/tmp # chmod 1777 /mnt/var/tmp
# zpool set bootfs=zroot/ROOT/default zroot
# cat << EOF > /tmp/bsdinstall_etc/fstab #Device Mountpoint FStype Options Dump Pass# /dev/gpt/swap0 none swap sw 0 0 /dev/gpt/swap1 none swap sw 0 0 EOF
When the installation is complete, choose Exit from the main menu. The next dialog will offer the option to 'open a shell in the new system to make any final manual modifications'. Select Yes.
# mount -t devfs devfs /dev # echo 'zfs_enable="YES"' >> /etc/rc.conf # echo 'zfs_load="YES"' >> /boot/loader.conf # echo 'vfs.zfs.vdev.cache.size="32M"' >> /boot/loader.conf # echo "vfs.zfs.min_auto_ashift=12" >> /etc/sysctl.conf # echo "vfs.zfs.max_auto_ashift=12" >> /etc/sysctl.conf # zfs set readonly=on zroot/var/empty
Exit the shell, remove the flash drive and reboot.
Create a zpool over the five 6TB drives and give it a file system. Here I named it simply nas and chose a raidz1 configuration.
# zpool create -m none nas raidz1 \ /dev/gpt/hdd0 /dev/gpt/hdd1 /dev/gpt/hdd2 /dev/gpt/hdd3 /dev/gpt/hdd4 \ cache /dev/gpt/cache0 log mirror /dev/gpt/log0 /dev/gpt/log1 # zfs set checksum=fletcher4 nas # zfs set atime=off nas # zfs create -o mountpoint=/export/nas -o setuid=off nas # chmod 1777 /export/nas
$ zpool status pool: nas state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM nas ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gpt/hdd0 ONLINE 0 0 0 gpt/hdd1 ONLINE 0 0 0 gpt/hdd2 ONLINE 0 0 0 gpt/hdd3 ONLINE 0 0 0 gpt/hdd4 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 gpt/log0 ONLINE 0 0 0 gpt/log1 ONLINE 0 0 0 cache gpt/cache0 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/ssd0 ONLINE 0 0 0 gpt/ssd1 ONLINE 0 0 0 errors: No known data errors
$ zpool iostat nas 5 capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- nas 291G 27.0T 0 136 0 11.6M
After installation of the FreeBSD-10.3 upgrade on April 29, 2016 I noticed the cache lost its gpt label. Deletion and recreation of the partition fixed it. I attempted to format it 4k again, even though it's not a 4k drive. It worked this time. The option to offset the starting block 2048 bytes I read about somewhere. I think it relates in that those 2048 bytes correspond to four 512kb blocks that remain after 4k formatting. If I recall correctly the reason for those to be set at the beginning rather than the end of the space is that GPT labels use the final sectors of each partition. Perhaps this assures the labels aren't written outside of defined block space. Surely past self didn't copy the value but did the math. Regardless of my poor memory here it does work.
#zpool remove nas gpt/cache0
or maybe it was
#zpool remove nas diskid/DISK-WHATEVERTHEDISKIDWAS
#gpart delete -i 1 ada5 #gpart add -b 2048 -a 4k -t freebsd-zfs -l cache ada5 #zpool add nas cache gpt/cache
Excepting the issue with the motherboard, which upon further reading I believe is caused by a flaw in the C2000 series processor and not the motherboard design, this setup has been remarkably stable. It lives behind an APC BackUPS 1500 and I have a small Honda generator for the occasions power outages last longer than a few minutes. There were 376 days of uninterrupted operation between the reboot during the FreeBSD-10.3 update and the motherboard / processor failure in May. Today, Thursday August 24, 2017 uptime is 90 days.
I'm happy with the perfomance. The heaviest workload for this server was the initial zfs send/recieve operation which copied to it approximately 8TB of data. Right now it is serving up a 1080p video, receiving bittorrent traffic via NFS and receiving files copied from a lan connected computer over SSH via the scp command. Here are the current iostats. The first output line is an average over time and the 2nd line is the operations and bandwidth during the previous one second interval.
$ zpool iostat nas 1 2 capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- nas 12.4T 14.8T 13 22 1.54M 327K nas 12.4T 14.8T 105 759 13.1M 43.0M
There are a couple Windows machines used for watching video in other rooms via Samba over wifi. They can be and often are going simultaneously. None of this is processor intensive so I also use the box for some transcoding.
Here's the latest zpool status showing the results of a recent scrub. There haven't been any read, write, or checksum errors with this build.
$ zpool status pool: nas state: ONLINE scan: scrub repaired 0 in 10h30m with 0 errors on Wed Aug 23 20:56:12 2017 config: NAME STATE READ WRITE CKSUM nas ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gpt/hdd0 ONLINE 0 0 0 gpt/hdd1 ONLINE 0 0 0 gpt/hdd2 ONLINE 0 0 0 gpt/hdd3 ONLINE 0 0 0 gpt/hdd4 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 gpt/log0 ONLINE 0 0 0 gpt/log1 ONLINE 0 0 0 cache gpt/cache ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: scrub repaired 0 in 0h1m with 0 errors on Wed Aug 23 10:24:40 2017 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/ssd0 ONLINE 0 0 0 gpt/ssd1 ONLINE 0 0 0 errors: No known data errors
Much of this section takes advantage of instruction provided by the invaluable FreeBSD Handbook with the exception of the NFSv4 configuration that uses the sharenfs feature of ZFS and is adapted from the configuration of the Hyades Supercomputer dedicated to Computational Astrophysics research at the University of California, Santa Cruz (UCSC).
An example is given here for the hostname and a commonly used private IP address. The interface name igb0 identifies the ethernet interface. That is igb is the name of the FreeBSD driver for the Intel(R) Gigabit Ethernet adapter used on this motherboard and it's suffixed with the interface number. Unbound is a local caching DNS resolver. SSH, and NTP are enabled. Crash dumps are enabled. The line which enables zfs on startup was added in Part 2 and is present also.
# cat /etc/rc.conf hostname="yourhostname.example.com" defaultrouter="192.168.0.1" ifconfig_igb0="DHCP" ifconfig_igb0_ipv6="inet6 accept_rtadv" local_unbound_enable="YES" sshd_enable="YES" ntpd_enable="YES" # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable dumpdev="AUTO" zfs_enable="YES"
The rpcbind utility is a server that converts RPC program numbers into universal addresses and is required by NFS. The optional rpc.lockd provides file and record locking services in the NFS environment and rpc.statd cooperates with rpc.statd daemons on other hosts to provide status monitoring. While rpc.lockd and rpc.statd are optional some applications require file locking to operate correctly. This machine will serve NFS over both UDP and TCP transports using 32 daemons.
# echo 'rpcbind_enable="YES"' >> /etc/rc.conf # echo 'rpc_lockd_enable="YES"' >> /etc/rc.conf # echo 'rpc_statd_enable="YES"' >> /etc/rc.conf # echo 'nfs_server_enable="YES"' >> /etc/rc.conf # echo 'nfs_server_flags="-u -t -n 32"' >> /etc/rc.conf # service nfsd start
Export the ZFS filesystem to the private subnet.
# zfs set sharenfs="-maproot=root -network=192.168.0.0/24" nas
The share is exported to the private subnet with no_root_squash (-maproot=root). One might prefer to squash root privileges on the share in which case you could map root to another user or nobody (-maproot=nobody).
The export list is copied to /etc/zfs/exports rather than the standard location in FreeBSD at /etc/exports. The share is instantaneously exported.
# cat /etc/exports cat: /etc/exports: No such file or directory
# cat /etc/zfs/exports # !!! DO NOT EDIT THIS FILE MANUALLY !!! /nas -maproot=root -network=192.168.0.0/24
# showmount -e Exports list on localhost: /nas 192.168.0.0
# cd /usr/ports/sysutils/zfstools/ # make install clean
Installing zfstools-0.3.6_1... To enable automatic snapshots, place lines such as these into /etc/crontab: PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin 15,30,45 * * * * root /usr/local/sbin/zfs-auto-snapshot frequent 4 0 * * * * root /usr/local/sbin/zfs-auto-snapshot hourly 24 7 0 * * * root /usr/local/sbin/zfs-auto-snapshot daily 7 14 0 * * 7 root /usr/local/sbin/zfs-auto-snapshot weekly 4 28 0 1 * * root /usr/local/sbin/zfs-auto-snapshot monthly 12 This will keep 4 15-minutely snapshots, 24 hourly snapshots, 7 daily snapshots, 4 weekly snapshots and 12 monthly snapshots. Any resulting zero-sized snapshots will be automatically cleaned up. Enable snapshotting on a dataset or top-level pool with: zfs set com.sun:auto-snapshot=true DATASET Children datasets can be disabled for snapshot with: zfs set com.sun:auto-snapshot=false DATASET Or for specific intervals: zfs set com.sun:auto-snapshot:frequent=false DATASET See website and command usage output for further details. ===> Cleaning for zfstools-0.3.6_1
A check of zfs properties and a look into /etc/crontab will show those instructions were followed.
# zfs get com.sun:auto-snapshot nas NAME PROPERTY VALUE SOURCE nas com.sun:auto-snapshot true local # cat /etc/crontab # /etc/crontab - root's crontab for FreeBSD # # $FreeBSD: releng/11.0/etc/crontab 194170 2015-03-15 13:06:12Z ayylmao $ # SHELL=/bin/sh PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin 15,30,45 * * * * root /usr/local/sbin/zfs-auto-snapshot frequent 4 0 * * * * root /usr/local/sbin/zfs-auto-snapshot hourly 24 7 0 * * * root /usr/local/sbin/zfs-auto-snapshot daily 7 14 0 * * 7 root /usr/local/sbin/zfs-auto-snapshot weekly 4 28 0 1 * * root /usr/local/sbin/zfs-auto-snapshot monthly 12
This attempt to find where the snapshots were stored provided little information.
# zfs get snapdir nas NAME PROPERTY VALUE SOURCE nas snapdir hidden default
# zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT nas@20150324 128K - 153K - nas@zfs-auto-snap_monthly-2016-09-01-00h28 9.64G - 8.44T - nas@zfs-auto-snap_monthly-2016-10-01-00h28 6.18G - 8.50T - nas@zfs-auto-snap_monthly-2016-11-01-00h28 2.69G - 8.64T - nas@zfs-auto-snap_monthly-2016-12-01-00h28 9.70G - 8.74T - nas@zfs-auto-snap_monthly-2017-01-01-00h28 15.0M - 8.79T - nas@zfs-auto-snap_monthly-2017-02-01-00h28 15.3M - 8.86T - nas@zfs-auto-snap_monthly-2017-03-01-00h28 1022M - 8.94T - nas@zfs-auto-snap_monthly-2017-04-01-00h28 16.9G - 9.01T - nas@zfs-auto-snap_monthly-2017-05-01-00h28 3.43G - 9.04T - nas@zfs-auto-snap_monthly-2017-06-01-00h28 9.77G - 9.07T - nas@zfs-auto-snap_monthly-2017-07-01-00h28 63.2G - 9.25T - nas@zfs-auto-snap_weekly-2017-07-30-00h14 4.90G - 9.19T - nas@zfs-auto-snap_monthly-2017-08-01-00h28 22.7G - 9.27T - nas@zfs-auto-snap_weekly-2017-08-06-00h14 30.5G - 9.42T - nas@zfs-auto-snap_weekly-2017-08-13-00h14 4.94G - 9.38T - nas@zfs-auto-snap_daily-2017-08-20-00h07 428K - 9.31T - nas@zfs-auto-snap_weekly-2017-08-20-00h14 377K - 9.31T - nas@zfs-auto-snap_daily-2017-08-21-00h07 479K - 9.32T - nas@zfs-auto-snap_daily-2017-08-22-00h07 639K - 9.33T - nas@zfs-auto-snap_daily-2017-08-23-00h07 230K - 9.31T - nas@zfs-auto-snap_daily-2017-08-24-00h07 153K - 9.31T - nas@zfs-auto-snap_hourly-2017-08-24-14h00 8.27M - 9.31T - nas@zfs-auto-snap_hourly-2017-08-24-15h00 10.9M - 9.31T - nas@zfs-auto-snap_hourly-2017-08-24-16h00 16.5M - 9.31T - nas@zfs-auto-snap_hourly-2017-08-24-17h00 1.65M - 9.31T - nas@zfs-auto-snap_hourly-2017-08-24-18h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-24-19h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-24-20h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-24-21h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-24-22h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-24-23h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-00h00 1.65M - 9.32T - nas@zfs-auto-snap_daily-2017-08-25-00h07 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-01h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-02h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-03h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-04h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-05h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-06h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-07h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-08h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-09h00 1.65M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-10h00 1.65M - 9.32T - nas@zfs-auto-snap_frequent-2017-08-25-10h15 1.65M - 9.32T - nas@zfs-auto-snap_frequent-2017-08-25-10h30 1.52M - 9.32T - nas@zfs-auto-snap_frequent-2017-08-25-10h45 1.52M - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-16h00 115K - 9.32T - nas@zfs-auto-snap_hourly-2017-08-25-20h00 115K - 9.32T - nas@zfs-auto-snap_daily-2017-08-26-00h07 0 - 9.31T - nas@zfs-auto-snap_hourly-2017-08-26-14h00 0 - 9.31T - nas@zfs-auto-snap_frequent-2017-08-26-14h15 0 - 9.31T - zroot@first 0 - 96K -
Then it was just a matter of choosing the snapshot I wanted and issued the following command. We'll pretend the dates are back then instead of now.
# zfs rollback nas@zfs-auto-snap_frequent-2017-08-26-14h15
Not bad, not bad at all.
I installed the smartmontools package which contains two utility programs (smartctl and smartd) to control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology System (S.M.A.R.T.).
# cd /usr/ports/sysutils/smartmontools/pkg-descr # make install clean # echo 'smartd_enable="YES"' >> /etc/rc.conf # echo 'daily_status_smart_devices="/dev/ada0 /dev/ada1 /dev/ada2 /dev/ada3 /dev/ada4 /dev/ada5 /dev/ada6 \ /dev/ada7"' >> /etc/periodic.conf
Google published some research on drive failures which isn't comforting with regard to SMART monitoring being predictive. The authors state "We find, for example, that after their first scan error, drives are 39 times more likely to fail within 60 days than drives with no such errors. First errors in reallocations, offline reallocations, and probational counts are also strongly correlated to higher failure probabilities. Despite those strong correlations, we find that failure prediction models based on S.M.A.R.T. parameters alone are likely to be severely limited in their prediction accuracy, given that a large fraction of our failed drives have shown no S.M.A.R.T. error signals whatsoever." So while it sounds better than panacea, it's not great. I look at the daily status reports. If I see anything negative there I'm going to swap out a drive, if I get so lucky. Still with raidz1 the array can lose one drive without data loss and if the worst happens and a second drive fails before the first is replaced then I recourse to offsite backup. There is now more data on this array than the offsite backup can contain, but everything important is backed up. Leaving only easily replaced files at risk for loss should two or more drives fail at nearly the same time.
I thought Samba was running on the server but isn't. Samba serves to Windows computers from one of the NFS clients instead.
The server is normally headless with no monitor or keyboard attached. The tmux terminal multiplexor is installed and ffmpeg for trancoding video. I often ssh in to set transcoding tasks which run in tmux sessions. This doesn't seem to interfere with NFS.
You may wonder what in the world anyone would want with so much storage. The answer is mostly for linux distributions, but there's much other stuff. If I want to break a video into stills for editing and reassembly, no problem. I have before and may again build a search engine. Mine wasn't Google sized, but it was fast and worked well. I recall bumping into some file system limitations before switching to PostgreSQL. Those just don't exist with ZFS. Permutations are fun!
You may also wonder why I didn't use a ready-made software solution like FreeNAS or NAS4Free. I just never have is all. I don't know if they're better or worse than my setup. From what I've read those seem like a fine way to go about this.