Building our own high performance NAS / SAN

We have been using the Sun / Oracle ZFS Storage 7000 series for about 3 years now. We have grown pretty fond of it. Now it is time to make use of what we have learned.

We want to build our own NAS appliance. A high performing one that is. One to be used as storage for primary Microsoft DAG nodes, that way we can let the secondary Microsoft DAG nodes live on commercially backed FT storage AKA our Oracle ZFS Storage, and still achieve IOPS you would not have dreamed of just a few years back !

A lot of thought have gone into sizing it, but without further ado – here is what we have shopped :

Server : Dell PE R715
CPU : 2 X AMD 8 cores @3.4 GHz
Memory : 256 GB
HBA : LSI Logic 9207i-8 & LSI Logic 9207e-8
Spindels : 14 X Seagate Savvio 10.5K 600 GB SAS-2
ReadSSD / L2ARC : 5 X Seagate Pulsar 2, 200 GB SAS-2 ( MLC / eMLC )
WriteSSD / Log : 3 X Seagate Pulsar XT 2, 100 GB SAS-2 ( SLC )
JBOD : Dell MD1220

We want to have 10 X spindles in a mirror configuration, that gives us 5 vdevs for the ZFS pool, which should be fine as most of our data in time will travel to the ARC and L2ARC. We “only” have about 750 GB data that needs to be accelerated.

2 spindels are marked as spares and we keep the last 2 in the “drawer”.

4 X ReadSSDs in a striped configuration and 1 in the “drawer”.

2 X WriteSSDs in a striped configuration and 1 in the “drawer”.

Initially we are hoping that the above configuration will give us ~ 100K OPS on reads and ~ 50K OPS on writes – both random on the “hot” data in ARC & L2ARC. As we will see later, our hopes were more than met 🙂

We are looking at the following contenders as the NAS OS / appliance software :

– Nexenta / Nexentastor
– SmartOS / OpenIndiana with napp-it.
– NAS4Free
– FreeNAS
– ZFS-guru

All of the above have ZFS built into the kernel. And not “just” the ZFS pool version 28 / file version 5 from the last Open Source Solaris 2009Q4. They all draw from the Illumos project which aims at maintaning the Open Solaris code, and most importent for us, the ZFS code. Although the BSD deratives are not directly attached to the Illumos codebase – the ZFS is.

Nexenta and napp-it are commercial products, so if your hands are shaking a bit now already… you can opt for those. They fully back almost any HW you may choose, and makes available both HW FT ( 2 active / active heads ) and a bunch of plugins… but if you want to use the NAS as we are planning, there is no use of the extra safety with a commercially backed product. If our nodes go down, so what. The secondaries will continue to be online. At this point in time we are only trying to grasp a feel of it, not migrating all our storage needs.

Of the above, FreeNAS have some commercial backing via the iX-Systems which bought the code a few years back. Still keeping the OS free though, thanks for that. As I am running the FreeNAS on my home storage appliance, this was my first choice. Unfortunately the underlying FreeBSD is only at 8.3, and was not able to boot on the Dell PE R715 – comments on this is very welcome.

I also tried a clean FreeBSD 9.1 STABLE, but is was to much work to make a ZFS-on-root boot environment and forcing the ZFS to take advantage of the newer 4K optimized disks. At least for me. More FreeBSD-savvy persons would most likely succeed.

NAS4Free was the next we tried. It is built on FreeBSD 9.1, and they have a more aggressive approach to keeping the software current as opposed to FreeNAS. Me like !

The NAS4Free has a very nice and clean UI. Not always very logic / intuitive, but never the less functional. We ran NAS4Free for 2 days without glitches and it is still a contender for the “win”.

Secondly we registered for a Nexenta Enterprise and installed it as a zfs-on-root on two built-in SAS drives. Works as a charm. The UI is a bit confusing at times, especially if you are used to the very intuitive and very slick UI of the Oracle ZFS apliances. Also it is not nearly as responsive – but it works !

The Nexenta has some advatages over its competitors. I.e. plugins to make Amazon S3 backup vith your ZFS snapshots / ZFS clones – that is pretty sweet. And for that alone, it is still a contender for the win.

As for now we will be running the Nexenta until the trial runs out, and hopefully FreeNAS will be ready with a version built on FreeBSD 9.1 in early june 2013.

So are you waiting for some benchmarks ?

After 3 days of pounding the Nexenta from a VM wich reads and writes extensively on the same DB-file – a 200 GB one. Now I know the difference between performance IRL and a synthetic benchmark, but the goal here was to see if we could achieve Read OPS : 100K and Write OPS : 50K.

After running the query ( show max statistics ) on the Nexenta-box it gives med the following :

Peak Read OPS : 472.400 /sec.
Peak Write OPS : 320.800 /sec.

So, is that a lot? Yep !

Please comment if you feel I have left something out or you have some ideas on how we could improve the above setup.

Thanks for reading.

ZFS. All your disk are belong to you.