rulururu

post Equallogic Auto-Snapshot Manager for VMware

September 13th, 2008 @ 2:03 am

For a couple of months now, I’ve been hearing about the upcoming 4.0 firmware and Auto-Snapshot Manager, VMware edition for the Equallogic PS series SAN. This new snapshot provider would allow us to coordinate snapshots between VitualCenter and the SAN, and, according to Dell/Equallogic, allow easy restoration of a single Virtual Machine from the SAN-based snapshots.

I had the opportunity on Thursday night to watch a pre-recorded demo as well as to attend a live webinar on Friday morning. I must say that after these demos and seeing exactly what this product does, I am stuck somewhere between excitement and disapointment. It’s a really cool concept, but I believe it still needs a lot of polishing, especially on the recovery side of things.

There are lots of awesome features in the new snapshot provider. We will have the ability to automatically, in a single click or scheduled task, trigger an ESX snapshot, including memory dump, then snapshot the SAN volume, followed by removing the ESX snapshot. This eliminates the journaling effect and associated performance hit and disk requirements of the ESX snapshots. This is all handled through a nice web interface, and the VirtualCenter folder tree is carried over, allowing snapshot schedules to be applied to groups of Virtual Machines.

There are, however, some catches. Only the selected VM’s are triggered for ESX snapshots, but the entire SAN volume, which may contain many other VM’s is snapshotted. This makes the ability to group VM’s using VirtualCenter folders less than useful. Let’s say I have four VM’s split between four volumes and want one machine on each volume to be snapshotted everu 12 hours. Then, I want one machine per volume to be snapshotted every 24 hours. In this scenario, I will actually end up with, at the SAN level, two snapshots per day of both entire volumes and all the VM’s since the entire volume is snapshotted. So, in my opinion, snapshotting VM’s by any grouping other than an entire SAN volume isn’t going to be practical without a lot of wasted disk space.

On the recovery side, I think there is a lot of room for improvement. It is very easy to revert an entire volume and all the VM’s it contains. Beyond that, restoring a single VM, for example, becomes a somewhat lengthy process. Basically, it involved going back to the Equallogic Group Manager, setting the snapshot online, going to ESX and mounting the snapshot as a new volume, deleting the damaged VM, copying it manually from the snapshot to the production volume, adding it to inventory, booting it up, and then unmounting the snapshot. Alternately, the VM can be booted from the snapshot volume, then migrated back to the production volume using Storage VMotion. Storage VMotion, however, requires accessing the ESX command line.

It is my hope that, in a future release, Dell will automate some of the recovery process using the VMware API’s. Currently, there are lots of improvements in creating the snapshot, but no real change in the process of recovering a VM.

I am looking forward to getting the Auto-Snapshot Manager, VMware edition installed in our environment and actually seeing it in action in a production environment. Expect another post in the future with more details once I actually get this up and running.

post Equallogic SAN Expansion

August 3rd, 2008 @ 2:42 pm

Filed under: Storage

Last December, we implemented 7TB of Equallogic storage as the backbone of our VMware Virtual Infrastructure implementation, as well as to serve as primary storage for our file, Exchange, and SQL servers.  Little did we know that 6 months later we’d be near capacity and shopping for more storage.

Thanks to James at EIS, we now have another 16TB of raw storage online.  Combined with our existing array and considering RAID overhead, we now have just under 15TB of usable iSCSI storage.  I’m excited to have this done!

I absolutely love our Equallogic SAN!  In less than 30 minutes, the new storage array was configured and added into the cluster.  The volumes were automatically distributed and network traffic load balanced among the arrays.  The only complaint I have is that their rack rail system could use some improvement.  Getting the array installed in the rack is the most time consuming part of the entire implementation.

Check it out:

Array was sitting in my office when I arrived on Friday:

Unpacked and ready to be installed.  It has 16 drives with a capacity of 1TB each:

Racked next to it’s little brother.  This is a total of 30 disks and 23TB of raw storage capacity:

Total storage capacity of 14.82TB:

The performance and raw throughput of the Equallogic gear amazes me.  Orion was reporting 534mbps of traffic between the arrays during setup:

Hopefully we will have plenty of space for a while now.  Although, as we move toward implementing Final Cut server and centralized storage for digital video, that may change.

post Successful SAN and VMware Upgrades

May 28th, 2008 @ 12:04 am

While everyone was away for the holiday Monday, I took the opportunity to upgrade our SAN and ESX servers.  Everything went surprisingly well.

What was really impressive is how fast the Equallogic SAN reboots.  The firmware upgrade was the first reboot since it was installed.  They claimed you could reboot it “live” without causing any problems with the servers, but I had never tested that theory until now.  I was sending it a series of pings every 1 second during the entire process.  I dropped a total of 12 pings during the reboot and the servers never new the storage had just rebooted.  Pretty impressive!  Check this out (I did it from home, hence the 12-15ms latency):

I also migrated all of our ESX servers from version 3.0.2 to 3.5.  For some reason, the HA agent had to be reconfigured on a couple of them, and the ESX firewall decided to block outbound iSCSI traffic on every box after the upgrade.  Other than that, the ESX upgrades went great!

Out first diskless ESX server is no online also.  The QLogic HBA initially wouldn’t connect to our SAN using jumbo frames.  QLogic’s response was to send me their “Beta” or “Limited Release” firmware, which scares me a little.  I have several production VM’s running on that host with no issues though.  I hope to do some benchmarks on VMware Server vs ESX with software iSCSI vs ESX with hardware iSCSI.  Stay tuned for details on that!

I love it when a project goes as planned!

post Networking for iSCSI

May 25th, 2008 @ 3:34 am

Filed under: Networking, Servers, Storage

I’ve received several comments and question on my post from a few days ago, “iSCSI Slow? I Think Not.”  The network hardware is critical for peak iSCSI performance.  I think a brief follow up with some details on our network configuration are in order.

We are using a Cisco Catalyst 6506 switch at the core of our network, which handles all of our iSCSI traffic.  The current configuration looks like this:

  • (1)  WS-X6K-SUP2-MSFC2 with PFC2 supervisor module
  • (2)  WS-X6148A-GE-TX gigabit modules (connects all server and iSCSI devices)
  • (1)  WX-X6414-GBIC fiber module (backbone to all of our IDF’s)

All SAN ports are configured for Jumbo Frames and Flow Control.

The servers are HP DL360 G5’s with NC360T Nics.  I just deployed a new ESX server with with a QLogic iSCSI HBA, but I don’t really have any benchmarks on that yet.  I’ll post some details on that once I run some benchmarks.  I’m interested in whether there will be a big performance increase over the ESX software iSCSI initiator.

post iSCSI Slow? I Think Not

May 22nd, 2008 @ 1:36 am

Filed under: Networking, Storage

People love to talk bad about iSCSI, especially “Those Other SAN Vendors” (ie: The Fibre Channel People).  I’ve had a couple of vendors tell me iSCSI is not an enterprise solution and I’d never see over 250Mbps of throughput.  I love proving them wrong.

Check out the images below.  Note that the two transfers below were happening SIMULTANEOUSLY to a single Equallogic PS300 SAN.  That’s a combined throughput of 1.06Gbps! iSCSI rocks!  The key is the network really.  High-end switches with big port buffers, jumbo frames, and flow control are a must.

post iSCSI for Video Editing/Archiving

May 20th, 2008 @ 7:42 am

Filed under: Macs, Storage, Strategy

I have a LOT of really cool and unique projects either in the works or in the planning stages. I can’t believe I get to have this much fun at work! I had a nice chat with John in our media area yesterday about how we can improve our storage, archiving, and workflow in video world.

We produce a LOT of videos. Most of the raw footage these days gets shot directly to hard disk, and archiving and managing all of that digital footage is becoming a big problem. It’s on local disks in edit stations, on removable hard drives, on volumes on our Equallogic SAN - it’s everywhere - and it’s all full or quickly filling up. Then there’s the whole management and workflow issues. How do we find a specific clip or project? How do we allow multiple people to work on the same project simultaneously?

We’ve pretty well decided Final Cut Server is the solution to the content and workflow management portion of the project. It will allow us to group and organize clips with thumbnails and previews, drag and drop directly into final cut, share and collaborate on projects, and even allow Windows machines to view the catalog and watch clips.

Now for the fun part - storing all of that data. How much data are we ultimately talking about? 1TB? 10TB? 100TB? I really don’t know the exact answer to that, but I can tell you this: It’s certainly way more than 1TB and probably way more than 10TB.

The obvious answer is Apple’s XSAN. I’ve definitely explored this, and have implemented and used XSAN in the past. It’s a nice product, but I’m not sure it’s the best solution for our needs. With the Fibre Channel switches, associated cabling, and metadata controllers, the initial implementation cost is high, and, let’s face it: Fibre Channel, although it probably has a few years left, is a dieing technology.

Here’s what I believe I’ve settled on:

Studio Network Solutions has a product called SANmp that allows multiple machines, across platforms, to access iSCSI volumes at the block level. With direct block level iSCSI to each edit station, with appropriate network infrastructure in place - Catalyst 6500 series at the core and probably an HP 2810 series at the edge, I should be able achieve transfer speeds approaching that of Fibre Channel for a fraction of the cost.

Promise has a line of iSCSI SATA arrays that seems like the ultimate solution for our scenario. Their 16 bay unit, loaded with 1TB disks, will give us 16TB of raw storage for a very reasonable price.

The networking side will require pulling a few additional gigE drops and replacing one switch, but most of the network infrastructure is already in place.

For the media asset management side of things, Final Cut Server will run on top of the above infrastructure on an Apple Xserve.

I’m curious if anyone else out there has implemented a similar solution. If so, I’d love to hear from you.

post Cool New Equalogic Exchange Snaphots

April 29th, 2008 @ 7:15 am

Filed under: Storage

Jason Powell beat me to it.  Check out his post here. And the official Dell/Equallogic press release here.  I’m looking forward to seeing what this can do.  We already use application aware SQL snapshots on our SAN, and it’s great to be able to recover a single database from a volume snapshot.

I’m curious what level of granularity this new tool will allow.  I question how usefully it would be to restore an entire datastore (which I think is what this tool does), since I already have methods to do that.  I guess even restoring at the datastore level, it would be a plus to have it integrated with the SAN.

ruldrurd