rulururu

post DR Test - Kind Of

May 29th, 2008 @ 11:59 am

Filed under: Applications

It’s always a bit scary when someone asks you to recover really important data from a week or two ago.  Did the backup run that day successfully?  Did it copy to tape successfully?  (We do disk to disk to tape backup)  Will the restore work?

A little over a week ago, our membership people found about 13,000 “Unassociated” records in the Shelby database.  Under Shelby’s guidance, I did a database backup and we deleted the orphaned records.  Shelby assured us it wouldn’t affect any “good” records.

Well, here we are a week later and apparently a few people are missing, so I’m uploading the current backup and the backup before the pruning so that they can figure out how to restore the deleted people.  Unreliable software is really annoying and Shelby is moving higher and higher up my “Unreliable List.”

post Successful SAN and VMware Upgrades

May 28th, 2008 @ 12:04 am

While everyone was away for the holiday Monday, I took the opportunity to upgrade our SAN and ESX servers.  Everything went surprisingly well.

What was really impressive is how fast the Equallogic SAN reboots.  The firmware upgrade was the first reboot since it was installed.  They claimed you could reboot it “live” without causing any problems with the servers, but I had never tested that theory until now.  I was sending it a series of pings every 1 second during the entire process.  I dropped a total of 12 pings during the reboot and the servers never new the storage had just rebooted.  Pretty impressive!  Check this out (I did it from home, hence the 12-15ms latency):

I also migrated all of our ESX servers from version 3.0.2 to 3.5.  For some reason, the HA agent had to be reconfigured on a couple of them, and the ESX firewall decided to block outbound iSCSI traffic on every box after the upgrade.  Other than that, the ESX upgrades went great!

Out first diskless ESX server is no online also.  The QLogic HBA initially wouldn’t connect to our SAN using jumbo frames.  QLogic’s response was to send me their “Beta” or “Limited Release” firmware, which scares me a little.  I have several production VM’s running on that host with no issues though.  I hope to do some benchmarks on VMware Server vs ESX with software iSCSI vs ESX with hardware iSCSI.  Stay tuned for details on that!

I love it when a project goes as planned!

post Networking for iSCSI

May 25th, 2008 @ 3:34 am

Filed under: Networking, Servers, Storage

I’ve received several comments and question on my post from a few days ago, “iSCSI Slow? I Think Not.”  The network hardware is critical for peak iSCSI performance.  I think a brief follow up with some details on our network configuration are in order.

We are using a Cisco Catalyst 6506 switch at the core of our network, which handles all of our iSCSI traffic.  The current configuration looks like this:

  • (1)  WS-X6K-SUP2-MSFC2 with PFC2 supervisor module
  • (2)  WS-X6148A-GE-TX gigabit modules (connects all server and iSCSI devices)
  • (1)  WX-X6414-GBIC fiber module (backbone to all of our IDF’s)

All SAN ports are configured for Jumbo Frames and Flow Control.

The servers are HP DL360 G5’s with NC360T Nics.  I just deployed a new ESX server with with a QLogic iSCSI HBA, but I don’t really have any benchmarks on that yet.  I’ll post some details on that once I run some benchmarks.  I’m interested in whether there will be a big performance increase over the ESX software iSCSI initiator.

post Got My New Xserve

May 23rd, 2008 @ 2:00 pm

Filed under: Macs, Servers

Our new Xserve arrived yesterday.  I got all the initial configuration done and got it racked.  Apple definitely makes some “Pretty” servers.

Over the next few weeks, I’ll be getting Open Directory and Update Services configured and rolled out to all of our Mac workstations.  At some point, we’ll also be installing Final Cut Server.  I’ll be post updates as we get all of this configured.  In the meantime, here’s a few pics:

post iSCSI Slow? I Think Not

May 22nd, 2008 @ 1:36 am

Filed under: Networking, Storage

People love to talk bad about iSCSI, especially “Those Other SAN Vendors” (ie: The Fibre Channel People).  I’ve had a couple of vendors tell me iSCSI is not an enterprise solution and I’d never see over 250Mbps of throughput.  I love proving them wrong.

Check out the images below.  Note that the two transfers below were happening SIMULTANEOUSLY to a single Equallogic PS300 SAN.  That’s a combined throughput of 1.06Gbps! iSCSI rocks!  The key is the network really.  High-end switches with big port buffers, jumbo frames, and flow control are a must.

post Church Management System Discussion

May 21st, 2008 @ 11:19 am

Filed under: Applications, Strategy

Yesterday, we had the opportunity to meet with Jill, our new communications director, about how we manage out membership data.  How do we communicate with our members?  Where does the data come from?  What are the problem areas?

We were able to identify at least 8 different types of “Databases” in use other than our Church Management Systen (Shelby).  Yikes!  The next steps are to identify why we are using so many disconnected databases and develop a solution that will meet the needs of the church long term.  It’s going to be a lot of work, but should be fun.

post iSCSI for Video Editing/Archiving

May 20th, 2008 @ 7:42 am

Filed under: Macs, Storage, Strategy

I have a LOT of really cool and unique projects either in the works or in the planning stages. I can’t believe I get to have this much fun at work! I had a nice chat with John in our media area yesterday about how we can improve our storage, archiving, and workflow in video world.

We produce a LOT of videos. Most of the raw footage these days gets shot directly to hard disk, and archiving and managing all of that digital footage is becoming a big problem. It’s on local disks in edit stations, on removable hard drives, on volumes on our Equallogic SAN - it’s everywhere - and it’s all full or quickly filling up. Then there’s the whole management and workflow issues. How do we find a specific clip or project? How do we allow multiple people to work on the same project simultaneously?

We’ve pretty well decided Final Cut Server is the solution to the content and workflow management portion of the project. It will allow us to group and organize clips with thumbnails and previews, drag and drop directly into final cut, share and collaborate on projects, and even allow Windows machines to view the catalog and watch clips.

Now for the fun part - storing all of that data. How much data are we ultimately talking about? 1TB? 10TB? 100TB? I really don’t know the exact answer to that, but I can tell you this: It’s certainly way more than 1TB and probably way more than 10TB.

The obvious answer is Apple’s XSAN. I’ve definitely explored this, and have implemented and used XSAN in the past. It’s a nice product, but I’m not sure it’s the best solution for our needs. With the Fibre Channel switches, associated cabling, and metadata controllers, the initial implementation cost is high, and, let’s face it: Fibre Channel, although it probably has a few years left, is a dieing technology.

Here’s what I believe I’ve settled on:

Studio Network Solutions has a product called SANmp that allows multiple machines, across platforms, to access iSCSI volumes at the block level. With direct block level iSCSI to each edit station, with appropriate network infrastructure in place - Catalyst 6500 series at the core and probably an HP 2810 series at the edge, I should be able achieve transfer speeds approaching that of Fibre Channel for a fraction of the cost.

Promise has a line of iSCSI SATA arrays that seems like the ultimate solution for our scenario. Their 16 bay unit, loaded with 1TB disks, will give us 16TB of raw storage for a very reasonable price.

The networking side will require pulling a few additional gigE drops and replacing one switch, but most of the network infrastructure is already in place.

For the media asset management side of things, Final Cut Server will run on top of the above infrastructure on an Apple Xserve.

I’m curious if anyone else out there has implemented a similar solution. If so, I’d love to hear from you.

post Wells Fargo Blocks Firefox?

May 18th, 2008 @ 10:30 pm

Filed under: Finances, General

I was trying to log into Wells Fargo to check the progress on my House Pay-Off Spectacular and was greeted with this:

Now seriously, what kind of customer service is that?  Who still uses Netscape?  To make matters worse, all of their download links just redirect you back to the same page.

Looks like I have yet another reason to put my Pay-Off Spectacular into high gear so I can fire Wells Fargo.

post Runaway Clock in Virtual Linux Servers

May 18th, 2008 @ 2:45 pm

Filed under: Servers, Virtualization

If you run any Linux guests under VMware, you’ve probably had issues with the clock in the VM drifting or just totally running away.

The Linux clock works by counting timer interrupts. In older kernels, this was usually done at a rate of 100Hz, or 100 times per second. Beginning with the 2.6 kernel, the interrupt timer is now set at 1000Hz, so interrupts are counted 10 times as often.

Due to the fact that VMware divides the host up into “time slots” for each guest OS, and depending on the system load, interrupts are often missed in the guest machines. The more often the guest kernel counts interrupts, the more apparent these “missed” interrupts become and the result clock skew in the gust machine. VMware Tools has the ability to sync the guest clock with the host, but this only occurs once per minute, and can only advance the clock, it can’t slow it down. Generally, the VMware Tools clock sync alone is not enough.

Here’s the steps that are needed in order to keep the clock skew under control (these apply to VMware Server running on a Linux host - in my case, CentOS). The guest OS changes will also apply to ESX.:

  • VMware server needs to be told what clock speed the CPU(s) run at. This can be found by running “cat /proc/cpuinfo”, which will return all kinds of information about the CPU’s, including the clockspeed. You’ll need to edit /etc/vmware/config and add the following lines (where host.cpukHz is the host CPU speek in KHz (2.8GHz in my example below)

    host.cpukHz = 2800000
    host.noTSC = TRUE
    ptsc.noTSC = TRUE

  • VMware Tools needs to be installed in the guest OS. VMware provides instructions on how to install VMware Tools in a Linux guest here.
  • VMware Tools time synchronization needs to be enabled. This is done by editing the VMX file in the virtual machine directory and adding the following line:

    tools.syncTime = “TRUE”

    Note that the host should use NTP to sync to an outside time source, while NTP should be disabled in each guest

  • Now, we need to lower the interrupt frequency in the guest kernel. Generally, this will require installing the kernel source, modifying the CONFIG_HZ parameter to a rate of 100Hz, and then recompiling the kernel. CentOS has made this easy for us by releasing a “VM Optimized” kernel for CentOS 5. Although perfectly stable, this kernel is presently in the “Testing” repository. Here’s how to install the VM Kernel using yum in a CentOS 5 system:Add the “Testing” repo as follows:

    cd /etc/yum.repos.d
    wget http://dev.centos.org/centos/5/CentOS-Testing.repo

    Now, install the VM Optimized kernel:

    yum enablerepo=c5-testing install kernel-vm kernel-vm-devel

  • Now, we need to make sure Grub is set to boot the new kernel, and also add the “clock=pit” parameter to the kernel boot options. We do that by editing /etc/grub.conf and making the following changes:

    default=0

    Where “0″ is the first kernel listed. If the VM Kernel is not the first item, you’ll need to adjust the value accordingly. For example, if it’s second in the list, you’d use “default=1″Now, add the clock=pit parameter to the kernel boot options. That section of the grub.conf file will look something like this:

    title CentOS (2.6.18-53.1.19.el5) root (hd0,0)
    kernel /vmlinuz-2.6.18-53.1.19.el5 ro root=LABEL=/ clock=pit
    initrd /initrd-2.6.18-53.1.19.el5.img

Once all of the above changes are made, reboot the guest, and you should see significantly better clock performance. I had some VM’s where the time would drift by hours, and after making these changes, they stay within a few seconds.

post ActiveSync + ISA Server

May 16th, 2008 @ 5:53 pm

Filed under: Email, Security, Servers

We worked for a while yesterday to get Bob’s Windows Mobile phone to sync with Exchange (Bob just joined our IT team - welcome Bob!). Without much luck.  Bob is our first user with Windows Mobile.  Everyone else uses Blackberry devices.

We use an ISA 2006 server in the DMZ with RADIUS authentication as a front-end server to Exchange.  I initially added the Microsoft-Server-ActiveSync virtual directory to the list of paths in the existing ISA rule.  We got errors about not having the correct privileges to do ActiveSync, which we obviously did have.  After messing with this for a little while, I realized I needed to create a separate rule for the ActiveSync path and place it above my OWA redirect rule.  I have a rule that allows the user to type in just http://webmail.jfbc.org and get automatically redirected to https://webmail.jfbc.org/owa.  It seems that this rule was also redirecting the ActiveSync directory.  Here’s what the “Correct” setup looks like in ISA server:

Apparently, that wasn’t the only issue.  Next problem: It kept complaining about an incorrect username or password.  Obviously, the username and password were correct.  Some monitoring in ISA server revealed the authentication didn’t seem to be happening.  All of the requests were marked as “anonymous.”

You won’t believe how simple this was.   On the handheld, there are 3 boxes: username, password, and domain.  We run split DNS, with JFBC.ORG as the internal domain name, so that’s what we entered.  Turns out that ISA server wants the NETBIOS name instead, which is simply JFBC.  It’s amazing how something so simple can create such a big issue.

ruldrurd
Next Page »