rulururu

post Bakbone NetVault?

August 6th, 2008 @ 11:51 am

Filed under: Backup/DR, Planning, Strategy

I had a conversation yesterday with Bakbone about their NetVault product.  As we’ve moved heavily into virtualization (90% of our infrastructure is virtualized at this point), backup and DR has become a growing challenge.  Ideally, we need to be able to back up entire virtual machines directly from the SAN, with the ability to restore and entire VM, or individual files within a VM.  In addition, properly protecting Active Directory, SQL Server, and Exchange are high priorities.  The ability to do message level restore in Exchange is also somewhat important.

Our aging Backup Exec installation seems to become more and more cumbersome and problematic, and seems to have the common problem of one product trying to do way too much and not doing any one thing exceptionally well. I think it’s time to move into a more enterprise-class product - something more closely tuned to our needs.  NetVault initially seems like a potentially good fit.  If anyone has any experiences with NetVault or has any other recommendations, I’d love to hear from you.

post Thoughts on Sonicwall Roadshow and E-Class Firewalls

June 3rd, 2008 @ 7:35 am

Last week, I attended the Sonicwall Roadshow in Atlanta for a look at their E-Class firewalls and other products. I’m looking to implement a new firewall solution in the very near future and have pretty much narrowed it down to the Sonicwall E-Class boxes or Cisco. Here’s a few random thoughts about the Sonicwall products:

  • The integration with Active Directory and ability to apply firewall rules and policies to users is really cool. As far as I’m aware, the Cisco ASA doesn’t do this. The example they showed was rate limiting YouTube to 30kbps for a group of users. It has a coolness factor to it, but, is it practical? Am I really going to rate limit YouTube for specific users? Most likely not.
  • I really like the application level inspection and filtering. Their example was searching for an embedded watermark in a confidential document and preventing it from leaving the network. Examples included looking for it in SMTP and FTP traffic as well as HTTP uploads. I could definitely see a use for this in some businesses. In our environment, it’s not really useful.
  • I’m still not a huge fan of their big confusing web interface - althought I will say it has been improved. I’m sure some users prefer the GUI, but I’d much rather have an easy to use command line
  • We recently phased out our Sonicwall wireless solution in favor of Xirrus WiFi arrays. The biggest problem we had with the Sonicwall access points is once you got several clients connected, clients would randomly get disconnected and RF strength would fluctuate. I observed the exact same thing happening on the presenter’s laptop at the roadshow. I saw the “Now Connected” dialog pop up 3 times during a 30 minute or so presentation - exact same problem we had. This is not really related to the firewall itself, but thought it was worth mentioning.
  • Their VPN client is nice, but there are better VPN clients on the market. We current use Cisco VPN and I am very happy with it.
  • It’s expensive - way more than than the Cisco, which is the opposite of what I expected.

The E-Class/NSA series boxes are a huge improvement over the previous generation firewalls. But, comparing it’s features to our needs, and looking at the cost/performance/features ratio, I’m just not convinced it’s right for our environment. Anyone have any further thoughts?

post Information Lifecycle/Storage/Backup Stuff

June 2nd, 2008 @ 7:49 pm

Last week, I met up with several other Church IT guys from the Atlanta area for a discussion on Information Lifecycle Management and backup with Veristor.  We raised a lot of questions and white boarded a pretty scary diagram of how data gets archived and backed up.

In the end, we determined that we need to identify a couple of key time frames:

  • RPO, or Recovery Point Objective:  How much data can we afford to lose?
  • RTO, or Recovery Time Objective:  How long can we wait to have our data back online?

This is going to take a lot of work from various departments, but I’ve got some initial thoughts.  First, what are our critical apps?  For us, they would be email - communication between our staff and members is critical.  Next would be our Accounting, Payroll, and Membership systems, which are all handled by the same app (Shelby).

So, how long can we be without them?  And what is reasonable given a limited budget?  As much as I’d like to say we can’t lose any data and we need to be back online 10 minutes after a disaster, that is simply not reasonably due to limited financial resources.  We probably could lose a day or so of data on the email and accounting systems and still survive.  Maybe a week on file shares and everything else.  A recovery time of 2 days for account and 1 week on everything else is probably reasonable.

I’ll be evaluating this further, as well as talking to other departments to develop some concrete objects so that we can get a better DR plan in place.

post Remote Access Followup

May 3rd, 2008 @ 8:19 pm

Tony made an interesting comment on my Remote Access Post from a few days ago. He has a good point, and I think it’s worth visiting. We do we give remote access to and from what computers. Is it a good idea to allow them access from their personal computers? I have had the same thought, and that is the primary reason I’m not already doing it. Here’s a few thoughts:

Generally, only a user with a church supplied laptop would be given VPN rights. If the user has been granted the right to log in via VPN, though, can I really control what machine they do it from? I really can’t. All they need is the Cisco VPN client and a few configuration details, so, in theory, a tech-savy user could access the VPN from any computer.

  • Is it any different from a rogue computer being physically plugged into the network? No, it’s really not. Now, random machines being attached to the network is definitely not something I promote or desire. But, it would take very extreme measures and a lot of expense to stop it. 802.1x is a possibility, but, beyond that, it would require some sort of centralized MAC Address based authentication. This exists from a few vendors, but isn’t cheap. Bottom line is, it’s not easy or cheap to keep rogue machines out completely.
  • Putting costs and implementation issues aside, what impact would it have on ministry to implement the above? Are there legitimate reasons for someone to attach a “rogue” machine to the network? In general, no, but there are some exceptions.
  • We live in an increasingly “connected” and mobile society. Ministry is no exception. Increasingly, being on the cutting edge of technology is a requirement of our ministry. It is absolutely critical that we enable our staff to perform their duties without being physically present in the office.

So, with the above in mind, I’ve placed a greater focus on keeping our internal defenses in line. Here’s a few actions I’ve taken or plan to take:

  • Windows firewall is enabled on all workstations via group policy and no programs are allowed to create exceptions. There are only a handful of ports allowed.
  • This is a no-brainer, but centrally managed anti-virus is in place on all internal machines.
  • SMTP is not allowed outbound from anywhere on the network, with the exception of our exchange servers. This limits the scope of damage should a machine with a mass-mailing worm show up on the network.
  • Access to network file shares is very carefully controlled. User accounts do not have access to anything they do not specifically needs to access for their job function.

After putting a lot of thought into it, I’ve come to the conclusion that the benefits to allowing VPN access outweigh the potential negative impact of not allowing it. I would rather allow limited access via other methods of possible, which is why I’m exploring Terminal Services Web Access combined with RemoteApp. But, if for some reason the Terminal Services solution does not work out, I believe VPN is an acceptable fall-back.

post Startup/Shutdown Procedures

April 23rd, 2008 @ 11:31 am

The last year or so, we’ve been moving full force at doing lots of cleanup and building an enterprise-class infrastructure.  There’s still a ways to go, but now it’s time to develop a solid IT strategy.  I’ll be making several updates along that journey, but one of the bigs things that’s came up is what do we do when things don’t go right?  A big issue is: What do we do if there’s a major power outage and we have to shut everything down and bring it back up?

Now, last year, we had the opportunity to install a 15KVA UPS that’s capable of running our entire server farm, network core, and phone system for about 2 hours.  We rarely have an outage that long, so we haven’t had to do a full shutdown/startup in quite a while.  During the last year or so, as we’ve implemented technologies such as SAN storage and Virtualization, it seems our infrastructure has gotten considerably more complex with lots of systems being interdependent on each other.

Here’s a short list of dependencies that come to mind:

  • Exchange, Blackberry Server, SQL Server, Virtual Center, and VPN all require Active Directory to be up.
  • Virtual Center obviously needs the ESX servers up to function.
  • Virtual Center and Blackberry Server use the SQL server.
  • Blackberry Server depends on Exchange being up.
  • The ESX cluster, Exchange, and SQL server all require SAN storage.
  • The SAN requires the core switches be functional before it comes online.

As you can see, the dependencies quickly get complex.  If you have to shutdown everything, what order do you do it in?  How do you bring it back up?  I’ll be documenting all of this over the coming weeks and hopefully actually doing a test at some point.  More updates to come.

post Software Upgrades - The Never-ending Cycle

April 12th, 2008 @ 11:14 pm

Filed under: General, Planning

It seems like every application or OS upgrade creates at least one compatibility issue requiring another upgrade… Sometimes, it seems like it just never ends. What’s worse is it seems like no matter how testing done, there’s still some show-stopping issue that pops up in the middle of deployment. Anyone else feel the same way?

We began rolling out Windows Vista back in February. Of course, I spent quite a while testing various things, even ran it on my machine a few weeks before the first user got it. All of our important apps - mainly Shelby and EMS seemed to work fine.

The first group was at a remote site, which accesses Shelby and EMS via terminal services. Everything worked great there. A couple of weeks later, we rolled out a few machines at our main site and the users began complaining about EMS not working. Turns out you could log in and navigate all the screens fine, but as soon as you tried to actually do much, there were lots of nasty errors. A quick look at the change logs online revealed we simple needed to upgrade from 10.0 to 10.1 - no big deal.

Fast forward to yesterday. It’s spring break, things are pretty slow arround the office, so it’s a good time to take EMS down and do the upgrade. Everything went fine until we tried to run some custom reports and were greeted with a nasty error about DLL’s being the wrong version. The report in question are actually a custom DLL that the software vendor developed for us. A quick call to support revealed that they would need to re-compile the custom app for version 10.1, which will take 7 business days. So, we ended up having to revert back to 10.0 for now.

Isn’t new technology great? :-\

ruldrurd