rulururu

post Runaway Clock in Virtual Linux Servers

May 18th, 2008 @ 2:45 pm

Filed under: Servers, Virtualization

If you run any Linux guests under VMware, you’ve probably had issues with the clock in the VM drifting or just totally running away.

The Linux clock works by counting timer interrupts. In older kernels, this was usually done at a rate of 100Hz, or 100 times per second. Beginning with the 2.6 kernel, the interrupt timer is now set at 1000Hz, so interrupts are counted 10 times as often.

Due to the fact that VMware divides the host up into “time slots” for each guest OS, and depending on the system load, interrupts are often missed in the guest machines. The more often the guest kernel counts interrupts, the more apparent these “missed” interrupts become and the result clock skew in the gust machine. VMware Tools has the ability to sync the guest clock with the host, but this only occurs once per minute, and can only advance the clock, it can’t slow it down. Generally, the VMware Tools clock sync alone is not enough.

Here’s the steps that are needed in order to keep the clock skew under control (these apply to VMware Server running on a Linux host - in my case, CentOS). The guest OS changes will also apply to ESX.:

  • VMware server needs to be told what clock speed the CPU(s) run at. This can be found by running “cat /proc/cpuinfo”, which will return all kinds of information about the CPU’s, including the clockspeed. You’ll need to edit /etc/vmware/config and add the following lines (where host.cpukHz is the host CPU speek in KHz (2.8GHz in my example below)

    host.cpukHz = 2800000
    host.noTSC = TRUE
    ptsc.noTSC = TRUE

  • VMware Tools needs to be installed in the guest OS. VMware provides instructions on how to install VMware Tools in a Linux guest here.
  • VMware Tools time synchronization needs to be enabled. This is done by editing the VMX file in the virtual machine directory and adding the following line:

    tools.syncTime = “TRUE”

    Note that the host should use NTP to sync to an outside time source, while NTP should be disabled in each guest

  • Now, we need to lower the interrupt frequency in the guest kernel. Generally, this will require installing the kernel source, modifying the CONFIG_HZ parameter to a rate of 100Hz, and then recompiling the kernel. CentOS has made this easy for us by releasing a “VM Optimized” kernel for CentOS 5. Although perfectly stable, this kernel is presently in the “Testing” repository. Here’s how to install the VM Kernel using yum in a CentOS 5 system:Add the “Testing” repo as follows:

    cd /etc/yum.repos.d
    wget http://dev.centos.org/centos/5/CentOS-Testing.repo

    Now, install the VM Optimized kernel:

    yum enablerepo=c5-testing install kernel-vm kernel-vm-devel

  • Now, we need to make sure Grub is set to boot the new kernel, and also add the “clock=pit” parameter to the kernel boot options. We do that by editing /etc/grub.conf and making the following changes:

    default=0

    Where “0″ is the first kernel listed. If the VM Kernel is not the first item, you’ll need to adjust the value accordingly. For example, if it’s second in the list, you’d use “default=1″Now, add the clock=pit parameter to the kernel boot options. That section of the grub.conf file will look something like this:

    title CentOS (2.6.18-53.1.19.el5) root (hd0,0)
    kernel /vmlinuz-2.6.18-53.1.19.el5 ro root=LABEL=/ clock=pit
    initrd /initrd-2.6.18-53.1.19.el5.img

Once all of the above changes are made, reboot the guest, and you should see significantly better clock performance. I had some VM’s where the time would drift by hours, and after making these changes, they stay within a few seconds.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

ruldrurd