Testimonials
  • Our project required a deep knowledge of hardware design, expertise in virtual server configurations, and the ability to quickly understand how to plug into our existing back-up/recovery processes. It was immediately evident that you had the experience and skills to pull it off.

Application Availability: Migrate Physical Linux Server to VMware (P2V)

As part of my work helping companies better protect their data and applications, we also look at the availability of their applications.  One great way to improve availability of applications to is move them to a virtualization platform, such as VMware Virtual Infrastructure.

VMware, and other virtualization platforms, provide a great deal of flexibility to drastically minimize downtime of your critical applications.  This is accomplished by abstracting out the application from a physical piece of hardware and placing this abstracted “block” (Virtual Machine) onto a “private cloud” of  hardware.  In this configuration if a piece of hardware fails, or needs to be taken offline for maintenance, the application Virtual Machine (VM) can simply be moved to another physical server.  If you have the means to deploy their enterprise solutions, this can be achieved automatically when hardware problems occur without any human intervention, even when working with legacy applications that are not “cluster aware.”

That’s great, so how do we move our physical servers over to VMware?  Well, for Windows there are built in plugins and standalone applications to do this (search for VMware Converter/VMware Converter Standalone).  It appears that recently they’ve added some Linux support, but I’ve constructed my own method which has one large advantage:  being able to “pre seed” large data sets before switch-over.  Put another way, if you have a lot of data on the server you wish to move, this method will allow you to copy the data slowly over days or weeks while the server is still online and available.  Then when the data is fully replicated, you can quickly cut over to the new server — usually in less than an hour.

This is a heavy task and should not be taken lightly.  Make sure your Linux skills are sharp, and that you have great backups in place should something go wrong.  You’ve been warned

Let’s walk through the process.  At a high level, here is what we need to do:

  1. Create the target VM on VMware
  2. Seed data
  3. Configure boot code — this can be tricky!
  4. Cutover

Let’s take a look at each of these steps in a little more detail.

Create Target VM

  1. Boot this VM from a rescue CD provided by your distribution of choice (I prefer System Rescue CD)
  2. Create the needed partitions, file systems, and LVM volumes to match the source system.  Note:  the new file systems need not be the same size as the source, they just need enough space for the actual data to be moved.

 

Start off by inspecting the existing configuration on the source server.  Look at the output of mount, fdisk -l, pvs, vgs, lvs, and the content of /etc/fstab.

For  example, take this fstab from a server I recently converted (Ubunut 8.04, with the interesting lines in bold):

# /etc/fstab: static file system information. # # <file system> <mount point> <type>  <options> <dump> <pass> proc            /proc           proc    defaults 0 0 /dev/system/root /  ext3    relatime,errors=remount-ro 0 1 /dev/sda1 /boot  ext3    relatime        0       2 /dev/system/home /home ext3 relatime   0 2 /dev/system/tmp /tmp ext3 relatime  0  2 /dev/system/usr /usr  ext3 relatime 0 2 /dev/system/var /var ext3    relatime 0 2 /dev/system/opt /opt ext3    relatime 0 2 /dev/system/swap none            swap    sw 0 0 /dev/scd0  /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0

On the new VM, I need to mirror these filesystems.  I could choose to consolidate them into one large root volume, but I’ll just stick with what is there.  If you choose to change the layout, make sure to save your new version of /etc/fstab on the new server and protect it with an –exclude=/etc/fstab on your rsync command for copying data.  To replicate the current setup, I:

Setup the needed physical partitions using fdisk, creating a boot partition (/dev/sda1) and a LVM physical volume (/dev/sda2):

sudo fdisk /dev/sda

Next I create the needed LVM physical volume, group, and logical volumes:

pvcreate /dev/sda2 vgcreate system/dev/sda2 lvcreate -n root -L500M /dev/system lvcreate -n home -L2G /dev/system lvcreate -n tmp -L2G /dev/system lvcreate -n usr -L4G /dev/system lvcreate -n var -L4G /dev/system lvcreate -n opt -L4G /dev/system lvcreate -n swap -L4G /dev/system

Next create the filesystems and swap partition:

mkswap /dev/system/swap mkfs.ext3 /dev/sda1 for fs in root home tmp usr var opt  do    mkfs.ext3 /dev/system/${fs}    echo DONE creating ${fs} done 

Now that the filesystem structure exists, we can create mount points and mount them up:

mkdir /mnt/new-system # mount the root volume first! mount /dev/system/root /mnt/new-system   cd /mnt/new-system for fs in home tmp usr var opt
do
   mkdir /mnt/new-system/${fs}
   mount /dev/system/${fs} /mnt/new-system/${fs}
done 

Seed Data

Now that we have the new system-root in place at /mnt/new-system, we can do our initial copy using rsync over ssh:

rsync --progress --exclude=/proc --exclude=/sys --exclude=/etc/fstab --bwlimit=4096 -ave ssh root@old-server:/ /mnt/new-system 

The above command will replicate everything from the root of the old server, to /mnt/new-system in our newly created VM.  You may add additional –exclude statements to exclude files that are no longer needed.  Adjust the –bwlimit argument to the number of Kbps you wish to copy, in this case we limit the amount of bandwidth to 4Mbps as to not interfere with production operations.

Configure the Bootloader

Great, now we have a new VM with all file systems copied over.  The next step is the trickiest part to get right:  installing GRUB (or other) bootloader.  The bootloader is the program used to start your computer right after the BIOS and it needs to know just enough about our VM to load our kernel and get going.

Below is what has worked for me, but in my experience every conversion I’ve done has required some last minute changes to make things work.

On Ubuntu (tested lately on 8.04); load the mptscsih module in your recovery environment (so that GRUB can see your new VMware disks):

modprobe mptscsih

Next change the current root to be our new /mnt/new-system directory and build a new initial ram disk with the proper drivers for our new system:

chroot /mnt/new-server /bin/bash # Use your current kernel version here (find it with uname -a on the source system) update-initramfs -u -k '2.6.24-16-server' 

Next hit CTRL+D to exit the chroot environment.  Next we install GRUB.

grub-install --root-directory=/mnt/new-server /dev/sda 

Now inspect /mnt/new-server/boot/menu.lst (or wherever your GRUB config file is, could be in boot/grub.conf or other file) and ensure that the kernel paths are correct.  You may infer the correct paths from boot/grub/device.map.  If the wrong path is used, update menu.lst or grub.conf with the correct entry (such as hd(0,0)) and re-run grub-install.

With RedHat or CentOS (I’ve done this with RHEL4/5 and CentOS 4/5) the steps are essentially the same, however I’ve found it easiest here to rebuild the kernel on the source server with these options:

sudo mkinitrd -v -f --with=mptscsih --with=mptspi /boot/initrd-`uname -r`-VM.img `uname -r`

Then copy this kernel ending in “-VM” to the new VM in place of the original file, then install GRUB.

The next step is to reboot and test that your boot code works.  Disconnect any network connections to avoid conflicts with the existing server, and reboot.  Upon boot note any errors you may need to correct, often times at this point I discover one of the following issues:

  1. I messed up with the mptscsih module and the root volume can’t be found, kernel panic
  2. There was a typo in /etc/fstab or a volume was missed
  3. Some network dependent services fail or take forever to start — this is normal as the network is down for this VM.

If you get to a login screen, congratulations the hard part is over!  If not, fix the noted issues and try again.

Cutover

Now that your new VM is booting it’s time to schedule a cutover time/date.  When that time arrives:

  1. Power down current production system, boot from Rescue CD
  2. Mount existing filesystems in under /mnt, much like we did for the rescue system
  3. Boot the VM from the Rescue CD, and re-mount partitions under /mnt/new-system
  4. Initiate final data copy, grabbing files changed since the seed operation (and previous open/inconsistent files such as MySQL databases)
  5. Reboot, cross your fingers, and test thoroughly

For step 2 I use this rsync command below on the new VM (don’t use –xattrs if your system doesn’t have SELinux capabilities; but you must use it if that is in use):

rsync --delete --exclude=/boot --exclude=/proc --exclude=/etc/fstab --exclude=/sys --xattrs -ave ssh / root@old-server:/mnt/new-server

The –delete option removes files from the destination (in our case, the VM) that no longer exist on the physical server.  If you are afraid of messing up this command, you can add the –dry-run argument to see which files will be deleted.

It is very important that this command be run from the new VM and that the order of hosts is SOURCE then DESTINATION (destination is the new VM in this case), if you mess this up you could easily destroy your installation.

After this rsync finishes, power off the old physical server and reboot the VM with the virtual network adapters connected.

Hopefully everything comes up and runs fine at this point, you can test by accessing the services installed on this server.  A lot of things can go wrong at this point, however, here are some things I’ve encountered and how to deal with them.

On Ubuntu, the Ethernet device will change when moving to VMware, for example from eth0 to eth1.  This can confuse installed applications configured to use a specific adapter, and static IPs used on the old server will no longer be configured.

This can be corrected by editing /etc/udev/rules.d/70-persistent-net.rules and commenting out the old rule for eth0, and changing eth1 in the new line to eth0, here is my file post change:

# PCI device 0x8086:0x100e (e1000)
#SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:11:11:19:3e:df", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x1022:0x2000 (pcnet32)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9e:00:06", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

I’ve also seen the loopback adapter fail after a move on Ubuntu (lo, 127.0.0.1), which many applications rely on.  I fixed this by adding back the line below in /etc/hosts (don’t ask me how it dissapeared):

127.0.0.1       localhost

That pretty much sums it up.  If I were you, I’d keep the previous server around for a week or two, just in case some service is discovered that did not come back properly you’ll have the option to review the old server.  I’d recommend pulling out the network and power cables from the old server at this point as well, just to make sure someone doesn’t accidentally power it back on.

I hope you’ve found this article helpful, feel free to leave me feedback on typos or other snags you’ve ran into so we can make this article better.

This entry was posted in LinkedIn, Off The Wire, Tech Tips and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.