It takes a lot of technology to power a 24×7 call center operation, and due to the innovative solutions we provide here at CMS, we require more technical resources than most.

Recently, we invested in new hardware and software technologies that allow us to better control, manage and troubleshoot potential issues with the servers and the applications they run.

The hardware technology is a network based storage solution called a Storage Area Network (SAN) that allows us to provide high availability data to our servers, including the operating system files that run the servers. The software technology is called virtualization, which allows us to run multiple instances of virtual servers on two physical servers with a highly redundant link between them.

First Off – What is a SAN?

A SAN is a Storage Area Network, or network accessible storage, designed to be highly redundant in almost every way. It’s basically a server with no operating system (it has firmware with embedded management features) and many disks. It uses a specialized network protocol designed to replicate storage protocols called iSCSI (SCSI is the non-network version found in most server’s hard drive/motherboard).

Our particular SAN has the following features:

  • RAID 50 storage array providing disk integrity even with disk failures and fast read/write capability.
  • 14TB (9TB usable) of storage space for virtual machines
  • Dual network interface cards that automatically fail over when a link goes down both on the port level, or the card level.
  • Dual disk controllers that automatically fail over

Looking at Performance Impact

One of the main benefits of virtualized servers is that they share and allocate resources based on need. We were curious to see how this would affect our SQL (database) server which is one of our most heavily-used servers.

If a software program is requesting information (for us, this would be loading scripts, updating our CRM, or loading instructions for Enframed, our internet overlay software) there’s a good chance it has to go to our SQL server to get that information.

To see how virtualization would affect SQL’s performance our Programming and Project Manager (and data analysis guru), Phillip Johnson, requested the same complicated set of information 200 times before SQL was virtualized and 200 times after it was virtualized.

This graph shows the number of the information requests (queries) that took x milliseconds to complete:

SQL Performance

You can see that our old server (blue) is highly concentrated right around 1300-1400 milliseconds. This is because a non-virtualized server always has access to the exact same amount of memory and processing power.

In contrast, the red line shows our non-optimized virtualized server. Here, the distribution is much broader. The same query typically fell anywhere within the 750-2000 milliseconds range. This is because, for some of those queries, the virtualization system allocated more resources to SQL and for others it allocated fewer resources.

Better, More Reliable Information

Now that we have successfully virtualized our internal environment we have more reliable access to information that can help us resolve issues with physical and virtual hosts by pinpointing the server that is causing issues from a centralized management console.

Below is an example report that includes general statistics on the 2 physical servers that are running 22 virtual servers.

Sample Report

Sample Report

Sample Report - Memory Consumed

Sample Report - CPU Usage

Sample Report - Sum of VM's CPU Utilization - Per Host

Sample Report - Sum of VM's Memory Utilization - Per Host

Although it was a long and tedious process, this investment was necessary to accommodate our growth and ensure consistent and reliable up time for our partners. I’m happy that we’re able to mark this project off our checklist and begin sharing the results. It’s definitely proven extremely beneficial thus far.

Server Photo courtesy of getButterfly on Flickr