The Code Project infrastructure: Consolidation

Vince Yonemitsu

Rate me:

5.00/5 (6 votes)

13 Apr 2012CPOL

26.5K

A series of articles that will hopefully guide you through some of our successes and failures in upgrading and maintaining our infrastructure.

Many SysAdmin’s have been in this boat before me, and many will follow. We have all been the new face at a company. You feel like you’re ready to take on the world and make a difference. It’s your time to make the business run faster, with fewer resources, lower its operating costs while increasing productivity and resources for internal and external customers.

You want everything to be streamlined, fast and with all the bells and whistles to make everyone happy, much like James Bond when he hops in his Austin Martin Vanquish. The problem is Q ran out of resources and gave you a 1989 K-Car Wagon. Sure it’s running and pretty good on gas, but plastic wood paneling has no place on a car in chase across the arctic, or in your server room.

When I started at the CodeProject everything was, for the most part, running perfectly fine. Servers were chugging along with some new, some old, some fast and some slow hardware. So why fix it if it’s not broken? Well broken is in the eye of the beholder, and just because you just put a new BlueTooth radio in your K-Car, it’s still…well… a K-Car.

I will be spending some time writing a series of articles that will hopefully guide you through some of our successes and failures. Over the last year and bit we went from an old school non-virtualized, non-clustered off the shelf, buy a server, slam it in the rack and pop in the Windows DVD setup to a mix of Linux and Windows, Virtualized Servers, SAN’s, SQL Failover Clusters, Load Balancers and more. In the end we have reduced our Electricity requirements by two thirds, cut the number of physical servers down to less than half, and reduced our monthly Licensing budget by 30 percent yet we have increased the number of services while increasing or maintaining performance on all fronts.

Sounds great right? Well fasten your seatbelt. It’s a long and winding road ahead. Things are going to break, services will go down, staff will complain and you will encounter resistance. With proper planning, outages and other concerns can be mitigated.

The first thing you need to do is to make sure you have your team and everyone involved aware that there will be struggles but the end goal is well worth it. Once you have everyone’s blessing (or those who can complain safely out of the Country…aka Chris in Australia) you are ready to dig in.

Where to Start

So what was my vision? From the earliest days at The Code Project I quickly realized that we needed two key things.

A SQL Failover Cluster. Waking up by an emergency phone call at 2am because our SQL server lost a piece of hardware and Time Zone X is about to wake up is not acceptable in my world of happy nap time.
Virtualization. With a quick poke around during production hours I saw that much of the servers we had were running off of old Pentium 4’s and first gen Xeon’s yet were only using a small fraction of the CPU’s they had. Although the services they were running were happy on this hardware these servers were starting to fail due to basic hardware failures like fans and Hard Drive’s. These failures were causing service outages that were again interfering with my beauty sleep.

BTW no matter what ring tone you use it sucks waking up in the middle of the night.

With these two targets in mind I knew I would need to have some sort of shared storage. Yes this means a SAN. I wasn’t yet sure about what type, model or capacity I needed, just that I needed one. I also knew I wanted a fast enough local network so that multiple services could all share lines in a virtual environment without slowing down.

Onto the Inventory

You need a complete and accurate inventory of what you have. You don’t need to detail every single piece of everything but there are certain pieces of hardware that can save lots of money. Some pieces of hardware haven’t really changed too much in the last few years and some can still be reused in your new vision.

Take note of the following key items in your inventory.

You first need a physical inventory of all server hardware. List every Server, with its CPU type, quantity and number of cores, how much memory is installed, Hard drives with makes, models (Take specific note as to whether or not its Single Port, Dual port, SAS, SATA etc…), form factor(2.5” or 3”), spindle speed (RPM’s), physical size (in GB), take note of the NIC’s in your servers, the number per server, and are they integrated or add on cards.
You will also need the soft information from each machine. You will need to map out if and when your individual servers are busy or idle, how much CPU and Memory each uses during peak and non-peak times, when the backups or any resource intensive jobs run, How much Hard drive space is allocated in what partitions, how much free space you have or need.
You should also make a list of network gear. Switches, routers, hubs (GASP!!!), firewalls etc. Is it gigabit throughout? What about your cables? Are they all Cat 5E or higher?
The last piece of information is the Software info. OS’s, services, applications, service pack levels and take note of any key dependencies, eg. Web Servers 1-6 use Databases on SQL Server 1 and 2.

Ideally much of your hardware is from one vendor, if it is this can save you a fair bit on the various bits and pieces as you build and spec out your new hardware plan. Most of our newer hardware was from one vendor with the older stuff being from another.

Once you have all of this information you will need to come up with your own plan of what can be amalgamated and what can’t. I pretty much decided anything running off of less than a Xeon would be the first to be virtualized. Mostly because this is the hardware that was giving us issues and since it was still running off that old hardware it should virtualize fairly easily. From there I looked at what services are not heavy in disk IO as this can be a major bottle neck. I had made my list written in very light pencil, and pressed on.

I tend not to like to virtualize domain controllers. The number one reason for this is during a panic situation like a power outage if the Domain controllers aren’t up first or down last a lot of stuff won’t work properly and you can end up chasing your tail around rebooting everything while you sit and stare at the Windows Login screen as it tries to connect to a Domain controller that isn’t up yet. This 5 to 20 minute timeout is agonizing with angry people tapping there feet behind you. If you are familiar with virtualizing you can setup priorities, startup delays and startup orders for virtual machines, but chances are not everything will be virtualized, so this isn’t always going to work.

Don’t Forget about Growth

This is where your companies 3 year plan comes in handy. You have that in your hip pocket right? No? You missed that meeting because you were fixing something? I know, I know, you don’t have it, but get it. Or make one up. Sit down with the key decision makers, and ask them if they foresee any mass hiring, service changes, new products or whatever tends to shape your server farm. The biggest mistake you can make when virtualizing is going too small or buying hardware that isn’t going to be expandable enough for the future.

If you have gone through this far you should now have a fairly good idea of what you want to virtualize and what you may be able to consolidate. In our case we had roughly 40 physical servers. 5 or 6 SQL servers, a dozen or so web servers, a half dozen mail servers and the rest were split up between DNS, Domain controllers and a mess of miscellaneous servers.

My initial plan was to buy two new big honkin SQL server’s for a SQL cluster and use the existing SQL servers as Virtual Host’s, connect them all to a SAN. Then convert as many of the other Physical Servers to Virtual Servers as I could. I saw 2 or possibly 3 SQL Servers as prime candidates for virtual server’s, they had lots of cores and lots of ram.

Disks are a very expensive part of putting in a SAN so my main goal for cost cutting would be to re-use as many disks as I could. The SQL servers and various file servers had lots of small and fast disks. Most of our servers were using various 2.5” disks some single port, some dual, some 10k and some 15k however all were from the same manufacturer.

I also figured we could move some services around to a cheaper means of storage (eg. iSCSI NAS’s etc..) and re-use those more expensive disks for the SAN as well. In all I was able to scavenge 8x300GB 10k disks, 10x148GB 10k dual port SAS disks, and numerous 72gb 15k disks. In the end we ended up with a cheap, moderately performing NAS for mass storage such as backups, and a nicely powered and fast SAN for high performance tasks such as SQL storage.

Fort the SQL server side of things, I needed to know how much physical space I needed for Logs and Data, as these will need to reside on shared storage. Since a lot of SysAdmins wear many hats one of the first things you need to know is that SQL Databases don’t always use as much space as what is on the disks. So finding this out from Windows Explorer isn’t going to cut it.

Here is a little SQL query that you can run on your SQL servers to see what you have in terms of space required and used.

SQL

DECLARE @command varchar(max) = 
	'USE [?] 
	select DBName = left(a.NAME,40),
		   [Size MB] = convert(decimal(12),round(a.size/128,2)),
		   [Used MB] = convert(decimal(12),round(fileproperty(a.name,''SpaceUsed'')/128,2)),
		   [Free Space MB] = convert(decimal(12),round((a.size-fileproperty(a.name,''SpaceUsed''))/128,2)),
			FullPath = left(a.FILENAME,70)
	from dbo.sysfiles a'
	
EXEC sp_MSforeachdb @command

This will show you how much each log and data file is using, there paths, names, and how much free space they each have. When building a SQL server you want to avoid as much as possible any auto growth, however you don’t want to consume so much space you are wasting it. Make your plans accordingly.

After running the numbers from my inventory of how much space I needed to store the VM’s and the SQL data/log files this all worked out surprisingly well with what disks we already had. I’ll go into the nitty gritty details of what’s going where with regards to the disks in the later articles.

Off I went on my merry way, happily unplugging things, shutting things down, moving things around, installing SQL cluster’s, going through test phases with HyperV, VMWare and XenServer. Eventually we ended up with Two Virtual Servers running about 20 VM’s in production (One recycled from an old SQL server and one a brand new server) both running Xenserver 6.0, a pair of SQL Servers in a failover cluster (One original, One new), with all four of these connected to DAS SAN (reusing disks from other servers), and only 14 other Physical servers remain, several of which are slated to be virtualized in the next few months.

In the end I have found myself with a bit more time on my hands, and am able to concentrate more on productivity gains, helping staff make better use of their resources, improving functionality, tweaking the servers to get peak performance… and of course… write this lovely little article.

I'll follow up with specifics on what I did with the SQL servers, how we consolidated the webservers, our storage balancing act, our network, and, of course, backups.

In the words of Little Nicky….

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

The Code Project infrastructure: Consolidation

Where to Start

Onto the Inventory

Don’t Forget about Growth

License

Comments and Discussions