Wednesday, June 2, 2010

Do you know where your vCenter sever is?


vCenter is one of those servers that at first was not a good candidate for virtualization but with vSphere 4 being able to handle much higher loads than VI3 or older versions of VMware it has become more and more common to find vCenter being virtualized. It is even now recommended by VMware to virtualize it and it makes for good dog food. We cannot expect our tier one app admis to take the plunge if we won’t do it ourselves now can we?

One reason I have heard for not virtualizing vCenter is “if something goes wrong how do I get to it?” I have to admit having the environment down and having to hunt for the vCenter server is not a good time. I resolved this issue by creating a new alarm to alert me whenever vCenter migrates from one Host to another and send me an email.

Here is how I did it.
Step 1. Go to the alarms tab and click the definitions button.























Step 2. Just right click and select new alarm from the context menu.












This will bring up the Alarm settings dialog box.


















Step 3. Give it a good name and description. Also select the second radio button.
















Step 4. Select the triggers tab and click Add. This will populate a generic Event.















Step 5. At this point click on the Event and select the “VM migrated” option and "DRS VM migrated".
VM migrated tells you when someone initiated a vMotion and DRS VM migrated tells you when the system moved.
















Step 6. Click on Advanced under the Conditions heading and click add.















This will add a new event argument. At this point click on the dropdown for the Argument and select “VM name”.


Then select value and add the name of your vCenter server. In this case it is vCenter1, when this is done click OK.














Step 7. Click on the Actions tab and click Add.















This will populate a new Action which by default is “send a notification email” which is what we want.
Next change the alert level from Yellow -> to Red to be Green-> to Yellow.















The last step for in this screen will be to add an email address.
Then click ok.

Step 8. Now it is time to test it. To do this just migrate the vCenter VM from one host to another and you should get an email.














This setup will allow you to always know where your vCenter server is and so you never have to worry about where it is in case of a bad day.



Monday, May 10, 2010

Storage Testing getting the most out of your disk array with multiple database servers.

Recently we at work became the proud new parents of and HP EVA4400 with 72 15K drives. This is significantly more spinals than the 12 disk than we are currently running on. Now like most environments we need to put 20 pounds of stuff in a 1 pound bag and get the most out of the hardware we can because there is simply no more money in our budget for more hardware. With that idea in mind I test drove several different best practice ideas from HP and VMware.


First we will review the best practices from HP. Please know that I tried to find a like for the EVA best practice document but have had no luck, the only one I have is the one that came on the CD with the EVA its self. Moving on, for HP the best practice comes in two flavors; first is the cost effective best practice which is to have all the disks in one large disk group while the second is to run Vraid 1 for the best availability and speed. Note that Vraid 1 on an EVA is the equivalent of the traditional raid 1+0 which is mirror and striping. This set up is true from an availability stand point and even on a raw performance stand point as well. There are a few shortcomings with it though. This first shortcoming being the amount of lost disk space. The disks in our EVA are 146 GB each for a total of 10512 GB or 10TB. With a raid 10 we will only get 5 TB raw and that will just not fit our needs. So we will be looking at Raid 5 and 6 setups. Additionally we have 2 very busy database servers and 2 busy exchange mail store servers. All four of these can have lots of disk I/O from Transaction logs and database read and writes. The best practice document from HP for the EVA does make note that when running database servers or anything else with large amounts of I/O the one large disk group might not be the best fit. With that being said and running the whole thing as a raid 10 eating up too much space I turned to VMware and their best practice and white papers for setting up storage.

Considering that we have 4 very active database servers which happen to be our tier one apps, I wanted to be sure that these servers had the I/O that they need to meet the application and business requirements. There were 2 white papers from VMware I read sometime back, one of these was about MS SQL (which can be found here http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf) and the other was about SAP (which can be found here http://www.vmware.com/files/pdf/whitepaper_SAP_bestpractice_jan08.pdf). In the MS SQL document VMware took and created a one disk group for the database and then for the T-logs took 2 other disk groups and laid raid 0 arrays on them and set the VM to have mirrored the two. Note that this was an all-out speed test for MS SQL server in vSphere and the number of disks they used where much larger than anything we have but the configuration did offer a possible configuration for our needs. For the SAP document, as much as we are not running SAP, it addresses running more than one database server and how to configure the disk to handle the load.
 

 
 
Now of the fun part, test one the SAP configuration.
 
In the SAP document VMware took and sliced out one disk group of 8-12 disks for T-Logs and the other disk groups for data and OS drives. As for me I took and created 1 disk group of 8 disk for T-Logs and set 2 vdisks (LUNs) on it. The vdisks have a raid level of 10 each. The reason I created 2 LUNs was to reduce LUN lock between the hosts. I then created 4 disk groups of 13 disks each for data and OS drives. If you’re doing the math 13 * 4 + 8 does not equal 72 but 60 this is because all good admins keep a little bit of their resources unused and in their back pocket for a rainy day. On each one of the 4 disk groups I first created a LUN with Raid 6 on it for the Database drive, and then I created another LUN for data. The break down looks like this.
 








Next I created the test VMs. I created 4 servers to simulate the 4 database servers we run now. Each one of these VMs are Windows 2008 servers with 1 vCPU 2 GB RAM and 2 drives. The first drive is for the OS and second drive is for T-Logs. Each Drive is on its own dedicated SCSI channel to improve how it accesses each drive during simultaneous read and writes. The servers where setup on the datastores as:
 
 I did not spread the OS drives over all 4 disk groups because my focus was to provide the I/O for T-Logs which have the highest write counts. To test the I/O I installed iometer (which can be downloaded here http://www.iometer.org/doc/downloads.html) on each one of the servers. I setup the test to RUN for 10 minutes on all 4 servers at the exact same time. The results were quite imprecise considering that there was only 8 disk split into 2 LUNs serving 4 servers running a very disk intensive test. The results look like this:



The Second Test was the MS SQL setup by VMware.

In the MS SQL test, VMware created 1 large disk group for the database and they created 2 disk groups each with a rail level 0 for the T-Logs. Since raid 0 does not provide any resiliency to disk failure so they mirrored the disk at the server level. This was a setup for speed testing one SQL server but it seemed like it could have potential to meet our needs, even though I am not a big fan of software Raids I am willing to do what it takes to get the job done. For the second test I setup the EVA with 4 disk groups with 15 disks each and then created two LUNs on each disk group one LUN was a raid 6 for Data and OS drives and the other was raid 0 for T-Logs. The configuration looked like this:


The final test setup looked like this:


This is definitely a more complicated setup than the first test and will be more prone to misconfiguration but at this point it is only a test. Next came setting up the Software mirrors. After the mirrors where all in place I setup IOmeter to test out the T-logs drives and the C: drives as well since in the final confuguration the T-Logs drives would be on disk groups that would be used by other servers as well. I ran the exact same test for 10 minutes just like the first round and the results looked like this:




Saturday, May 1, 2010

Test

This is a test

Yes the rule is test, test, test and test again and when you think your done testing well test it one more time. Then have some others test it as well.

Obviously this applies to more than just virtualization technologies and can have some good out comes if done right and some not so good out comes if not done at all. This has become a rule that I follow whenever I think I have found a fix to a problem, or building some new system. Life and Murphy have taught me to test everything every way I can think of and then have some others test it as well.

Here are a few links to some testing Methodologies

http://www.satisfice.com/testmethod.shtml

Yes I know its a wiki but its a start
http://en.wikipedia.org/wiki/Software_testing