VAAI Train Running Late Today

Another one of those posts today that most likely will not affect most, however there is a known issue with the vStorage APIs for Array Integration (VAAI) mixed together with EMC VMAX Storage Arrays. My best bud, Chad Sakac, wrote about this last week (over here).

Long story short, if you’re running a VMAX with Enginuity 5875.135.91 or 5875.139.93 along with ESX(i) 4.1 hosts you may see some slowness when trying to do things like Storage vMotion, Deploy from Template, etc…basically things that leverage HardwareAcceleratedMove.

So how do you fix it?  First thing is you need to disable HardwareAcceleratedMove, instructions on how to do this can be found in VMware KB1033665. Next thing is to contact EMC Support and have the ePack that engineering released for this problem installed on your VMAX. Then finally, contact VMware support for their hotfix…rumor has it that a VMware support bundle should be released sometime soon. After you have the patch for VMAX as well as vSphere you should have no problem turning HardwareAcceleratedMove back on.

So to wrap this up….VAAI issue when mixed with VMAX, but it doesn’t affect all VAAI functionality, just XCOPY (HardwareAcceleratedMove) and even then it doesn’t affect every single operation. Get the patches, get them installed and get back on schedule!

Posted under Storage, vSphere

This post was written by Rick Scherer on June 14, 2011

Tags: , , , ,

NetXen HP NC522SFP Network Flooding

I had a very fun weekend. It started at 4am Saturday with a migration of ~125 virtual machines from an old AMD based environment to a new Intel Nehalem based environment. Who could’ve known that within a few hours all hell would’ve broken loose.

Enter in problem of network flooding from the NetXen based HP branded NC522SFP.  Because all of the 10GbE ports from the (9) new ESXi servers were creating thousands of pause frames on the Cisco Nexus 5020 switches, I thought originally that it was an issue on the switch.  Talks with Cisco revealed nothing.  We attempted to disconnect one of the connected ports (each ESXi host is dual connected into a pair of N5Ks using vPC) to remove a potential spanning tree loop….no dice.

A reboot of the host resolved the problem, things appeared to be running normally and we decided to let it be and wait until Monday.

10 hours goes by, it is now Sunday morning and the problem returns.  First host loses storage (we’re doing NFS over 10GbE here), then two more…until all 9 in this cluster are pretty much toast.  I decide to open a ticket with VMware.  Wouldn’t you know, there is a potential known bug and resolution.

Bug 496013

Description: Some NetXen based 10GbE cards using the unm_nic and nx_nic drivers sometime flood the network with pause frames causing the port to become disabled.

Resolution: NetXen believes upgrading the firmware to version 4.0.516 will resolve the problem.

I’ve gone ahead and patched 4 of the hosts with this new firmware, so far it has been stable (knock on wood).   I’ll let you know if something happens.

Checking which version of the firmware you’re running is simple. From a command-line (ESX or ESXi hidden CLI), type ethtool -i <vmnic#> (replace vmnic# with the alias to the vmnic you’d like to check).  You should see output similar to:

driver: nx_nic

version: 4.0.301

firmware-version: 4.0.406

bus-info: 0000:07:00.0

Update - Utility CD with firmware patch now included…

As you can see above, the firmware is out of date. To update the firmware you will need to boot from a Linux utility CD that has the appropriate driver, you then run a firmware update utility provided by HP.  To make this process easy I have created a bootable SLAX utility CD with the drivers pre-loaded. You can download the ISO from here (file temporarily removed). Once booted run the installer located in the root filesystem (ie: ./CP011471.scexe).

Let me know if you have any questions.

Posted under ESX 3.5 Tips, ESXi 3.5 Tips, Networking, Storage, vSphere

This post was written by Rick Scherer on January 11, 2010

Tags: , , , , , , , , ,