Tuesday, May 20, 2014

Vmware Cluster datastore path dead

Symptoms:

  1. have 4 esxi5.1 hosts (esxi01-04) in vmware cluster, VM in two specific datastore cannot start up on esxi01 host
  2. checked the iSCSI Network adapter should be fine as only two datastore have issue
  3. checked the LUN mapping is fine
  4. Try rescan and refresh datatstore didn't solve the issue
  5. Try to disconnect and re-mount the datastore on esxi01 didn't solve the problem . 
  6. For a reason to resume service asap, migrate all VMs to TEMP datastore, unmount and detach the datastore for all esxi host
  7. unmaping LUN on storage side, rescan datastore on vmware cluster the problem LUN should be disappear.
  8. Remap LUN to esxi hosts, add to datasore cluster again, re-signature and format the LUN, put VM on it for testing 

Search through esxi logs and get result below:

standard input)-2014-05-xxT03:09:31.297Z cpu6:8198)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.60a9800037536d72502444xxxxxxxxb" state in doubt; requested fast path state update...
(standard input)-2014-05-xxT03:09:31.594Z cpu4:8401)ALERT: NMP: vmk_NmpVerifyPathUID:1167:The physical media represented by device naa.60a9800037536d72502444xxxxxx (path vmhba32:C0:T0:L13) has changed. If this is a data LUN, this is a critical error. Detect
(standard input)-2014-05-xxT03:09:31.594Z cpu4:8401)WARNING: ScsiDevice: 1422: Device :naa.60a9800037536d72502444xxxxxxx has been removed or is permanently inaccessible.
(standard input):2014-05-xxT03:09:32.410Z cpu4:8196)WARNING: HBX: 1548: HB failed due to no connectivity on [HB state abcdef02 offset 4059136 gen 17 stampUS 1847607014728 uuid 535950b3-29e2ce5d-488a-0025b501011f jrnl <FB 70200> drv 14.58] on vol 'XXXXXXX'
(standard input)-2014-05-xxT03:10:50.461Z cpu10:8202)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.60a9800037536d72502xxxxxxxx" state in doubt; requested fast path state update...
(standard input)-2014-05-xxT03:16:42.380Z cpu15:10515)WARNING: Vol3: 1717: Failed to refresh FS 535daf66-4b90cce6-fa7c-0025b50100ee descriptor: Device is permanently unavailable
(standard input)-2014-05-xxT03:16:42.661Z cpu15:10515)WARNING: Vol3: 1717: Failed to refresh FS 535daf66-4b90cce6-fa7c-0025b50100ee descriptor: Device is permanently unavailable

not sure whether this KB describe the exact issue but at least they got similar symptoms

VMFS Resignature causes thrashing between multiple VMware ESXi 4.x/5.x and ESX 4.x hosts (1026710

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1026710

Unmounting a LUN or detaching a datastore/storage device from multiple VMware ESXi 5.x hosts (2004605)
This article provides steps to unmount a LUN from an ESXi 5.x host, which includes unmounting the file system and detaching the device. These steps must be performed for each ESXi host.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004605

No comments:

Post a Comment