Windows Guest OS encounters BSOD after Virtual Machine is vMotioned to ESXi 7.0+

This issue has been fixed in vSphere 7.0 Update 3i (December 2022)

Incident

After completing the https://eis-vss.atlassian.net/wiki/spaces/VSSPublic/blog/2022/04/22/1073512459/VMware+vCenter+Server+Maintenance+May+14+2022+Sat+08+00AM+-+08+00PM, the VSS Team have been upgrading ESXi hosts on the ITS Private Cloud to vSphere 7.0U2 for the last month. This task has been performed without impacting end users due to VMware Live vMotion Technologies.

Today, at ~90% of hosts upgraded, we have had an increasing number of reports about Windows Guest OS (from Windows Server 2012 to 2019) running on cluster FD4 encountering Blue Screen of Death (BSOD) errors after being vMotioned to an upgrade host.

Symptoms

  • Windows crash with BSOD and/OR.

  • Windows cannot boot showing the message "OS failed to boot with no operating system found".

Cause

Apparently, we are experiencing a bug in vSphere 7.0 documented and recently published (June 1, 2022) in Windows Guest OS encounters BSOD after Virtual Machine is vMotioned to ESXi version 7.0U2 or higher (88516).

Virtual Machine File System (VMFS) under certain use case uses address optimization logic to avoid disk reads while resolving logical address to physical addresses. This address optimization helps in read performance and is used only in certain cases depending on how the virtual disk is allocated, alignment of address allocation and usage of large File Blocks (LFB)s for allocation. Due to a bug in this address optimization logic VMFS can return zeros during read causing the above issue. (VMware 88516)

Workaround

Virtual Machine

There is evidence that a power cycle (power off/power on) fixes the BSOD errors and potentially, a power cycle (power off/power on) might be helpful to avoid memory errors by bringing a fresh running version of the operating system with new vSphere settings. 

If one of your VMs is currently in crashed state:

  1. Power off the virtual machine.

    vss-cli compute vm set <id> off
  2. Power on the virtual machine.

    vss-cli compute vm set <id> on

Virtual Machine

We strongly suggest all VM admins to upgrade VMware tools to the latest version to ensure Guest Operating System stability.

ESXi Host

2022-06-20 01:20 PM: We are assessing the impact of the workaround suggested in the KB article with VMware Support and as soon we have the full impact of the change, we will implement it to fix the issues.

Solution

Currently there is no resolution to the issue (VMware).

University of Toronto - Since 1827