One of the ESXi hosts failed with a “Purple Screen Of Death” and below analysis found as the root cause of the failure.
It was sitting in a vSphere 5.5 and lower patch level version to 30xxxxx. We were not able to identify any hardware failures or any error related to the server hardware. Also I can confirm that it was configured with the correct drivers.
This is the part of an error logs we found in the failed ESXi host
This was identified as the root cause: PCPU becomes too busy logging all the correctable error messages to perform routine background tasks, leading ESXi to assume that PCPU is unresponsive.
Possible tasks to correct the Error: To fix this PSOD error we had to update the 5.5 Patch version to 3568722, however the latest patch version available to 5.5 is 5230635.
You can read More about this in below KB articles:
- VMware ESXi 5.5, Patch Release ESXi550-201602001 (2144353)
- VMware ESXi 5.5, Patch ESXi-5.5.0-20160204001-standard (2144001)