Hope you are already aware with the L1 Terminal Fault (L1TF) in VMware and you can read my previous post here. In order to complete mitigation of the L1TF vulnerability CVE-2018-3646 requires enabling the ESXi Side-Channel-Aware Scheduler in the ESXi hypervisor. Enabling this scheduler it will enable only a single logical processor off a Hyper-Thread-Enabled core. Because of this running Virtual Machine workload can be impacted.
Prior to enable the ESXi Side-Channel-Aware Scheduler impact assessment needs to be done in the current environment to understand the proper impact of the change. VMware has released a tool called “HTAware Mitigation Tool” and your environment can be analyzed with this and you can have a clear picture of the impact. VMware has released the Knowledge base article for the usage and this is my personal experience with the use of this tool.
First of all you can download the tool from VMware Knowledge Base article (56931). and you can find the attached .zip file in the “Attachments” section
This HTAware tool will perform the following checks in your environment
- Scans the virtual infrastructure for CPU utilization across Clusters, Hosts, and VMs to identify heavily utilized resources
- Identifies VMs which may be unable to run on their current host after the mitigation is applied
- Identifies hosts that are likely safe candidates for mitigation. This list of hosts can be provided as input to the second stage of the tool to enable the HTAware Mitigation
These are the key features of the HTAware Mitigation Tool
- Collect and output historical CPU utilization information stored by vCenter for the Cluster and Host
- Identify the load impact of enabling the HTAware Mitigation on the scanned hosts. The tool also considers the load impact of reduced host capacity during rolling cluster upgrades
- Identify VMs whose total count of vCPUs is greater than the number of physical cores on the running host. Such VMs will be too “wide” to run on that host when the HTAware Mitigation is enabled
- Identify VMs which utilize the vCPU pinning feature. The PCPU (physical CPU) numbers may no longer be valid once the scheduler is enabled
- Provides automation functionality to apply HTAware Mitigation across vSphere clusters and/or individual hosts
HTAware Mitigation Tool Usage
Below prerequisites to using the HTAware Mitigation Tool
- PowerShell 3.0 or greater
- PowerCLI 6.3 (supported on Windows, Linux & MacOS)
- Get-* commands indicated below require System.View privilege
- Set-* commands indicated below require privileges of Get-* commands plus Host.Config.AdvancedConfig privilege on the host being modified
HTAware Tool Installation
Download the .zip file and extract the “HTAwareMitigation” tool (At the time of writing this article tool version is 1.0.0.9)
Run “Import-Module .\HTAwareMitigation.psd1” to import the libraries (I have changed my working folder to the extracted folder)
Note: If you get a digital signature issue make sure to set your Execution Policy to “Unristricted” it was not working with the RemoteSigned execution policy in my case.
To view the available Functions and verify the successful import of the tool run “Get-Command -Module HTAwareMitigation” command
To analyse a ESXi Clustrer run “Get-HTAwareMitigationAnalysis -ClusterName “<Name of the cluster>” “
Once the tool done with the analysis It will generate files similar to these
- VC_Name.json.gz – Raw Collected Data
- output.csv – Processed results in CSV format
- output.html – Detailed report
- output.json.gz – Processed raw data
I opened the “output.html” report and report was like this
If there are any issues arise it can be seen similar to this
If we need to get the Cluster configuration details related to this L1 Terminal Fault can be view with “Get-HTAwareMitigationConfig -ClusterName “<Cluster Name>” “
You can set parameters at once with the generated files to mitigate the vulnerabilities using “Set-HTAwareMitigationConfig -InputFile .\output.csv -Enable” command (you can use the -Confirm:$false argument to avoid manual confirmation prompt at the end of your command)
After completion of the above command we need to reboot the ESXi hosts to apply the changes, Re-run the command and check the status of the Hosts and it will show the “Reboot required” status
After the reboot check the configuration status of the hosts issuing the same above command, You can suppression the warning of the cluster/host using “Set-HTAwareMitigationSuppression -ClusterName “<Cluster Name>” or Set-HTAwareMitigationSuppression -VMHostname “<ESXi Host Name>” commands, also you can do this in the ESXi Summery tab of the vCenter server
Limitations of the HTAware Mitigation Tool:
- The assumptions used by the Tool to designate hosts as Green, Yellow or Red are based on collection triggers and CPU usage. Individual workload may follow more conservative (or less conservative) rules when running workloads. It is important that the guidance provided by the Tool are used as additional input into the analysis of desired infrastructure capacity and utilization
- Host utilization does not take DRS or manual load balancing into account. This may cause the Tool to be conservative or miss load spikes
- The Tool performs estimates of compute throughput. Hyper-Thread does help responsiveness and the impact to responsiveness cannot be estimated
References:
If you found this post as useful please rate the post and share it!
Ryan Smith
November 6, 2018Aruna,
Great article. It looks like the link to the download KB is broken. Here it is for reference for anyone. https://kb.vmware.com/s/article/56931
Aruna Lakmal
November 6, 2018Thank you for the comment and sharing the link. 🙂