Nutanix Cluster Software & Firmware Upgrade Order

I have been asked many times in what specific order a Nutanix cluster should be upgraded. This refers to all software components that make up a Nutanix cluster such as Nutanix AOS, AHV, NCC & Foundation but also VMware ESXi (& Hyper-V) and even the BIOS, firmware and other hardware device drivers. Well, I thought it was about time to share my way or working with you this blog post. I hope it helps you with your own Nutanix cluster upgrade activities.

First Some Prerequisites

Before listing the upgrade order, I want to highlight some prerequisite steps that are a key part of my Nutanix cluster upgrade activity.

“Before anything else, preparation is the key to success.” Alexander Graham Bell.

Keep Nutanix NCC & Foundation Up-To-Date

I will not go into the details of the workings of Nutanix Cluster Check (‘NCC’) and Foundation but I want to stress that these two services should always be kept up-to-date as part of regular administration on a Nutanix cluster. It will only take a couple of minutes to download and apply the available updates on the Nutanix Controller VM’s in the cluster. The CVM’s will not require a reboot and there is no negative impact on your workloads running on the cluster.

Review the Nutanix Compatibility Matrix

Always check the Nutanix Compatibility Matrix before you go ahead and start downloading the latest upgrades packages from Nutanix and VMware (and Microsoft). By checking that matrix, you are confirming that your envisioned upgrades of Nutanix AOS and VMware ESXi (or Microsoft Hyper-V) are qualified with your underlying servers model and type. If your specific combination is not listed in the matrix it does not mean that it does not work. However, it is not a qualified combination resulting in limited or no support in case you run into issues later on. Always keep in mind that you can reach out to Nutanix Support in case you have questions when using the compatibility matrix.

Review the Nutanix Hardware Compatibility Lists

Always check Nutanix Hardware Compatibility Lists in case you are also going to perform hardware level upgrades touching the BIOS, firmware and device drivers of your servers. This is specific for non-Nutanix Appliance nodes such as HPE Proliant and Dell PowerEdge. For example, when you have HPE Proliant DL3x0 Gen10 servers then you can see on the relevant HCL that HPE SPP 2020.09.0 is the latest service pack supported with the Nutanix platform (at the time of writing this post). This information is key because you cannot just simply upgrade your hardware drivers to the latest versions available from the vendor.

Check the Upgrade Paths

As with any upgrades, do check the upgrade paths for your Nutanix and Hypervisor Software versions. Nutanix has an Upgrade Paths page where you can confirm that your current versions of AOS and Prism Central (but also Calm and Files) can be upgraded to your envisioned target release version. VMware has its’ own Product Interoperability Matrices where you can also confirm the available upgrade paths for ESXi, vCenter Server Appliance and all other VMware products. Do ensure that you adhere to the available paths to avoid any trouble whilst actually performing the various upgrade activities. Additionally note, that upgrade paths are also applicable for hardware service packs as is the case with HPE SPP’s thus always check the HPE SPP Update Compatibility information.

Upgrade Order

I hope that I did not lose you by throwing all that important preparation information at you? 🙂

Without any further ado, please see the following upgrade order for a Nutanix cluster:

    1. Upgrade Nutanix Prism Central (‘PC’)
      • Upgrade NCC (if newer version is available)
      • Upgrade PC
      • Run a new NCC and remedy any newly found issues
    2. Upgrade NCC and Foundation in Nutanix Prism Element (‘PE’)
      • Upgrade NCC (if newer version is available)
      • Upgrade Foundation (if newer version is available)
      • Run a new NCC and remedy any newly found issues
    3. Upgrade VMware vCenter Server Appliance
      • Upgrade VCSA (only applicable if you are using VMware vSphere)
      • Run a new Skyline Health Check and remedy any newly found issues
        • This feature does require participation in the Customer Experience Improvement Program (‘CEIP’) from VMware
    4. Update Nutanix Lifecycle Manager (‘LCM’) in Nutanix PE
      • Perform a new LCM inventory, which updates the LCM framework to the latest software version
        • This step first updates the LCM framework and, subsequently, performs a complete inventory: no reboots or upgrades to servers will be done
    5. Upgrade Nutanix AOS in Nutanix PE
      • In case of Nutanix AHV as hypervisor, do not immediately upgrade AHV right after the AOS upgrade has been completed
      • Run a new NCC and remedy any newly found issues
    6. Upgrade Server Firmware via LCM in Nutanix PE
      • In case you are using VMware ESXi, ensure that the relevant vSphere cluster HA Admission Control is disabled and DRS is set to Fully Automated
        • Additionally, ensure that there are no virtual machines turned on, which are ‘pinned’ to ESXi hosts using either affinity rules or shared pci devices (ie. graphics cards)
      • LCM can only be used for Nutanix Appliance servers such as NX (SuperMicro) and DX (HPE). Check the entire list here.
        • As of recently, it has become possible to use LCM for HPE DL380 Gen10 servers (dependent on using the latest version of Nutanix Foundation)
        • In case you are using non-Appliance servers, you need to use the hardware vendor advised upgrade method. Do note that these upgrades will not be “1-click upgrades” and require manual actions.
    7. Upgrade Hypervisor in Nutanix PE
      • For Nutanix AHV, you can proceed by following the message in Nutanix PE and use the AHV version as packaged within the AOS upgrade
      • For VMware ESXi, use the offline bundle as downloaded from VMware
        • Special note here is to use the vendor specific version of ESXi as is required with, for example, HPE and Dell
      • Run a new NCC and remedy any newly found issues

Improvements in Nutanix LCM

Nutanix has already made great improvements on the Nutanix Lifecycle Management service ensuring that the above steps require the minimal amount of administrator effort.

Some of these improvements and new features are listed below:

    • Consolidating software and firmware component upgrades into a unified control plane
    • An infrastructure inventory shows software and firmware versions running across the environment, including any new versions available for deployment
    • When an IT administrator selects an update package, LCM automatically adds all corresponding upgrades that are a part of the dependency chain, ensuring that, with 1-click, everything required will be installed
    • Streamlined upgrade workflows with minimized infrastructure maintenance windows due to how LCM intelligently calculates the most optimal plan for all upgrade packages to be deployed
    • Based on the bundle of components — software or firmware– that are selected for upgrade, LCM creates a plan that ensures the updates are deployed with the minimal number of service or host restarts

 

The improvements on LCM are still continuing. It will not take long before all the above steps will be fully automated in the near future! 🙂

Read more on: https://www.nutanix.com/products/life-cycle-manager.

Thank you

Thank you for reading this post and please do not hesitate to leave a comment below.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.