SDDC-Manager tips&tricks

My VCF deployment started it’s life in June 2024 based on the 5.0 BOM. I managed to install most of the Aria-suite components and added NSX-ALB(AVI) to the mix, meanwhile steadily upgrading to the latest VCF-releases and intermediate async-patches. Eventually, I even converted the very same Management-WLD from OSA-to-NFS-to-ESA and got it to work with vLCM images (both not supported btw!). Needless to say I encountered many challenges to keep this installation alive, and I had to restore SDDC-manager on several occasions. This blog describes some of the tools and procedures I used to reanimate SDDC-Manager and keep it healthy.

These topics will be discussed:

  • (expired) Password Management
  • SDDC-manager (out-of-sync) situations

Password Management

I assume many of you know the dreaded red banner password warnings that can appear in SDDC-manager. Most of the time this means SDDC-Manager was unable to contact the endpoint to check for the password status (same for certificates!). So, in most cases it’s best to wait and see if the error is persistent.

In some other cases it can happen a restored SDDC-Manager is not aware of meanwhile rotated passwords. The next challenge is to find out what the actual password was, before we can remediate it. The official procedure is to make use of the SDDC-Manager lookup_passwords tool.

root@sddc-manager # lookup_passwords

Password lookup operation requires ADMIN user credentials. Please refer VMware Cloud Foundation Administration Guide for setting up ADMIN user.

Supported entity types: ESXI VCENTER PSC NSX_MANAGER NSX_CONTROLLER NSXT_MANAGER NSXT_EDGE VRSLCM VRLI VROPS VRA WSA BACKUP VXRAIL_MANAGER AD
Enter an entity type from above list: NSXT_MANAGER
Enter page number (optional): 
Enter page size (optional, default=50): 
Enter Username: administrator@vsphere.local
Enter Password: 
        NSXT_MANAGER
        identifiers: 192.168.1.66,m01-nsx01.vmw.local
        workload: m01
                username: admin
                password: VMware1!VMware1!
                type: API
                account type: SYSTEM

        NSXT_MANAGER
        identifiers: 192.168.1.66,m01-nsx01.vmw.local
        workload: m01
                username: root
                password: VMware1!VMware1!
                type: SSH
                account type: SYSTEM

While this is useful for the regular accounts it can become a challenge if one of the service-accounts is impacted. Also, when you enabled password rotation, it can be very hard to know the actual password of an account. Therefor, we can leverage two API’s to retrieve all passwords for regular and service accounts (and store them in a local password key-store of some kind). To gain access to the API we will need to first retrieve an access-token as shown below.

root@sddc-manager # curl -d '{"username" : "administrator@vsphere.local", "password" : "VMware1!"}' -H "Content-Type: application/json" -X POST localhost/v1/tokens -k | jq 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2232    0  2163  100    69  10672    340 --:--:-- --:--:-- --:--:-- 11049
{
  "accessToken": "eyJhbGciOiJIUzI1NiJ9.eyJqdGkiOiI4M2ZkOTFkZi1lYWE5LTRjMmYtODQ5Ni04MjRjMWZlM2M5MmUiLCJpYXQiOjE3MjY3NDMxMDcsInN1YiI6ImFkbWluaXN0cmF0b3JAdnNwaGVyZS5sb2NhbCIsImlzcyI6InZjZi1hdXRoIiwiYXVkIjoic2RkYy1zZXJ2aWNlcyIsIm5iZiI6MTcyNjc0MzEwNywiZXhwIjoxNzI2NzQ2NzA3LCJ1c2VyIjoiYWRtaW5pc3RyYXRvckB2c3BoZXJlLmxvY2FsIiwibmFtZSI6ImFkbWluaXN0cmF0b3JAdnNwaGVyZS5sb2NhbCIsInNjb3BlIjpbIlJFU09VUkNFX0ZVTkNUSU9OQUxJVFlfV1JJVEUiLCJMSUNFTlNJTkdfSU5GT19SRUFEIiwiU0REQ19GRURFUkFUSU9OX1dSSVRFIiwiQVZOX1dSSVRFIiwiU0REQ19NQU5BR0VSX1JFQUQiLCJDRVJUX1dSSVRFIiwiQUxCX0NMVVNURVJfUkVBRCIsIkxJQ0VOU0VfS0VZX1JFQUQiLCJFREdFX0NMVVNURVJfV1JJVEUiLCJVU0VSX1JFQUQiLCJDT01QTElBTkNFX1dSSVRFIiwiQ1JFREVOVElBTF9XUklURSIsIkJBQ0tVUF9DT05GSUdfUkVBRCIsIkNMVVNURVJfV1JJVEUiLCJBVk5fUkVBRCIsIlZBU0FfUFJPVklERVJfUkVBRCIsIkRPTUFJTl9XUklURSIsIkNFSVBfUkVBRCIsIlNPU19XUklURSIsIlNERENfTUFOQUdFUl9XUklURSIsIlJBX1JFQUQiLCJOVFBfV1JJVEUiLCJUQUdfV1JJVEUiLCJERVBPVF9DT05GSUdfV1JJVEUiLCJTWVNURU1fUkVBRCIsIkRFUE9UX0NPTkZJR19SRUFEIiwiSE9TVF9XUklURSIsIlJFU09VUkNFX0xPQ0tfV1JJVEUiLCJCQUNLVVBfUkVTVE9SRV9SRUFEIiwiQ0VSVF9SRUFEIiwiVVNFUl9XUklURSIsIkNPTVBMSUFOQ0VfUkVBRCIsIlVQR1JBREVfUkVBRCIsIk9USEVSX1JFQUQiLCJMSUNFTlNJTkdfV1JJVEUiLCJTT1NfUkVBRCIsIkVWRU5UX1dSSVRFIiwiU0VDVVJJVFlfQ09ORklHX1JFQUQiLCJDUkVERU5USUFMX1JFQUQiLCJIT1NUX1JFQUQiLCJBTEJfQ0xVU1RFUl9XUklURSIsIlZFUlNJT05fU1lOQ19XUklURSIsIkNFSVBfV1JJVEUiLCJSRVNPVVJDRV9MT0NLX1JFQUQiLCJPVEhFUl9XUklURSIsIkxJQ0VOU0VfS0VZX1dSSVRFIiwiUkVTT1VSQ0VfRlVOQ1RJT05BTElUWV9SRUFEIiwiQ0FfUkVBRCIsIlRBR19SRUFEIiwiTElDRU5TSU5HX1JFQUQiLCJORVRXT1JLX1BPT0xfV1JJVEUiLCJXQ1BfUkVBRCIsIkxJQ0VOU0lOR19JTkZPX1dSSVRFIiwiQkFDS1VQX1JFU1RPUkVfV1JJVEUiLCJOVFBfUkVBRCIsIkVER0VfQ0xVU1RFUl9SRUFEIiwiRVZFTlRfUkVBRCIsIkJBQ0tVUF9DT05GSUdfV1JJVEUiLCJXQ1BfV1JJVEUiLCJTRVJWSUNFX0FDQ09VTlRfV1JJVEUiLCJORVRXT1JLX1BPT0xfUkVBRCIsIkNBX1dSSVRFIiwiQ0xVU1RFUl9SRUFEIiwiVkFTQV9QUk9WSURFUl9XUklURSIsIkROU19XUklURSIsIlNZU1RFTV9XUklURSIsIlZSU0xDTV9XUklURSIsIkROU19SRUFEIiwiU0VSVklDRV9BQ0NPVU5UX1JFQUQiLCJTRERDX0ZFREVSQVRJT05fUkVBRCIsIkRPTUFJTl9SRUFEIiwiVlJTTENNX1JFQUQiLCJVUEdSQURFX1dSSVRFIl0sInJvbGUiOlsiQURNSU4iXX0.hQZphpc0IP6FHnTQEVrC-hU1rmPovnD0A_4seDWK2Lo",
  "refreshToken": {
    "id": "600263f4-d459-4ea5-9ff9-e606e895165e"
  }
}

The accessToken can then be used to GET the following two API-calls (Postman):

Please be aware that as of VCF 5.2.1 Password (& Certificate) management is also available in the vSphere Client:

Fixing SDDC-Manager out-of-sync situations

You might wonder why we need fixing some specific out-of-sync situations where SDDC-Manager might end up in. Think about restoring SDDC-Manager from file where the restored state is older than the actual state of the WLD’s, or a situation where changes were made outside of SDDC-manager. One thing is certain, SDDC-Manager will complain until you fix it. In my experience I was able to solve some situations by using already available tools.

First of all, it is good practice to run the sos-tool (VDT-alike tool for SDDC-Manager). Below a snippet showcases the health-check results.

root@sddc-manager # /opt/vmware/sddc-support/sos --health-check
Welcome to Supportability and Serviceability(SoS) utility!
Performing SoS operation for m01 domain components
----
Version Check Status : YELLOW                                                                                                                                                                      
+-----+--------------------------------+---------------------------+-----------------------+-----------------------+--------+
| SL# |           Component            | BOM Version (lcmManifest) |    Running version    | VCF Inventory Version | State  |
+-----+--------------------------------+---------------------------+-----------------------+-----------------------+--------+
|  1  |      ESXI: esxi55.vcf.lan      |       8.0.3-24280767      |     8.0.3-24414501    |     8.0.3-24414501    | YELLOW |
|  2  |      ESXI: esxi56.vcf.lan      |       8.0.3-24280767      |     8.0.3-24414501    |     8.0.3-24414501    | YELLOW |
|  3  |      ESXI: esxi57.vcf.lan      |       8.0.3-24280767      |     8.0.3-24414501    |     8.0.3-24414501    | YELLOW |
|  4  |      ESXI: esxi58.vcf.lan      |       8.0.3-24280767      |     8.0.3-24414501    |     8.0.3-24414501    | YELLOW |
|  5  | NSX_MANAGER: m01-nsx01.vcf.lan |     4.2.1.0.0-24304122    | Failed to get version |   4.2.1.1.0-24405893  | YELLOW |
|  6  |   SDDC: sddc-manager.vcf.lan   |          5.2.1.1          |        5.2.1.1        |        5.2.1.1        | GREEN  |
|  7  |   VCENTER: m01-vc01.vcf.lan    |    8.0.3.00300-24305161   |  8.0.3.00400-24322831 |  8.0.3.00400-24322831 | YELLOW |
+-----+--------------------------------+---------------------------+-----------------------+-----------------------+--------+
---
Progress : 96%, Completed tasks : [GENERAL-CHECK, SERVICES-CHECK, ALARM-CHECK, COMPUTE-CHECK, VSAN-CHECK, GET-SERVER-DETAILS, VCF-SUMMARY, HARDWARE-COConnectivity : RED                                                                                                                                                                                                                                   
+-----+----------------------------------------------------+----------------------------+--------+
| SL# |                        Area                        |           Title            | State  |
+-----+----------------------------------------------------+----------------------------+--------+
|  1  |               ESXi : esxi55.vcf.lan                |        Ping status         | GREEN  |
|     |                                                    |  API Connectivity status   | GREEN  |
|     |                                                    | ** SSH status is disabled. |        |
|  2  |               ESXi : esxi56.vcf.lan                |        Ping status         | GREEN  |
|     |                                                    |  API Connectivity status   | GREEN  |
|     |                                                    | ** SSH status is disabled. |        |
|  3  |               ESXi : esxi57.vcf.lan                |        Ping status         | GREEN  |
|     |                                                    |  API Connectivity status   | GREEN  |
|     |                                                    | ** SSH status is disabled. |        |
|  4  |               ESXi : esxi58.vcf.lan                |        Ping status         | GREEN  |
|     |                                                    |  API Connectivity status   | GREEN  |
|     |                                                    | ** SSH status is disabled. |        |
|  5  |                  NSX Ping Status                   |      NSX Ping Status       |  RED   |
|  6  |               NSX: m01-nsx01.vcf.lan               |  API Connectivity status   |  RED   |
|  7  | VMware Aria Operations : m01-vrops-master.vcf.lan  |        Ping status         | GREEN  |
|     |                                                    | ** SSH status is enabled.  |        |
|  8  | VMware Aria Operations : m01-vrops-replica.vcf.lan |        Ping status         | GREEN  |
|     |                                                    | ** SSH status is enabled.  |        |
|  9  |  VMware Aria Suite Lifecycle : m01-vrslcm.vcf.lan  |        Ping status         | GREEN  |
|     |                                                    | ** SSH status is enabled.  |        |
|  10 |     Workspace ONE Access : m01-vidm01.vcf.lan      |        Ping status         | GREEN  |
|     |                                                    | ** SSH status is enabled.  |        |
|  11 |     Workspace ONE Access : m01-vidm02.vcf.lan      |        Ping status         | YELLOW |
|     |                                                    | ** SSH status is disabled. |        |
|  12 |     Workspace ONE Access : m01-vidm03.vcf.lan      |        Ping status         | GREEN  |
|     |                                                    | ** SSH status is enabled.  |        |
|  13 |             vCenter : m01-vc01.vcf.lan             |        Ping status         | GREEN  |
|     |                                                    | ** SSH status is enabled.  |        |
+-----+----------------------------------------------------+----------------------------+--------+

As with password errors the sos-health-check results do not always mean something is actually broken. In many cases, as in my example above, it is just a connectivity/availability issue that can easily be resolved by (re)starting the component or checking the nic/switch/ip/vlan connectivity.

For the out-of-sync situations there are two tools available that helped save my lab environment.

  • Make use of the recently introduced brownfield-import tool with the sync option
  • Trigger the version-sync API

As of VCF version 5.2.0 the vcf-brownfield-import utility (found as ‘solution’ within VCF downloads) was introduced with the main purpose of importing existing vSphere environments into VCF as a workload domain. This tools executes several checks to see if the to-be imported environment meets the requirements. It also has an option to sync where it is able to fix issues like changes (configuration-drift) made outside of SDDC-Manager.

root@sddc-manager # python3 vcf_brownfield.py -h
[2025-01-02 10:58:20,282] [INFO] vcf_brownfield: Brownfield Import main version: 5.2.1.1-24418436
usage: vcf_brownfield.py [-h] [-v] {convert,check,import,sync,deploy-nsx,precheck} ...

Brownfield Import main script, version: 5.2.1.1-24418436

options:
  -h, --help            show this help message and exit
  -v, --version         Display Brownfield Import scripts version

Available operations:
  {convert,check,import,sync,deploy-nsx,precheck}
    convert             Convert Management Domain into SDDC Manager
    check               Check whether vCenter is suitable to be imported as a Virtual Infrastructure (VI) 
    import              Import vCenter as a Virtual Infrastructure (VI) domain into SDDC Manager
    sync                Sync an already imported Virtual Infrastructure (VI) domain
    deploy-nsx          Deploy NSX Clusters as a Standalone operation
    precheck            Run prechecks on vCenter

root@sddc-manager # python3 vcf_brownfield.py sync --domain-name m01
[2025-01-02 11:00:10,953] [INFO] vcf_brownfield: Brownfield Import main version: 5.2.1.1-24418436
Enter SDDC Manager local admin password: 

<< discovery and checks of several configurations and services >>

[2025-01-02 11:02:23,879] [INFO] check_domain: For more details, please, check:
        Failed guardrails YML: /root/vcf-brownfield-import-5.2.1.1-24418436/vcf-brownfield-toolset/output/guardrails_report_m01-vc01.vcf.lan.yml
        Failed guardrails CSV: /root/vcf-brownfield-import-5.2.1.1-24418436/vcf-brownfield-toolset/output/guardrails_report_m01-vc01.vcf.lan.csv
        All guardrails CSV: /root/vcf-brownfield-import-5.2.1.1-24418436/vcf-brownfield-toolset/output/guardrails_report_m01-vc01.vcf.lan_all.csv
[2025-01-02 11:02:23,880] [INFO] vcf_brownfield: Inventory sync for domain m01 completed successfully
[2025-01-02 11:02:23,880] [INFO] vcf_brownfield: Operation sync completed on target: m01 with status: PASS in 116.59s

Review the resulting reports and check the SDDC-Manager GUI for health and reporting of correct versions. If this did not result in what was expected we have one other way to sync-versions between SDDC<>VC. The following API sequence triggers a sync-task in SDDC-Manager.

root@sddc-manager # TOKEN=$(curl -d '{"username" : "administrator@vsphere.local", "password" : "VMware1!"}' -H "Content-Type: application/json" -X POST localhost/v1/tokens -k | jq -r '.accessToken')
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2232    0  2163  100    69   6221    198 --:--:-- --:--:-- --:--:--  6413

root@sddc-manager [ ~ ]# curl -X POST -H 'Content-type: application/json' -H 'Accept: application/json' -H "Authorization: Bearer $TOKEN" http://localhost/v1/resources/version-syncs -d '{"resourceType":"SYSTEM"}'
{"id":"13592e74-061a-4b7d-9d71-5e235fa06475","name":"Synchronize Inventory Versions","status":"IN_PROGRESS","creationTimestamp":"2025-01-02T11:32:04.054Z","isCancellable":false,"isRetryable":false}

Have a look in SDDC-Manager to check if the task is running, what the output is and if it actually fixed your issue.

This concludes this SDDC-Manager tips&tricks blog. Hope you find it useful!

Marco Baaijen

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment