SDDC-Manager tips&tricks
My VCF deployment started it’s life in June 2024 based on the 5.0 BOM. I managed to install most of the Aria-suite components and added NSX-ALB(AVI) to the mix, meanwhile steadily upgrading to the latest VCF-releases and intermediate async-patches. Eventually, I even converted the very same Management-WLD from OSA-to-NFS-to-ESA and got it to work with vLCM images (both not supported btw!). Needless to say I encountered many challenges to keep this installation alive, and I had to restore SDDC-manager on several occasions. This blog describes some of the tools and procedures I used to reanimate SDDC-Manager and keep it healthy.
These topics will be discussed:
- (expired) Password Management
- SDDC-manager (out-of-sync) situations
Password Management
I assume many of you know the dreaded red banner password warnings that can appear in SDDC-manager. Most of the time this means SDDC-Manager was unable to contact the endpoint to check for the password status (same for certificates!). So, in most cases it’s best to wait and see if the error is persistent.
In some other cases it can happen a restored SDDC-Manager is not aware of meanwhile rotated passwords. The next challenge is to find out what the actual password was, before we can remediate it. The official procedure is to make use of the SDDC-Manager lookup_passwords
tool.
root@sddc-manager # lookup_passwords
Password lookup operation requires ADMIN user credentials. Please refer VMware Cloud Foundation Administration Guide for setting up ADMIN user.
Supported entity types: ESXI VCENTER PSC NSX_MANAGER NSX_CONTROLLER NSXT_MANAGER NSXT_EDGE VRSLCM VRLI VROPS VRA WSA BACKUP VXRAIL_MANAGER AD
Enter an entity type from above list: NSXT_MANAGER
Enter page number (optional):
Enter page size (optional, default=50):
Enter Username: administrator@vsphere.local
Enter Password:
NSXT_MANAGER
identifiers: 192.168.1.66,m01-nsx01.vmw.local
workload: m01
username: admin
password: VMware1!VMware1!
type: API
account type: SYSTEM
NSXT_MANAGER
identifiers: 192.168.1.66,m01-nsx01.vmw.local
workload: m01
username: root
password: VMware1!VMware1!
type: SSH
account type: SYSTEM
While this is useful for the regular accounts it can become a challenge if one of the service-accounts is impacted. Also, when you enabled password rotation, it can be very hard to know the actual password of an account. Therefor, we can leverage two API’s to retrieve all passwords for regular and service accounts (and store them in a local password key-store of some kind). To gain access to the API we will need to first retrieve an access-token as shown below.
root@sddc-manager # curl -d '{"username" : "administrator@vsphere.local", "password" : "VMware1!"}' -H "Content-Type: application/json" -X POST localhost/v1/tokens -k | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2232 0 2163 100 69 10672 340 --:--:-- --:--:-- --:--:-- 11049
{
"accessToken": "eyJhbGciOiJIUzI1NiJ9.eyJqdGkiOiI4M2ZkOTFkZi1lYWE5LTRjMmYtODQ5Ni04MjRjMWZlM2M5MmUiLCJpYXQiOjE3MjY3NDMxMDcsInN1YiI6ImFkbWluaXN0cmF0b3JAdnNwaGVyZS5sb2NhbCIsImlzcyI6InZjZi1hdXRoIiwiYXVkIjoic2RkYy1zZXJ2aWNlcyIsIm5iZiI6MTcyNjc0MzEwNywiZXhwIjoxNzI2NzQ2NzA3LCJ1c2VyIjoiYWRtaW5pc3RyYXRvckB2c3BoZXJlLmxvY2FsIiwibmFtZSI6ImFkbWluaXN0cmF0b3JAdnNwaGVyZS5sb2NhbCIsInNjb3BlIjpbIlJFU09VUkNFX0ZVTkNUSU9OQUxJVFlfV1JJVEUiLCJMSUNFTlNJTkdfSU5GT19SRUFEIiwiU0REQ19GRURFUkFUSU9OX1dSSVRFIiwiQVZOX1dSSVRFIiwiU0REQ19NQU5BR0VSX1JFQUQiLCJDRVJUX1dSSVRFIiwiQUxCX0NMVVNURVJfUkVBRCIsIkxJQ0VOU0VfS0VZX1JFQUQiLCJFREdFX0NMVVNURVJfV1JJVEUiLCJVU0VSX1JFQUQiLCJDT01QTElBTkNFX1dSSVRFIiwiQ1JFREVOVElBTF9XUklURSIsIkJBQ0tVUF9DT05GSUdfUkVBRCIsIkNMVVNURVJfV1JJVEUiLCJBVk5fUkVBRCIsIlZBU0FfUFJPVklERVJfUkVBRCIsIkRPTUFJTl9XUklURSIsIkNFSVBfUkVBRCIsIlNPU19XUklURSIsIlNERENfTUFOQUdFUl9XUklURSIsIlJBX1JFQUQiLCJOVFBfV1JJVEUiLCJUQUdfV1JJVEUiLCJERVBPVF9DT05GSUdfV1JJVEUiLCJTWVNURU1fUkVBRCIsIkRFUE9UX0NPTkZJR19SRUFEIiwiSE9TVF9XUklURSIsIlJFU09VUkNFX0xPQ0tfV1JJVEUiLCJCQUNLVVBfUkVTVE9SRV9SRUFEIiwiQ0VSVF9SRUFEIiwiVVNFUl9XUklURSIsIkNPTVBMSUFOQ0VfUkVBRCIsIlVQR1JBREVfUkVBRCIsIk9USEVSX1JFQUQiLCJMSUNFTlNJTkdfV1JJVEUiLCJTT1NfUkVBRCIsIkVWRU5UX1dSSVRFIiwiU0VDVVJJVFlfQ09ORklHX1JFQUQiLCJDUkVERU5USUFMX1JFQUQiLCJIT1NUX1JFQUQiLCJBTEJfQ0xVU1RFUl9XUklURSIsIlZFUlNJT05fU1lOQ19XUklURSIsIkNFSVBfV1JJVEUiLCJSRVNPVVJDRV9MT0NLX1JFQUQiLCJPVEhFUl9XUklURSIsIkxJQ0VOU0VfS0VZX1dSSVRFIiwiUkVTT1VSQ0VfRlVOQ1RJT05BTElUWV9SRUFEIiwiQ0FfUkVBRCIsIlRBR19SRUFEIiwiTElDRU5TSU5HX1JFQUQiLCJORVRXT1JLX1BPT0xfV1JJVEUiLCJXQ1BfUkVBRCIsIkxJQ0VOU0lOR19JTkZPX1dSSVRFIiwiQkFDS1VQX1JFU1RPUkVfV1JJVEUiLCJOVFBfUkVBRCIsIkVER0VfQ0xVU1RFUl9SRUFEIiwiRVZFTlRfUkVBRCIsIkJBQ0tVUF9DT05GSUdfV1JJVEUiLCJXQ1BfV1JJVEUiLCJTRVJWSUNFX0FDQ09VTlRfV1JJVEUiLCJORVRXT1JLX1BPT0xfUkVBRCIsIkNBX1dSSVRFIiwiQ0xVU1RFUl9SRUFEIiwiVkFTQV9QUk9WSURFUl9XUklURSIsIkROU19XUklURSIsIlNZU1RFTV9XUklURSIsIlZSU0xDTV9XUklURSIsIkROU19SRUFEIiwiU0VSVklDRV9BQ0NPVU5UX1JFQUQiLCJTRERDX0ZFREVSQVRJT05fUkVBRCIsIkRPTUFJTl9SRUFEIiwiVlJTTENNX1JFQUQiLCJVUEdSQURFX1dSSVRFIl0sInJvbGUiOlsiQURNSU4iXX0.hQZphpc0IP6FHnTQEVrC-hU1rmPovnD0A_4seDWK2Lo",
"refreshToken": {
"id": "600263f4-d459-4ea5-9ff9-e606e895165e"
}
}
The accessToken
can then be used to GET the following two API-calls (Postman):
- https://<sddc-manager>/v1/system/credentials
- https://<sddc-manager>/v1/system/credentials/service
Please be aware that as of VCF 5.2.1 Password (& Certificate) management is also available in the vSphere Client:
Fixing SDDC-Manager out-of-sync situations
You might wonder why we need fixing some specific out-of-sync situations where SDDC-Manager might end up in. Think about restoring SDDC-Manager from file where the restored state is older than the actual state of the WLD’s, or a situation where changes were made outside of SDDC-manager. One thing is certain, SDDC-Manager will complain until you fix it. In my experience I was able to solve some situations by using already available tools.
First of all, it is good practice to run the sos
-tool (VDT-alike tool for SDDC-Manager). Below a snippet showcases the health-check results.
root@sddc-manager # /opt/vmware/sddc-support/sos --health-check
Welcome to Supportability and Serviceability(SoS) utility!
Performing SoS operation for m01 domain components
----
Version Check Status : YELLOW
+-----+--------------------------------+---------------------------+-----------------------+-----------------------+--------+
| SL# | Component | BOM Version (lcmManifest) | Running version | VCF Inventory Version | State |
+-----+--------------------------------+---------------------------+-----------------------+-----------------------+--------+
| 1 | ESXI: esxi55.vcf.lan | 8.0.3-24280767 | 8.0.3-24414501 | 8.0.3-24414501 | YELLOW |
| 2 | ESXI: esxi56.vcf.lan | 8.0.3-24280767 | 8.0.3-24414501 | 8.0.3-24414501 | YELLOW |
| 3 | ESXI: esxi57.vcf.lan | 8.0.3-24280767 | 8.0.3-24414501 | 8.0.3-24414501 | YELLOW |
| 4 | ESXI: esxi58.vcf.lan | 8.0.3-24280767 | 8.0.3-24414501 | 8.0.3-24414501 | YELLOW |
| 5 | NSX_MANAGER: m01-nsx01.vcf.lan | 4.2.1.0.0-24304122 | Failed to get version | 4.2.1.1.0-24405893 | YELLOW |
| 6 | SDDC: sddc-manager.vcf.lan | 5.2.1.1 | 5.2.1.1 | 5.2.1.1 | GREEN |
| 7 | VCENTER: m01-vc01.vcf.lan | 8.0.3.00300-24305161 | 8.0.3.00400-24322831 | 8.0.3.00400-24322831 | YELLOW |
+-----+--------------------------------+---------------------------+-----------------------+-----------------------+--------+
---
Progress : 96%, Completed tasks : [GENERAL-CHECK, SERVICES-CHECK, ALARM-CHECK, COMPUTE-CHECK, VSAN-CHECK, GET-SERVER-DETAILS, VCF-SUMMARY, HARDWARE-COConnectivity : RED
+-----+----------------------------------------------------+----------------------------+--------+
| SL# | Area | Title | State |
+-----+----------------------------------------------------+----------------------------+--------+
| 1 | ESXi : esxi55.vcf.lan | Ping status | GREEN |
| | | API Connectivity status | GREEN |
| | | ** SSH status is disabled. | |
| 2 | ESXi : esxi56.vcf.lan | Ping status | GREEN |
| | | API Connectivity status | GREEN |
| | | ** SSH status is disabled. | |
| 3 | ESXi : esxi57.vcf.lan | Ping status | GREEN |
| | | API Connectivity status | GREEN |
| | | ** SSH status is disabled. | |
| 4 | ESXi : esxi58.vcf.lan | Ping status | GREEN |
| | | API Connectivity status | GREEN |
| | | ** SSH status is disabled. | |
| 5 | NSX Ping Status | NSX Ping Status | RED |
| 6 | NSX: m01-nsx01.vcf.lan | API Connectivity status | RED |
| 7 | VMware Aria Operations : m01-vrops-master.vcf.lan | Ping status | GREEN |
| | | ** SSH status is enabled. | |
| 8 | VMware Aria Operations : m01-vrops-replica.vcf.lan | Ping status | GREEN |
| | | ** SSH status is enabled. | |
| 9 | VMware Aria Suite Lifecycle : m01-vrslcm.vcf.lan | Ping status | GREEN |
| | | ** SSH status is enabled. | |
| 10 | Workspace ONE Access : m01-vidm01.vcf.lan | Ping status | GREEN |
| | | ** SSH status is enabled. | |
| 11 | Workspace ONE Access : m01-vidm02.vcf.lan | Ping status | YELLOW |
| | | ** SSH status is disabled. | |
| 12 | Workspace ONE Access : m01-vidm03.vcf.lan | Ping status | GREEN |
| | | ** SSH status is enabled. | |
| 13 | vCenter : m01-vc01.vcf.lan | Ping status | GREEN |
| | | ** SSH status is enabled. | |
+-----+----------------------------------------------------+----------------------------+--------+
As with password errors the sos-health-check results do not always mean something is actually broken. In many cases, as in my example above, it is just a connectivity/availability issue that can easily be resolved by (re)starting the component or checking the nic/switch/ip/vlan connectivity.
For the out-of-sync situations there are two tools available that helped save my lab environment.
- Make use of the recently introduced brownfield-import tool with the sync option
- Trigger the version-sync API
As of VCF version 5.2.0 the vcf-brownfield-import utility (found as ‘solution’ within VCF downloads) was introduced with the main purpose of importing existing vSphere environments into VCF as a workload domain. This tools executes several checks to see if the to-be imported environment meets the requirements. It also has an option to sync
where it is able to fix issues like changes (configuration-drift) made outside of SDDC-Manager.
root@sddc-manager # python3 vcf_brownfield.py -h
[2025-01-02 10:58:20,282] [INFO] vcf_brownfield: Brownfield Import main version: 5.2.1.1-24418436
usage: vcf_brownfield.py [-h] [-v] {convert,check,import,sync,deploy-nsx,precheck} ...
Brownfield Import main script, version: 5.2.1.1-24418436
options:
-h, --help show this help message and exit
-v, --version Display Brownfield Import scripts version
Available operations:
{convert,check,import,sync,deploy-nsx,precheck}
convert Convert Management Domain into SDDC Manager
check Check whether vCenter is suitable to be imported as a Virtual Infrastructure (VI)
import Import vCenter as a Virtual Infrastructure (VI) domain into SDDC Manager
sync Sync an already imported Virtual Infrastructure (VI) domain
deploy-nsx Deploy NSX Clusters as a Standalone operation
precheck Run prechecks on vCenter
root@sddc-manager # python3 vcf_brownfield.py sync --domain-name m01
[2025-01-02 11:00:10,953] [INFO] vcf_brownfield: Brownfield Import main version: 5.2.1.1-24418436
Enter SDDC Manager local admin password:
<< discovery and checks of several configurations and services >>
[2025-01-02 11:02:23,879] [INFO] check_domain: For more details, please, check:
Failed guardrails YML: /root/vcf-brownfield-import-5.2.1.1-24418436/vcf-brownfield-toolset/output/guardrails_report_m01-vc01.vcf.lan.yml
Failed guardrails CSV: /root/vcf-brownfield-import-5.2.1.1-24418436/vcf-brownfield-toolset/output/guardrails_report_m01-vc01.vcf.lan.csv
All guardrails CSV: /root/vcf-brownfield-import-5.2.1.1-24418436/vcf-brownfield-toolset/output/guardrails_report_m01-vc01.vcf.lan_all.csv
[2025-01-02 11:02:23,880] [INFO] vcf_brownfield: Inventory sync for domain m01 completed successfully
[2025-01-02 11:02:23,880] [INFO] vcf_brownfield: Operation sync completed on target: m01 with status: PASS in 116.59s
Review the resulting reports and check the SDDC-Manager GUI for health and reporting of correct versions. If this did not result in what was expected we have one other way to sync-versions between SDDC<>VC. The following API sequence triggers a sync-task in SDDC-Manager.
root@sddc-manager # TOKEN=$(curl -d '{"username" : "administrator@vsphere.local", "password" : "VMware1!"}' -H "Content-Type: application/json" -X POST localhost/v1/tokens -k | jq -r '.accessToken')
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2232 0 2163 100 69 6221 198 --:--:-- --:--:-- --:--:-- 6413
root@sddc-manager [ ~ ]# curl -X POST -H 'Content-type: application/json' -H 'Accept: application/json' -H "Authorization: Bearer $TOKEN" http://localhost/v1/resources/version-syncs -d '{"resourceType":"SYSTEM"}'
{"id":"13592e74-061a-4b7d-9d71-5e235fa06475","name":"Synchronize Inventory Versions","status":"IN_PROGRESS","creationTimestamp":"2025-01-02T11:32:04.054Z","isCancellable":false,"isRetryable":false}
Have a look in SDDC-Manager to check if the task is running, what the output is and if it actually fixed your issue.
This concludes this SDDC-Manager tips&tricks blog. Hope you find it useful!