vmware: no healthy upstream
After a hard reset, a greeting of:
no healthy upstream
pops up from the vSphere Client. Login as root and issue:
service-control –start vmware-vpxd
to see if there’s any additional information in regards to this error. You may or may not receive more info. Check the time and NTP settings. There’s a good chance time is not synced up.
date
Login to the management console. For example:
https://vcsa01.nix.mds.xyz:5480/#/login
If it fails to login with:
Unable to login
check space with:
# df -h |grep 100
/dev/mapper/log_vg-log 9.8G 9.5G 0 100% /storage/log
Clear space by removing old log files, for example:
root@vcsa01 [ ~ ]# df -h |grep 100
/dev/mapper/log_vg-log 9.8G 9.5G 0 100% /storage/log
root@vcsa01 [ ~ ]# cd /storage/log
root@vcsa01 [ /storage/log ]# du -sh *|grep G
9.4G vmware
root@vcsa01 [ /storage/log ]# cd vmware/
root@vcsa01 [ /storage/log/vmware ]# du -sh *|grep G
1.6G eam
2.7G lookupsvc
root@vcsa01 [ /storage/log/vmware ]# cd lookupsvc/
root@vcsa01 [ /storage/log/vmware/lookupsvc ]# du -sh *|grep G
2.6G tomcat
root@vcsa01 [ /storage/log/vmware/lookupsvc ]# cd tomcat/
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]#
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]# rm -rf localhost_access.2021*
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/log_vg-log 9.8G 8.0G 1.4G 86% /storage/log
root@vcsa01 [ /storage/log/vmware/lookupsvc/tomcat ]#
After cleaning up the space in a few more folders, reboot the appliance:
reboot
You should now be able to login after the space is freed. Try to loign again to the management console (Port 5480) to disable and enable Time Synchronization if UI still doesn’t show up. If not, issue:
root@vcsa01 [ ~ ]# service-control –start applmgmt
From the vSphere Client SSH session. If you get a certificate expiration failure:
Exception in invoking authentication handler [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1076)
Renew the self signed cert, or official certificate. Use this page:
https://kb.vmware.com/s/article/76719
Example output:
root@vcsa01 [ /tmp ]# ./fixsts.sh
NOTE: This works on external and embedded PSCs
This script will do the following
1: Regenerate STS certificate
What is needed?
1: Offline snapshots of VCs/PSCs
2: SSO Admin Password
IMPORTANT: This script should only be run on a single PSC per SSO domain
==================================
Resetting STS certificate for vcsa01.nix.mds.xyz started on Mon Aug 1 04:23:46 UTC 2022
Detected DN: cn=vcsa01.nix.mds.xyz,ou=Domain Controllers,dc=vsphere,dc=local
Detected PNID: vcsa01.nix.mds.xyz
Detected PSC: vcsa01.nix.mds.xyz
Detected SSO domain name: vsphere.local
Detected Machine ID: 310ae9cb-82a9-4fa4-bcd4-d34b054d0090
Detected IP Address: 192.168.0.33
Domain CN: dc=vsphere,dc=local
==================================
==================================
Detected Root’s certificate expiration date: 2030 Jun 3
Detected today’s date: 2022 Aug 1
==================================
Exporting and generating STS certificate
Status : Success
Using config file : /tmp/vmware-fixsts/certool.cfg
Status : Success
Enter password for administrator@vsphere.local:
Highest tenant credentials index : 1
Exporting tenant 1 to /tmp/vmware-fixsts
Deleting tenant 1
Highest trusted cert chains index: 1
Exporting trustedcertchain 1 to /tmp/vmware-fixsts
Deleting trustedcertchain 1
Applying newly generated STS certificate to SSO domain
adding new entry “cn=TenantCredential-1,cn=vsphere.local,cn=Tenants,cn=IdentityManager,cn=Services,dc=vsphere,dc=local”
adding new entry “cn=TrustedCertChain-1,cn=TrustedCertificateChains,cn=vsphere.local,cn=Tenants,cn=IdentityManager,cn=Services,dc=vsphere,dc=local”
Replacement finished – Please restart services on all vCenters and PSCs in your SSO domain
==================================
IMPORTANT: In case you’re using HLM (Hybrid Linked Mode) without a gateway, you would need to re-sync the certs from Cloud to On-Prem after following this procedure
==================================
==================================
root@vcsa01 [ /tmp ]#
Try to login again. Another symptom of this error is:
[500] An error occurred while fetching identity providers. Try again. If problem persists, contact your administrator.
Cheers,
Tom