While replacing the SSL certificates across the Management and Workload Domains of a customer’s 3.10 VCF environment I hit an issue with the NSX-V Manager and the SDDC Manager certificate replacement failing. The vCenter, PSCs and Log Insight certificates replacement succeeded, and all components were using the same Intermediate and Root CA servers for the signed certificates. Replacing the certificates in a WLD where just the vCenter and NSX-V Manager were deployed also resulted in a success for vCenter and a failure for NSX.
I was following the documented process at https://docs.vmware.com/en/VMware-Cloud-Foundation/3.10/com.vmware.vcf.admin.doc_310/GUID-2A1E7307-84EA-4345-9518-198718E6A8A6.html.
The SDDC Manager GUI was not providing a lot of information on what was wrong with the request so I went digging in the log files. On the SDDC Manager appliance in the var/log/vmware/vcf/commonsvcs/vcf-commonsvcs.log file I saw messages about “could not get certificates from input stream…. Header and footer do not match: —–BEGIN CERTIFICATE—– —–END CERTIFICATE———BEGIN CERTIFICATE—–” where there is a space between the first mention of begin and end certificate and no space between the second mention. The log also contained a message “Problems parsing certificate….. Could not parse certificate from input string”.
Here was the first hint that it didn’t like the certificate format. On the NSX-V Manager I generated a support bundle and then went looking through the logs for further clues.
In the \logs\appliance_mgmt\vsmvam.log file I saw a message “Invalid chain certificates”.
Again it is complaining about the certificate format.
Checking my certificates they were formatted as .crt files with each WLD in its own subfolder with the name of the folder matching the name displayed in SDDC Manager for the Workload Domain. My root and intermediate CA certificates had been combined into a single .crt file with the intermediate CA listed first, followed by the root CA.
There were no white spaces or extra characters in the certificates, including blank lines at the end of the certificates, everything looked fine.
Next I used OpenSSL commands on the SDDC Manager to verify the certificate chain and also the individual components in the chain.
openssl verify -CAfile rootca.crt nsxmanagercert.crt
It reported that the chain was valid, which is what I had expected since the PSC, vCenter and Log Insight had accepted the certificates and each check I performed on the individual files reported no issues.
At this point I was stuck and opened a support case with VMware GSS.
With their help the root cause was identified as an encoding error in the certificates. My customer had emailed the certificate contents as text rather than as files, and when they were saved via Notepad++ into .crt files they were set to use the Windows or DOS format for EOL encoding. While the files looked valid, the engineer told me this has been known to cause issues. The second recommendation was to have an end of line character after the —–END CERTIFICATE—– at the end of the file. Effectively to press enter and leave the cursor at the start of the next line down.
To fix the formatting we opened the individual NSX Manager, SDDC Manager and the combined root CA certificates in Notepad++. We pressed enter to add a new line at the very end of the certificate file and then used the option on the Edit Menu under EOL Conversion to set it to UNIX/OSX Format for both files. Saving the files we then repeated the steps in the documentation to create a .tar.gz file with the certificates for the Management WLD and initiated the Install Certificate process again.
This time the process completed without issue and all certificates were replaced successfully.
So why did it work without these changes for vCenter, PSCs and vRLI? apparently there are different approaches being used to replace the certificates on each component. NSX-V uses an API based method, while the PSC for example copies the file which would have converted the encoding automatically and eliminated that issue.