Troubleshooting
The following error conditions and general troubleshooting tips may be helpful in resolving issues with the Keyfactor orchestrators. Generally speaking, issues are often related to trusts of root and intermediate certificates, firewall challenges, or insufficient permissions for the service account running the orchestrator Keyfactor orchestrators perform a variety of functions, including managing certificate stores and SSH key stores. service.
Things to check in the Management Portal include:
-
Is the last seen time for the orchestrator on the Orchestrator Management page in the Management Portal within the last few minutes (see Orchestrator Management in the Keyfactor Command Reference Guide)? Most orchestrators send a heartbeat to Keyfactor Command every 5 minutes, so this date should at most be 5 minutes out of date if the orchestrator is operating correctly.
Tip: Orchestrator control targets for the Keyfactor Bash Orchestrator The Bash Orchestrator, one of Keyfactor's suite of orchestrators, is used to discover and manage SSH keys across an enterprise. do not appear on the Orchestrator Management page, so for a remote server that's not operating as expected, this would be the orchestrator that is controlling the target.Figure 598: Orchestrator Management for a Keyfactor Bash Orchestrator
- Has the orchestrator been approved on the Orchestrator Management page in the Management Portal (see Orchestrator Management in the Keyfactor Command Reference Guide)?
- Is there a sync schedule set to run frequently for the orchestrator (SSH The SSH (secure shell) protocol provides for secure connections between computers. It provides several options for authentication, including public key, and protects the communications with strong encryption.), remote control target (SSH), or certificate store? Sync schedules for certificates stores are automatically disabled if inventory jobs are failing.
- For the Keyfactor Bash Orchestrator:
Has the server record for the orchestrator or remote control target been created under SSH Server Manager on the Servers tab in the Management Portal (see SSH Servers in the Keyfactor Command Reference Guide)?
Figure 599: Orchestrator Management for a Keyfactor Bash Orchestrator
- Does the server record for the orchestrator or remote control target in the Management Portal have the correct hostname The unique identifier that serves as name of a computer. It is sometimes presented as a fully qualified domain name (e.g. servername.keyexample.com) and sometimes just as a short name (e.g. servername). or IP address? If the name or IP address is incorrect, sync jobs will fail.
- Is the server record for the remote control target in the Management Portal associated with the correct orchestrator? If the control target is associated with the wrong orchestrator, you may be looking at the wrong log files (see Debug Logging and Error Messages) for troubleshooting information.
It is often helpful to enable debug logging on the orchestrator. For information on configuring this, see the specific orchestrator chapters.
Once the logging is set at debug or trace level, it can be helpful to watch the logs live while activity is going on. On Linux, you can do this with tail (or a similar tool) to watch the log in real time. For example:
On Windows, there are also tools with tail-like functionality. Notepad++, for example, has this functionality built in.
Some messages in the KeyfactorUniversal Orchestrator log include a correlation ID that helps to identify log messages that originated from the same request. The correlation ID is a randomly generated GUID that often appears just after the date in the log entry (B0C4946E-DB3B-4404-8080-79AFF260DE4E in the following example) and is the same for all log messages for the given request until the request completes.
2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Orchestrators.JobExecutors.OrchestratorJobExecutor [Debug] - Running job extension for job with Id 'b0c4946e-db3b-4404-8080-79aff260de4e' 2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory [Trace] - Entered 'ProcessJob' method. 2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory [Trace] - {"JobCancelled":false,"ServerError":null,"JobHistoryID":230398,"RequestStatus":1,"ServerUserName":"keyexample\\svc_kyforch","ServerPassword":"**********","JobConfigurationProperties":{"spnwithport":false,"WinRm Protocol":"https","WinRm Port":"5986","ServerUsername":"keyexample\\svc_kyforch","ServerUseSsl":true,"sniflag":0},"UseSSL":true,"JobTypeID":"00000000-0000-0000-0000-000000000000","JobID":"b0c4946e-db3b-4404-8080-79aff260de4e","Capability":"CertStores.WinCert.Inventory","LastInventory":[],"CertificateStoreDetails":{"ClientMachine":"websrvr93.keyexample.com","StorePath":"My","StorePassword":"**********","Type":117}} 2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory [Trace] - Establishing runspace on client machine: websrvr93.keyexample.com 2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.PsHelper [Trace] - Entered 'GetClientPsRunspace' method. 2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.PsHelper [Trace] - Creating remote session at: https://websrvr93.keyexample.com:5986/wsman 2023-09-15 18:19:00.3780 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.PsHelper [Trace] - Credentials Specified [Messages removed for clarity] 2023-09-15 18:19:00.7389 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory [Trace] - Connecting to remote server websrvr93.keyexample.com failed with the following error message : acquiring creds with username only failed No credentials were supplied, or the credentials were unavailable or inaccessible SPNEGO cannot find mechanisms to negotiate For more information, see the about_Remote_Troubleshooting Help topic. at System.Management.Automation.Runspaces.AsyncResult.EndInvoke() at System.Management.Automation.Runspaces.Internal.RunspacePoolInternal.EndOpen(IAsyncResult asyncResult) at System.Management.Automation.Runspaces.Internal.RemoteRunspacePoolInternal.Open() at System.Management.Automation.Runspaces.RunspacePool.Open() at System.Management.Automation.RemoteRunspace.Open() at Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory.PerformInventory(InventoryJobConfiguration config, SubmitInventoryUpdate submitInventory) 2023-09-15 18:19:00.7389 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory [Warn] - Inventory job failed for Site 'My' on server 'websrvr93.keyexample.com' with error: 'Connecting to remote server websrvr93.keyexample.com failed with the following error message : acquiring creds with username only failed No credentials were supplied, or the credentials were unavailable or inaccessible SPNEGO cannot find mechanisms to negotiate For more information, see the about_Remote_Troubleshooting Help topic. at System.Management.Automation.Runspaces.AsyncResult.EndInvoke() at System.Management.Automation.Runspaces.Internal.RunspacePoolInternal.EndOpen(IAsyncResult asyncResult) at System.Management.Automation.Runspaces.Internal.RemoteRunspacePoolInternal.Open() at System.Management.Automation.Runspaces.RunspacePool.Open() at System.Management.Automation.RemoteRunspace.Open() at Keyfactor.Extensions.Orchestrator.WindowsCertStore.WinCert.Inventory.PerformInventory(InventoryJobConfiguration config, SubmitInventoryUpdate submitInventory) 2023-09-15 18:19:00.7389 B0C4946E-DB3B-4404-8080-79AFF260DE4E 230398 Keyfactor.Orchestrators.JobExecutors.OrchestratorJobExecutor [Debug] - Finished running job extension for job with Id 'b0c4946e-db3b-4404-8080-79aff260de4e'
Some messages to look for include:
-
This message (or similar—text varies slight from orchestrator to orchestrator) indicates that the orchestrator has not yet been approved in the Keyfactor Command Management Portal:
2021-07-29 09:01:28.5957 Keyfactor.Orchestrators.JobEngine.SessionJobExecutor [Info] - Agent has not yet been registered with CMS. Trying again in 30 minutes.
After approving the orchestrator in the Management Portal, you can restart the orchestrator service to avoid waiting 30 minutes for the next automated retry.
-
Some log message spell out the problem pretty clearly. For example, this message from the Java Agent The Java Agent, one of Keyfactor's suite of orchestrators, is used to perform discovery of Java keystores and PEM certificate stores, to inventory discovered stores, and to push certificates out to stores as needed. log:
2021-07-29 09:00:02.437 [Scheduler_Worker-1] ERROR com.css_security.cms.JksUtilities - Keystore /opt/apps/myapp.jks loaded as type JKS but the provided password is incorrect
In this case, the certificate store configuration in the Management Portal is not using the correct password for the store.
-
This series of messages in the Java Agent log indicates that the stored credentials file for the Java Agent is no longer useable:
2021-07-01 11:24:59.292 [Scheduler_Worker-1] ERROR com.css_security.cms.apache.http.HttpClientFactory - Given final block not properly padded. Such issues can arise if a bad key is used during decryption. 2021-07-01 11:24:59.313 [Scheduler_Worker-1] ERROR com.css_security.cms.apache.http.HttpClientFactory - Could not decrypt credentials file at config/install.creds 2021-07-01 11:24:59.313 [Scheduler_Worker-1] INFO com.css_security.cms.apache.http.HttpClientFactory - Your machine key may have changed. Reencrypt credentials using local machine key. 2021-07-01 11:24:59.313 [Scheduler_Worker-1] INFO com.css_security.cms.apache.http.HttpClientFactory - Generate new credentials by running included cms-credential-encryptor utility 2021-07-01 11:24:59.313 [Scheduler_Worker-1] INFO com.css_security.cms.apache.http.HttpClientFactory - Try 1. Trying again in 30 seconds
The credentials file can be recreated to return the Java Agent to functionality (see Appendix A—Generate New Credentials for the Java Agent).
-
This series of messages indicates that the Keyfactor Command server is unreachable:
2021-07-29 11:59:02.1003 Keyfactor.Orchestrators.JobEngine.SessionClient [Error] - Unable to heartbeat: 2021-07-29 11:59:02.1003 Keyfactor.Orchestrators.JobEngine.SessionClient [Trace] - Leaving CMSSessionClient.Heartbeat 2021-07-29 11:59:02.1006 Keyfactor.Orchestrators.JobEngine.SessionJobExecutor [Debug] - Heartbeat success: Unreachable 2021-07-29 11:59:02.1006 Keyfactor.Orchestrators.JobEngine.SessionJobExecutor [Warn] - Heartbeat endpoint unreachable. Trying again later
This could indicate a network or firewall issue.
- A series of messages similar to this for the Universal Orchestrator can indicate a problem retrieving the CRL A Certificate Revocation List (CRL) is a list of digital certificates that have been revoked by the issuing Certificate Authority (CA) before their scheduled expiration date and should no longer be trusted. for the certificate used to secure the Keyfactor Command server if you've chosen to connect to Keyfactor Command over SSL TLS (Transport Layer Security) and its predecessor SSL (Secure Sockets Layer) are protocols for establishing authenticated and encrypted links between networked computers.:
2022-09-14 11:15:06.1830 Keyfactor.Orchestrators.JobEngine.SessionJobExecutor [Error] - Error in SessionManager: Unable to register session. The SSL connection could not be established, see inner exception. The remote certificate is invalid because of errors in the certificate chain: RevocationStatusUnknown, OfflineRevocation
Confirm that the CRLs for the CA A certificate authority (CA) is an entity that issues digital certificates. Within Keyfactor Command, a CA may be a Microsoft CA or a Keyfactor gateway to a cloud-based or remote CA. that issued the certificate and the remaining CAs in the chain are valid. Confirm that they are available in a location that is accessible to the orchestrator server (e.g. a location other than LDAP if the orchestrator is installed on a server not joined to a domain in the forest An Active Directory forest (AD forest) is the top most logical container in an Active Directory configuration that contains domains, and objects such as users and computers. where they were issued). If you're using delta CRLs and hosting them on an IIS website using the default CRL suffix as a naming convention (+), be sure to enable double escaping in IIS to allow the orchestrator to retrieve the CRL files containing a plus in the file name.
-
Messages that look like errors during SSL scanning are common as attempts are made to connect to TLS TLS (Transport Layer Security) and its predecessor SSL (Secure Sockets Layer) are protocols for establishing authenticated and encrypted links between networked computers. endpoints and connections fail or are refused. This is part of the process of testing whether an SSL endpoint An endpoint is a URL that enables the API to gain access to resources on a server. is responding and then whether there is a certificate there. Most of these message exist at TRACE level, so monitoring at DEBUG rather than TRACE level will eliminate these messages if they become overwhelming. For example:
2022-09-12 10:56:32.3948 EE033BD9-421A-44CA-89BC-10C86949B506 166937 Tls13Probe [Trace] - Endpoint 192.168.216.87:443 returned status 'ExceptionDownloading' with exception 'System.ArgumentException': The specified nonce is not a valid size for this algorithm. (Parameter 'nonce') 2022-09-12 10:56:39.0567 EE033BD9-421A-44CA-89BC-10C86949B506 166937 Tls13Probe [Trace] - Endpoint 192.168.216.158:443 returned status 'ConnectionRefused' with exception 'System.Net.Sockets.SocketException': An existing connection was forcibly closed by the remote host. 2022-09-12 10:57:23.4727 EE033BD9-421A-44CA-89BC-10C86949B506 166937 Tls13Probe [Trace] - Connection to 192.168.216.87:443 failed 2022-09-12 10:57:24.3345 EE033BD9-421A-44CA-89BC-10C86949B506 166937 a [Trace] - Endpoint 192.168.216.211:443 returned status 'ExceptionDownloading' with exception 'Keyfactor.Orchestrators.SSL.Pipeline.Exceptions.ConnectionGoneException': Read zero bytes on a blocking read 2022-09-12 10:57:57.9505 EE033BD9-421A-44CA-89BC-10C86949B506 166937 b [Trace] - Endpoint 192.168.216.96:443 returned status 'SslRefused' with exception 'Keyfactor.Orchestrators.SSL.Pipeline.Exceptions.TlsAlertException': Got TLS alert during TLS handshake: Alert level 2, Alert description 70
You should see a heartbeat message similar to the following in the log every 5 minutes:
-
Keyfactor Universal Orchestrator on Windows:
2023-09-12 11:01:16.4598 Keyfactor.Orchestrators.JobEngine.SessionJobExecutor [Debug] - Existing session found. Heartbeating...
-
Keyfactor Bash Orchestrator:
Tue Aug 11 18:06:02 UTC 2023 [Debug]: Performing orchestrator heartbeat...
-
Keyfactor Java Agent on Linux:
2023-07-30 00:52:11.662 [Scheduler_Worker-1] DEBUG com.css_security.cms.agents.jobs.SessionManager - Existing session found. Heartbeating...
This is the orchestrator checking in with the Keyfactor Command server to see if there are any jobs. If this message is missing, it could indicate that the heartbeat service is not running.
If you're running the Keyfactor Bash Orchestrator, you can see the heartbeat service as a separate entity. Execute this command on the orchestrator in the command shell as root:
Output from this command should look something like that shown in Figure 600: Status for the Keyfactor Bash Orchestrator Service. If you don't see heartbeat.sh in the output, the heartbeat service is not running.
For other orchestrators, check to see if the orchestrator service as a whole is running (see details in the specific orchestrator chapters). Start the service if it is not running or restart it if it is running and check again for a heartbeat after a few minutes.
At a very basic level, the orchestrator needs to be able to communicate with the Keyfactor Command server(s) on either port 80 or port 443 (depending on the configuration option you've chosen for this connection—see orchestrator specific chapters).
The ports needed for the Keyfactor Universal Orchestrator The Keyfactor Universal Orchestrator, one of Keyfactor's suite of orchestrators, is used to interact with servers and devices for certificate management, run SSL discovery and management tasks, and manage synchronization of certificate authorities in remote forests. With the addition of custom extensions, it can provide certificate management capabilities on a variety of platforms and devices (e.g. Amazon Web Services (AWS) resources, Citrix\NetScaler devices, F5 devices, IIS stores, JKS keystores, PEM stores, and PKCS#12 stores) and execute tasks outside the standard list of certificate management functions. It runs on either Windows or Linux servers or Linux containers. depend on the functions enabled for the orchestrator. For example, IIS certificate store management uses remote PowerShell (default TCP 5985 and 5986). For SSL discovery and management, any ports configured for scanning need to be open.
The Keyfactor Bash Orchestrator communicates with any remote control targets on port 22 or the alternative port you have configured for SSH. If you are using a non-standard port for SSH, you need to be sure to configure this on both the Keyfactor Command side (see Adding SSH Servers in the Keyfactor Command Reference Guide) and in the SSH configuration on the orchestrator and remote control targets (sshd_config).
For more information about the firewall ports needed in a Keyfactor Command environment, see Firewall Considerations in the Keyfactor Command Server Installation Guide.
The Keyfactor Bash Orchestrator has two possible configurations—local and remote. The troubleshooting steps differ depending on whether the server that's not operating as expected is running the orchestrator software (a local installation) or is a control target for the orchestrator (a remote installation). In either case, the best place to start with troubleshooting is in the Keyfactor Command Management Portal to confirm things seem correct on this side of the communication and then configure debug logging on the orchestrator and review those logs.
In this snippet you see a successful inventory showing keys found for the Linux users ginag and svc_greenchicken and a logon found for the Linux user zadams with no key found. You see that the server is configured in inventory and publish policy mode, since after performing the inventory the server went through the steps of publishing logons and keys. Details about these are not written to the log.
Tue Aug 11 18:07:45 UTC 2020 [Debug]: Sending request to 'https://keyfactor.keyexample.com/KeyfactorAgents/SshSync/1/Configure' with payload '{"SessionToken": "5451f7aa-4fd5-4bf5-a563-2e4f7bd3ed3f", "JobId": "b835bde8-8174-447a-b351-810e582148c0"}' Tue Aug 11 18:07:45 UTC 2020 [Debug]: Configure Response for job with id 'b835bde8-8174-447a-b351-810e582148c0': {"Hostname":"appsrvr79.keyexample.com","InventoryCompleteEndpoint":"/SshSync/1/InventoryComplete", "Port":22,"AuditId":7642,"JobCancelled":false,"Result":{"Status":1,"Error":null}} Tue Aug 11 18:07:46 UTC 2020 [Debug]: Using sshd_config file '/etc/ssh/sshd_config' on server 'appsrvr79.keyexample.com' for job with id 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:46 UTC 2020 [Info]: Beginning local inventory job on server 'appsrvr79.keyexample.com' for job with id 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:49 UTC 2020 [Debug]: Sending request to 'https://keyfactor.keyexample.com/KeyfactorAgents/SshSync/1/InventoryComplete' with payload '{"Status":2,"Results": [{ "user": "ginag", "lastlogon": "", "keys": [ "ssh-rsa AAAAB3NzaC1yc2EAAAA[truncated for display purposes]9M5vl6f Gina G. Gant" ] },{ "user": "zadams", "lastlogon": "", "keys": [] },{ "user": "svc_greenchicken", "lastlogon": "", "keys": [ "ssh-rsa AAAAB3NzaC1yc2EAAAAD[truncated for display purposes]vicWhZOd John W. Smith" ] }],"SessionToken": "5451f7aa-4fd5-4bf5-a563-2e4f7bd3ed3f","JobId": "b835bde8-8174-447a-b351-810e582148c0"}' Tue Aug 11 18:07:49 UTC 2020 [Debug]: Inventory Complete Response for job with id 'b835bde8-8174-447a-b351-810e582148c0' on server 'appsrvr79.keyexample.com': {"SshDesiredState":[{"Username":"ginag","Keys":["ssh-rsa AAAAB3NzaC1yc2EAAAA[truncated for display purposes]9M5vl6f Gina G. Gant"]},{"Username":"zadams","Keys":[]},{"Username":"svc_greenchicken","Keys":["ssh-rsa AAAAB3NzaC1yc2EAAAAD[truncated for display purposes]vicWhZOd John W. Smith"]}],"Result":{"Status":1,"Error":null}} Tue Aug 11 18:07:49 UTC 2020 [Info]: Enforcing publish policy on server 'appsrvr79.keyexample.com' for job with id 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:52 UTC 2020 [Info]: Publishing logons on local server 'appsrvr79.keyexample.com' for job with id 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:52 UTC 2020 [Info]: Published logons successfully on server 'appsrvr79.keyexample.com' for job 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:52 UTC 2020 [Info]: Publishing keys on local server 'appsrvr79.keyexample.com' for job with id 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:54 UTC 2020 [Info]: Published keys successfully on server 'appsrvr79.keyexample.com' for job 'b835bde8-8174-447a-b351-810e582148c0' Tue Aug 11 18:07:54 UTC 2020 [Debug]: Sending request to 'https://keyfactor.keyexample.com/KeyfactorAgents/SshSync/1/Complete' with payload '{"SessionToken": "5451f7aa-4fd5-4bf5-a563-2e4f7bd3ed3f", "JobId": "b835bde8-8174-447a-b351-810e582148c0", "Status": 2}' Tue Aug 11 18:07:54 UTC 2020 [Info]: Execution of 'b835bde8-8174-447a-b351-810e582148c0' on server 'appsrvr79.keyexample.com' complete.
During installation of the orchestrator, a local Linux user account should be created automatically as an identity under which the orchestrator service will operate. This allows the orchestrator to run as a non-root user. On servers on which you install the orchestrator directly, the following Linux user account is created:
On servers configured as remote control targets, the following Linux user account is created:
You can validate that the user has been created and has the correct configuration be reviewing the /etc/passwd file.
In a command shell, output the content of the /etc/passwd file to the screen:
In the output from this command, look for the entry for the keyfactor-bash or keyfactor-bash-orchestrator-svc user. It will look similar to one of these:
keyfactor-bash-orchestrator-svc:x:112:65534::/opt/keyfactor-bash-orchestrator-client:/bin/bash
On the remote control target server, you should find an entry in the sshd_config file that directs the service account logon over to the install path for the client to find the authorized_keys file for the service account user, like so:
AuthorizedKeysFile /opt/keyfactor-bash-orchestrator-client/authorized_keys
On both the orchestrator and remote control target servers, you should find a file in the /etc/sudoer.d directory named for the service name of the orchestrator or remote control target user (keyfactor-bash or keyfactor-bash-orchestrator-svc) and containing a list of commands the orchestrator is allowed to execute as root. For example:
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/cat
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /usr/bin/test
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/rm
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /usr/bin/tee
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/touch
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/chmod
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/chown
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /usr/bin/gpasswd
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /usr/sbin/usermod
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/sed
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /usr/bin/flock
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /bin/mkdir
keyfactor-bash appsrvr79.keyexample.com = (root) NOPASSWD: /usr/sbin/adduser
The orchestrator connects to the remote control targets it is managing using SSH with a public key In asymmetric cryptography, public keys are used together in a key pair with a private key. The private key is retained by the key's creator while the public key is widely distributed to any user or target needing to interact with the holder of the private key. pair. On the orchestrator, the key pair In asymmetric cryptography, public keys are used together in a key pair with a private key. The private key is retained by the key's creator while the public key is widely distributed to any user or target needing to interact with the holder of the private key. is stored in the .ssh directory under the directory where the orchestrator is installed. By default, this is:
Both the private key Private keys are used in cryptography (symmetric and asymmetric) to encrypt or sign content. In asymmetric cryptography, they are used together in a key pair with a public key. The private or secret key is retained by the key's creator, making it highly secure. (id_rsa) and public key (id_rsa.pub) are found here.
In a command shell, output the content of the public key file to the screen:
On the remote control target, the public key of the key pair is stored in the authorized_keys file for the remote control target service account, which is found in the remote control install path. By default, this is:
In a command shell, output the content of the authorized_keys file to the screen:
Compare the public key string from the remote control target authorized_keys file to the public key string from the orchestrator id_rsa.pub file. They should match exactly. If they do not match, the remote control target is not using the correct public key, which will cause connection attempts made to it from the orchestrator to fail.
If the orchestrator is managing more than one server (remote control targets), it can be difficult to interpret the logs, because the orchestrator operates in a multi-threaded manner and log messages for jobs with different servers will be mixed together. Find a message related to the job you're interested in and look for the ID for that job. Then look for all other messages referencing that ID.
Look for error messages in the log. These should appear with the word Error in brackets just after the date like so:
Tue Aug 11 19:14:33 UTC 2020 [Error]: Error occurred during job with id 'b835bde8-8174-447a-b351-810e582148c0' on server 'appsrvr79.keyexample.com': An error occurred attempting to configure the job 'b835bde8-8174-447a-b351-810e582148c0'
This particular message doesn't tell you very much except that this job was unable to complete for some reason. If you look at the debug messages that appear immediately before and after the error message, they may provide more information.
This message indicates that the orchestrator was unable to make an SSH connection to the remote control target named in the message:
Mon Aug 10 23:36:10 UTC 2020 [Error]: Error occurred during job with id '3f04f552-05fd-4c90-b3b1-edeec70878bb' on server 'appsrvr80.ubuntu.keyexample.com': Unable to connect to 'appsrvr80.ubuntu.keyexample.com' on port '22' via SSH
This could happen for a number of reasons. Perhaps the hostname configured for the remote target is incorrect. Perhaps the public key on the remote target is incorrect. It can be helpful in this case to check the Linux syslog on the orchestrator for more context on the message. For example, this set of messages from the Linux syslog reveals that the public key on the target is invalid in some fashion:
Aug 11 13:03:04 appsrvr158 keyfactor-bash[29417]: Testing 'keyfactor-bash-orchestrator-svc' on server 'appsrvr80.keyexample.com' via SSH for job with id 'eeabd541-b9d2-46d2-a215-9cb99fed4adc'... Aug 11 13:03:04 appsrvr158 keyfactor-bash-orchestrator.sh[932]: keyfactor-bash-orchestrator-svc@appsrvr80.keyexample.com: Permission denied (publickey). Aug 11 13:03:30 appsrvr158 keyfactor-bash[29486]: Error occurred during job with id 'eeabd541-b9d2-46d2-a215-9cb99fed4adc' on server 'appsrvr80.keyexample.com': Unable to connect to 'appsrvr80.keyexample.com' on port '22' via SSH
For information on troubleshooting public key issues with remote control targets, see Validate Remote Control Target Public Key. For more information on troubleshooting remote control target issues in general, see Remote Control Target Logs. For information on what successful inventory and publish policy log messages look like, see Successful Inventory and Policy Publishing.
Unlike on the orchestrator itself, where you can enable debug logging to see a more detailed picture of what's going on when the orchestrator attempt to connect or run a job, on a remote control target, the only logs available are the SSH logs showing attempts by the orchestrator to make a remote connection into the target and then the commands the orchestrator runs from an SSH perspective. These logs are found in the Linux system log where SSH logs are consolidated. The name and location of this will vary by operating system, but it is often found in /var/log by default (auth.log or secure is common). A large number of entries are generated in the log on a successful connection for inventory or inventory and policy publishing, so it can be difficult to interpret the logs.
In these logs you can check to see if the orchestrator is successfully making an SSH connection. If it isn't, you may see some messages that will help determine why it isn't. If it's successfully making the initial connection but then failing further along in the process, this log may also help reveal that. Perhaps one of the commands that the service account needs to run isn't in the expected path, for example.
When the orchestrator first connects to the remote control target, the log entries on the target will look something like:
Aug 11 17:36:51 appsrvr80 sshd[95543]: Accepted publickey for keyfactor-bash-orchestrator-svc from 10.4.3.158 port 47778 ssh2: RSA SHA256:u5zNB4UEoPNcax5p4fBbkkWaoiWq6AcEkA65XdzUkM4 Aug 11 17:36:51 appsrvr80 sshd[95543]: pam_unix(sshd:session): session opened for user keyfactor-bash-orchestrator-svc by (uid=0) Aug 11 17:36:51 appsrvr80 systemd-logind[656]: New session 13019 of user keyfactor-bash-orchestrator-svc. Aug 11 17:36:51 appsrvr80 systemd: pam_unix(systemd-user:session): session opened for user keyfactor-bash-orchestrator-svc by (uid=0) Aug 11 17:36:51 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/cat /etc/ssh/sshd_config
An inventory of an authorized_keys file for a user will appear as a series of entries, something like:
Aug 11 18:11:28 appsrvr164 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/test -f /home/jsmith/.ssh/authorized_keys Aug 11 18:11:28 appsrvr164 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/ls -l /home/jsmith/.ssh/authorized_keys Aug 11 18:11:28 appsrvr164 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/cat /home/jsmith/.ssh/authorized_keys
Removal of a rogue key A rogue key, in the context of Keyfactor Command, is an SSH public key that appears in an authorized_keys file on a server managed by the SSH orchestrator without authorization. on a remote control target under management (in inventory and publish policy mode) will appear as a series of entries where the authorized_keys file is removed, recreated and repopulated with any valid keys (none in this case), like:
Aug 12 09:01:24 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/usr/bin/test -f /home/jsmith/.ssh/authorized_keys Aug 12 09:01:24 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/rm /home/jsmith/.ssh/authorized_keys Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/usr/bin/test -f /home/jsmith/.ssh/authorized_keys Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/usr/bin/test -d /home/jsmith/.ssh Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/usr/bin/touch /home/jsmith/.ssh/authorized_keys Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/chmod 640 /home/jsmith/.ssh/authorized_keys Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/bin/chown jsmith: /home/jsmith/.ssh/authorized_keys Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/usr/bin/flock /home/jsmith/.ssh/authorized_keys echo Aug 12 09:01:25 appsrvr80 sudo: keyfactor-bash-orchestrator-svc : TTY=unknown ; PWD=/opt/keyfactor-bash-orchestrator-client ; USER=root ; COMMAND=/usr/bin/tee -a /home/jsmith/.ssh/authorized_keys
Below are some possible errors you might encountered and some suggested troubleshooting tips or solutions.
Here is an example of some very similar errors you might see when trying to connect to a target machine to inventory a certificate store or execute a management or discovery job on a certificate store:
An error occurred while sending the request.
Unable to connect to the remote server
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 192.168.12.12:443 (80131500)
Messages of this type are generally the result of the target server being inaccessible. This might happen if the server was turned off or in maintenance mode. Perhaps there is a network problem routing to that server. If the certificate store has never worked in Keyfactor Command, perhaps there is a typo in the server name configuration.
You may encounter this error when doing an inventory of an IIS certificate store:
This is an indication that the certificate store you are inventorying contains more certificates (or more precisely, the certificates add up to a total number of bytes greater) than IIS on the Keyfactor Command server is configured to accept. To resolve this, adjust the values on the IIS server that control the upload limits. For example, the maxAllowedContentLength. See Monitoring Network Scan Jobs with View Scan Details in the Keyfactor Command Reference Guide) on fine tuning SSL monitoring for more information.
You may receive a 403.16 error while trying to authenticate an orchestrator to Keyfactor Command using certificate authentication. On the face of it, this error indicates that the chain for the certificate you're using to authenticate is not trusted by the Keyfactor Command server. First, check to be sure that your certificate is trusted by the Keyfactor Command server. But if your certificate is fully trusted and you're still getting this error, what then?
This error can indicate that the trusted root store on the Keyfactor Command server contains a certificate that is not a root certificate (for example, an intermediate certificate is accidentally in the root store). To check this, open the Local Computer certificates MMC on the Keyfactor Command server, drill down to Certificates under the Trusted Root Certificate Authorities and scan for any certificates where the Issued To does not match the Issued By. Remove any certificates you find like this.
Figure 601: Certificate Incorrectly in the Trusted Root Certificate Store
If you receive an error similar to the following:
This may indicate that the Keyfactor Universal Orchestrator was installed without the Microsoft Visual C++ Redistributable x64 required to manage certificates from remote Microsoft CAs (see System Requirements).
If you receive an error similar to the following (some portions of message removed for clarity):
2023-02-15 11:54:27.6600 Keyfactor.Orchestrators.JobEngine.SessionJobExecutor [Error] - Error in SessionManager: Unable to register session. The SSL connection could not be established, see inner exception. The remote certificate is invalid because of errors in the certificate chain: RevocationStatusUnknown, OfflineRevocation
This may indicate that the Keyfactor Universal Orchestrator cannot access the CRL(s) for the SSL certificate used to secure the Keyfactor Command server (see System Requirements).
To check this:
- Enable at least debug level logging (see Configure Logging for the Universal Orchestrator).
- Either wait for the orchestrator to attempt to register again, or restart the orchestrator service (see Start the Universal Orchestrator Service) to force an immediate attempt to register.
-
Look in the logs for a log message similar to the following (referencing your Keyfactor Command server name):
2023-02-15 12:08:14.6076 Keyfactor.Orchestrators.Core.Http.KeyfactorHttpClient [Debug] - Sending request to 'https://keyfactor.keyexample.com/KeyfactorAgents/Session/Register' -
Visit the referenced URL (https://keyfactor.keyexample.com/KeyfactorAgents/Session/Register) in a browser on the orchestrator server. This should give you a response of:
The requested resource does not support http method 'GET'. -
In the browser, view details for the certificate (the exact method for this will vary depending on the browser) and check the CRL Distribution Points field in the certificate.
Figure 602: Find the Certificate for the Keyfactor Command Web Site
- In the same browser on the orchestrator server, attempt to browse to the URL for the CRL (assuming it's a URL).
- If the CRL downloads without error, then likely CRL access is not the issue. Open the CRL and check the Next update date to see if it's in the past (indicating the CRL is out of date).
-
Test the connection from the orchestrator server to the remotely managed Windows server:
Test-netConnection -ComputerName <target> -Port 5986 or Test-netConnection -ComputerName <target> -Port 5985
-
Test PS Session from the orchestrator server to the remotely managed server:
Enter-PSSEssion -ComputerName <target>
-
On the remotely managed server, check what's available:
winrm enumerate winrm/config/listener
-
Enable secure winrm:
winrm quickconfig -transport:https
-
Check the secure winrm port certificate:
gci -path cert:\localmachine\my |ft -property thumbprint,subject,NotBefore,NotAfter