--- title: Moveworks Agent Troubleshooting Guide excerpt: '' deprecated: false hidden: false metadata: title: '' description: '' robots: index next: description: '' --- # **Common Procedures** ## Updating the Active Directory Password 1. Log into the agent VM and navigate the to **Moveworks** folder 2. Within the Moveworks folder enter `ls` to see the contents of the Moveworks folder. You should see conf, certs, and logs folders. 3. Change into the conf folder by entering `cd conf` 4. Enter `ls` to see the contents of the conf folder. You should see a file named **agent_conf.yml** 5. Enter `vi agent_conf.yml` to open the agent_conf file in a text editor 1. try `vim agent_conf.yml` if vi does not work 6. Type `a` to enter into “insert”/edit mode which allows you to edit the file 7. Under **ldap_service_password** change the text “encrypted_value” to just “value”. Keep the indentation the same. On the same line, remove the value after “encrypted_value:” and replace it with the new password. ![](https://files.readme.io/c73d03121f2d28e34b5dad1170199f4dac14cec399b56275232b1bd2e77e4a27-image.png) 8. Press the escape key to exit “insert”/edit mode 9. Type `:` to enter command mode, then type `wq` to save and exit the file 10. Once the file is saved, change back to the top level directory by entering `cd ..` two times back to back. Enter `ls` and you should see the moveworks folder and the setup_agent.sh script. 11. Enter `./setup_agent.sh --stop` to stop any running agents. 1. If using docker, use `sudo ./setup_agent.sh --stop` 12. Enter `./setup_agent.sh --start` to start a new agent. 1. If using docker, use `sudo ./setup_agent.sh --start` 13. Now, navigate back to the conf folder by entering `cd moveworks` and then `cd conf` 14. Enter `vi agent_conf.yml` to review the file. You should now see that “value” has been changed back to “encrypted_value” and the new password is now encrypted. ## Configuring LDAP over SSL aka LDAPS (636) ### **Prerequisites** First, understand if you need a certificate file to connect to the LDAP server over SSL. If so, you will need to supply a base 64 encoded ASCII certificate file, which is typically a **.pem** server cert file. This certificate is used to verify that the LDAP server we are communicating to is who they say they are. The content of the cert should be a “cert chain” should look like something below, and should live in the `./certs` directory. Wildcard certs are not accepted per Microsoft: \<[https://docs.microsoft.com/en-US/troubleshoot/windows-server/identity/enable-ldap-over-ssl-3rd-certification-authority>](https://docs.microsoft.com/en-US/troubleshoot/windows-server/identity/enable-ldap-over-ssl-3rd-certification-authority>) The file should look like below and be a **.pem** server cert file. ```text PlainText -----BEGIN CERTIFICATE----- content of your domain certificate -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- content of any intermediate CA certificate -----END CERTIFICATE----- ``` If the exported certificate is a `.p7b` file, you will need to use the following command to convert it from `.p7b` to `.pem` instead of simply renaming the file: ```shell bash openssl pkcs7 -print_certs -in certnew.p7b -out ca_chain.pem ``` **More details here:** \<[https://knowledge.digicert.com/solution/SO21448.html>](https://knowledge.digicert.com/solution/SO21448.html>) ### **Verifying the cert is valid:** Run the following command to verify that the certificate is signed by a known certificate authority. If the certificate is only signed by the client, that will not work. `openssl x509 -in -text -noout` The Active Directory fully qualified domain name of the domain controller appears in one of the following locations: * The common name (CN) in the Subject field * The Subject Alternative Name (SAN) extension in the DNS entry ### **Pulling the Cert on Your Own** You may be able to pull the cert by connecting the the LDAPs server: ```shell shell openssl s_client -connect : -showcerts /dev/null | openssl x509 -outform PEM > cert.pem ``` ### **Configuration Changes/ Updates:** **RECOMMENDED:** Run the setup script again, and choose the SSL option to supply the LDAP certificate path. You can also do the following changes manually (after shutting down the agent). **I. Modify** `/conf/agent_config`.yml 1. Must use LDAP hostname, not IP address. 2. Use `port: 636` 3. Add `use_ssl: true` 4. If using a cert that is not well-known, you must set the container path of the certificate. The directory path is static as it refers to the path within the docker image. The certificate file name will change with what you’ve named it locally. * `path_to_cert: /home/moveworks/agent/certs/mycert.pem` **II. Modify** `./start_agent.sh` * Add this additional argument to the script, the first directory should be the absolute directory path on the local VM. * `v "/home/svcmoveworks/certs:/home/moveworks/agent/certs" \` ### Example agent_config.yml with LDAPS enabled **Example config when using a cert:** ```yaml ldap_config: enabled: true host: ldaps.customer.com port: 636 service_password: encrypted_value: service_user: use_ssl: true path_to_cert: /home/moveworks/agent/certs/mycert.pem ``` **Example config when CA is a well-known: NOTE:** port # is set to **636** instead or 389, **use_ssl** is set to **true**, but no cert is supplied in the config ```yaml ldap_config: enabled: true host: ldaps.customer.com port: 636 service_password: encrypted_value: service_user: use_ssl: true ``` **Example config when switching to LDAP** ```yaml ldap_config: enabled: true host: ldap.customer.com port: 389 service_password: encrypted_value: service_user: ``` ### TLS Skip Verify Use for **self signed certificates** ```yaml ldap_config: enabled: true host: ldaps.customer.com port: 636 service_password: encrypted_value: service_user: use_ssl: true path_to_cert: /home/moveworks/agent/certs/mycert.pem tls_skip_verify: true ``` ## Validating LDAP Service Account ### Verifying Access via AD Explorer To test the access level of the LDAP service account, we can use AD Explorer (downloaded from [here](https://docs.microsoft.com/en-us/sysinternals/downloads/adexplorer)) to validate if the service account has access to the LDAP server. 1. First, grab the host, service account username, and password. 2. Open AD Explorer, and connect to the LDAP host. Note that you may need to connect to VPN in order to connect successfully. 1. ![](https://files.readme.io/cc74be5-image.png) 3. Once logged in successfully, you can navigate to groups and validate as needed. 1. ![](https://files.readme.io/fc445a0-image.png) 4. Select one of the distribution groups. On the right side, you can see the full details. Note: we've run into issues where a service password with a special character might cause problems. If all else fails, you can try resetting the password to something without special characters. ### Verifying Access via Bash 1. To find the **LDAP host name or IP address**, you can run the command: ```shell nslookup -q=SRV _ldap._tcp..com ``` 2. For **Active Directory**, LDAP Service User is formatted as **netbios domain name** with a backlash and then the service account name (e.g. WHS\svc_moveworks) 3. **To test and validate the service account credentials**, you can run these commands: 1. Operation to request user's authzid ```shell ldapwhoami -x -H  -w -D "\" ``` `e.g. - u:WHS/svc_moveworks` 2. Operation to search for object ```shell ldapsearch -x -H ldaps://:636 -W -D "\" -b "" ``` **Note:** For **RedHat OS**, you will need to install _openldap to run the above commands_ 1. To install run: ```shell yum install openldap* ``` 3. Another LDAPs search example (returns users): ```shell ldapsearch -x -LLL -H ldaps://ldap.custom.com:636 -D "username_here" -w "password_here" -b "dc=customer,dc=com" -s sub "(objectClass=user)" givenName ``` 4. After completing the configuration process, an _**agent_config.yml**_ file will be generated in ./config. Please verify the contents of the _./config/agent_config.yml are as expected._ 5. Run the following command to get the NAT Gateway IP that needs to be whitelisted on our end. When doing this, it is best to also confirm with the customer. 1. If VM is internet accessible, you can run the following to get the IP address: ```shell wget -qO- http://ipecho.net/plain ; echo ``` 6. Whitelist IP 1. Enter the IP in this file `infra/terraform/agent/prod/variables.t` ## Confirming NET_BIOS Domain for AD Service Account 1. Sometimes a NETBIOS_DOMAIN must appear in the username of the service account in the format of NETBIOS_DOMAIN\username 1. **e.g:** `MOVEWORKS\svc_ad_moveworks` 2. To check open ‘Active Directory Users and Computers’ 3. Click on the ‘Find Objects in Active Directory Domain Services’ icon on the toolbar. (Looks like folder with magnifying glass) 4. Search for service account ![](https://files.readme.io/ae0b4e0-image.png)
5. Double-click on the account 6. Select ‘Account’ in pop-up and the if there is something in the ‘User logon name’, that should be used as the `netbios\\domain` ![](https://files.readme.io/70c5147-CleanShot_2024-05-16_at_16.21.422x.png) 7. Update Service Account Username in config to `netbios_domain\\username`, for the above it would be **MOVEWORKS\svc_ad_moveworks** # **Known Issues and their Fixes** ### 🔴 config: Moveworks access secret is invalid This is exactly what it sounds like, your secret is the problem ![](https://files.readme.io/f28e5e7-image.png) Sometimes, this means that the account is locked out. Look for the `data` error code in the log line. **52e**: Invalid credentials—password expiry could be one of the main reasons. Please reset the service account password on call, lock and unlock the account, and try again after updating the new credentials in the agent_config.yml file. **775**: account is locked out ![](https://files.readme.io/e78f96a-image.png) ### 🔴 **Permission Denied when using start agent script** ```shell open /home/moveworks/agent/conf/agent_config.yml: permission denied ``` This happens when you try to start the agent configuration script with a user whose group ID does not match what is expected in our script. Run through the steps again and ensure you use the correct group ID listed in our setup guide - we cannot use any other ID since the agent is looking for a specific one. ### 🔴 **Permission Denied when accessing certificate file** Often when installing updated certificate files, we SCP the new cert into the desired machine’s cert folder. However, this new cert does not have the same file permissions as the previous cert our agent was leveraging. In order to quickly copy file permissions we can use the `chmod --reference=reference_file file` command. ```shell # navigate to your certs directory cd /home/moveworks/certs # view file permissions for each file in the directory ls -ltr # locate the file that you want to copy file permissions from (likely the old certificate) # apply the file permissions from your old cert to your new sudo chmod --reference=oldCERT.pem newCERT.pem ls -l newCERT.pem ``` *** ### 🔴 **Agent cannot communicate with Moveworks** ```text bond server reply: rest_pool: Get \"https://{{customer_itsm}}/rest/servicedeskapi/servicedesk/1\": Proxy Authentication Required", ``` This happens when you have a Proxy for a system _(AD for e.g.)_, but your internal ITSM system or Knowledge System needs the Agent to bypass the proxy. In such cases, please add the following line to the system's config line that needs to bypass the proxy (same level as `enabled: true`): ```yaml do_not_use_rest_proxy: true ``` *** ### 🔴 **Agent cannot communicate with customer’s system (e.g. Confluence)** ```text bond server reply: rest_pool: Get \"https:///rest/api/content?limit=25&spaceKey=AHLP\": Forbidden ``` This happens when you have set up a proxy for outbound connections but don’t require a proxy for their internal system. In such cases, please add the following line to the system's config line that needs to bypass the proxy (same level as `enabled: true`): ```json do_not_use_rest_proxy: true ``` *** ### 🔴 **Failure when authenticating to Moveworks** ```text retry: Received status code 401 while trying to authenticate (moveworks_user=ORGNAME) ``` 1. _Invalid Access Secret (token expires if not used)_ - if the token is **not used within 30 days of generating**, it will expire. Please generate a new token in the **MW Setup Agent Page**. **To validate if a token works, you can use the following command:** ```curl curl --header "Content-Type: application/json" --request POST --data '{"access_secret":"{ADD YOUR TOKEN HERE}", "access_key": "mykey"}' https://agent.moveworks.com/api/v1/auth ``` Config fail: ```curl curl --header "Content-Type: application/json" --header "Authorization: " --request POST https://agent.moveworks.com/api/v1/config ``` **Firewall / Whitelisting tests:** 1. Run _**ping google.com**_ to see if outgoing traffic is allowed 2. Try to _**curl \<[https://agent.moveworks.com/api/v1/web>](https://agent.moveworks.com/api/v1/web>)**_ to see if our agent service can be hit *** ### 🔴 **Agent fails to start with “Operation Not Permitted” Error** ![](https://files.readme.io/fdb982a-image.png) **Issue:** This error is caused by lack of executable permissions to the user that you are trying to run this containers as. Work with your sys admin to get this resolved on a VM level. If not, simply removing this line from `start_agent.sh` will allow the container to start. * `--security-opt=no-new-privileges` *** ### 🔴 **I/O Timeout in Agent Logs** ```bash [ERROR] [2020-09-11T00:43:20Z] [moveworks/golang/utils/retry/retry.go:177] retry: Post "https://agent.moveworks.com/***********": dial tcp: lookup agent.moveworks.com on 10.255.255.6:53: read udp 172.17.0.2:47416->10.255.255.6:53: i/o timeout (moveworks_user=ORGNAME) ``` **Issue:** This error shows that the agent residing inside docker is unable to make an outbound connection to our agent system. This can be caused by several reasons such as firewall on your side or whitelisting on our side or potential a docker networking issue. Try running the agent in `host network mode` [(see below **Docker Networking issue - ‘No Route to Host’ Error**)](/docs/moveworks-agent-troubleshooting#-docker-networking-issue---no-route-to-host-error) *** ### 🔴 **CA Bundle Certificate missing** **Error:** `retry: Post " *******": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (moveworks_user=ORGNAME)` **Issue:** CA Bundle certs or agent certs are not being recognized **Potential Fix:** 1. Find where the ca bundle is hosted on the machine by running `openssl version -d` Sample output: `OPENSSLDIR: "/etc/pki/tls"` 2. Go to the path and find the ca bundle cert Sample path: `/etc/pki/tls/certs/ca-bundle.crt` 3. Mount the volume in the start_agent.sh 1. Sample parameter: `-v /etc/pki/:/etc/pki/` 4. Map the path in `conf/agent_config.yml` Sample config: ```yaml moveworks_config: path_to_cert: /etc/pki/tls/certs/ca-bundle.crt ``` *** ### 🔴 **Docker Networking issue - ‘No Route to Host’ Error** 1. Restart the docker daemon on the machine by running `sudo systemctl restart docker` and then bring up a new agent If you run into a no route to host error, try using telnet on the server to see if the server can even access the LDAP Domain Controller: If you see something like below, then there is a networking issue on the customer’s side ![](https://files.readme.io/fc3d214-image.png) If not, then you can try the following steps as a workaround to use the host machine’s networking configuration. 1. Edit the `start_agent.sh` and add the `--net=host` flag to enable `host network mode` 1. More information on that setting here: \<[https://stackoverflow.com/questions/43316376/what-does-net-host-option-in-docker-command-really-do>](https://stackoverflow.com/questions/43316376/what-does-net-host-option-in-docker-command-really-do>) 2. Restart the agent 3. Note: The logs under logs folder will no longer have the image ID in the title and will instead use the machine name. *** ### 🔴 **Docker container is shut down upon server restart** * Run `sudo systemctl enable docker` after installing the agent to ensure docker is restarted whenever the server is. **Docker complains about the log size constraints** When running the config script you get the following error: `docker: Error response from daemon: unknown log opt ‘max-size’/‘max-file’/etc for journald log driver` Solution: Edit the config script in vim and add the following line `--log-driver json-file` *** ### 🔴 **Docker ps says image is in ‘Restarting’ state, logs might not be generating** This is likely a permission issue on either the config file, or the certs file. When copying or creating files, the strictest permissions are applied. Run `chmod 777` on the config file and the certificate file if you have it. Restart the Agent. if it is still an issue, run it on all of the agent related files. *** ### 🔴 **Certificate Issues** If root CA certs don’t work, try cat-ing them to see if they’re in **base64**. It should look something like: ```text -----BEGIN CERTIFICATE----- MIIFbTCCA1WgAwIBAgIJAN338vEmMtLsMA0GCSqGSIb3DQEBCwUAME0xCzAJBgNV BAYTAlVLMRMwEQYDVQQIDApUZXN0LVN0YXRlMRUwEwYDVQQKDAxHb2xhbmcgVGVz dHMxEjAQBgNVBAMMCXRlc3QtZmlsZTAeFw0xNzAyMDEyMzUyMDhaFw0yNzAxMzAy MzUyMDhaME0xCzAJBgNVBAYTAlVLMRMwEQYDVQQIDApUZXN0LVN0YXRlMRUwEwYD VQQKDAxHb2xhbmcgVGVzdHMxEjAQBgNVBAMMCXRlc3QtZmlsZTCCAiIwDQYJKoZI GKj0lGpnLfGqwhs2/s3jpY7+pcvVQxEpvVTId5byDxu1ujP4HjO/VTQ2P72rE8Ft r05pE3PdHn9JrCl4iWdVlgtiI9BoPtQyDfa/OEFaScE8KYR8LxaAgdgp3zYncWls BpwQ6Y/A2wIkhlD9eEp5Ib2hz7isXOs9UwjdriKqrBXqcIAE5M+YIk3+KAQKxAtd 4YsK3CSJ010uphr12YKqlScj4vuKFjuOtd5RyyMIxUG3lrrhAu2AzCeKCLdVgA8+ 75FrYMApUdvcjp4uzbBoED4XRQlx9kdFHVbYgmE/+yddBYJM8u4YlgAL0hW2/D8p z9JWIfxVmjJnBnXaKGBuiUyZ864A3PJndP6EMMo7TzS2CDnfCYuJjvI0KvDjFNmc rQA04+qfMSEz3nmKhbbZu4eYLzlADhfH8tT4GMtXf71WLA5AUHGf2Y4+HIHTsmHG vQ== -----END CERTIFICATE----- ``` *** ### 🔴 **x509: certificate signed by unknown authority when connecting to the Moveworks agent API** If you see this error in the agent log when attempting to connect to [agent.moveworks.com](http://agent.moveworks.com) it is likely a problem of not having a local cert signed by a Known Authority _Note: This error is not related to the certificate provided to connect to the customer’s on-premise servers (.pem file), it’s the system cert used to connect to our API_ ```text [ERROR] [2021-12-23T14:40:10Z] [moveworks/golang/utils/retry/retry.go:177] retry: Post "https://agent.moveworks.com/***********": x509: certificate signed by unknown authority (moveworks_user=ORGNAME) ``` You can verify this using this curl command. (if you run this command on your local machine you should see the expected output) ```curl curl -vs https://agent.moveworks.com/api/v1/web ``` The output will show you the handshake process step-by-step: ```python * Trying 54.149.4.60:443... * TCP_NODELAY set * Connected to agent.moveworks.com (54.149.4.60) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (OUT), TLS alert, unknown CA (560): * SSL certificate problem: unable to get local issuer certificate * Closing connection 0 ``` Things to note: * `CApath: /etc/ssl/certs` Shows you where the certs are stored locally. * `CAfile: /etc/ssl/certs/ca-certificates.crt` Shows you the specific cert being used in this call. * `TLSv1.3 (OUT), TLS alert, unknown CA (560):` Means the cert is signed by an unknown authority * `SSL certificate problem: unable to get local issuer certificate` We were unable to find an acceptable cert **Solution**: Ask the customer’s admins to install a cert from a known trusted certificate authority and reference it **Last resort:** Copy the combine cert file (sometimes in `/etc/ssl/certificates/ca-certificates.crt` ) to the agent’s `/cert` folder, mount it in `./start_agent.sh` and the refer to it under `moveworks_config` ```yaml #!/bin/bash docker run \ -d \ --read-only \ --security-opt=no-new-privileges \ --restart=unless-stopped \ --log-driver=json-file \ --log-opt max-size=10m \ --log-opt max-file=5 \ -v "$(pwd)/conf":/home/moveworks/agent/conf \ -v "$(pwd)/logs":/var/log/moveworks \ -v "$(pwd)/certs":/home/moveworks/agent/certs \ moveworks_agent ``` ```yaml ldap_config: enabled: true host: #hostname here port: 636 service_password: #password here service_user: #userame here use_ssl: true path_to_cert: /home/moveworks/agent/certs/ca-certificates.crt # This is the cert used to connect to LDAP moveworks_config: access_key: #orgname here access_secret: #org secret here auth_url: https://agent.moveworks.com/api/v1/auth config_url: https://agent.moveworks.com/api/v1/config proxy_url_enc: #proxy secret here path_to_cert: /home/moveworks/agent/certs/ca-certificates.crt # < -------- This is the cert used to connect to moveworks API ``` You can also test specific certs using curl to see if they work: ```bash curl -vs --cacert {cert_path} https://agent.moveworks.com/api/v1/web ``` A successful connection looks like this: ```yaml * Trying 54.149.4.60:443... * TCP_NODELAY set * Connected to agent.moveworks.com (54.149.4.60) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 * ALPN, server did not agree to a protocol * Server certificate: * subject: CN=*.moveworks.com * start date: Jan 1 07:40:57 2022 GMT * expire date: Jan 15 07:40:57 2022 GMT * subjectAltName: host "agent.moveworks.com" matched cert's "*.moveworks.com" * issuer: C=US; ST=California; O=Zscaler Inc.; OU=Zscaler Inc.; CN=Zscaler Intermediate Root CA (zscalertwo.net) (t) * SSL certificate verify ok. > GET /api/v1/web HTTP/1.1 > Host: agent.moveworks.com > User-Agent: curl/7.68.0 > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 401 Unauthorized < Date: Tue, 04 Jan 2022 18:25:12 GMT < Content-Type: text/plain; charset=utf-8 < Content-Length: 59 < Connection: keep-alive < Set-Cookie: AWSALB=A/1VlVGMWS42786wbIPJjuulX0B4VWnBkO+lOo4QOhwlfQnUfwh57m86me/84Lavi909lCF2nKW9FezGKypi/vF+ns6a2fckfhAx8Z54Z1vwt6a8SHl8QCEDPXea; Expires=Tue, 11 Jan 2022 18:25:12 GMT; Path=/ < Set-Cookie: AWSALBCORS=A/1VlVGMWS42786wbIPJjuulX0B4VWnBkO+lOo4QOhwlfQnUfwh57m86me/84Lavi909lCF2nKW9FezGKypi/vF+ns6a2fckfhAx8Z54Z1vwt6a8SHl8QCEDPXea; Expires=Tue, 11 Jan 2022 18:25:12 GMT; Path=/; SameSite=None; Secure < X-Content-Type-Options: nosniff < auth: token is not valid: you must provide a jwt to verify * Connection #0 to host agent.moveworks.com left intact ``` *** ### 🔴 **x509: certificate signed by unknown authority error when connecting to the LDAP service** You will see an error in the agent log referring to an LDAP error code. This means the cert used to connect to LDAP is signed by an unknown authority. This may mean that you are using a custom CA, and you need the full cert chain for this to work. Double check the cert `.pem` file being used has the whole chain. ```yaml [ERROR] [2022-01-04T18:18:14Z] [moveworks/golang/utils/retry/retry.go:177] retry: DialURL: LDAP Result Code 200 "Network Error": x509: certificate signed by unknown authority (moveworks_user=ORGNAME) ``` Try pulling the cert on your own, then adding the cert to the `ldap_config`: ```yaml ldap_config: enabled: true host: # host here port: 636 service_password: # password service_user: # user use_ssl: true path_to_cert: /home/moveworks/agent/certs/cert.pem # <----- this is the ldap cert ``` *** ### 🔴 No such File or Directory: TLS config: TLS cert path: open /home/moveworks/agent/certs/cert.pem** > _The cert wasn’t showing up in the container, it was only on the local machine and not in the container. So adding a volume that maps the location on he local machine to the location in the container works._ Which is solved by adding the volume `/home/moveworks/agent/certs` to `start_agent.sh` ``` docker run \ -d \ --read-only \ --security-opt=no-new-privileges \ --restart=unless-stopped \ --log-driver=json-file \ --log-opt max-size=10m \ --log-opt max-file=5 \ -v "$(pwd)/conf":/home/moveworks/agent/conf \ -v "$(pwd)/agent/certs":/home/moveworks/agent/certs \ -v "$(pwd)/logs":/var/log/moveworks \ moveworks_agent ``` *** ### 🔴 **x509: certificate signed by unknown authority when connecting to an On-Prem REST API service (Confluence, Jira, BMC Remedy, Custom APIs, etc)** **Self-signed certificate which is untrusted** ```markdown bond server reply: rest_pool: Get "https://{{instance_name}}/rest/api/latest/issue/{{ticket_id}}\": x509: certificate signed by unknown authority ``` Use the `ca_cert_path`parameter to map the cert path. This cert path is the path on the **container**, not the **host machine**. Remember to check your cert mount path on the container to ensure that the cert is accessible to the container. Most of the time, this value will be: _ca_cert_path: /home/moveworks/agent/certs/myCert.crt_ **Certificate on the on-prem jira instance is expired.** ```markdown bond server reply: rest_pool: Post "https://{{instance_name}}/rest/servicedeskapi/request": x509: certificate has expired or is not yet valid ``` Any of the above x509: certificate errors could be due to the fact that your on-prem REST instance we are connecting to has invalid certs for HTTPS. You can allow the Agent to ignore self-signed certificate errors by adding the following config to the `agent_config.yml` file under `rest_configs`: ```yaml tls_skip_verify: true ``` :warning:Using the config `tls_skip_verify: true`is** not recommended.**
### 🔴 **Unknown Certificate Authority** If you see the following error:`Unknown Certificate Authority`, it could be because the certificate chain the customer exported is in a different format than .pem if its a `.p7b` file, convert it from `.p7b` to `.pem` instead of simply renaming the file: ```yaml openssl pkcs7 -print_certs -in certnew.p7b -out ca_chain.pem ``` **Source:** \<[https://knowledge.digicert.com/solution/SO21448.html>](https://knowledge.digicert.com/solution/SO21448.html>) If it is a `.cer` file, convert it from `.cer` to `.pem` instead of simply renaming the file: ```shell openssl x509 -inform der -in MTCAD-root.cer -out MTCAD-root.pem ``` **Source:** \<[https://www.sslshopper.com/article-most-common-openssl-commands.html>](https://www.sslshopper.com/article-most-common-openssl-commands.html>) *** ### 🔴 **Certificate Error: Certificate Relies on Legacy Common Name Field** `x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0` This means that the certificate only includes a CN and there is no SAN. This type of cert has been depreciated. enable `tls_skip_verify: true` in the agent config *** ### 🔴 **404 Not Found for config request** ```bash [ERROR] [2021-12-07T02:09:03Z] [moveworks/golang/utils/retry/retry.go:177] retry: received non-200 response status status [404 Not Found] for config request (moveworks_user=ORGNAME) ``` This error can occur if the `config url` is incorrect. *** ### 🔴 **Unable to read LDAP Response Packet: connection reset by peer** **Solution:** This indicates an issue with the LDAP Server denying the connection, which is likely due to a bad cert, or port issues. In this case, try opening port 389 and connecting via LDAP as a workaround. These rules out there are no network issues; from there, you can work with the relevant teams to pull an appropriate cert for LDAPS connection. *** ### 🔴 **TLS handshake error: first record does not look like a TLS handshake** **Solution:** This typically happens when you are using an `HTTP_PROXY` or `HTTPS_PROXY` and the proxy either supports HTTPS and you are using HTTP or vice versa, where it supports HTTP only and what is configured is https. *** ### 🔴 **After starting the agent, the container does not show up when we run podman ps or docker ps** **Solution:** Check `agent_config.yml` file as there might be typos. *** ### 🔴 **404 Not Found for authentication request** ```bash [ERROR] [2021-1-18T02:09:01Z] [moveworks/golang/utils/retry/retry.go:177] retry: Received status code 404 while trying to authenticate (moveworks_user=ORGNAME) ``` This error occurs when the `auth_url` is incorrect in the config file on your server. The screenshot shows that the `auth_url` has a space and single quotes. That caused the error ![](https://files.readme.io/89f2f56-image.png) **SOLUTION:** Fix `auth_url`, by replacing it with `https://agent.moveworks.com/api/v1/auth` without quotes. *** ### 🔴 **LDAP Result Code 200 Certificate is Valid Error** ```bash [ERROR] [2022-1-18T21:00:44Z] [moveworks/golang/utils/retry/retry.go:177] retry: DialURL: LDAP Result Code 200 "Network Error": x509: certificate is valid for NMH-DC01.ADNMHC.NMMC.com [ERROR] [2022-1-18T21:00:55Z] [moveworks/golang/utils/retry/retry.go:177] retry: DialURL: LDAP Result Code 200 "Network Error": x509: certificate is valid for NMH-DC02.ADNMHC.NMMC.com [ERROR] [2022-1-18T21:00:60Z] [moveworks/golang/utils/retry/retry.go:177] retry: DialURL: LDAP Result Code 200 "Network Error": x509: certificate is valid for NMH-DC03.ADNMHC.NMMC.com ``` This error occurs when we are pointing to a domain such as `abc.xyz.com` which actually has 3 LDAP servers that can handle requests for anyone trying to access the domain `abc.xyz.com` and the certificate is one that was made for the server instances. In this case, there were 3 servers associated with 1 domain. **SOLUTION:** In the agent config on the customers server change the `host` to not use the domain which could be 3 different servers, to use 1 specific server ```yaml ldap_config: host: abc.xyz.com ``` to: ```yaml ldap_config: host: nmh-dc02.abc.xyz.com ``` *** ### 🔴**LDAP Result Code 200 Network Error “server misbehaving”** ```text LDAP Result Code 200 "Network Error": dial tcp: lookup d1.d2.,w.com on 172.22.160.252:53: server misbehaving ``` Where `[d1.d2.mw.com](http://d1.d2.mw.com)` is the LDAP host address configured in `agent_config.yml` file. **SOLUTION:** 1. Check `~/etc/resolv.conf` on the host machine as well as on the Docker container to ensure the DNS resolver configurations are the same ```yaml cat ~/etc/resolv.conf ``` 2. Next stop and kill all containers (in all states) ``` docker stop $(docker ps -a -q) docker rm $(docker ps -a -q) ``` 3. Restart docker service and spin up a new agent container ``` sudo systemctl restart docker ./start_agent.sh docker update --restart unless-stopped $(docker ps -q) ``` **ALTERNATIVE SOLUTION:** Enable host networking mode in the start agent script as indicated [earlier](/docs/moveworks-agent-troubleshooting#-docker-networking-issue---no-route-to-host-error). *** ### 🔴 **Podman User Namespaces not enabled error** `Podman run error in non-root mode: "user namespaces are not enabled in /proc/sys/user/max_user_namespaces"` **SOLUTION:** Documented here: \<[https://github.com/containers/podman/issues/7704>](https://github.com/containers/podman/issues/7704>) CentOS 7 requires running ```bash echo “user.max_user_namespaces=10000” > /etc/sysctl.d/42-rootless.conf and sysctl --system as root ``` *** ### 🔴 **LDAP Result Code 200 Cert is Expired** ```text [ERROR] [2022-01-20T02:56:06Z] [moveworks/golang/utils/retry/retry.go:177] retry: DialURL: LDAP Result Code 200 "Network Error": x509: certificate has expired or is not yet valid: current time 2022-01-20T02:56:06Z is after 2022-01-04T23:54:14Z (moveworks_user=ORGNAME) ``` This error can mean two different things so both need to be checked: 1. **CERT ON THE VM IS EXPIRED:** **Use the following command to verify if the cert is expired:** `openssl x509 --enddate --noout cert.pem` If the cert is a cert chain, you may need to split the cert into **2 or 3** `.pem` files and run the command for each individual part of the cert. This will help you narrow down the cert that is expired. 1. **YOUR LDAP HOST CERT IS EXPIRED:** Get your cert and check the expiration date: 👇 Replace `domain.host.com` with the host domain configured in `agent_config.yml` ```shell # reachout to the LDAP host and download the certificate openssl s_client -connect domain.host.com:636 -showcerts /dev/null | openssl x509 -outform PEM > cert.pem # use keytool to read the contents of the certificate keytool -printcert -v -file cert.pem ``` *** ### 🔴 **LDAP connection reset by peer** This implies the port is closed, if you are connecting to port 389, then you will likely need to use port 636 with a cert and configure LDAPS. *** ### 🔴 **Common LDAP Error Codes** Some common LDAP errors to help with troubleshooting when running the `ldapwhoami` or `ldapsearch`command. | Error Code | Error | Description | | ---------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | 525 | User not found | Returned when an invalid username is supplied. | | 52e | Invalid credentials | Returned when a valid username is supplied but an invalid password/credential is supplied. If this error is received, it will prevent most other errors from being displayed. | | 530 | Not permitted to logon at this time | Returned when a valid username and password/credential are supplied during times when login is restricted. | | 531 | Not permitted to logon from this workstation | Returned when a valid username and password/credential are supplied, but the user is restricted from using the workstation where the login was attempted. | | 532 | Password expired | Returned when a valid username is supplied, and the supplied password is valid but expired. | | 533 | Account disabled | Returned when a valid username and password/credential are supplied but the account has been disabled. | | 701 | Account expired | Returned when a valid username and password/credential are supplied but the account has expired. | | 773 | User must reset password | Returned when a valid username and password/credential are supplied, but the user must change their password immediately (before logging in for the first time, or after the password was reset by an administrator). | | 775 | Account locked out | Returned when a valid username is supplied, but the account is locked out. Note that this error will be returned regardless of whether or not the password is invalid. | | | | | *** ### 🔴 **Permission Denied for config file** When starting the config script and you run into ```bash open /home/moveworks/agent/conf/agent_config.yml: permission denied ``` This is what the directory permissions for the agent should look like after running the setup guide. Make sure the group ID is `17540`. ![](https://files.readme.io/456bad8-image.png) Selinux [https://prefetch.net/blog/2017/09/30/using-docker-volumes-on-selinux-enabled-servers/#:~:text=To allow a docker container,Z to the volume mount](https://prefetch.net/blog/2017/09/30/using-docker-volumes-on-selinux-enabled-servers/#:~:text=To%20allow%20a%20docker%20container,Z%20to%20the%20volume%20mount). *** ### 🔴 **Failing to connect to[agent.moveworks.com](http://agent.moveworks.com) TLS timing out on client hello** This is possibly related to **MTS** (maximum transmission unit) being too small. \<[https://linuxhint.com/how-to-change-mtu-size-in-linux/>](https://linuxhint.com/how-to-change-mtu-size-in-linux/>) ```shell ifconfig | grep mtu ifconfig mtu up ``` *** ### 🔴 **No Such File or Directory for`start_agent.sh`** If you see the following error: ```yaml open/ home/moveworks/agent/scripts/start_agent.sh: no such file or directory ``` This might mean that the command to run the configuration script may be incorrect (check for new line characters). *** ### 🔴 **[401 Forbidden] when trying to pull Agent image onto the machine** When using `wget` to pull the agent image from the s3 URL, you may get this error. This means the VM may not have outbound network connectivity enabled. *** ### 🔴 **LDAP Result Code 200 “Network Error”: dial tcp:** The error will reference the name of the server that is erroring out. Verify that the FQDN of the LDAP server is correct as specified in the agent_config.yml file. It’s possible the server was renamed or decommissioned. *** ### 🔴 **LDAP Result Code 8 “Strong Auth Required”** Typically, this means you are attempting to use port 389 with LDAP, but the server expects LDAPS with port 636 and a certificate. You will need to reconfigure the agent with a cert using LDAPS. See Slack Overflow post for more info: \<[https://stackoverflow.com/questions/24385929/stronger-authentication-required>](https://stackoverflow.com/questions/24385929/stronger-authentication-required>) *** ### 🔴 **`moveworks/agent/certs/agent_key.pem`permission denied** The key doesn’t exist in this case because it couldn’t be written there. We need to update the permissions of the directory. Try running `sudo chown -R 17540:17540 .` in the root of the agent directory. *** ### 🔴 **ITSM_DISCONNECTED error message** If your configuration is leveraging an on-prem agent for ticketing (ServiceNow, Jira, etc.) and you are seeing consistent timeouts within the bot when opening/closing tickets, then please check the following: * There are no WAFs that are bottlenecking the connection from the agent to your ITSM system * Typically, lower environment ITSM systems have lower CPU/storage due to low traffic. Please verify that the timeout is not caused by a throttling of the ITSM system. You can increase the CPU/storage to resolve this. *** # **Handy Docker Commands when Debugging** 1. Tail docker logs and ensure no startup errors 1. Show only new logs: `docker logs -f containerName` 2. Tail INFO logs outputted by agent 1. navigate to logs directory 2. tail -f *.INFO.log 3. Access the command line inside the docker container: `docker exec -it $(docker ps --format '{{.Names}}') /bin/bash` 4. Kill all running docker containers: `docker kill $(docker ps -q)` 5. Remove all stopped docker containers: `docker rm $(docker ps -aq)` 6. Remove all docker images (note: requires you to [reload the agent image](https://www.notion.so/Agent-Setup-Guide-Internal-DEPRECATED-a1ce380d137043278ef44c07de845cc9?pvs=21)): `docker rmi $(docker images -q)` 1. **Warning:** This removes all docker images, don’t do this if the customer has other docker images loaded on this machine. 2. Instead do `docker rmi {docker image id here}` 1. Get the image Id from `docker images` 2. Verify the agent build image is the date expected ## Other Tips * For terminal commands on the host machine, you may need to run `sudo bin/bash` before performing any commands. * If scripts don't work on the host machine, try `cat