On-Prem Agent Troubleshooting
On-Prem Agent Troubleshooting
Installation Errors
These errors occur while running setup_agent.sh during initial setup or upgrade. For errors after the agent is running, see Diagnosing Common Issues below.
🔴 Docker service fails to start
There may be an issue with the Docker installation. To confirm:
- Run
docker logs --follow - If you see
exec /bin.sh: operation not permitted, uninstall Docker and reinstall using the Docker convenience script
🔴 Docker or Podman will not install
The script can only auto-install Docker or Podman on select Linux distributions. Manually install Docker or Podman and rerun the script with the same options.
🔴 Podman repository error (Ubuntu earlier than 22.04)
Ubuntu versions earlier than 22.04 do not support Podman via the official repository. Manually install Docker or Podman and rerun the script.
🔴 yum-utils missing (RHEL)
RHEL-based distributions require yum-utils to install Docker. Manually install Docker and rerun the script.
🔴 loginctl not installed (Podman)
If loginctl is not installed, the script continues. This is not an issue for Docker-based containers, but for Podman the agent may stop running when the user session ends.
🔴 systemd not installed
Without systemctl, the Docker or Podman container may not restart automatically on boot.
🔴 Docker or Podman version too old
Upgrade Docker or Podman to meet the minimum version requirement, then rerun the script.
🔴 systemd directory missing (Podman)
Manually create the directory and rerun the script with the same options.
🔴 Podman container naming or systemd unit error
Clean up old containers, networks, and volumes: podman system prune
🔴 systemd daemon reload error (Podman)
Verify that systemd is correctly installed and functional.
🔴 Failed to pull agent image from ECR
Verify the version tag is correct at gallery.ecr.aws/moveworks/agent. If the VM cannot reach ECR due to firewall restrictions, see Fetching the Agent Image Without ECR in the Installation Guide.
Diagnosing Common Issues
This page covers the most common configuration and runtime issues encountered when running the Moveworks On-Prem Agent.
Configuring LDAP over SSL (LDAPS, Port 636)
Prerequisites
Determine whether you need a certificate file to connect to the LDAP server over SSL. If so, you will need a base64-encoded .pem certificate file (typically the root cert or full cert chain). Wildcard certs are not accepted — see Microsoft’s guidance.
The certificate file should look like this and live in the ./certs directory:
If the exported certificate is a .p7b file, convert it to .pem:
Verifying the certificate is valid
The Active Directory fully qualified domain name of the domain controller should appear in either the CN of the Subject field or the DNS entry in the Subject Alternative Name extension.
Pulling the certificate directly from the server
Applying the configuration
The recommended approach is to re-run the setup script and select the SSL option to supply the LDAP certificate path:
To apply manually, edit /home/moveworks/agent/conf/agent_config.yml with the agent stopped:
- Use the LDAP hostname, not IP address
- Set
port: 636 - Set
use_ssl: true - Set the certificate path:
path_to_cert: /home/moveworks/agent/certs/mycert.pem
Example — with certificate:
Example — with well-known CA (no cert needed):
Example — LDAP without SSL (port 389):
Example — self-signed certificate (TLS skip verify):
tls_skip_verify: true disables certificate verification. Do not use this in production.
Validating the LDAP Service Account
Via AD Explorer
AD Explorer lets you verify service account access to the LDAP server.
- Open AD Explorer and connect using the host, service account username, and password. You may need to connect to VPN first.
- Once connected, navigate to groups to validate access as needed.
Passwords containing special characters can cause authentication failures. If other steps check out, try resetting the password to one without special characters.
Via command line
Find the LDAP hostname:
Test service account credentials:
On Red Hat OS, install openldap first: yum install openldap*
Confirming the NETBIOS Domain for an AD Service Account
Some environments require the service account username in the format NETBIOS_DOMAIN\username (e.g. MOVEWORKS\svc_ad_moveworks).
To find the NETBIOS domain:
- Open Active Directory Users and Computers
- Click the Find Objects icon (folder with magnifying glass)
- Search for the service account and double-click it
- Select the Account tab — the value in User logon name is the NETBIOS domain
Update the agent config to use: NETBIOS_DOMAIN\username
Known Issues and Fixes
🔴 Moveworks access secret is invalid
Your org access secret is incorrect or has expired. Tokens expire if not used within 30 days of generation.
Generate a new secret in Moveworks Setup > Core Platform > On-Premise Agents > Generate Secret.
To validate a token manually:
🔴 Permission denied when starting the agent
The user running the script has a group ID that does not match the expected value. Ensure all agent files have read and write permissions for user ID 17540 / group ID 17540:
🔴 Permission denied when accessing a certificate file
When copying a new cert into the certs directory via SCP, the new file may not inherit the correct permissions from the previous cert. Copy permissions from the old cert:
🔴 Agent cannot communicate — proxy authentication required
Your proxy applies to one system but not another. Add the following to the REST config entry that needs to bypass the proxy:
🔴 Agent cannot communicate with an on-prem system — Forbidden
Same cause as above. Add do_not_use_rest_proxy: true to the relevant REST config entry.
🔴 401 Unauthorized when authenticating to Moveworks
The access secret is invalid or expired (tokens expire after 30 days without use). Generate a new one in Moveworks Setup and reconfigure the agent:
Firewall tests:
🔴 Agent fails to start — “Operation Not Permitted”
The container user lacks executable permissions. Work with your sysadmin to resolve at the VM level. As a workaround, remove --security-opt=no-new-privileges from the start script.
🔴 I/O timeout in agent logs
The agent container cannot make an outbound connection to Moveworks. Common causes: firewall rules, DNS issues, or Docker networking.
Try running the agent in host network mode — see the Docker networking issue section below.
🔴 CA bundle certificate missing
CA bundle certs are not being recognized by the container.
- Find the CA bundle path:
openssl version -d(typically/etc/pki/tls/certs/ca-bundle.crt) - Mount the volume in the start script:
-v /etc/pki/:/etc/pki/ - Reference it in
agent_config.yml:
🔴 Docker networking issue — No Route to Host
- Restart the Docker daemon:
sudo systemctl restart docker - Use telnet to verify the server can reach the LDAP domain controller
- If networking is the issue, enable host network mode by adding
--net=hostto the start script
🔴 Docker container shuts down on server restart
🔴 Docker log size error
Edit the start script and add --log-driver json-file.
🔴 Container in “Restarting” state, logs not generating
Likely a permissions issue on the config or certs directory:
Restart the agent. If the issue persists, run chmod 777 on all agent-related files.
🔴 x509: certificate signed by unknown authority — connecting to Moveworks API
The system cert used to connect to the Moveworks API is not from a known CA. Ask your admins to install a cert from a trusted certificate authority.
As a last resort, copy the system CA bundle (/etc/ssl/certs/ca-certificates.crt) to the agent’s /certs folder, mount it, and reference it:
Test with:
🔴 x509: certificate signed by unknown authority — connecting to LDAP
The cert used to connect to LDAP does not include the full chain. Ensure the .pem file contains all intermediate certificates, then add it to ldap_config:
🔴 No such file or directory: TLS cert path
The cert exists on the host but is not mounted into the container. Add the volume to the start script:
🔴 x509: certificate signed by unknown authority — connecting to an on-prem REST service
Use ca_cert_path in the REST config to point to the cert inside the container:
If the cert is expired, you can temporarily disable verification (not recommended for production):
🔴 Unknown certificate authority — wrong file format
If the cert is a .p7b file:
If the cert is a .cer file:
🔴 Certificate relies on legacy Common Name field
The certificate only has a CN with no SAN, which is deprecated. Set tls_skip_verify: true in the agent config as a workaround, or request a new cert with a SAN from your CA.
🔴 404 for config request
The config_url in agent_config.yml is incorrect. Verify it matches your region — for US Commercial: https://agent.moveworks.com/api/v1/config.
🔴 404 for authentication request
The auth_url is incorrect. Ensure there are no spaces or quotes around the value. For US Commercial: https://agent.moveworks.com/api/v1/auth.
🔴 Unable to read LDAP response packet — connection reset by peer
The LDAP server is denying the connection. Likely a bad cert or wrong port. Try port 389 as a workaround to rule out network issues, then work with your team to obtain the correct LDAPS cert.
🔴 TLS handshake error — first record does not look like a TLS handshake
Typically caused by a proxy configured for the wrong protocol (HTTP vs HTTPS). Check whether your HTTP_PROXY or HTTPS_PROXY setting matches the protocol the agent is using.
🔴 Container not visible in docker ps / podman ps after start
Check agent_config.yml for typos. Even a single indentation error will prevent the container from starting.
🔴 LDAP Result Code 200 — certificate valid for wrong host
The agent is connecting to a load-balanced domain that resolves to multiple servers, but the cert is issued for a specific server. Point directly to one server in agent_config.yml:
🔴 LDAP Result Code 200 — server misbehaving (DNS)
DNS resolver mismatch between host and container.
- Compare
cat /etc/resolv.confon the host and inside the container - Stop and remove all containers:
- Restart Docker and start a new agent:
Alternatively, enable host networking mode in the start script.
🔴 Podman — user namespaces not enabled
On CentOS 7, run as root:
🔴 LDAP Result Code 200 — certificate expired
Check if the cert on the VM is expired:
Check if the LDAP host cert is expired:
🔴 LDAP connection reset by peer
The port is likely closed. If using port 389, switch to port 636 with a cert and configure LDAPS.
🔴 LDAP Result Code 8 — Strong Auth Required
The server requires LDAPS (port 636 with a cert), but the agent is configured for plain LDAP (port 389). Reconfigure using LDAPS.
🔴 ITSM_DISCONNECTED — consistent timeouts for ticketing
- Check for WAFs throttling the connection between the agent and your ITSM
- Lower environments often have reduced CPU/storage — verify the timeout is not caused by resource limits on the ITSM instance itself
Common LDAP Error Codes
Debugging Tips
Check agent logs
Useful Docker commands
Test connectivity
See Also
- Operations and Health — common commands, health monitoring, credential rotation
- Installation Guide — fresh install or reinstall steps