Edge Agent Troubleshooting
This guide covers common issues with the Edge Agent and how to resolve them.
Agent Not Starting
Symptom: The agent service fails to start or crashes immediately.
Check the service status:
sudo systemctl status atlasai-agentCheck the agent logs:
sudo journalctl -u atlasai-agent --no-pager -n 50Common causes:
| Issue | Solution |
|---|---|
| Configuration file missing | Run the installer again or create /etc/atlasai/agent.yaml manually |
| Invalid YAML syntax | Validate with atlas-agent validate-config |
| Port already in use | Check if another process is using the metrics port (9100) |
| Permission denied | Ensure the agent runs as root or a user with appropriate permissions |
| Binary not found | Verify /usr/local/bin/atlas-agent exists and is executable |
Agent Not Connecting to Tenant Plane
Symptom: Agent is running but shows as “Disconnected” in the UI.
Diagnose connectivity:
curl -v https://<tenant-plane-url>:8443/api/healthCommon causes:
| Issue | Solution |
|---|---|
| Wrong Tenant URL | Verify tenant.url in /etc/atlasai/agent.yaml |
| Invalid API key | Regenerate the API key in Settings → Edge Agents |
| Firewall blocking | Ensure outbound access to the Tenant Plane on port 8443 |
| TLS certificate error | Set tls_skip_verify: true temporarily for debugging, or install the CA cert |
| Proxy required | Set HTTPS_PROXY environment variable in the systemd service file |
To add proxy settings:
sudo systemctl edit atlasai-agentAdd:
[Service]
Environment="HTTPS_PROXY=http://proxy.corp.example.com:8080"Then restart:
sudo systemctl daemon-reload
sudo systemctl restart atlasai-agentHigh Resource Usage
Symptom: Agent consuming more CPU or memory than expected.
Normal resource usage:
| Resource | Expected |
|---|---|
| CPU | < 1% average, < 5% during collection bursts |
| Memory | 30–60 MB RSS |
| Disk I/O | Minimal (log file reads + buffer writes) |
If usage is higher:
- Reduce collection frequency — Increase intervals in
agent.yaml:collectors: system: interval: 30s process: interval: 60s - Limit log paths — Narrow
logs.pathsto only the files you need - Reduce process count — Lower
process.top_nfrom 20 to 10 - Enable debug logging temporarily — Set
logging.level: debugto identify which collector is consuming resources, then revert
Missing Metrics
Symptom: Some expected metrics are not appearing in the Tenant Plane.
Verify collectors are enabled:
atlas-agent statusThis shows the state of each collector (running, stopped, error) and the last collection timestamp.
Common causes:
| Issue | Solution |
|---|---|
| Collector disabled | Enable the collector in agent.yaml |
| Permission denied | The agent needs read access to /proc, /sys, and log files |
| Mount point not listed | Add the mount point to disk.mount_points or leave empty for all |
| Network interface not detected | Verify interface names match what the OS reports via ip link |
Log Forwarding Issues
Symptom: Logs not appearing in the AtlasAI Logs module.
Check log collector status:
atlas-agent status --collector logsCommon causes:
| Issue | Solution |
|---|---|
| File not matching glob | Verify the file path matches a pattern in logs.paths |
| File permissions | Agent needs read access to the log files |
| File excluded | Check logs.exclude_paths |
| Multiline misconfigured | Verify multiline.pattern matches your log format |
| Buffer full | If the agent was offline, the buffer may be full — check transport.buffer_size |
Runbook Execution Failures
Symptom: Runbook steps fail when executed on the agent.
Check execution logs:
sudo journalctl -u atlasai-agent -g "runbook" --no-pager -n 50Common causes:
| Issue | Solution |
|---|---|
| Command blocked | Check runbook_executor.blocked_commands in the config |
| Timeout | Increase runbook_executor.timeout for long-running commands |
| Permission denied | The agent’s runbook executor runs as the agent user — verify permissions |
| Executor disabled | Set runbook_executor.enabled: true |
Getting Help
If these steps don’t resolve your issue:
- Collect a diagnostic bundle:
atlas-agent diagnostics --output /tmp/atlas-diag.tar.gz - Open a support ticket at support.atlastechlab.com
- Attach the diagnostic bundle to the ticket