Troubleshooting
Fix agent offline errors, SSL provisioning failures, DNS misconfigurations, WAF blocks, and HAProxy issues. Diagnostic steps and solutions for every problem.
Troubleshooting
This page covers every common issue you may encounter with Lumos Gate and how to resolve it. Each section includes diagnostic steps, root causes, and solutions.
1. Agent Shows Offline in Dashboard
The most common issue. The agent appears as "Offline" in the dashboard even though the VPS is running.
Check Agent Service Status
systemctl status lumos-agentIf the service is not running, start it:
systemctl start lumos-agentIf it fails to start, check the logs:
journalctl -u lumos-agent -n 50 --no-pagerVerify the Connection Token
The agent authenticates with the WS server using the token provided during agent installation. If the token is incorrect, was regenerated, or the server was decommissioned and recreated, the agent cannot connect.
# Check for auth errors in the logs
journalctl -u lumos-agent | grep -i "auth\|token\|unauthorized\|401"Solution: If the token is invalid, decommission the server in the dashboard, create a new one, and reinstall the agent with the new token:
curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=NEW_TOKEN LUMOS_FORCE=1 bashCheck Outbound Connectivity
The agent makes an outbound WebSocket (WSS) connection to the Lumos WS server on port 443. Ensure your VPS firewall allows outbound HTTPS connections.
# Test basic connectivity to the WS server
curl -v https://lumosgate.com/health
# Test WebSocket upgrade (should return 101 or connection upgrade headers)
curl -v -H "Connection: Upgrade" -H "Upgrade: websocket" https://lumosgate.com/wsIf outbound connections are blocked, check your VPS firewall rules:
# Check iptables rules
iptables -L -n
# Check UFW status (if using UFW)
ufw status
# Ensure outbound HTTPS is allowed
ufw allow out 443/tcpCheck for Multiple Agent Instances
Ensure only one instance of the agent is running:
ps aux | grep lumos-agentIf multiple instances are running, stop them all and restart the service:
systemctl stop lumos-agent
pkill -f lumos-agent
systemctl start lumos-agentCheck DNS Resolution from the VPS
The agent needs to resolve the Lumos WS server hostname:
# Verify DNS resolution works
dig lumosgate.com +short
nslookup lumosgate.comIf DNS resolution fails, check /etc/resolv.conf and ensure a valid nameserver is configured. You can temporarily add a public DNS:
echo "nameserver 8.8.8.8" >> /etc/resolv.confAgent Crashed or OOM Killed
Check if the agent was killed by the kernel's OOM killer:
dmesg | grep -i "oom\|lumos\|killed"
journalctl -u lumos-agent | grep -i "signal\|kill\|exit"If the agent is being OOM-killed, your VPS may not have enough RAM. See Supported OS -- Memory for requirements.
2. Agent Won't Connect (WebSocket Issues)
The agent is running but cannot establish a WebSocket connection to the Lumos WS server.
Check the WS Server URL
The agent's configuration includes the WS server URL. If this is misconfigured, the agent will fail to connect.
journalctl -u lumos-agent | grep -i "ws\|websocket\|connect\|dial"Firewall or NAT Issues
Some VPS providers or corporate networks drop long-lived WebSocket connections. The agent sends periodic heartbeats to keep the connection alive.
# Check if there is aggressive connection tracking
conntrack -L 2>/dev/null | wc -l
# Check conntrack timeout for established connections
sysctl net.netfilter.nf_conntrack_tcp_timeout_established 2>/dev/nullIf the timeout is very low (under 300 seconds), idle WebSocket connections may be dropped. The agent handles reconnection automatically, but frequent disconnections indicate a network-level issue.
Proxy or Content Filter
Some VPS providers route traffic through a transparent proxy that interferes with WebSocket upgrades. Check with your provider if WebSocket connections are supported.
WS Server Down
If all agents across all your servers go offline simultaneously, the issue is likely with the Lumos WS server, not your agents. The agents will automatically reconnect with exponential backoff once the server is available again.
3. SSL Certificate Not Provisioning
SSL certificates are provisioned automatically via Let's Encrypt using the ACME HTTP-01 challenge. If a certificate is stuck in "provisioning" state, check the following.
DNS Must Point to the Shield VPS
The ACME HTTP-01 challenge requires that the domain resolves to the shield VPS IP address. Let's Encrypt will make an HTTP request to http://your-domain.com/.well-known/acme-challenge/... and it must reach the agent.
# Check where the domain resolves
dig +short example.com
# This should return your shield VPS IP, not your origin IPIf DNS is not pointing to the shield yet, update your DNS records first. See DNS Setup.
Port 80 Must Be Open
The HTTP-01 challenge uses port 80. Ensure it is not blocked by a firewall and HAProxy is listening:
# Check if port 80 is listening
ss -tlnp | grep :80
# Test from outside (run this from a different machine or use an online tool)
curl -v http://your-domain.com/.well-known/acme-challenge/testCommon reasons port 80 is blocked:
- VPS provider firewall (check provider dashboard/control panel)
iptablesorufwrules blocking inbound port 80- Another service (Apache, Nginx) occupying port 80
Check Agent Logs for ACME Errors
journalctl -u lumos-agent | grep -i "acme\|ssl\|certificate\|letsencrypt\|challenge"Common ACME errors and solutions:
| Error | Cause | Solution |
|---|---|---|
| DNS not resolving | Domain does not point to shield IP | Update A record to shield VPS IP |
| Rate limited | Too many certificate requests (50/domain/week) | Wait 1 week and retry |
| Port 80 blocked | Firewall blocking inbound HTTP | Open port 80 in firewall |
| Invalid domain | Domain is internal/reserved or does not exist | Use a valid public domain |
| Challenge failed | HTTP-01 verification request could not reach the agent | Check firewall, DNS, and HAProxy status |
| Authorization expired | Challenge took too long | Retry the SSL provisioning |
Cloudflare Proxy Interference
If your domain is behind Cloudflare with the orange cloud (proxy) enabled, Cloudflare intercepts the ACME challenge request. Solutions:
- Temporarily disable Cloudflare proxy (grey cloud / DNS-only) while provisioning the certificate
- Use DNS-only mode permanently -- Lumos Gate itself is the proxy layer, so Cloudflare proxy is redundant
- Set Cloudflare SSL mode to Full (Strict) if you must keep the orange cloud
See DNS Setup for Cloudflare-specific guidance.
Manual Certificate Check
If a certificate was provisioned but seems invalid or expired:
# Check the certificate details
echo | openssl s_client -connect your-domain.com:443 -servername your-domain.com 2>/dev/null | openssl x509 -noout -dates -subject
# Check certificate chain
echo | openssl s_client -connect your-domain.com:443 -servername your-domain.com 2>/dev/null | openssl x509 -noout -issuerSSL Certificate Expiry Warning
The system checks for certificates expiring within 7 days during the health check cycle and sends an ssl_expiring notification. The agent automatically renews certificates before they expire. If you receive an expiry warning, check the agent logs for renewal errors.
4. Domain Not Working After Adding
After adding a domain in Lumos Gate, it does not resolve or returns errors.
Check DNS Propagation
DNS changes can take time to propagate. Check current DNS records from multiple sources:
# Check current DNS records
dig example.com A +short
# Check from specific DNS servers
dig example.com A @8.8.8.8 +short # Google DNS
dig example.com A @1.1.1.1 +short # Cloudflare DNS
dig example.com A @9.9.9.9 +short # Quad9
# Check TTL (lower TTL = faster propagation)
dig example.com A +noall +answerIf the result does not show your shield VPS IP, DNS has not propagated yet or the records are incorrect.
Tip: Set TTL to 300 (5 minutes) before changing DNS records. This ensures faster propagation. Once everything is working, you can increase the TTL.
Verify DNS Records
Ensure you have the correct A record:
Type Name Value TTL
A example.com <SHIELD_VPS_IP> 300
A www.example.com <SHIELD_VPS_IP> 300 (if using www)If you use a CNAME for a subdomain, it must ultimately resolve to the shield VPS IP.
Note: If you are using Cloudflare proxy (orange cloud), the IP returned by
digwill be Cloudflare's IP, not your shield IP. For Lumos Gate to work correctly, either disable the Cloudflare proxy (grey cloud / DNS-only) or configure SSL mode to Full (Strict). See DNS Setup.
Check Config Push Reached the Agent
After adding a domain in the dashboard, a config push is sent to the WebSocket server, which forwards it to the agent. Verify the domain was added to HAProxy:
# Check if the domain exists in HAProxy config
grep "example.com" /etc/haproxy/haproxy.cfgIf the domain is not in the config, the config push may have failed. Check the agent logs:
journalctl -u lumos-agent | grep -i "config\|domain\|push\|example.com"If the agent was offline when the config push was sent, reconnecting will trigger a full config sync. Restart the agent to force a reconnect:
systemctl restart lumos-agentTest Direct Connection (Bypass DNS)
Bypass DNS and test directly against the shield VPS to isolate whether the issue is DNS or proxy configuration:
# Test HTTP directly against the shield VPS
curl -H "Host: example.com" http://<SHIELD_VPS_IP>/
# Test HTTPS directly (with SNI)
curl -H "Host: example.com" --resolve example.com:443:<SHIELD_VPS_IP> https://example.com/If this returns your site content, the proxy is working and the issue is DNS-only. If it returns a 503 or connection error, the proxy configuration or origin is the problem.
Check Origin Connectivity
From the shield VPS, verify the origin server is reachable:
# Test origin from the shield VPS
curl -v http://<ORIGIN_IP>:<ORIGIN_PORT>/If the origin is unreachable from the shield, check:
- Origin firewall -- ensure the shield VPS IP is whitelisted
- WireGuard tunnel -- if using encrypted origin, ensure the tunnel is up
- Origin server is actually running and listening on the expected port
5. WAF Blocking Legitimate Traffic
If real users or legitimate services are being blocked by the WAF.
Check WAF Events Log
Navigate to Dashboard -> WAF and review the blocked requests log. Each entry shows:
- Source IP address
- Request path and method
- Block reason (rate limit, IP blacklist, OWASP pattern, bot detection)
- Domain
- Timestamp
Identify the Block Reason and Fix
| Block Reason | What Triggered It | Solution |
|---|---|---|
| Rate limit exceeded | Too many requests from a single IP in the configured window | Increase the rate limit threshold for the domain |
| IP blacklisted | The client IP is in your IP blacklist | Remove the IP from the blacklist |
| OWASP pattern match | Request matched a SQL injection, XSS, or path traversal pattern | Review the specific request; if it is a false positive, lower the WAF level from "high" to "medium" or "low" |
| Bot challenge failed | Client did not pass the JavaScript challenge | Ensure the client supports JavaScript. API clients and bots will fail JS challenges -- see below |
Adjust WAF Level
The WAF level controls sensitivity. If you are getting false positives:
- Navigate to Dashboard -> WAF
- Find the affected domain
- Change the WAF level:
- High -- Strictest, more false positives possible
- Medium -- Balanced (recommended for most sites)
- Low -- Minimal blocking, only obvious attacks
API Clients and Bot Protection
Bot protection uses a JavaScript challenge that requires a browser environment. API clients, webhooks, monitoring services, and legitimate bots (like payment processors or CI/CD systems) will fail the JS challenge.
Solutions:
- Add trusted IPs to the whitelist so they bypass all WAF rules
- Disable bot protection for API-only domains
- Use a separate domain for API endpoints without bot protection enabled
Whitelist Trusted IPs
Add trusted IPs to the whitelist so they bypass WAF rules:
- Navigate to Dashboard -> WAF -> IP Management
- Add IP addresses or CIDR ranges that should be whitelisted
Warning: Only whitelist IPs you trust, such as your office network, monitoring services, known API clients, or payment processor webhook IPs.
Disable WAF for a Domain
If you need to quickly stop blocking while you investigate:
- Navigate to Dashboard -> WAF
- Toggle WAF off for the specific domain
WAF is toggled per-domain, so disabling it for one domain does not affect others. Re-enable it once you have adjusted the rules.
6. HAProxy Not Reloading
The agent generates HAProxy configurations and reloads the process. If reloads fail, your latest domain or WAF changes will not take effect.
Check Agent Logs
journalctl -u lumos-agent | grep -i "reload\|haproxy\|error\|rollback"Common reload errors:
| Error | Cause | Solution |
|---|---|---|
| Configuration syntax error | Generated config has an issue | Agent auto-rolls back; check logs for the specific syntax error |
| Port already in use | Another process on port 80 or 443 | Find and stop the conflicting process (see section 11) |
| Permission denied | Agent lost root privileges | Check agent service user configuration |
| File not found | HAProxy binary missing | Reinstall HAProxy: apt install -y haproxy |
Verify HAProxy Status
# Check HAProxy service status
systemctl status haproxy
# Test the current config for syntax errors
haproxy -c -f /etc/haproxy/haproxy.cfgAutomatic Rollback
The agent implements automatic config rollback:
- Current config is backed up in memory
- New config is written to
/etc/haproxy/haproxy.cfg - HAProxy reload is attempted
- If reload fails, the backup is restored and HAProxy is reloaded with the old config
- A
haproxy_reload_failederror is reported to the dashboard via notification
All config writes and reloads are serialized under a single mutex to prevent race conditions. If you see repeated reload failures, check the agent logs for the specific HAProxy error message.
Manual Config Validation
# Validate the current config
haproxy -c -f /etc/haproxy/haproxy.cfg
# If invalid, check what was written
head -100 /etc/haproxy/haproxy.cfgManual Restart (Last Resort)
As a last resort, you can manually restart HAProxy:
systemctl restart haproxyWarning: Restarting HAProxy (as opposed to reloading) causes a brief interruption in active connections. HAProxy reload is zero-downtime; restart is not. Only restart if reload is not working.
7. HAProxy Health Check Failures
The agent monitors HAProxy health every 10 seconds. If HAProxy crashes, the agent automatically restarts it and sends a haproxy_crash notification.
Check for Repeated Crashes
journalctl -u lumos-agent | grep -i "crash\|restart\|health\|haproxy.*down"
journalctl -u haproxy -n 50 --no-pagerCommon Crash Causes
| Cause | Solution |
|---|---|
| Out of memory | Upgrade VPS RAM or reduce concurrent connections |
| Too many open files | Check ulimit -n; edge-setup should have raised this |
| Corrupted config | Agent will auto-rollback; check logs |
| HAProxy binary updated externally | Avoid running apt upgrade haproxy independently |
Check HAProxy Resource Usage
# Check HAProxy memory usage
ps aux | grep haproxy
# Check open file descriptors
ls /proc/$(pgrep -f "haproxy.*-f")/fd 2>/dev/null | wc -l
# Check connection count
ss -s8. High Latency Through Proxy
Traffic through the shield VPS has noticeably higher latency than direct connections.
Check Origin Server Response Time
The shield adds a network hop, but most latency usually comes from the origin:
# Measure time through the shield
curl -o /dev/null -s -w "Total: %{time_total}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" https://example.com
# Measure time direct to origin (from the shield VPS itself)
curl -o /dev/null -s -w "Total: %{time_total}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" http://<ORIGIN_IP>:<ORIGIN_PORT>If the origin TTFB is high, the issue is not with the proxy.
Consider VPS Location
The physical distance between user, shield, and origin affects latency:
Good: User (EU) -> Shield (EU) -> Origin (EU) ~5ms added
OK: User (EU) -> Shield (EU) -> Origin (US) ~100ms added
Bad: User (EU) -> Shield (US) -> Origin (EU) ~200ms added (unnecessary round trip)Place your shield VPS in the same region as the majority of your users, or as close to the origin as possible. See VPS Providers for providers with multiple regions and Multiple Servers for multi-region setups.
WireGuard Overhead
If you are using WireGuard to encrypt traffic between the shield and origin, expect approximately 3-5% overhead due to encryption and encapsulation. This is generally negligible for most workloads.
Check VPS Resources
Ensure your shield VPS has enough resources:
# CPU usage
top -bn1 | head -10
# Memory usage
free -h
# Network throughput
iftop -t -s 5 2>/dev/null || echo "Install iftop: apt install iftop"
# Check if the VPS is swapping (bad for performance)
swapon --show
vmstat 1 5If the VPS is resource-constrained, consider upgrading the VPS tier or distributing traffic across multiple servers.
Check Kernel Tuning
If you installed with LUMOS_NO_TUNE=1, kernel tuning was skipped. This can cause performance issues under load:
# Check if BBR is active
sysctl net.ipv4.tcp_congestion_control
# Should output: net.ipv4.tcp_congestion_control = bbr
# Check connection tracking limits
sysctl net.netfilter.nf_conntrack_max 2>/dev/nullYou can re-run the edge setup script to apply tuning:
curl -fsSL https://get.lumosgate.com/edge-setup.sh | bashSee Supported OS -- Edge Setup for details.
9. Bot Protection Blocking Real Users
The bot protection system uses a JavaScript challenge with HMAC cookie verification. Some legitimate users or clients may fail this challenge.
Who Gets Blocked
- Users with JavaScript disabled in their browser
- Very old browsers that do not support modern JS
- API clients making direct HTTP requests (no browser environment)
- Automated monitoring tools (Pingdom, UptimeRobot, etc.)
- Payment processor webhooks (Stripe, PayPal, etc.)
- Search engine crawlers (though major crawlers are usually whitelisted by user agent)
Solutions
-
Whitelist known IPs -- Add the IP addresses of your monitoring services, API clients, and webhook sources to the IP whitelist. Whitelisted IPs bypass all WAF and bot protection checks.
-
Disable bot protection per domain -- If a domain serves primarily API traffic, disable bot protection for that domain. You can still keep WAF rules active.
-
Separate API and web domains -- Use
api.example.comfor API traffic (no bot protection) andexample.comfor web traffic (with bot protection).
Verifying Bot Protection Is the Issue
Check the WAF events log in the dashboard. If the block reason is "Bot challenge failed", bot protection is the cause. The blocked request entry will show the IP and request path.
You can also test from the command line:
# This will fail bot protection (no JS engine)
curl -v https://example.com/
# Check if you get a 403 or a JS challenge page10. Account Frozen
Your dashboard shows a frozen account banner and you cannot make configuration changes.
Why It Happens
The account is frozen when the automatic billing deduction fails due to insufficient credit balance. The system attempted to deduct your plan's monthly price and your balance was too low.
Your Sites Are Still Online
Existing proxy configurations continue to work. HAProxy on your shield servers keeps running with the last known good configuration. Your sites remain online and accessible. No stop signal is sent to your agents.
How to Unfreeze
- Navigate to Dashboard -> Settings -> Billing (you can still access this while frozen)
- Click Deposit
- Select an amount and complete the USDT payment
- Once the payment confirms on-chain, your balance updates
- If the new balance >= your plan's monthly price, the account unfreezes automatically
- All mutation operations are re-enabled within seconds
Note: Auto-unfreeze happens as soon as the blockchain transaction confirms your deposit. No manual action is needed beyond sending the payment.
Cannot Deposit While Frozen?
If the deposit button does not appear or the billing tab is not loading, try:
- Clear your browser cache and reload the dashboard
- Try a different browser
- Check browser console for JavaScript errors (F12 -> Console)
The deposit endpoint is accessible even while frozen, so it should work. If you still cannot deposit, contact support.
Emergency Domain Changes While Frozen
You can still change origin IP addresses on existing domains while frozen. This is intentionally allowed for emergency situations (for example, if an origin server goes down and you need to redirect traffic). Navigate to the domain detail page and update the origin servers.
See Credits -- Account Freezing and Account -- Frozen Accounts for complete details.
11. HAProxy Won't Start
HAProxy fails to start, blocking all proxy traffic.
Port Conflict
The most common cause is another service occupying ports 80 or 443:
# Find what is using port 80
ss -tlnp | grep :80
# Find what is using port 443
ss -tlnp | grep :443Common conflicting services:
| Service | How to Stop |
|---|---|
| Apache2 | systemctl stop apache2 && systemctl disable apache2 |
| Nginx | systemctl stop nginx && systemctl disable nginx |
| Caddy | systemctl stop caddy && systemctl disable caddy |
| Another HAProxy | pkill haproxy then restart via systemd |
Config Syntax Error
# Validate config
haproxy -c -f /etc/haproxy/haproxy.cfg
# The error output will show the exact line and issueIf the config is corrupted, the agent's automatic rollback should have restored the previous working config. If it did not, you can check if a backup exists:
# Look for backup configs
ls -la /etc/haproxy/haproxy.cfg*Missing HAProxy Binary
which haproxy
haproxy -vIf HAProxy is not installed, install it:
apt update && apt install -y haproxy
systemctl enable haproxy
systemctl restart lumos-agentPermissions Issue
# Check HAProxy config file permissions
ls -la /etc/haproxy/haproxy.cfg
# Should be readable by haproxy user/group
# Fix if needed
chmod 644 /etc/haproxy/haproxy.cfg
chown root:root /etc/haproxy/haproxy.cfg12. Config Push Not Working
You make changes in the dashboard (add domain, change WAF rules, etc.) but the changes do not reach the agent.
Verify the Config Push Chain
The config push chain is: Dashboard API -> WebSocket Server -> Agent WebSocket -> HAProxy reload
A failure at any point breaks the chain.
Check Agent Connection
First, verify the agent is connected (appears online in dashboard). If offline, see section 1.
Force a Config Sync
Restart the agent to force a full config sync on reconnect:
systemctl restart lumos-agentThe agent requests the full configuration from the WS server upon every reconnect, so a restart effectively forces a fresh config sync.
Check Agent Logs for Config Updates
journalctl -u lumos-agent | grep -i "config\|push\|update\|received"If you see "config received" but no HAProxy reload, the issue is in HAProxy config generation or reload. See section 6.
13. Cannot Delete a Domain
You try to delete a domain but get an error.
Account Frozen
If your account is frozen, all mutation operations (including deletion) are blocked. Deposit credits to unfreeze first.
API Error
Check the browser console (F12 -> Network tab) for the specific error response from the API. Common errors:
| HTTP Status | Meaning | Solution |
|---|---|---|
| 403 | Account frozen | Deposit credits to unfreeze |
| 404 | Domain not found | Refresh the page; it may already be deleted |
| 500 | Server error | Try again; check server logs if it persists |
Domain Still in Use
If the domain has active traffic or pending SSL provisioning, the deletion should still work. There is no "in use" block. If deletion fails, try again after a few seconds.
14. Agent Installation Fails
The installation script exits with an error.
Check OS Requirements
The agent requires Debian 12+ or Ubuntu 24.04+:
cat /etc/os-releaseIf you are running a different distribution, it is not currently supported. See Supported OS.
Check Root Access
The installer must run as root:
whoami
# Should output: rootIf not root, use sudo:
curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=YOUR_TOKEN sudo -E bashCheck curl
The installer requires curl:
curl --versionIf not installed:
apt update && apt install -y curlExisting HAProxy Detected
If HAProxy is already installed, the installer shows the existing configuration statistics (number of frontends, backends, lines) and asks for confirmation. To skip the interactive prompt:
curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=YOUR_TOKEN LUMOS_FORCE=1 bashThe LUMOS_FORCE=1 flag bypasses the confirmation prompt. The existing HAProxy configuration is still backed up before any changes are made. After installation, you can import existing sites via Detected Sites.
Network Errors
If the installer cannot download the agent binary:
# Test connectivity to the download server
curl -v https://get.lumosgate.com/
# Check DNS resolution
dig get.lumosgate.com +shortDisk Full
df -h /If less than 100 MB is free, clear space before installing.
Package Lock (apt)
If another apt process is running:
# Check for running apt processes
ps aux | grep apt
# Wait for it to finish, or if it is stuck:
kill $(cat /var/lib/dpkg/lock-frontend 2>/dev/null) 2>/dev/null
rm -f /var/lib/dpkg/lock-frontend /var/lib/dpkg/lock /var/cache/apt/archives/lock
dpkg --configure -a15. Agent Update
How to update the Lumos Gate agent to the latest version.
Automatic Updates
The agent does not auto-update. You must manually update when a new version is available.
Update Procedure
Re-run the installation script with the LUMOS_FORCE=1 flag. This downloads the latest binary and restarts the service while preserving your configuration:
curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=YOUR_TOKEN LUMOS_FORCE=1 bashNote: The
LUMOS_FORCE=1flag is required when the agent is already installed. It skips the existing HAProxy confirmation prompt. Your encrypted agent configuration and HAProxy config are preserved.
Verify the Update
# Check the agent is running
systemctl status lumos-agent
# Check agent logs for the new version
journalctl -u lumos-agent -n 20 --no-pager16. Connection Drops / Agent Keeps Reconnecting
The agent disconnects and reconnects frequently.
Check VPS Network Stability
# Test network stability with continuous ping
ping -c 100 lumosgate.com
# Check for packet loss
ping -c 50 -q lumosgate.comIf you see packet loss above 1-2%, the VPS network may be unstable. Contact your VPS provider.
Check Agent Reconnection Logs
journalctl -u lumos-agent | grep -i "connect\|disconnect\|reconnect\|backoff"The agent has built-in automatic reconnection with exponential backoff. Occasional disconnections are normal (network blips, WS server restarts during deployments). Frequent disconnections (more than a few per hour) indicate a persistent network issue.
Aggressive NAT/Firewall Timeout
Some networks drop idle TCP connections. The agent sends periodic heartbeats, but if the timeout is very aggressive (under 60 seconds), connections may still drop. This is common on some budget VPS providers.
17. DNS Failover Not Working
DNS failover is configured but does not trigger when the primary server goes down.
Check Plan
DNS failover requires the Pro or Enterprise plan and at least 2 servers.
Check Health Check Status
The WebSocket server triggers health checks every 5 minutes. Check if health checks are running:
- Navigate to Dashboard -> Servers and check server status indicators
- Check notifications for
server_downalerts
Check DNS Provider Configuration
DNS failover requires a configured DNS provider (Cloudflare). Verify in Dashboard -> Domains -> [domain] -> DNS that the DNS provider is connected.
Timing
Failover is not instant. The health check runs every 5 minutes, so in the worst case it takes up to 5 minutes to detect a failure, plus DNS propagation time (typically 1-5 minutes with low TTL).
18. Detected Sites Not Showing
After installing the agent on a server with existing HAProxy configuration, the Detected Sites page shows no sites.
Agent Must Send Backup Config
The agent sends the existing HAProxy configuration to the WS server upon first connection. The dashboard parses this to find existing sites.
- Ensure the agent has connected at least once
- Check agent logs for backup config upload:
journalctl -u lumos-agent | grep -i "backup\|existing\|config" - If the agent was installed with
LUMOS_FORCE=1on a fresh system (no existing HAProxy config), there are no sites to detect
Already Managed Domains
Sites that you have already added as domains in Lumos Gate are marked as "already managed" and will appear differently in the detected sites list.
Logs and Diagnostics
Agent Logs
The primary diagnostic tool. Most issues are diagnosable from agent logs:
# Recent logs (last 50 lines)
journalctl -u lumos-agent -n 50 --no-pager
# Follow logs in real-time
journalctl -u lumos-agent -f
# Logs from a specific time range
journalctl -u lumos-agent --since "1 hour ago"
# Filter for errors only
journalctl -u lumos-agent -p err --no-pagerHAProxy Logs
# HAProxy service logs
journalctl -u haproxy -n 50 --no-pager
# HAProxy access logs (if configured to syslog)
tail -100 /var/log/haproxy.log 2>/dev/nullHAProxy Config Validation
# Validate current config
haproxy -c -f /etc/haproxy/haproxy.cfg
# Show current config
cat /etc/haproxy/haproxy.cfgSystem Diagnostics
# Full system overview
systemctl status lumos-agent
systemctl status haproxy
ss -tlnp | grep -E ':80|:443'
free -h
df -h /
uname -r
cat /etc/os-releaseDashboard Notifications
Error events from the agent are reported to the dashboard via the notification system. Check Dashboard -> Notifications for:
server_down-- Agent disconnectedserver_error-- HAProxy crash, config update failed, reload failedssl_expiring-- Certificate expiring within 7 days
Collecting Diagnostics for Support
When reporting an issue to support (Pro/Enterprise plans), include the output of these commands:
echo "=== Agent Status ==="
systemctl status lumos-agent
echo "=== Agent Logs (last 100 lines) ==="
journalctl -u lumos-agent -n 100 --no-pager
echo "=== HAProxy Status ==="
systemctl status haproxy
echo "=== HAProxy Config Validation ==="
haproxy -c -f /etc/haproxy/haproxy.cfg
echo "=== HAProxy Version ==="
haproxy -v
echo "=== OS Info ==="
cat /etc/os-release
echo "=== Kernel ==="
uname -r
echo "=== Memory ==="
free -h
echo "=== Disk ==="
df -h /
echo "=== Ports ==="
ss -tlnp | grep -E ':80|:443'
echo "=== Architecture ==="
uname -m19. Agent Binary Not Found (404)
The agent installer downloads the binary from get.lumosgate.com. If the download returns a 404 error, the binary is not available for your platform.
Causes
- CDN not configured -- The
get.lumosgate.comCDN endpoint has not been set up yet, or the binary has not been published for the current release. - Unsupported architecture -- Agent binaries are built for
linux-amd64andlinux-arm64only. Other architectures (e.g.,armv7,i386) are not supported.
Check Your Architecture
uname -m
# Expected: x86_64 (amd64) or aarch64 (arm64)Workaround: Build from Source
If the CDN binary is not available, you can build the agent from source on any machine with Go installed:
cd agent
GOOS=linux GOARCH=amd64 go build -o lumos-agent ./cmd/lumos-agent/For ARM servers:
GOOS=linux GOARCH=arm64 go build -o lumos-agent ./cmd/lumos-agent/Then transfer the binary to your VPS and place it at /usr/local/bin/lumos-agent.
See Supported OS for the full list of supported architectures and operating systems.
Quick Reference
| Symptom | Most Likely Cause | First Step |
|---|---|---|
| Agent offline | Service stopped or token invalid | systemctl status lumos-agent |
| SSL stuck provisioning | DNS not pointing to shield or port 80 blocked | dig example.com +short |
| Domain not working | DNS not propagated or config push failed | dig example.com @8.8.8.8 +short |
| WAF blocking users | Rate limit too low or false positive | Check WAF events in dashboard |
| HAProxy not reloading | Config syntax error | haproxy -c -f /etc/haproxy/haproxy.cfg |
| High latency | Origin slow or VPS too far from users | Test origin directly from shield |
| Account frozen | Insufficient credit balance | Deposit via Settings -> Billing |
| Bot protection blocking | API client or old browser | Whitelist the IP address |
| Installation fails | Wrong OS or not root | cat /etc/os-release && whoami |
| Config changes not applying | Agent offline or server issue | Restart agent to force sync |
| Port 80/443 in use | Apache/Nginx still running | `ss -tlnp |
Next Steps
- Agent Installation -- Installation guide and flags
- Agent CLI -- Agent commands including uninstall
- DNS Setup -- DNS configuration guide
- SSL/TLS -- SSL certificate management
- WAF -- WAF configuration and tuning
- Notifications -- Set up alerts for error events
- Supported OS -- System requirements and compatibility
- Credits and Billing -- Understanding billing and frozen accounts
- Architecture -- How all components work together