Fix agent offline errors, SSL provisioning failures, DNS misconfigurations, WAF blocks, and HAProxy issues. Diagnostic steps and solutions for every problem.

Troubleshooting

This page covers every common issue you may encounter with Lumos Gate and how to resolve it. Each section includes diagnostic steps, root causes, and solutions.

1. Agent Shows Offline in Dashboard

The most common issue. The agent appears as "Offline" in the dashboard even though the VPS is running.

Check Agent Service Status

systemctl status lumos-agent

If the service is not running, start it:

systemctl start lumos-agent

If it fails to start, check the logs:

journalctl -u lumos-agent -n 50 --no-pager

The agent authenticates with the WS server using the token provided during agent installation. If the token is incorrect, was regenerated, or the server was decommissioned and recreated, the agent cannot connect.

# Check for auth errors in the logs
journalctl -u lumos-agent | grep -i "auth\|token\|unauthorized\|401"

Solution: If the token is invalid, decommission the server in the dashboard, create a new one, and reinstall the agent with the new token:

curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=NEW_TOKEN LUMOS_FORCE=1 bash

Check Outbound Connectivity

The agent makes an outbound WebSocket (WSS) connection to the Lumos WS server on port 443. Ensure your VPS firewall allows outbound HTTPS connections.

# Test basic connectivity to the WS server
curl -v https://lumosgate.com/health

# Test WebSocket upgrade (should return 101 or connection upgrade headers)
curl -v -H "Connection: Upgrade" -H "Upgrade: websocket" https://lumosgate.com/ws

If outbound connections are blocked, check your VPS firewall rules:

# Check iptables rules
iptables -L -n

# Check UFW status (if using UFW)
ufw status

# Ensure outbound HTTPS is allowed
ufw allow out 443/tcp

Check for Multiple Agent Instances

Ensure only one instance of the agent is running:

ps aux | grep lumos-agent

If multiple instances are running, stop them all and restart the service:

systemctl stop lumos-agent
pkill -f lumos-agent
systemctl start lumos-agent

Check DNS Resolution from the VPS

The agent needs to resolve the Lumos WS server hostname:

# Verify DNS resolution works
dig lumosgate.com +short
nslookup lumosgate.com

If DNS resolution fails, check /etc/resolv.conf and ensure a valid nameserver is configured. You can temporarily add a public DNS:

echo "nameserver 8.8.8.8" >> /etc/resolv.conf

Agent Crashed or OOM Killed

Check if the agent was killed by the kernel's OOM killer:

dmesg | grep -i "oom\|lumos\|killed"
journalctl -u lumos-agent | grep -i "signal\|kill\|exit"

If the agent is being OOM-killed, your VPS may not have enough RAM. See Supported OS -- Memory for requirements.

2. Agent Won't Connect (WebSocket Issues)

The agent is running but cannot establish a WebSocket connection to the Lumos WS server.

Check the WS Server URL

The agent's configuration includes the WS server URL. If this is misconfigured, the agent will fail to connect.

journalctl -u lumos-agent | grep -i "ws\|websocket\|connect\|dial"

Firewall or NAT Issues

Some VPS providers or corporate networks drop long-lived WebSocket connections. The agent sends periodic heartbeats to keep the connection alive.

# Check if there is aggressive connection tracking
conntrack -L 2>/dev/null | wc -l

# Check conntrack timeout for established connections
sysctl net.netfilter.nf_conntrack_tcp_timeout_established 2>/dev/null

If the timeout is very low (under 300 seconds), idle WebSocket connections may be dropped. The agent handles reconnection automatically, but frequent disconnections indicate a network-level issue.

Proxy or Content Filter

Some VPS providers route traffic through a transparent proxy that interferes with WebSocket upgrades. Check with your provider if WebSocket connections are supported.

WS Server Down

If all agents across all your servers go offline simultaneously, the issue is likely with the Lumos WS server, not your agents. The agents will automatically reconnect with exponential backoff once the server is available again.

3. SSL Certificate Not Provisioning

SSL certificates are provisioned automatically via Let's Encrypt using the ACME HTTP-01 challenge. If a certificate is stuck in "provisioning" state, check the following.

DNS Must Point to the Shield VPS

The ACME HTTP-01 challenge requires that the domain resolves to the shield VPS IP address. Let's Encrypt will make an HTTP request to http://your-domain.com/.well-known/acme-challenge/... and it must reach the agent.

# Check where the domain resolves
dig +short example.com

# This should return your shield VPS IP, not your origin IP

If DNS is not pointing to the shield yet, update your DNS records first. See DNS Setup.

Port 80 Must Be Open

The HTTP-01 challenge uses port 80. Ensure it is not blocked by a firewall and HAProxy is listening:

# Check if port 80 is listening
ss -tlnp | grep :80

# Test from outside (run this from a different machine or use an online tool)
curl -v http://your-domain.com/.well-known/acme-challenge/test

Common reasons port 80 is blocked:

VPS provider firewall (check provider dashboard/control panel)
iptables or ufw rules blocking inbound port 80
Another service (Apache, Nginx) occupying port 80

Check Agent Logs for ACME Errors

journalctl -u lumos-agent | grep -i "acme\|ssl\|certificate\|letsencrypt\|challenge"

Common ACME errors and solutions:

Error	Cause	Solution
DNS not resolving	Domain does not point to shield IP	Update A record to shield VPS IP
Rate limited	Too many certificate requests (50/domain/week)	Wait 1 week and retry
Port 80 blocked	Firewall blocking inbound HTTP	Open port 80 in firewall
Invalid domain	Domain is internal/reserved or does not exist	Use a valid public domain
Challenge failed	HTTP-01 verification request could not reach the agent	Check firewall, DNS, and HAProxy status
Authorization expired	Challenge took too long	Retry the SSL provisioning

Cloudflare Proxy Interference

If your domain is behind Cloudflare with the orange cloud (proxy) enabled, Cloudflare intercepts the ACME challenge request. Solutions:

Temporarily disable Cloudflare proxy (grey cloud / DNS-only) while provisioning the certificate
Use DNS-only mode permanently -- Lumos Gate itself is the proxy layer, so Cloudflare proxy is redundant
Set Cloudflare SSL mode to Full (Strict) if you must keep the orange cloud

See DNS Setup for Cloudflare-specific guidance.

Manual Certificate Check

If a certificate was provisioned but seems invalid or expired:

# Check the certificate details
echo | openssl s_client -connect your-domain.com:443 -servername your-domain.com 2>/dev/null | openssl x509 -noout -dates -subject

# Check certificate chain
echo | openssl s_client -connect your-domain.com:443 -servername your-domain.com 2>/dev/null | openssl x509 -noout -issuer

SSL Certificate Expiry Warning

The system checks for certificates expiring within 7 days during the health check cycle and sends an ssl_expiring notification. The agent automatically renews certificates before they expire. If you receive an expiry warning, check the agent logs for renewal errors.

4. Domain Not Working After Adding

After adding a domain in Lumos Gate, it does not resolve or returns errors.

Check DNS Propagation

DNS changes can take time to propagate. Check current DNS records from multiple sources:

# Check current DNS records
dig example.com A +short

# Check from specific DNS servers
dig example.com A @8.8.8.8 +short     # Google DNS
dig example.com A @1.1.1.1 +short     # Cloudflare DNS
dig example.com A @9.9.9.9 +short     # Quad9

# Check TTL (lower TTL = faster propagation)
dig example.com A +noall +answer

If the result does not show your shield VPS IP, DNS has not propagated yet or the records are incorrect.

Tip: Set TTL to 300 (5 minutes) before changing DNS records. This ensures faster propagation. Once everything is working, you can increase the TTL.

Verify DNS Records

Ensure you have the correct A record:

Type    Name              Value              TTL
A       example.com       <SHIELD_VPS_IP>    300
A       www.example.com   <SHIELD_VPS_IP>    300  (if using www)

If you use a CNAME for a subdomain, it must ultimately resolve to the shield VPS IP.

Note: If you are using Cloudflare proxy (orange cloud), the IP returned by dig will be Cloudflare's IP, not your shield IP. For Lumos Gate to work correctly, either disable the Cloudflare proxy (grey cloud / DNS-only) or configure SSL mode to Full (Strict). See DNS Setup.

Check Config Push Reached the Agent

After adding a domain in the dashboard, a config push is sent to the WebSocket server, which forwards it to the agent. Verify the domain was added to HAProxy:

# Check if the domain exists in HAProxy config
grep "example.com" /etc/haproxy/haproxy.cfg

If the domain is not in the config, the config push may have failed. Check the agent logs:

journalctl -u lumos-agent | grep -i "config\|domain\|push\|example.com"

If the agent was offline when the config push was sent, reconnecting will trigger a full config sync. Restart the agent to force a reconnect:

systemctl restart lumos-agent

Test Direct Connection (Bypass DNS)

Bypass DNS and test directly against the shield VPS to isolate whether the issue is DNS or proxy configuration:

# Test HTTP directly against the shield VPS
curl -H "Host: example.com" http://<SHIELD_VPS_IP>/

# Test HTTPS directly (with SNI)
curl -H "Host: example.com" --resolve example.com:443:<SHIELD_VPS_IP> https://example.com/

If this returns your site content, the proxy is working and the issue is DNS-only. If it returns a 503 or connection error, the proxy configuration or origin is the problem.

Check Origin Connectivity

From the shield VPS, verify the origin server is reachable:

# Test origin from the shield VPS
curl -v http://<ORIGIN_IP>:<ORIGIN_PORT>/

If the origin is unreachable from the shield, check:

Origin firewall -- ensure the shield VPS IP is whitelisted
WireGuard tunnel -- if using encrypted origin, ensure the tunnel is up
Origin server is actually running and listening on the expected port

5. WAF Blocking Legitimate Traffic

If real users or legitimate services are being blocked by the WAF.

Check WAF Events Log

Navigate to Dashboard -> WAF and review the blocked requests log. Each entry shows:

Source IP address
Request path and method
Block reason (rate limit, IP blacklist, OWASP pattern, bot detection)
Domain
Timestamp

Identify the Block Reason and Fix

Block Reason	What Triggered It	Solution
Rate limit exceeded	Too many requests from a single IP in the configured window	Increase the rate limit threshold for the domain
IP blacklisted	The client IP is in your IP blacklist	Remove the IP from the blacklist
OWASP pattern match	Request matched a SQL injection, XSS, or path traversal pattern	Review the specific request; if it is a false positive, lower the WAF level from "high" to "medium" or "low"
Bot challenge failed	Client did not pass the JavaScript challenge	Ensure the client supports JavaScript. API clients and bots will fail JS challenges -- see below

Adjust WAF Level

The WAF level controls sensitivity. If you are getting false positives:

Navigate to Dashboard -> WAF
Find the affected domain
Change the WAF level:
- High -- Strictest, more false positives possible
- Medium -- Balanced (recommended for most sites)
- Low -- Minimal blocking, only obvious attacks

API Clients and Bot Protection

Bot protection uses a JavaScript challenge that requires a browser environment. API clients, webhooks, monitoring services, and legitimate bots (like payment processors or CI/CD systems) will fail the JS challenge.

Solutions:

Add trusted IPs to the whitelist so they bypass all WAF rules
Disable bot protection for API-only domains
Use a separate domain for API endpoints without bot protection enabled

Whitelist Trusted IPs

Add trusted IPs to the whitelist so they bypass WAF rules:

Navigate to Dashboard -> WAF -> IP Management
Add IP addresses or CIDR ranges that should be whitelisted

Warning: Only whitelist IPs you trust, such as your office network, monitoring services, known API clients, or payment processor webhook IPs.

Disable WAF for a Domain

If you need to quickly stop blocking while you investigate:

Navigate to Dashboard -> WAF
Toggle WAF off for the specific domain

WAF is toggled per-domain, so disabling it for one domain does not affect others. Re-enable it once you have adjusted the rules.

6. HAProxy Not Reloading

The agent generates HAProxy configurations and reloads the process. If reloads fail, your latest domain or WAF changes will not take effect.

Check Agent Logs

journalctl -u lumos-agent | grep -i "reload\|haproxy\|error\|rollback"

Common reload errors:

Error	Cause	Solution
Configuration syntax error	Generated config has an issue	Agent auto-rolls back; check logs for the specific syntax error
Port already in use	Another process on port 80 or 443	Find and stop the conflicting process (see section 11)
Permission denied	Agent lost root privileges	Check agent service user configuration
File not found	HAProxy binary missing	Reinstall HAProxy: `apt install -y haproxy`

Verify HAProxy Status

# Check HAProxy service status
systemctl status haproxy

# Test the current config for syntax errors
haproxy -c -f /etc/haproxy/haproxy.cfg

Automatic Rollback

The agent implements automatic config rollback:

Current config is backed up in memory
New config is written to /etc/haproxy/haproxy.cfg
HAProxy reload is attempted
If reload fails, the backup is restored and HAProxy is reloaded with the old config
A haproxy_reload_failed error is reported to the dashboard via notification

All config writes and reloads are serialized under a single mutex to prevent race conditions. If you see repeated reload failures, check the agent logs for the specific HAProxy error message.

Manual Config Validation

# Validate the current config
haproxy -c -f /etc/haproxy/haproxy.cfg

# If invalid, check what was written
head -100 /etc/haproxy/haproxy.cfg

Manual Restart (Last Resort)

As a last resort, you can manually restart HAProxy:

systemctl restart haproxy

Warning: Restarting HAProxy (as opposed to reloading) causes a brief interruption in active connections. HAProxy reload is zero-downtime; restart is not. Only restart if reload is not working.

7. HAProxy Health Check Failures

The agent monitors HAProxy health every 10 seconds. If HAProxy crashes, the agent automatically restarts it and sends a haproxy_crash notification.

Check for Repeated Crashes

journalctl -u lumos-agent | grep -i "crash\|restart\|health\|haproxy.*down"
journalctl -u haproxy -n 50 --no-pager

Common Crash Causes

Cause	Solution
Out of memory	Upgrade VPS RAM or reduce concurrent connections
Too many open files	Check `ulimit -n`; edge-setup should have raised this
Corrupted config	Agent will auto-rollback; check logs
HAProxy binary updated externally	Avoid running `apt upgrade haproxy` independently

Check HAProxy Resource Usage

# Check HAProxy memory usage
ps aux | grep haproxy

# Check open file descriptors
ls /proc/$(pgrep -f "haproxy.*-f")/fd 2>/dev/null | wc -l

# Check connection count
ss -s

8. High Latency Through Proxy

Traffic through the shield VPS has noticeably higher latency than direct connections.

Check Origin Server Response Time

The shield adds a network hop, but most latency usually comes from the origin:

# Measure time through the shield
curl -o /dev/null -s -w "Total: %{time_total}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" https://example.com

# Measure time direct to origin (from the shield VPS itself)
curl -o /dev/null -s -w "Total: %{time_total}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" http://<ORIGIN_IP>:<ORIGIN_PORT>

If the origin TTFB is high, the issue is not with the proxy.

Consider VPS Location

The physical distance between user, shield, and origin affects latency:

Good:  User (EU) -> Shield (EU) -> Origin (EU)       ~5ms added
OK:    User (EU) -> Shield (EU) -> Origin (US)        ~100ms added
Bad:   User (EU) -> Shield (US) -> Origin (EU)        ~200ms added (unnecessary round trip)

Place your shield VPS in the same region as the majority of your users, or as close to the origin as possible. See VPS Providers for providers with multiple regions and Multiple Servers for multi-region setups.

WireGuard Overhead

If you are using WireGuard to encrypt traffic between the shield and origin, expect approximately 3-5% overhead due to encryption and encapsulation. This is generally negligible for most workloads.

Check VPS Resources

Ensure your shield VPS has enough resources:

# CPU usage
top -bn1 | head -10

# Memory usage
free -h

# Network throughput
iftop -t -s 5 2>/dev/null || echo "Install iftop: apt install iftop"

# Check if the VPS is swapping (bad for performance)
swapon --show
vmstat 1 5

If the VPS is resource-constrained, consider upgrading the VPS tier or distributing traffic across multiple servers.

Check Kernel Tuning

If you installed with LUMOS_NO_TUNE=1, kernel tuning was skipped. This can cause performance issues under load:

# Check if BBR is active
sysctl net.ipv4.tcp_congestion_control
# Should output: net.ipv4.tcp_congestion_control = bbr

# Check connection tracking limits
sysctl net.netfilter.nf_conntrack_max 2>/dev/null

You can re-run the edge setup script to apply tuning:

curl -fsSL https://get.lumosgate.com/edge-setup.sh | bash

See Supported OS -- Edge Setup for details.

9. Bot Protection Blocking Real Users

The bot protection system uses a JavaScript challenge with HMAC cookie verification. Some legitimate users or clients may fail this challenge.

Who Gets Blocked

Users with JavaScript disabled in their browser
Very old browsers that do not support modern JS
API clients making direct HTTP requests (no browser environment)
Automated monitoring tools (Pingdom, UptimeRobot, etc.)
Payment processor webhooks (Stripe, PayPal, etc.)
Search engine crawlers (though major crawlers are usually whitelisted by user agent)

Solutions

Whitelist known IPs -- Add the IP addresses of your monitoring services, API clients, and webhook sources to the IP whitelist. Whitelisted IPs bypass all WAF and bot protection checks.
Disable bot protection per domain -- If a domain serves primarily API traffic, disable bot protection for that domain. You can still keep WAF rules active.
Separate API and web domains -- Use api.example.com for API traffic (no bot protection) and example.com for web traffic (with bot protection).

Verifying Bot Protection Is the Issue

Check the WAF events log in the dashboard. If the block reason is "Bot challenge failed", bot protection is the cause. The blocked request entry will show the IP and request path.

You can also test from the command line:

# This will fail bot protection (no JS engine)
curl -v https://example.com/

# Check if you get a 403 or a JS challenge page

10. Account Frozen

Your dashboard shows a frozen account banner and you cannot make configuration changes.

Why It Happens

The account is frozen when the automatic billing deduction fails due to insufficient credit balance. The system attempted to deduct your plan's monthly price and your balance was too low.

Your Sites Are Still Online

Existing proxy configurations continue to work. HAProxy on your shield servers keeps running with the last known good configuration. Your sites remain online and accessible. No stop signal is sent to your agents.

How to Unfreeze

Navigate to Dashboard -> Settings -> Billing (you can still access this while frozen)
Click Deposit
Select an amount and complete the USDT payment
Once the payment confirms on-chain, your balance updates
If the new balance >= your plan's monthly price, the account unfreezes automatically
All mutation operations are re-enabled within seconds

Note: Auto-unfreeze happens as soon as the blockchain transaction confirms your deposit. No manual action is needed beyond sending the payment.

Cannot Deposit While Frozen?

If the deposit button does not appear or the billing tab is not loading, try:

Clear your browser cache and reload the dashboard
Try a different browser
Check browser console for JavaScript errors (F12 -> Console)

The deposit endpoint is accessible even while frozen, so it should work. If you still cannot deposit, contact support.

Emergency Domain Changes While Frozen

You can still change origin IP addresses on existing domains while frozen. This is intentionally allowed for emergency situations (for example, if an origin server goes down and you need to redirect traffic). Navigate to the domain detail page and update the origin servers.

See Credits -- Account Freezing and Account -- Frozen Accounts for complete details.

11. HAProxy Won't Start

HAProxy fails to start, blocking all proxy traffic.

Port Conflict

The most common cause is another service occupying ports 80 or 443:

# Find what is using port 80
ss -tlnp | grep :80

# Find what is using port 443
ss -tlnp | grep :443

Common conflicting services:

Service	How to Stop
Apache2	`systemctl stop apache2 && systemctl disable apache2`
Nginx	`systemctl stop nginx && systemctl disable nginx`
Caddy	`systemctl stop caddy && systemctl disable caddy`
Another HAProxy	`pkill haproxy` then restart via systemd

Config Syntax Error

# Validate config
haproxy -c -f /etc/haproxy/haproxy.cfg

# The error output will show the exact line and issue

If the config is corrupted, the agent's automatic rollback should have restored the previous working config. If it did not, you can check if a backup exists:

# Look for backup configs
ls -la /etc/haproxy/haproxy.cfg*

Missing HAProxy Binary

which haproxy
haproxy -v

If HAProxy is not installed, install it:

apt update && apt install -y haproxy
systemctl enable haproxy
systemctl restart lumos-agent

Permissions Issue

# Check HAProxy config file permissions
ls -la /etc/haproxy/haproxy.cfg

# Should be readable by haproxy user/group
# Fix if needed
chmod 644 /etc/haproxy/haproxy.cfg
chown root:root /etc/haproxy/haproxy.cfg

12. Config Push Not Working

You make changes in the dashboard (add domain, change WAF rules, etc.) but the changes do not reach the agent.

Verify the Config Push Chain

The config push chain is: Dashboard API -> WebSocket Server -> Agent WebSocket -> HAProxy reload

A failure at any point breaks the chain.

Check Agent Connection

First, verify the agent is connected (appears online in dashboard). If offline, see section 1.

Force a Config Sync

Restart the agent to force a full config sync on reconnect:

systemctl restart lumos-agent

The agent requests the full configuration from the WS server upon every reconnect, so a restart effectively forces a fresh config sync.

Check Agent Logs for Config Updates

journalctl -u lumos-agent | grep -i "config\|push\|update\|received"

If you see "config received" but no HAProxy reload, the issue is in HAProxy config generation or reload. See section 6.

13. Cannot Delete a Domain

You try to delete a domain but get an error.

Account Frozen

If your account is frozen, all mutation operations (including deletion) are blocked. Deposit credits to unfreeze first.

API Error

Check the browser console (F12 -> Network tab) for the specific error response from the API. Common errors:

HTTP Status	Meaning	Solution
403	Account frozen	Deposit credits to unfreeze
404	Domain not found	Refresh the page; it may already be deleted
500	Server error	Try again; check server logs if it persists

Domain Still in Use

If the domain has active traffic or pending SSL provisioning, the deletion should still work. There is no "in use" block. If deletion fails, try again after a few seconds.

14. Agent Installation Fails

The installation script exits with an error.

Check OS Requirements

The agent requires Debian 12+ or Ubuntu 24.04+:

cat /etc/os-release

If you are running a different distribution, it is not currently supported. See Supported OS.

Check Root Access

The installer must run as root:

whoami
# Should output: root

If not root, use sudo:

curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=YOUR_TOKEN sudo -E bash

Check curl

The installer requires curl:

curl --version

If not installed:

apt update && apt install -y curl

Existing HAProxy Detected

If HAProxy is already installed, the installer shows the existing configuration statistics (number of frontends, backends, lines) and asks for confirmation. To skip the interactive prompt:

curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=YOUR_TOKEN LUMOS_FORCE=1 bash

The LUMOS_FORCE=1 flag bypasses the confirmation prompt. The existing HAProxy configuration is still backed up before any changes are made. After installation, you can import existing sites via Detected Sites.

Network Errors

If the installer cannot download the agent binary:

# Test connectivity to the download server
curl -v https://get.lumosgate.com/

# Check DNS resolution
dig get.lumosgate.com +short

Disk Full

df -h /

If less than 100 MB is free, clear space before installing.

Package Lock (apt)

If another apt process is running:

# Check for running apt processes
ps aux | grep apt

# Wait for it to finish, or if it is stuck:
kill $(cat /var/lib/dpkg/lock-frontend 2>/dev/null) 2>/dev/null
rm -f /var/lib/dpkg/lock-frontend /var/lib/dpkg/lock /var/cache/apt/archives/lock
dpkg --configure -a

15. Agent Update

How to update the Lumos Gate agent to the latest version.

Automatic Updates

The agent does not auto-update. You must manually update when a new version is available.

Update Procedure

Re-run the installation script with the LUMOS_FORCE=1 flag. This downloads the latest binary and restarts the service while preserving your configuration:

curl -fsSL https://get.lumosgate.com/install | LUMOS_TOKEN=YOUR_TOKEN LUMOS_FORCE=1 bash

Note: The LUMOS_FORCE=1 flag is required when the agent is already installed. It skips the existing HAProxy confirmation prompt. Your encrypted agent configuration and HAProxy config are preserved.

Verify the Update

# Check the agent is running
systemctl status lumos-agent

# Check agent logs for the new version
journalctl -u lumos-agent -n 20 --no-pager

16. Connection Drops / Agent Keeps Reconnecting

The agent disconnects and reconnects frequently.

Check VPS Network Stability

# Test network stability with continuous ping
ping -c 100 lumosgate.com

# Check for packet loss
ping -c 50 -q lumosgate.com

If you see packet loss above 1-2%, the VPS network may be unstable. Contact your VPS provider.

Check Agent Reconnection Logs

journalctl -u lumos-agent | grep -i "connect\|disconnect\|reconnect\|backoff"

The agent has built-in automatic reconnection with exponential backoff. Occasional disconnections are normal (network blips, WS server restarts during deployments). Frequent disconnections (more than a few per hour) indicate a persistent network issue.

Aggressive NAT/Firewall Timeout

Some networks drop idle TCP connections. The agent sends periodic heartbeats, but if the timeout is very aggressive (under 60 seconds), connections may still drop. This is common on some budget VPS providers.

17. DNS Failover Not Working

DNS failover is configured but does not trigger when the primary server goes down.

Check Plan

DNS failover requires the Pro or Enterprise plan and at least 2 servers.

Check Health Check Status

The WebSocket server triggers health checks every 5 minutes. Check if health checks are running:

Navigate to Dashboard -> Servers and check server status indicators
Check notifications for server_down alerts

Check DNS Provider Configuration

DNS failover requires a configured DNS provider (Cloudflare). Verify in Dashboard -> Domains -> [domain] -> DNS that the DNS provider is connected.

Timing

Failover is not instant. The health check runs every 5 minutes, so in the worst case it takes up to 5 minutes to detect a failure, plus DNS propagation time (typically 1-5 minutes with low TTL).

18. Detected Sites Not Showing

After installing the agent on a server with existing HAProxy configuration, the Detected Sites page shows no sites.

Agent Must Send Backup Config

The agent sends the existing HAProxy configuration to the WS server upon first connection. The dashboard parses this to find existing sites.

Ensure the agent has connected at least once
Check agent logs for backup config upload: journalctl -u lumos-agent | grep -i "backup\|existing\|config"
If the agent was installed with LUMOS_FORCE=1 on a fresh system (no existing HAProxy config), there are no sites to detect

Already Managed Domains

Sites that you have already added as domains in Lumos Gate are marked as "already managed" and will appear differently in the detected sites list.

Logs and Diagnostics

Agent Logs

The primary diagnostic tool. Most issues are diagnosable from agent logs:

# Recent logs (last 50 lines)
journalctl -u lumos-agent -n 50 --no-pager

# Follow logs in real-time
journalctl -u lumos-agent -f

# Logs from a specific time range
journalctl -u lumos-agent --since "1 hour ago"

# Filter for errors only
journalctl -u lumos-agent -p err --no-pager

HAProxy Logs

# HAProxy service logs
journalctl -u haproxy -n 50 --no-pager

# HAProxy access logs (if configured to syslog)
tail -100 /var/log/haproxy.log 2>/dev/null

HAProxy Config Validation

# Validate current config
haproxy -c -f /etc/haproxy/haproxy.cfg

# Show current config
cat /etc/haproxy/haproxy.cfg

System Diagnostics

# Full system overview
systemctl status lumos-agent
systemctl status haproxy
ss -tlnp | grep -E ':80|:443'
free -h
df -h /
uname -r
cat /etc/os-release

Dashboard Notifications

Error events from the agent are reported to the dashboard via the notification system. Check Dashboard -> Notifications for:

server_down -- Agent disconnected
server_error -- HAProxy crash, config update failed, reload failed
ssl_expiring -- Certificate expiring within 7 days

Collecting Diagnostics for Support

When reporting an issue to support (Pro/Enterprise plans), include the output of these commands:

echo "=== Agent Status ==="
systemctl status lumos-agent

echo "=== Agent Logs (last 100 lines) ==="
journalctl -u lumos-agent -n 100 --no-pager

echo "=== HAProxy Status ==="
systemctl status haproxy

echo "=== HAProxy Config Validation ==="
haproxy -c -f /etc/haproxy/haproxy.cfg

echo "=== HAProxy Version ==="
haproxy -v

echo "=== OS Info ==="
cat /etc/os-release

echo "=== Kernel ==="
uname -r

echo "=== Memory ==="
free -h

echo "=== Disk ==="
df -h /

echo "=== Ports ==="
ss -tlnp | grep -E ':80|:443'

echo "=== Architecture ==="
uname -m

19. Agent Binary Not Found (404)

The agent installer downloads the binary from get.lumosgate.com. If the download returns a 404 error, the binary is not available for your platform.

Causes

CDN not configured -- The get.lumosgate.com CDN endpoint has not been set up yet, or the binary has not been published for the current release.
Unsupported architecture -- Agent binaries are built for linux-amd64 and linux-arm64 only. Other architectures (e.g., armv7, i386) are not supported.

Check Your Architecture

uname -m
# Expected: x86_64 (amd64) or aarch64 (arm64)

Workaround: Build from Source

If the CDN binary is not available, you can build the agent from source on any machine with Go installed:

cd agent
GOOS=linux GOARCH=amd64 go build -o lumos-agent ./cmd/lumos-agent/

For ARM servers:

GOOS=linux GOARCH=arm64 go build -o lumos-agent ./cmd/lumos-agent/

Then transfer the binary to your VPS and place it at /usr/local/bin/lumos-agent.

See Supported OS for the full list of supported architectures and operating systems.

Quick Reference

Symptom	Most Likely Cause	First Step
Agent offline	Service stopped or token invalid	`systemctl status lumos-agent`
SSL stuck provisioning	DNS not pointing to shield or port 80 blocked	`dig example.com +short`
Domain not working	DNS not propagated or config push failed	`dig example.com @8.8.8.8 +short`
WAF blocking users	Rate limit too low or false positive	Check WAF events in dashboard
HAProxy not reloading	Config syntax error	`haproxy -c -f /etc/haproxy/haproxy.cfg`
High latency	Origin slow or VPS too far from users	Test origin directly from shield
Account frozen	Insufficient credit balance	Deposit via Settings -> Billing
Bot protection blocking	API client or old browser	Whitelist the IP address
Installation fails	Wrong OS or not root	`cat /etc/os-release && whoami`
Config changes not applying	Agent offline or server issue	Restart agent to force sync
Port 80/443 in use	Apache/Nginx still running	`ss -tlnp

Next Steps

Agent Installation -- Installation guide and flags
Agent CLI -- Agent commands including uninstall
DNS Setup -- DNS configuration guide
SSL/TLS -- SSL certificate management
WAF -- WAF configuration and tuning
Notifications -- Set up alerts for error events
Supported OS -- System requirements and compatibility
Credits and Billing -- Understanding billing and frozen accounts
Architecture -- How all components work together

Troubleshooting

On this page