Troubleshooting & FAQ
Common issues, what causes them, and how to fix them.
Ingest returns 429 Too Many Requests
The writer queue is full — backpressure, working as intended. Compatibility shims return 429 (the native /api/ingest blocks instead). The client should retry with backoff. If sustained, give the queue more room (HELIOS_BLOCK_QUEUE_CAP, restart) or look at why flushes are slow. See Performance tuning.
Ingest returns 400 "unknown environment"
The target environment must exist first. Create it under Admin → Environments, or ingest into default. Note that environments starting with _ are reserved.
Events don't show up in search
- Time range — events route by their event timestamp; widen the search window to where the data actually falls.
- Wrong env/index — confirm
?env=/?index=(or the token's pinned env) match what you're searching. Try* | stats count by index. - Parse errors — check the ingest response's
errorscount; malformed lines are counted, not ingested.
Logged out / 401 after a restart
The JWT secret was regenerated or isn't shared. If it isn't persisted (or, in a cluster, isn't identical on every node), tokens stop validating. Pin HELIOS_JWT_SECRET_PATH to a stable, shared file.
Multi-node: data or settings not converging
- Sync interval — cross-node visibility is eventual (≈
HELIOS_BLOCK_SYNC_SECS, default 10s). Give it a moment. - Mismatched secrets — every node needs the same control key and JWT secret, and the same
HELIOS_CONTROL_ENCRYPTIONsetting. A control-key mismatch means a node can't decrypt the shared control plane. See Multi-node.
Retention isn't deleting old data
- No retention is configured — set a global default (
HELIOS_RETENTION_DEFAULT_DAYSor Admin → General) or a per-env override. Unset = keep forever. - The sweep runs on an interval (
HELIOS_RETENTION_SWEEP_SECS, default hourly). To force it now, callPOST /api/admin/gc. See Indexes & retention.
FIPS binary won't start
A FIPS build aborts if the validated module can't load. On macOS, set DYLD_FALLBACK_LIBRARY_PATH to the directory of libaws_lc_fips_*crypto.dylib. On Linux/Docker this is handled for you. Confirm via the crypto provider in Admin → General.
The agent or AI monitors don't work
The LLM provider isn't configured or enabled. Set a provider, enter credentials, Test connection, and enable the agent. Check the _helioslogsself-log for provider errors.
An MCP client can't connect
- Use the right URL:
POST /mcp. - Include
Authorization: Bearer <token>if an MCP auth token is set. - Check the env/index and tool allowlists — a hidden tool is also rejected if called. Smoke-test with the
tools/listcurl from the MCP page.
SAML login fails
The user sees a generic error; the real reason is logged. Search _helioslogs for saml_* events. Common causes: certificate/audience mismatch, an expired or replayed assertion, or no matching HeliosLogs user (SAML is match-only — create the user first).
Forgot the admin password
Set HELIOS_ADMIN_RESET=1 together with HELIOS_ADMIN_PASSWORD and restart; the admin password is reset and sessions revoked. Unset it again afterward. See First steps.
FAQ
Do I need a database? No — HeliosLogs is a single binary; the control plane is encrypted JSON on disk (or in the shared store).
How do I scale out / get HA? Point multiple nodes at one shared store and share the secret files.
How do I back up? Replicate the data dir / shared store and back up the two secret files. See Upgrades & backups.
Can existing shippers send to HeliosLogs? Yes — it accepts Elasticsearch, Splunk HEC, Loki, OTLP, and syslog. See Compatibility endpoints.