Performance tuning
HeliosLogs runs well at its defaults — start there and change knobs only when you have a reason. This page explains what each storage and query knob does and when to touch it. Exact names and defaults are in the Configuration reference; the configuration model explains precedence (env var > control setting > default) and which knobs are live vs. restart-only.
All knobs below are editable live in Admin → General → Server tunables (an env var, if set, wins and locks the field) — except query threads, which is restart-only.
Storage engine
The engine writes immutable blocks and merges small ones into larger ones in the background. The knobs trade write amplification against query cost:
| Knob | Default | Raise it to… | Lower it to… |
|---|---|---|---|
HELIOS_BLOCK_TARGET_MB | 64 MB | Fewer, larger blocks — better scan locality | Smaller blocks |
HELIOS_BLOCK_MIN_COMPACT_MB | 5 MB | Compact less aggressively | Merge smaller groups sooner |
HELIOS_BLOCK_MAX_SMALL_BLOCKS | 100 | Tolerate more tiny files before forcing a merge | Force merges sooner (fewer files) |
HELIOS_BLOCK_COMPACT_SECS | 30 s | Compact less often (lower background CPU) | Compact more eagerly |
Symptoms and responses:
- Lots of tiny blocks / slow listings — lower the small-block waiver or the floor so they merge sooner.
- Compaction CPU spikes — raise the compaction interval and/or target size.
Ingest throughput
Buffered ingest batches rows into blocks. These control batching and backpressure:
| Knob | Default | Effect |
|---|---|---|
HELIOS_BLOCK_FLUSH_ROWS | 50 000 | Rows buffered before a block is flushed. |
HELIOS_BLOCK_FLUSH_SECS | 5 s | Time-based flush so low-volume streams still land promptly. |
HELIOS_BLOCK_QUEUE_CAP | 100 000 | Ingest queue depth (startup-fixed). When full, ingest returns 429 instead of growing memory unbounded. |
HELIOS_BLOCK_FLUSH_CONCURRENCY | 2 | Concurrent block flushes (startup-fixed). Each in-flight flush builds a full block in memory, so this caps the buffered-memory ceiling. |
If ingest clients see 429s under a firehose, that's backpressure working — the client should retry with backoff. If you have memory headroom and want to absorb bigger bursts, raise HELIOS_BLOCK_QUEUE_CAP (restart required). See Ingestion overview.
Query
| Knob | Default | Notes |
|---|---|---|
HELIOS_QUERY_THREADS | 4 | Thread pool for query fan-out — the total query-CPU ceiling shared across requests. Restart-only. Scale toward available cores. |
HELIOS_QUERY_CACHE_MB | 1024 MB | Caches row-match bitmaps per (block, filter), so repeated, paginated, and time-zoomed queries skip re-scanning. 0 disables it. |
HELIOS_AGG_MAX_PARTITIONS | 96 | Partitions an approximate aggregation scans exactly before it switches to stride-sampling. Raise for more accuracy over very wide time ranges (more CPU); lower for faster, rougher aggregates. |
Memory
HeliosLogs uses jemalloc tuned to return freed memory to the OS quickly, so RSS doesn't ratchet up after compaction or large-query spikes. The main memory levers an operator controls are the query cache size, the ingest queue/flush concurrency (buffered-block memory), and query threads (peak concurrent scan memory). Size the host with headroom above steady-state for these spikes.
A tuning workflow
- Observe before changing anything — HeliosLogs indexes its own request latencies and errors into the
_systemenv. See Self-observability. - Change one knob in Admin → General; live knobs apply within seconds.
- Verify the effect in the self-logs (latency, 429 rate, compaction frequency) before changing the next.