Benchmarked on i5-10400F, Go 1.24:
Benchmarks
Section titled “Benchmarks”| Operation | Latency | Allocs |
|---|---|---|
| Lex (simple query) | 304 ns/op | 2 |
| Lex (full query) | 945 ns/op | 2 |
| Parse (simple query) | 477 ns/op | 4 |
| Parse (full query) | 1,470 ns/op | 8 |
Design Decisions
Section titled “Design Decisions”Lexer — O(1) Keyword Lookup
Section titled “Lexer — O(1) Keyword Lookup”The lexer uses a stack-allocated buffer and O(1) keyword table lookup. It never heap-allocates to identify keywords, keeping tokenization cost flat regardless of input length.
Parser — Zero-Allocation Comparison
Section titled “Parser — Zero-Allocation Comparison”The Pratt parser uses byte-level asciiEqual / asciiEqualLower comparisons instead of strings.EqualFold or reflect-based logic. This avoids allocations in the hot path.
Filters — Type-Switch, No Reflect
Section titled “Filters — Type-Switch, No Reflect”Filter predicate evaluation uses explicit type-switch dispatch. The reflect package is never called in the filter conversion path, which keeps the per-query filter cost predictable.
Sparse BM25 — Atomic Cache
Section titled “Sparse BM25 — Atomic Cache”BM25 parameters (k1, b, avgdl) are cached with atomic.Pointer so concurrent queries never block on a mutex to read embedding configuration.
Pipeline — Cached Options
Section titled “Pipeline — Cached Options”The pipeline caches buildDocumentOptions across requests. The embedding client uses http.Client{Timeout: 30s} instead of http.DefaultClient to prevent runaway connections.
Network Efficiency
Section titled “Network Efficiency”BatchQuery — Single Round-Trip
Section titled “BatchQuery — Single Round-Trip”When running multiple QUERY statements, use BatchQuery to send them all in a single QueryBatchPoints call to Qdrant:
results, _ := qql.BatchQuery(ctx, client, []string{ "QUERY 'emergency triage' FROM docs LIMIT 5", "QUERY 'cardiac arrest' FROM docs LIMIT 5", "QUERY 'neurological assessment' FROM docs LIMIT 5",})// All 3 queries in one round-tripThis is 3–5× faster than sequential execution for pure QUERY batches.
Gateway — Auto-Detect Batch
Section titled “Gateway — Auto-Detect Batch”The gateway auto-detects when all queries in an ExecBatch call are pure QUERY statements and routes them through Qdrant's native QueryBatch API automatically.