From Hashmod to Jump Consistent Hash — stream-metrics-route Hash Algorithm Upgrade
Introduction
In the previous article, we reviewed the three-year evolution of stream-metrics-route and mentioned that the “dual hashmod scheduling” is the core scheduling mechanism of the entire gateway. However, during continuous production operation, one fatal flaw of hashmod became increasingly obvious—every scaling operation triggers full data redistribution.
This article documents the complete decision process of migrating from hash % N (hashmod) to Jump Consistent Hash: which candidate algorithms were evaluated, why Jump Hash was ultimately chosen, and the specific impact before and after migration.
This article’s technical details are based on stream-metrics-route’s evaluation document.
Problem: Hashmod’s Scaling Disaster
stream-metrics-route’s original implementation used the most basic hash % N for assigning metrics to backend nodes:
| |
This algorithm is so simple it barely needs explanation, but it has a fatal flaw: when the number of nodes N changes, almost all key assignments change.
flowchart TB
subgraph Before["Before scaling: 3 nodes N=3"]
direction LR
K1["key:A hash%3=0"] --> N1["Node 0"]
K2["key:B hash%3=1"] --> N2["Node 1"]
K3["key:C hash%3=2"] --> N3["Node 2"]
K4["key:D hash%3=0"] --> N1
K5["key:E hash%3=1"] --> N2
K6["key:F hash%3=2"] --> N3
end
subgraph After["After scaling: 4 nodes N=4 — 100% redistribution!"]
direction LR
K1b["key:A hash%4=0"] --> N1b["Node 0"]
K2b["key:B hash%4=1"] --> N2b["Node 1"]
K3b["key:C hash%4=2"] --> N3b["Node 2"]
K4b["key:D hash%4=0"] --> N1b
K5b["key:E hash%4=1"] --> N2b
K6b["key:F hash%4=2"] --> N4b["Node 3 NEW"]
end
Before ---|"Add one node"| After
style N4b fill:#ffcdd2,stroke:#f44336
style After fill:#fff8e1,stroke:#ff9800What does this mean in a metric routing scenario?
- Complete reassignment of all time series: Every metric is sent to a different
stream_task_id - Downstream vmagent aggregation state is destroyed: Stream aggregation’s in-memory state depends on
stream_task_idfor deduplication; when the ID changes, aggregation counters reset to zero - Load imbalance during migration: A large number of time series scatter simultaneously
- Unnecessary network storm: All backend nodes receive a completely new dataset simultaneously
When your backend scales from 100 to 101 nodes, 100% of metrics are rerouted—this isn’t a “move a bit more” problem; it’s “scrap everything and start over.”
sequenceDiagram
autonumber
participant GW as Gateway
participant V0 as vmagent-0
participant V1 as vmagent-1
participant V2 as vmagent-2
participant V3 as vmagent-3 (New)
Note over GW: N=3 to N=4 scaling triggered
rect rgba(244,67,54,0.1)
Note over GW,V3: 100% metric redistribution
GW->>V0: key_A was on node 0
GW->>V1: key_B was on node 1
GW->>V2: key_C was on node 2
GW->>V3: key_D might route to new node
GW->>V0: key_E may have changed target
Note over V0,V3: Most keys' target nodes changed
end
rect rgba(255,152,0,0.1)
Note over V0,V3: vmagent stream aggregation state all invalidated
V0-->>V0: Internal state cleared, re-accumulating
V1-->>V1: Internal state cleared, re-accumulating
V2-->>V2: Internal state cleared, re-accumulating
V3-->>V3: Starting from zero
endFor production environments, this “tear everything down for every scale-up” behavior is unacceptable. We need a consistent hashing algorithm—where node changes cause minimal key remapping.
Candidate Algorithm Evaluation
We evaluated five mainstream sharding algorithms, comparing across four dimensions:
| Algorithm | Time Complexity | Memory Usage | Balance Quality | Migration Cost (N→N+1) |
|---|---|---|---|---|
| hash % N (hashmod) | O(1) | O(1) | Good | 100% reshuffled |
| Ring Consistent Hash | O(log N) | O(N) | Medium (needs virtual nodes) | ~K/N |
| Rendezvous (HRW) Hash | O(N) | O(1) | Good | ~K/N |
| Jump Consistent Hash | O(1) | O(1) | Excellent | ~K/(N+1) |
| Maglev Hash | O(1) | O(N*M) | Good | Depends on table |
Below we analyze each algorithm’s suitability for the stream-metrics-route scenario.
Ring Consistent Hash (Ring Hash)
Ring consistent hash is the most classic consistent hashing implementation, widely used in Dynamo, Cassandra, and other systems.
flowchart LR
subgraph Ring["Hash Ring 0 to 2^32-1"]
direction TB
N1["Node0 at 30deg"]
N2["Node1 at 150deg"]
N3["Node2 at 270deg"]
end
K1["key:A hash=20deg"] -->|"Closest clockwise"| N1
K2["key:B hash=100deg"] -->|"Closest clockwise"| N2
K3["key:C hash=200deg"] -->|"Closest clockwise"| N3
style N1 fill:#ffcdd2,stroke:#e53935
style N2 fill:#c8e6c9,stroke:#43a047
style N3 fill:#bbdefb,stroke:#1e88e5
style Ring fill:#fafafa,stroke:#999Pros: Good scalability, new nodes only affect keys in the adjacent range.
Cons: Bare rings have poor balance—with few nodes, some nodes may get far more than 1/N of the key space. The solution is Virtual Nodes, where each physical node has multiple virtual nodes on the ring. But this brings two problems:
- Memory overhead: 100 backends × 150 virtual nodes = 15000 ring nodes to maintain
- Tuning complexity: Virtual node count needs tuning based on backend count and key distribution; there’s no “set it and forget it” parameter
For a lightweight gateway like stream-metrics-route, introducing a ring structure that needs tuning and maintenance is too heavy.
Rendezvous Hash (HRW)
Rendezvous hash, also known as Highest Random Weight (HRW), has a very intuitive approach: calculate a weight for each (key, node) pair, select the node with the highest weight.
| |
Pros: Excellent balance, simple implementation, no virtual nodes needed.
Cons: Each lookup requires computing weights with all nodes—O(N) complexity. When the number of nodes N is large (e.g., hundreds of vmagent instances), each routing requires N hash calculations.
In our scenario, the gateway processes hundreds of thousands of metrics per second, each requiring two routing operations (task partition + node selection). O(N) complexity means performance is linearly inversely correlated with node count. This is far from ideal for a low-latency gateway.
Maglev Hash
Maglev hash was designed by Google and used in the Maglev load balancer. It achieves O(1) routing lookup through a precomputed lookup table.
flowchart TB
K["Input key"] --> H["hash key mod M"]
H --> L0["slot 0: Node2"]
H --> L1["slot 1: Node0"]
H --> L2["slot 2: Node1"]
H --> L3["slot 3: ..."]
H --> LM["slot M-1: Node0"]
L0 --> N["Target node"]
L1 --> N
L2 --> N
L3 --> N
LM --> N
style H fill:#e3f2fd,stroke:#2196f3
style N fill:#e8f5e9,stroke:#4caf50Pros: Lookup is indeed O(1), widely used internally at Google.
Cons: The lookup table size M is typically much larger than N (usually M » 100*N), meaning:
- Memory overhead: O(N*M) space, very large with many nodes
- Table rebuild: Node changes require regenerating the entire lookup table
- Minimal failure domain: Table quality depends on node permutation generation; some node combinations may produce poor distribution
For stream-metrics-route, we don’t need Maglev’s “minimal failure domain” feature (which matters for network load balancers), but we’d have to bear its memory and complexity costs.
Jump Consistent Hash
Finally, let’s look at the algorithm we ultimately chose. Jump Consistent Hash comes from a 2014 Google paper: “A Fast, Minimal Memory, Consistent Hash Algorithm”.
Its core idea can be described intuitively: Imagine you’re rolling a special die to decide where each key “jumps” to which bucket. When the number of buckets increases, existing keys have only a small probability of being “jumped” to a new bucket.
| |
About 15 lines of code, zero memory allocation, no lookup table, no virtual nodes.
Key features:
- Minimal migration: N → N+1, only about 1/(N+1) of keys need remapping
- O(1) time complexity: Same speed as hashmod
- Excellent distribution: Mathematically proven uniform distribution across buckets
- Zero configuration: No parameters to tune
This is exactly what we need—hashmod’s performance and simplicity, plus consistent hashing’s minimal migration.
Algorithm Decision Flow
The following flowchart shows our decision logic for evaluating candidate algorithms:
flowchart TD
Start["Need consistent hashing"] --> Q1{"Need O(1) lookup?"}
Q1 -->|"No"| Ring["Ring Consistent Hash<br/>O(log N), needs virtual nodes"]
Q1 -->|"Yes"| Q2{"Can accept O(N) memory?"}
Q2 -->|"Yes"| Q3{"Need minimal failure domain?"}
Q3 -->|"Yes"| Maglev["Maglev Hash<br/>O(N*M) memory"]
Q3 -->|"No"| Q4{"Can accept O(N) time?"}
Q4 -->|"Yes"| HRW["Rendezvous (HRW)<br/>O(N) time, O(1) memory"]
Q4 -->|"No"| Jump["✅ Jump Consistent Hash<br/>O(1) time, O(1) memory"]
Q2 -->|"No"| Jump
style Jump fill:#c8e6c9,stroke:#4caf50,stroke-width:3px
style Ring fill:#fff3e0,stroke:#ff9800
style HRW fill:#fff3e0,stroke:#ff9800
style Maglev fill:#fff3e0,stroke:#ff9800For stream-metrics-route’s scenario—a high-performance gateway processing hundreds of thousands of metrics per second, requiring minimal migration and zero configuration—Jump Consistent Hash is the only choice that satisfies all constraints simultaneously.
Jump Consistent Hash Deep Dive
Algorithm Intuition
Jump Consistent Hash’s core idea can be understood through the following process:
- Start from bucket 0
- Continuously “jump” to higher bucket numbers
- The probability of each jump decreases as the bucket count increases
- The bucket you land on is the result
sequenceDiagram
autonumber
participant Key as Key hash value
participant B0 as Bucket 0
participant B1 as Bucket 1
participant B2 as Bucket 2
participant B3 as Bucket 3
Note over Key: numBuckets = 4
Key->>B0: b=-1 j=0 initialization
Key->>Key: key = key * constant + 1
Key->>B1: j = 1*(2^31/high_key) = 2
Note over B0,B1: Skipped bucket 0 b=0
Key->>Key: key = key * constant + 1
Key->>B2: j = 3*(2^31/high_key) = 3
Note over B1,B2: Skipped bucket 1 b=2
Key->>Key: key = key * constant + 1
Key->>B3: j = 4*(2^31/high_key) = 7
Note over B2,B3: Skipped bucket 2 b=3
Note over Key,B3: j=7 > numBuckets=4 stop
Note over B3: Result: Bucket 3The key point: when the number of buckets increases from N to N+1, only that one “jump” landing at j=N changes the existing allocation result. This mathematically guarantees approximately 1/(N+1) migration rate.
Scaling Impact Comparison
The diagram below intuitively shows the behavior difference between hashmod and Jump Hash during scaling:
flowchart LR
subgraph Hashmod["Hashmod: 100 to 101 nodes"]
direction TB
HM_Before["100 keys each assigned to nodes 0-99"]
HM_After["Scaling to 101 nodes<br/>hash_101 ≠ hash_100<br/>100% of keys reassigned"]
HM_Before --> HM_After
end
subgraph Jump["Jump Hash: 100 to 101 nodes"]
direction TB
JH_Stay["~99 keys maintain assignment<br/>99.01% unchanged"]
JH_Move["~1 key moves to new node<br/>~0.99% migration"]
JH_Stay
JH_Move
end
Hashmod ---|"Same scenario"| Jump
style HM_After fill:#ffcdd2,stroke:#f44336
style JH_Stay fill:#c8e6c9,stroke:#4caf50
style JH_Move fill:#fff9c4,stroke:#fdd835In concrete numbers:
- Hashmod: 100 → 101 nodes, 100% of
stream_task_idvalues change - Jump Hash: 100 → 101 nodes, only ~0.99% of
stream_task_idvalues change
This means during scaling operations, vmagent’s stream aggregation state under Jump Hash is almost unaffected—99% of aggregation windows continue normal accumulation, with only about 1% needing reinitialization.
Migration Implementation
Code Changes
The change was very focused—replacing only the hash function, with the routing architecture and dual-hash logic entirely unchanged.
Old version (hashmod):
| |
New version (Jump Consistent Hash):
| |
Note that the FNV-32a hash calculation in SortLabelsHashKey, the filterLabels function, circuit breaker, and retry logic—all remain unchanged. The change is limited to the final routing function call.
flowchart LR
subgraph Pipeline["Metric Processing Pipeline"]
A["Receive metrics"] --> B["relabel rules"]
B --> C["filterLabels"]
C --> D["FNV-32a hash"]
D --> E["Routing function"]
E --> F["Async dispatch"]
end
Note1["Only this step changed<br/>hashmod to Jump Hash"] -.-> E
style E fill:#fff9c4,stroke:#fdd835,stroke-width:3px
style A fill:#e8f5e9,stroke:#4caf50
style B fill:#e8f5e9,stroke:#4caf50
style C fill:#e8f5e9,stroke:#4caf50
style D fill:#e8f5e9,stroke:#4caf50
style F fill:#e8f5e9,stroke:#4caf50Breaking Change Assessment
The migration has one minor breaking change: the stream_task_id value for all metrics will change.
But the impact of this change is limited:
stream_task_idis a transparent label used by vmagent for aggregation deduplication—its specific value doesn’t matter as long as it’s consistent within the same dimension- Static YAML configuration means node changes already require a restart
- A one-time switch can be done during a maintenance window, no need for gradual migration
flowchart TB
subgraph Migration["Migration Process"]
direction TB
M1["1. Confirm maintenance window"]
M2["2. Stop stream-metrics-route"]
M3["3. Update binary with Jump Hash"]
M4["4. Start service"]
M5["5. Verify metric routing distribution"]
M1 --> M2 --> M3 --> M4 --> M5
end
subgraph Impact["Impact Scope"]
I1["stream_task_id values all change"]
I2["vmagent aggregation state reinitialized"]
I3["relabel/filterLabels logic unchanged"]
I4["Circuit breaker/retry logic unchanged"]
end
Migration -.->|"One-time impact"| Impact
style I1 fill:#fff9c4,stroke:#fdd835
style I2 fill:#fff9c4,stroke:#fdd835
style I3 fill:#e8f5e9,stroke:#4caf50
style I4 fill:#e8f5e9,stroke:#4caf50Behavior After Migration
After migration, daily scaling operations become very elegant:
| Operation | Hashmod Behavior | Jump Hash Behavior |
|---|---|---|
| 3 → 4 nodes | 100% redistribution | ~25% migration |
| 10 → 11 nodes | 100% redistribution | ~9.1% migration |
| 100 → 101 nodes | 100% redistribution | ~0.99% migration |
| 100 → 100 nodes (no change) | No impact | No impact |
As cluster size grows, Jump Hash’s advantage becomes more pronounced—which is exactly where our production environment sits (hundreds of vmagent instances).
Performance Analysis
Benchmark Characteristics
| Metric | Hashmod | Jump Hash |
|---|---|---|
| Time complexity | O(1) | O(1) |
| Memory allocation | None | None |
| Lines of code | ~5 lines | ~15 lines |
| Lookup table | Not needed | Not needed |
| Deterministic | ✅ Same input → Same output | ✅ Same input → Same output |
Although Jump Hash has an internal for loop, due to the jump property, the expected number of loop iterations is O(ln N). For our node scale (usually < 200), the actual execution count is minimal. Compared to hashmod, there is no observable additional CPU overhead.
Production Benefits
flowchart LR
subgraph Before["Before migration: Hashmod"]
direction TB
B1["Scaling triggered"]
B2["100% redistribution"]
B3["Aggregation state all invalidated"]
B4["Data accuracy drops"]
B5["Monitoring alert false positives"]
B1 --> B2 --> B3 --> B4 --> B5
end
subgraph After["After migration: Jump Hash"]
direction TB
A1["Scaling triggered"]
A2["~1% redistribution"]
A3["99% aggregation state unaffected"]
A4["Data accuracy maintained"]
A5["Frictionless scaling"]
A1 --> A2 --> A3 --> A4 --> A5
end
Before ---|"Same operation"| After
style B5 fill:#ffcdd2,stroke:#f44336
style A5 fill:#c8e6c9,stroke:#4caf50Summary of production advantages from Jump Hash:
- Graceful scaling: Adding backends incrementally doesn’t cause full rebalancing
- Stable aggregation: Maintains vmagent aggregation state continuity during scaling operations
- Reduced load: Minimizes network traffic and processing overhead during migration
- Predictable behavior: Distribution quality is mathematically guaranteed, no “unlucky distribution imbalance” situations
Theoretical Foundation
Jump Consistent Hash isn’t invented out of thin air; it has rigorous mathematical proof:
“A Fast, Minimal Memory, Consistent Hash Algorithm” by John Lamping and Eric Veach, Google, 2014 http://arxiv.org/abs/1406.2294
The paper proves two key properties:
- Uniform distribution: Each bucket gets approximately 1/N of keys, distribution quality no worse than hashmod
- Minimal migration: When N → N+1, exactly K/(N+1) keys need remapping, which is the theoretical optimum
This algorithm has been used internally at Google across multiple systems (including Spanner’s tablet routing), with large-scale production verification.
Summary
This migration from hashmod to Jump Consistent Hash can be summarized in one sentence:
Same O(1) performance, migration cost reduced from 100% to 1%.
| Dimension | Conclusion |
|---|---|
| Performance impact | No additional overhead, O(1) unchanged |
| Code change scope | Minimal, only replaced hash function calls |
| Migration breaking change | One-time stream_task_id remapping, done within maintenance window |
| Long-term benefit | Scaling operations nearly frictionless for production |
For any metric routing system using hash % N for sharding that faces dynamic node scaling requirements, Jump Consistent Hash is a worthwhile upgrade path—zero-cost performance with huge operational benefits.