From Hashmod to Jump Consistent Hash — stream-metrics-route Hash Algorithm Upgrade

Introduction

In the previous article, we reviewed the three-year evolution of stream-metrics-route and mentioned that the “dual hashmod scheduling” is the core scheduling mechanism of the entire gateway. However, during continuous production operation, one fatal flaw of hashmod became increasingly obvious—every scaling operation triggers full data redistribution.

This article documents the complete decision process of migrating from hash % N (hashmod) to Jump Consistent Hash: which candidate algorithms were evaluated, why Jump Hash was ultimately chosen, and the specific impact before and after migration.

This article’s technical details are based on stream-metrics-route’s evaluation document.


Problem: Hashmod’s Scaling Disaster

stream-metrics-route’s original implementation used the most basic hash % N for assigning metrics to backend nodes:

go
1
2
3
4
5
6
func hashMod(mode int, hash uint32) int {
    if mode <= 1 {
        return 0
    }
    return int(hash % uint32(mode))
}

This algorithm is so simple it barely needs explanation, but it has a fatal flaw: when the number of nodes N changes, almost all key assignments change.

mermaid
flowchart TB
    subgraph Before["Before scaling: 3 nodes N=3"]
        direction LR
        K1["key:A hash%3=0"] --> N1["Node 0"]
        K2["key:B hash%3=1"] --> N2["Node 1"]
        K3["key:C hash%3=2"] --> N3["Node 2"]
        K4["key:D hash%3=0"] --> N1
        K5["key:E hash%3=1"] --> N2
        K6["key:F hash%3=2"] --> N3
    end

    subgraph After["After scaling: 4 nodes N=4 — 100% redistribution!"]
        direction LR
        K1b["key:A hash%4=0"] --> N1b["Node 0"]
        K2b["key:B hash%4=1"] --> N2b["Node 1"]
        K3b["key:C hash%4=2"] --> N3b["Node 2"]
        K4b["key:D hash%4=0"] --> N1b
        K5b["key:E hash%4=1"] --> N2b
        K6b["key:F hash%4=2"] --> N4b["Node 3 NEW"]
    end

    Before ---|"Add one node"| After

    style N4b fill:#ffcdd2,stroke:#f44336
    style After fill:#fff8e1,stroke:#ff9800

What does this mean in a metric routing scenario?

  • Complete reassignment of all time series: Every metric is sent to a different stream_task_id
  • Downstream vmagent aggregation state is destroyed: Stream aggregation’s in-memory state depends on stream_task_id for deduplication; when the ID changes, aggregation counters reset to zero
  • Load imbalance during migration: A large number of time series scatter simultaneously
  • Unnecessary network storm: All backend nodes receive a completely new dataset simultaneously

When your backend scales from 100 to 101 nodes, 100% of metrics are rerouted—this isn’t a “move a bit more” problem; it’s “scrap everything and start over.”

mermaid
sequenceDiagram
    autonumber
    participant GW as Gateway
    participant V0 as vmagent-0
    participant V1 as vmagent-1
    participant V2 as vmagent-2
    participant V3 as vmagent-3 (New)

    Note over GW: N=3 to N=4 scaling triggered

    rect rgba(244,67,54,0.1)
        Note over GW,V3: 100% metric redistribution
        GW->>V0: key_A was on node 0
        GW->>V1: key_B was on node 1
        GW->>V2: key_C was on node 2
        GW->>V3: key_D might route to new node
        GW->>V0: key_E may have changed target
        Note over V0,V3: Most keys' target nodes changed
    end

    rect rgba(255,152,0,0.1)
        Note over V0,V3: vmagent stream aggregation state all invalidated
        V0-->>V0: Internal state cleared, re-accumulating
        V1-->>V1: Internal state cleared, re-accumulating
        V2-->>V2: Internal state cleared, re-accumulating
        V3-->>V3: Starting from zero
    end

For production environments, this “tear everything down for every scale-up” behavior is unacceptable. We need a consistent hashing algorithm—where node changes cause minimal key remapping.


Candidate Algorithm Evaluation

We evaluated five mainstream sharding algorithms, comparing across four dimensions:

AlgorithmTime ComplexityMemory UsageBalance QualityMigration Cost (N→N+1)
hash % N (hashmod)O(1)O(1)Good100% reshuffled
Ring Consistent HashO(log N)O(N)Medium (needs virtual nodes)~K/N
Rendezvous (HRW) HashO(N)O(1)Good~K/N
Jump Consistent HashO(1)O(1)Excellent~K/(N+1)
Maglev HashO(1)O(N*M)GoodDepends on table

Below we analyze each algorithm’s suitability for the stream-metrics-route scenario.

Ring Consistent Hash (Ring Hash)

Ring consistent hash is the most classic consistent hashing implementation, widely used in Dynamo, Cassandra, and other systems.

mermaid
flowchart LR
    subgraph Ring["Hash Ring 0 to 2^32-1"]
        direction TB
        N1["Node0 at 30deg"]
        N2["Node1 at 150deg"]
        N3["Node2 at 270deg"]
    end

    K1["key:A hash=20deg"] -->|"Closest clockwise"| N1
    K2["key:B hash=100deg"] -->|"Closest clockwise"| N2
    K3["key:C hash=200deg"] -->|"Closest clockwise"| N3

    style N1 fill:#ffcdd2,stroke:#e53935
    style N2 fill:#c8e6c9,stroke:#43a047
    style N3 fill:#bbdefb,stroke:#1e88e5
    style Ring fill:#fafafa,stroke:#999

Pros: Good scalability, new nodes only affect keys in the adjacent range.

Cons: Bare rings have poor balance—with few nodes, some nodes may get far more than 1/N of the key space. The solution is Virtual Nodes, where each physical node has multiple virtual nodes on the ring. But this brings two problems:

  • Memory overhead: 100 backends × 150 virtual nodes = 15000 ring nodes to maintain
  • Tuning complexity: Virtual node count needs tuning based on backend count and key distribution; there’s no “set it and forget it” parameter

For a lightweight gateway like stream-metrics-route, introducing a ring structure that needs tuning and maintenance is too heavy.

Rendezvous Hash (HRW)

Rendezvous hash, also known as Highest Random Weight (HRW), has a very intuitive approach: calculate a weight for each (key, node) pair, select the node with the highest weight.

go
1
2
3
4
5
// Pseudocode: calculate weight for each node
for _, node := range nodes {
    weight[hash(key + node) % MAX_INT]
}
// Select the node with the largest weight

Pros: Excellent balance, simple implementation, no virtual nodes needed.

Cons: Each lookup requires computing weights with all nodes—O(N) complexity. When the number of nodes N is large (e.g., hundreds of vmagent instances), each routing requires N hash calculations.

In our scenario, the gateway processes hundreds of thousands of metrics per second, each requiring two routing operations (task partition + node selection). O(N) complexity means performance is linearly inversely correlated with node count. This is far from ideal for a low-latency gateway.

Maglev Hash

Maglev hash was designed by Google and used in the Maglev load balancer. It achieves O(1) routing lookup through a precomputed lookup table.

mermaid
flowchart TB
    K["Input key"] --> H["hash key mod M"]
    H --> L0["slot 0: Node2"]
    H --> L1["slot 1: Node0"]
    H --> L2["slot 2: Node1"]
    H --> L3["slot 3: ..."]
    H --> LM["slot M-1: Node0"]

    L0 --> N["Target node"]
    L1 --> N
    L2 --> N
    L3 --> N
    LM --> N

    style H fill:#e3f2fd,stroke:#2196f3
    style N fill:#e8f5e9,stroke:#4caf50

Pros: Lookup is indeed O(1), widely used internally at Google.

Cons: The lookup table size M is typically much larger than N (usually M » 100*N), meaning:

  • Memory overhead: O(N*M) space, very large with many nodes
  • Table rebuild: Node changes require regenerating the entire lookup table
  • Minimal failure domain: Table quality depends on node permutation generation; some node combinations may produce poor distribution

For stream-metrics-route, we don’t need Maglev’s “minimal failure domain” feature (which matters for network load balancers), but we’d have to bear its memory and complexity costs.

Jump Consistent Hash

Finally, let’s look at the algorithm we ultimately chose. Jump Consistent Hash comes from a 2014 Google paper: “A Fast, Minimal Memory, Consistent Hash Algorithm”.

Its core idea can be described intuitively: Imagine you’re rolling a special die to decide where each key “jumps” to which bucket. When the number of buckets increases, existing keys have only a small probability of being “jumped” to a new bucket.

go
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
func JumpConsistentHash(key uint64, numBuckets int) int {
    if numBuckets <= 1 {
        return 0
    }
    var b int64 = -1
    var j int64 = 0
    for j < int64(numBuckets) {
        b = j
        key = key*2862933555777941757 + 1
        j = int64(float64(b+1) * (float64(int64(1)<<31) / float64((key>>33)+1)))
    }
    return int(b)
}

About 15 lines of code, zero memory allocation, no lookup table, no virtual nodes.

Key features:

  1. Minimal migration: N → N+1, only about 1/(N+1) of keys need remapping
  2. O(1) time complexity: Same speed as hashmod
  3. Excellent distribution: Mathematically proven uniform distribution across buckets
  4. Zero configuration: No parameters to tune

This is exactly what we need—hashmod’s performance and simplicity, plus consistent hashing’s minimal migration.


Algorithm Decision Flow

The following flowchart shows our decision logic for evaluating candidate algorithms:

mermaid
flowchart TD
    Start["Need consistent hashing"] --> Q1{"Need O(1) lookup?"}
    Q1 -->|"No"| Ring["Ring Consistent Hash<br/>O(log N), needs virtual nodes"]
    Q1 -->|"Yes"| Q2{"Can accept O(N) memory?"}
    Q2 -->|"Yes"| Q3{"Need minimal failure domain?"}
    Q3 -->|"Yes"| Maglev["Maglev Hash<br/>O(N*M) memory"]
    Q3 -->|"No"| Q4{"Can accept O(N) time?"}
    Q4 -->|"Yes"| HRW["Rendezvous (HRW)<br/>O(N) time, O(1) memory"]
    Q4 -->|"No"| Jump["✅ Jump Consistent Hash<br/>O(1) time, O(1) memory"]
    Q2 -->|"No"| Jump

    style Jump fill:#c8e6c9,stroke:#4caf50,stroke-width:3px
    style Ring fill:#fff3e0,stroke:#ff9800
    style HRW fill:#fff3e0,stroke:#ff9800
    style Maglev fill:#fff3e0,stroke:#ff9800

For stream-metrics-route’s scenario—a high-performance gateway processing hundreds of thousands of metrics per second, requiring minimal migration and zero configuration—Jump Consistent Hash is the only choice that satisfies all constraints simultaneously.


Jump Consistent Hash Deep Dive

Algorithm Intuition

Jump Consistent Hash’s core idea can be understood through the following process:

  1. Start from bucket 0
  2. Continuously “jump” to higher bucket numbers
  3. The probability of each jump decreases as the bucket count increases
  4. The bucket you land on is the result
mermaid
sequenceDiagram
    autonumber
    participant Key as Key hash value
    participant B0 as Bucket 0
    participant B1 as Bucket 1
    participant B2 as Bucket 2
    participant B3 as Bucket 3

    Note over Key: numBuckets = 4
    Key->>B0: b=-1 j=0 initialization
    Key->>Key: key = key * constant + 1
    Key->>B1: j = 1*(2^31/high_key) = 2
    Note over B0,B1: Skipped bucket 0 b=0
    Key->>Key: key = key * constant + 1
    Key->>B2: j = 3*(2^31/high_key) = 3
    Note over B1,B2: Skipped bucket 1 b=2
    Key->>Key: key = key * constant + 1
    Key->>B3: j = 4*(2^31/high_key) = 7
    Note over B2,B3: Skipped bucket 2 b=3
    Note over Key,B3: j=7 > numBuckets=4 stop
    Note over B3: Result: Bucket 3

The key point: when the number of buckets increases from N to N+1, only that one “jump” landing at j=N changes the existing allocation result. This mathematically guarantees approximately 1/(N+1) migration rate.

Scaling Impact Comparison

The diagram below intuitively shows the behavior difference between hashmod and Jump Hash during scaling:

mermaid
flowchart LR
    subgraph Hashmod["Hashmod: 100 to 101 nodes"]
        direction TB
        HM_Before["100 keys each assigned to nodes 0-99"]
        HM_After["Scaling to 101 nodes<br/>hash_101 ≠ hash_100<br/>100% of keys reassigned"]
        HM_Before --> HM_After
    end

    subgraph Jump["Jump Hash: 100 to 101 nodes"]
        direction TB
        JH_Stay["~99 keys maintain assignment<br/>99.01% unchanged"]
        JH_Move["~1 key moves to new node<br/>~0.99% migration"]
        JH_Stay
        JH_Move
    end

    Hashmod ---|"Same scenario"| Jump

    style HM_After fill:#ffcdd2,stroke:#f44336
    style JH_Stay fill:#c8e6c9,stroke:#4caf50
    style JH_Move fill:#fff9c4,stroke:#fdd835

In concrete numbers:

  • Hashmod: 100 → 101 nodes, 100% of stream_task_id values change
  • Jump Hash: 100 → 101 nodes, only ~0.99% of stream_task_id values change

This means during scaling operations, vmagent’s stream aggregation state under Jump Hash is almost unaffected—99% of aggregation windows continue normal accumulation, with only about 1% needing reinitialization.


Migration Implementation

Code Changes

The change was very focused—replacing only the hash function, with the routing architecture and dual-hash logic entirely unchanged.

Old version (hashmod):

go
1
2
3
4
5
hash := common.SortLabelsHashKey(ts.Labels)
dime := common.hashMod(r.dimension, hash)     // hash % N → task partition ID
// ...
hashnode = common.SortLabelsHashKey(tmpLabels)
tmpch := common.hashMod(r.uplen, hashnode)     // hash % N → node selection

New version (Jump Consistent Hash):

go
1
2
3
4
5
hash := common.SortLabelsHashKey(ts.Labels)
dime := common.JumpConsistentHash(uint64(hash), r.dimension)  // stream_task_id
// ...
hashnode = common.SortLabelsHashKey(tmpLabels)
tmpch := common.JumpConsistentHash(uint64(hashnode), r.uplen) // node selection

Note that the FNV-32a hash calculation in SortLabelsHashKey, the filterLabels function, circuit breaker, and retry logic—all remain unchanged. The change is limited to the final routing function call.

mermaid
flowchart LR
    subgraph Pipeline["Metric Processing Pipeline"]
        A["Receive metrics"] --> B["relabel rules"]
        B --> C["filterLabels"]
        C --> D["FNV-32a hash"]
        D --> E["Routing function"]
        E --> F["Async dispatch"]
    end

    Note1["Only this step changed<br/>hashmod to Jump Hash"] -.-> E

    style E fill:#fff9c4,stroke:#fdd835,stroke-width:3px
    style A fill:#e8f5e9,stroke:#4caf50
    style B fill:#e8f5e9,stroke:#4caf50
    style C fill:#e8f5e9,stroke:#4caf50
    style D fill:#e8f5e9,stroke:#4caf50
    style F fill:#e8f5e9,stroke:#4caf50

Breaking Change Assessment

The migration has one minor breaking change: the stream_task_id value for all metrics will change.

But the impact of this change is limited:

  1. stream_task_id is a transparent label used by vmagent for aggregation deduplication—its specific value doesn’t matter as long as it’s consistent within the same dimension
  2. Static YAML configuration means node changes already require a restart
  3. A one-time switch can be done during a maintenance window, no need for gradual migration
mermaid
flowchart TB
    subgraph Migration["Migration Process"]
        direction TB
        M1["1. Confirm maintenance window"]
        M2["2. Stop stream-metrics-route"]
        M3["3. Update binary with Jump Hash"]
        M4["4. Start service"]
        M5["5. Verify metric routing distribution"]
        M1 --> M2 --> M3 --> M4 --> M5
    end

    subgraph Impact["Impact Scope"]
        I1["stream_task_id values all change"]
        I2["vmagent aggregation state reinitialized"]
        I3["relabel/filterLabels logic unchanged"]
        I4["Circuit breaker/retry logic unchanged"]
    end

    Migration -.->|"One-time impact"| Impact

    style I1 fill:#fff9c4,stroke:#fdd835
    style I2 fill:#fff9c4,stroke:#fdd835
    style I3 fill:#e8f5e9,stroke:#4caf50
    style I4 fill:#e8f5e9,stroke:#4caf50

Behavior After Migration

After migration, daily scaling operations become very elegant:

OperationHashmod BehaviorJump Hash Behavior
3 → 4 nodes100% redistribution~25% migration
10 → 11 nodes100% redistribution~9.1% migration
100 → 101 nodes100% redistribution~0.99% migration
100 → 100 nodes (no change)No impactNo impact

As cluster size grows, Jump Hash’s advantage becomes more pronounced—which is exactly where our production environment sits (hundreds of vmagent instances).


Performance Analysis

Benchmark Characteristics

MetricHashmodJump Hash
Time complexityO(1)O(1)
Memory allocationNoneNone
Lines of code~5 lines~15 lines
Lookup tableNot neededNot needed
Deterministic✅ Same input → Same output✅ Same input → Same output

Although Jump Hash has an internal for loop, due to the jump property, the expected number of loop iterations is O(ln N). For our node scale (usually < 200), the actual execution count is minimal. Compared to hashmod, there is no observable additional CPU overhead.

Production Benefits

mermaid
flowchart LR
    subgraph Before["Before migration: Hashmod"]
        direction TB
        B1["Scaling triggered"]
        B2["100% redistribution"]
        B3["Aggregation state all invalidated"]
        B4["Data accuracy drops"]
        B5["Monitoring alert false positives"]
        B1 --> B2 --> B3 --> B4 --> B5
    end

    subgraph After["After migration: Jump Hash"]
        direction TB
        A1["Scaling triggered"]
        A2["~1% redistribution"]
        A3["99% aggregation state unaffected"]
        A4["Data accuracy maintained"]
        A5["Frictionless scaling"]
        A1 --> A2 --> A3 --> A4 --> A5
    end

    Before ---|"Same operation"| After

    style B5 fill:#ffcdd2,stroke:#f44336
    style A5 fill:#c8e6c9,stroke:#4caf50

Summary of production advantages from Jump Hash:

  1. Graceful scaling: Adding backends incrementally doesn’t cause full rebalancing
  2. Stable aggregation: Maintains vmagent aggregation state continuity during scaling operations
  3. Reduced load: Minimizes network traffic and processing overhead during migration
  4. Predictable behavior: Distribution quality is mathematically guaranteed, no “unlucky distribution imbalance” situations

Theoretical Foundation

Jump Consistent Hash isn’t invented out of thin air; it has rigorous mathematical proof:

“A Fast, Minimal Memory, Consistent Hash Algorithm” by John Lamping and Eric Veach, Google, 2014 http://arxiv.org/abs/1406.2294

The paper proves two key properties:

  1. Uniform distribution: Each bucket gets approximately 1/N of keys, distribution quality no worse than hashmod
  2. Minimal migration: When N → N+1, exactly K/(N+1) keys need remapping, which is the theoretical optimum

This algorithm has been used internally at Google across multiple systems (including Spanner’s tablet routing), with large-scale production verification.


Summary

This migration from hashmod to Jump Consistent Hash can be summarized in one sentence:

Same O(1) performance, migration cost reduced from 100% to 1%.

DimensionConclusion
Performance impactNo additional overhead, O(1) unchanged
Code change scopeMinimal, only replaced hash function calls
Migration breaking changeOne-time stream_task_id remapping, done within maintenance window
Long-term benefitScaling operations nearly frictionless for production

For any metric routing system using hash % N for sharding that faces dynamic node scaling requirements, Jump Consistent Hash is a worthwhile upgrade path—zero-cost performance with huge operational benefits.

References