BBR Congestion Control Algorithm Deep Dive

BBR (Bottleneck Bandwidth and Round-trip propagation time), developed by Neal Cardwell, Yuchung Cheng, and others at Google, is one of the most advanced model-based congestion control algorithms available today. Unlike traditional loss-based algorithms (Reno, CUBIC), BBR explicitly models the network path by directly measuring bottleneck bandwidth and propagation delay, sending data at the BDP (Bandwidth-Delay Product) rate at the bottleneck point.

BBR’s core insight is that packet loss does not equal congestion. On deep-buffered (Bufferbloat) or wireless links, packet loss can be caused by channel noise or excessive buffer queuing rather than genuine link saturation. BBR actively measures bandwidth and latency to precisely control the sending rate, rather than passively waiting for loss signals.

BBR’s Core Model

BBR builds its network model on two key measurements:

mermaid
flowchart LR
    subgraph Network Measurements
        BW["BtlBw<br/>Bottleneck Bandwidth<br/>Max delivery rate"]
        RTT["RTprop<br/>Round-trip propagation<br/>Min RTT"]
    end

    BW --> BDP["BDP = BtlBw × RTprop<br/>Bandwidth-Delay Product"]
    BDP --> OP["Operating Point<br/>inflight ≈ BDP"]
    BDP --> RATE["Send Rate<br/>pacing_rate = BtlBw × gain"]

    style BW fill:#4CAF50,color:#fff
    style RTT fill:#2196F3,color:#fff
  • BtlBw (Bottleneck Bandwidth): The maximum delivery rate of the path, obtained by measuring the maximum delivered bandwidth over a time window. It represents the upper limit of data transmission capacity for the narrowest link on the path.
  • RTprop (Round-trip propagation time): The minimum round-trip time of the path, obtained by measuring the minimum RTT over a time window. It represents the shortest time for a signal to travel through the physical medium, excluding queuing delay.
  • BDP (Bandwidth-Delay Product): The product of BtlBw × RTprop, describing the maximum amount of data that can be “in flight” in the pipe.

BBR’s core goal is to maintain exactly BDP amount of data in flight at the bottleneck — too much data queues in the buffer (increasing latency), too little underutilizes the bandwidth. This equilibrium point is called the Optimal Operating Point.

BBR’s Measurement Mechanism

What sets BBR apart from traditional congestion control algorithms is its measurement mechanism. Instead of inferring congestion from loss signals, BBR actively measures two inherent properties of the network path: bottleneck bandwidth (BtlBw) and propagation delay (RTprop).

BtlBw Measurement: Max Filter

BtlBw measurement is based on a key observation: when the sender’s data pipe is fully filled, the rate of ACK returns equals the bottleneck link’s service rate. BBR calculates the delivery rate at the end of each RTT round:

plaintext
1
2
3
# At the end of each RTT round:
delivery_rate = bytes_acked_in_round / time_elapsed
BtlBw = max(BtlBw, delivery_rate)  # Windowed max filter

This value passes through a windowed max filter, typically with a 10-RTT window. The design rationale for the windowed max filter:

  1. Eliminate measurement noise: A single delivery rate sample can be affected by ACK compression, receive window limitations, or other transient effects
  2. Track bandwidth changes: A 10-RTT window is short enough to respond quickly to actual bandwidth increases, yet long enough to filter out transient fluctuations
  3. Conservative increase: The max filter ensures BtlBw estimates only increase within the window, until the window expires and a new measurement cycle begins

When no higher delivery_rate appears within one window period (10 RTTs), BBR lowers its BtlBw estimate to respond to an actual decrease in bottleneck bandwidth.

RTprop Measurement: Min Filter

RTprop measurement leverages TCP ACK timestamp information. Upon receiving each ACK, BBR calculates that packet’s RTT:

plaintext
1
2
3
# Each time an ACK is received:
RTT = now - packet_send_time
RTprop = min(RTprop, RTT)  # Windowed min filter

This value passes through a windowed min filter, typically with a 10-second window. The design rationale for the windowed min filter:

  1. Filter out queuing delay: Only when the buffer is completely empty does the measured RTT equal the true propagation delay. The min filter automatically selects these “clean” samples.
  2. Adapt to route changes: A 10-second window is long enough to capture queue-draining opportunities, yet short enough to respond to path latency changes caused by IP rerouting.
  3. Robust to noise: Latency noise (scheduling delays, interrupt handling) only increases RTT and will never be selected by the min filter.
mermaid
flowchart TD
    subgraph BtlBw Measurement
        BW1["Send data<br/>Fill the pipe"] --> BW2["Receive ACK<br/>Compute delivery rate"]
        BW2 --> BW3["Max Filter<br/>10 RTT window"]
        BW3 --> BW4["BtlBw estimate"]
    end

    subgraph RTprop Measurement
        RT1["Send data<br/>Record timestamp"] --> RT2["Receive ACK<br/>Compute RTT"]
        RT2 --> RT3["Min Filter<br/>10 sec window"]
        RT3 --> RT4["RTprop estimate"]
    end

    BW4 --> OP["Optimal Operating Point<br/>BDP = BtlBw × RTprop"]
    RT4 --> OP

    style BW4 fill:#4CAF50,color:#fff
    style RT4 fill:#2196F3,color:#fff
    style OP fill:#FF9800,color:#fff

Why Use Windowed Filters?

Network conditions change dynamically. Using windowed filters instead of global max/min strikes a balance between measurement accuracy and adaptability:

  • An unbounded max filter would keep BtlBw at its historical maximum forever, unable to respond to bandwidth decreases
  • An unbounded min filter would keep RTprop at its historical minimum forever, unable to respond to route changes
  • Window lengths are empirically derived: 10 RTTs balance responsiveness to bandwidth changes, and 10 seconds balance responsiveness to latency changes

BBR’s Four-State Machine

BBR cycles through four states to probe and maintain the optimal operating point:

mermaid
flowchart TD
    STARTUP["STARTUP<br/>Gain 2.77<br/>Exponential BW probing"]
    DRAIN["DRAIN<br/>Gain 0.35<br/>Drain startup queue"]
    PROBE_BW["PROBE_BW<br/>Gain cycle 1.25/0.75/1<br/>Periodic BW probing"]
    PROBE_RTT["PROBE_RTT<br/>Gain 0.75<br/>Drain queue<br/>Re-measure RTprop"]

    STARTUP -->|"BW growth slows<br/>or packet loss detected"| DRAIN
    DRAIN -->|"inflight ≤ BDP"| PROBE_BW
    PROBE_BW -->|"10s since last PROBE_RTT"| PROBE_RTT
    PROBE_RTT -->|"RTprop updated<br/>or 200ms elapsed"| PROBE_BW

    style STARTUP fill:#f44336,color:#fff
    style DRAIN fill:#FF9800,color:#fff
    style PROBE_BW fill:#4CAF50,color:#fff
    style PROBE_RTT fill:#2196F3,color:#fff

STARTUP Phase

BBR exponentially probes the bottleneck bandwidth limit, similar to traditional congestion control’s slow start phase. It doubles the send rate each RTT (gain ≈ 2.77, because it needs to simultaneously fill the pipe and create probing effects). Detailed STARTUP behavior:

  • Gain is set to 2.77, increasing the send amount by approximately 177% each RTT
  • Continuously monitors bandwidth growth rate; exits when bandwidth growth is below 25% for three consecutive RTTs, indicating the bottleneck bandwidth limit is near
  • If packet loss is detected, it also immediately exits STARTUP
  • Upon exiting STARTUP, the current bandwidth estimate is recorded as BtlBw

DRAIN Phase

The exponential growth in STARTUP inevitably creates a queue in the buffer. The purpose of DRAIN is to empty this queue:

  • Gain is set to 0.35, drastically reducing the send rate
  • Continuously monitors inflight data until it drops back to BDP level
  • Once the queue is drained, the measured RTT reflects the true RTprop without queuing delay

PROBE_BW Phase

This is BBR’s steady-state phase, accounting for over 99% of operating time. A periodic gain cycle continuously probes for small changes in available bandwidth.

PROBE_RTT Phase

If more than 10 seconds have elapsed since the last PROBE_RTT, BBR temporarily reduces the send rate to ensure the queue is fully drained, obtaining an accurate RTprop measurement.

PROBE_BW Gain Cycle

The PROBE_BW phase uses an 8-phase fixed gain cycle to continuously probe network conditions:

mermaid
flowchart LR
    P1["Gain 1.0<br/>Steady send"] --> P2["Gain 1.0<br/>Steady send"]
    P2 --> P3["Gain 1.0<br/>Steady send"]
    P3 --> P4["Gain 1.25<br/>↑ Probe bandwidth"]
    P4 --> P5["Gain 0.75<br/>↓ Drain queue"]
    P5 --> P6["Gain 1.0<br/>Steady send"]
    P6 --> P7["Gain 1.0<br/>Steady send"]
    P7 --> P8["Gain 1.0<br/>Steady send"]
    P8 --> P1

    style P4 fill:#f44336,color:#fff
    style P5 fill:#2196F3,color:#fff

Why 1.25 and 0.75?

The gain values are precisely calculated based on queuing theory:

1.25 gain (Probe Phase): The send rate temporarily exceeds the bottleneck bandwidth by 25%, creating a small queue at the bottleneck. This 25% excess is sufficient to fill part of the buffer within one or two RTTs, allowing the sender to observe the upper bound of delivery rate. If the probe succeeds (no packet loss), BBR confirms the current BtlBw estimate is still valid; if a higher delivery rate is observed, BtlBw is updated.

0.75 gain (Drain Phase): The send rate is reduced to 75% of the bottleneck bandwidth, allowing the queue to drain at 25% of the bottleneck rate. The rationale for choosing 0.75 over a lower value:

  • Excessively low gain would cause significant bandwidth underutilization
  • 0.75 is sufficient to fully drain the queue before the next probe phase begins
  • Queue drain time ≈ (excess data sent) / (bottleneck_rate × 0.25), typically completing in one or two RTTs

1.0 gain (Steady Phase): The remaining 6 phases maintain the exact BDP send rate, neither queuing nor wasting bandwidth. These 6 phases occupy most of the cycle, ensuring that overall bandwidth utilization stays near 100%.

The mathematical expectation of the entire cycle gain is exactly 1.0, meaning BBR does not create persistent queue buildup at the bottleneck over the long term.

BBR v1 vs v2 vs v3

Since its initial release in 2016, BBR has undergone three major iterations. The table below compares key differences across versions:

FeatureBBR v1 (2016)BBR v2 (2019)BBR v3 (2023)
Congestion signalsBW + RTT onlyBW + RTT + ECN + LossBW + RTT + ECN + Loss
Loss tolerance~15%~15%~15%
ECN support
Reno/CUBIC coexistence fairnessPoorSignificantly improvedGood
STARTUP cwnd gain2.892.892.77
STARTUP pacing gain2.892.892.77
Bandwidth convergence issuesPresentPartially fixedFully fixed
PROBE_RTT strategyinflight=4 (fixed)inflight=4 (fixed)Dynamic adjustment
Congestion detection trigger8 loss events/RTT8 loss events/RTT6 loss events/RTT
Multi-flow fairnessBelow averageGoodExcellent

Key BBRv3 Improvements

BBR v3 is the most mature version to date, with the following major improvements:

  1. Bandwidth convergence fix: Resolved instability in bandwidth estimation when multiple flows share a bottleneck. In v1 and v2, the gain cycle could cause severe bandwidth allocation oscillations under multi-flow competition. v3 achieves smooth convergence through an improved gain phase synchronization mechanism.
  2. STARTUP gain adjustment: Reduced from 2.89 to 2.77, decreasing startup queuing depth and packet loss probability at the bottleneck.
  3. More aggressive congestion detection: The loss event threshold per RTT to trigger STARTUP exit is reduced from 8 to 6, responding to congestion earlier.
  4. Dynamic PROBE_RTT: Instead of always reducing inflight to 4 data segments, BBRv3 dynamically adjusts the probe amount based on the deviation between current RTT and RTprop, reducing unnecessary throughput fluctuations.
  5. Performance tuning: Overall reduced queuing delay and packet loss rate, with approximately 20% latency reduction in standard tests.

Enabling BBR in Linux

BBR has been integrated into the mainline Linux kernel since version 4.9. Below are the complete steps to enable BBR on a Linux system:

Check Kernel Support

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Check current kernel version (requires ≥ 4.9)
uname -r

# Verify BBR module availability
sysctl net.ipv4.tcp_available_congestion_control

# Check if BBR module is loaded
lsmod | grep tcp_bbr
# If not loaded, load it manually
sudo modprobe tcp_bbr

Enable Temporarily

bash
1
2
3
4
5
6
# Set TCP congestion control algorithm to BBR
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# Also recommended: switch to fq (Fair Queuing) qdisc
# fq provides better support for BBR's pacing mechanism
sudo sysctl -w net.core.default_qdisc=fq

Permanent Configuration

bash
1
2
3
4
5
6
# Write configuration to sysctl config file
echo 'net.ipv4.tcp_congestion_control=bbr' | sudo tee -a /etc/sysctl.conf
echo 'net.core.default_qdisc=fq' | sudo tee -a /etc/sysctl.conf

# Apply immediately
sudo sysctl -p

Verify

bash
1
2
3
4
5
# Confirm the currently active congestion control algorithm
sysctl net.ipv4.tcp_congestion_control

# Check individual TCP connection algorithm
ss -ti | head -20

Why fq (Fair Queuing) is Needed?

BBR relies on precise pacing (rate shaping) to control the send rate. The fq qdisc provides:

  • Per-flow pacing: Automatically delivers precise rate control for each TCP flow
  • Fair bandwidth allocation: Distributes bandwidth fairly among multiple flows
  • Reduced burstiness: Smooths data transmission, reducing the impact of traffic bursts on bottleneck buffers

If using pfifo_fast or other qdiscs without pacing support, BBR’s precision degrades significantly.

BBR and QUIC

QUIC (Quick UDP Internet Connections), designed by Google, has become the transport layer foundation for HTTP/3. BBR plays a central role in QUIC’s congestion control.

Why QUIC Chose BBR

QUIC selected BBR as its default congestion control algorithm for several reasons:

  1. Model-based advantage: QUIC runs over UDP, and the operating system kernel does not provide congestion control for QUIC. BBR is model-based rather than loss-based, and does not depend on the kernel’s loss detection mechanisms (such as duplicate ACK counting), making it easier to implement precisely in userspace.
  2. Better mobile network performance: Mobile networks frequently experience handovers (WiFi ↔ cellular), signal fluctuations, and other issues causing significant non-congestion loss. Since BBR does not rely on loss signals, it significantly outperforms CUBIC in these scenarios.
  3. Low latency requirements: QUIC’s HTTP/3 traffic is sensitive to the first byte latency. BBR’s PROBE_BW steady state minimizes queuing delay, making it better suited for real-time interactive scenarios than CUBIC’s window-grow-loss-halve cycle.

BBR Support for QUIC Connection Migration

QUIC’s Connection Migration is a key advantage on mobile networks. When a client switches networks (e.g., from WiFi to cellular):

plaintext
1
2
Before migration: Path A (WiFi) — BtlBw_A, RTprop_A, BDP_A
After migration:  Path B (Cellular) — BtlBw_B, RTprop_B, BDP_B

BBR’s advantages in this scenario:

  • Fast re-estimation: After connection migration, the new path’s BtlBw and RTprop are completely different. BBR’s STARTUP phase can quickly probe the new path’s bandwidth upper limit within 1-2 RTTs.
  • 0-RTT recovery: Combined with QUIC’s 0-RTT connection migration mechanism, BBR can start sending data on the new path without waiting for a handshake.
  • Avoids CUBIC’s “cold start” problem: CUBIC requires a lengthy window probing process on a new path, while BBR converges quickly through active measurement.

Current Deployment Status

Google’s QUIC implementation, the Chromium network stack, and QUIC implementations in web servers such as Caddy and Nginx all use BBR by default. According to IETF reports, over 40% of Internet traffic now uses BBR or BBR-derived congestion control algorithms.

BBR in Practice

Ideal Scenarios

mermaid
flowchart TD
    subgraph Ideal Scenarios
        WL["Wireless Networks<br/>WiFi/4G/5G<br/>1-5% random loss"]
        LL["Long-Fat Networks<br/>High BW × High RTT<br/>e.g., submarine cables"]
        BF["Bufferbloat<br/>Deep buffers<br/>Severe queuing delay"]
        RTI["Real-time Interactive<br/>Video conferencing/Cloud gaming<br/>Latency-sensitive"]
    end

    subgraph Considerations
        F1["Coexistence with CUBIC<br/>May dominate bandwidth<br/>Needs rate limiting"]
        F2["Shallow buffers<br/>Gain cycle causes<br/>periodic loss"]
        F3["Short flows (<10 RTT)<br/>BBR hasn't reached<br/>steady state yet"]
        F4["CPU overhead<br/>Precise pacing and<br/>ACK processing cost"]
    end

    WL --> B["BBR<br/>High Performance & Low Latency"]
    LL --> B
    BF --> B
    RTI --> B

BBR excels in the following scenarios:

  1. Wireless networks (WiFi/4G/5G): Traditional algorithms misinterpret random wireless loss as congestion, causing frequent rate reduction. BBR is unaffected by 1-5% random loss, maintaining high throughput on unstable mobile networks.
  2. Long-fat networks: High BDP on transoceanic links makes CUBIC’s window growth extremely slow. BBR converges quickly to the optimal rate through active bandwidth probing, without requiring a lengthy AIMD process.
  3. Bufferbloat scenarios: Under deep buffers, CUBIC continuously fills the buffer causing hundreds of milliseconds of queuing delay. BBR’s model-driven approach operates at the BDP point, avoiding excessive buffer occupancy.
  4. Real-time interactive applications: BBR’s low queuing delay characteristic significantly improves latency-sensitive applications such as video conferencing and cloud gaming.

Considerations

  1. Coexistence with CUBIC: BBR tends to claim unused bandwidth, potentially gaining a disproportionate share when competing with CUBIC flows. In production, rate shaping or fair queuing scheduling is recommended for BBR flows.
  2. Shallow buffers: BBR’s gain cycle (1.25 → 0.75) can cause periodic packet loss with shallow buffers, affecting short connections and small file transfers. This can be mitigated by adjusting tcp_bbr.* sysctl parameters.
  3. Short flows: BBR requires at least a few RTTs to complete STARTUP and reach steady state. For request-response pattern small file transfers (<10 RTT), BBR’s advantages are less obvious, and the aggressive STARTUP probing may even increase latency.
  4. CPU overhead: BBR’s pacing mechanism and ACK processing are more complex than CUBIC. At very high bandwidth (>100Gbps), CPU overhead becomes non-negligible.

Tuning Recommendations

For production environments requiring fine-grained control, BBR behavior can be adjusted via sysctl:

bash
1
2
3
4
5
6
# View BBR tunable parameters
sysctl -a | grep tcp_bbr

# Example: Adjust BBR's STARTUP probe sensitivity
# (Parameters may differ on newer kernels)
echo 'net.ipv4.tcp_bbr_congestion_gain=0.01' | sudo tee -a /etc/sysctl.conf

Actual results vary by network conditions; A/B testing in the target environment is recommended to determine optimal configuration.

References