TCP Congestion Control Evolution: From Reno to BBR

The previous articles in this series focused on the architecture and implementation of P2P networks, but the performance bottleneck of P2P applications often lies not in the protocol layer itself, but in the underlying transport protocol — TCP congestion control. A single BitTorrent node may maintain hundreds of concurrent TCP connections, each independently performing congestion control. Choosing the wrong congestion control algorithm can severely degrade bandwidth utilization, especially on high-latency links or links with random packet loss. Understanding the evolution of congestion control not only helps P2P developers optimize transport performance but is also essential for gaining a deep understanding of how the internet transport layer operates.

TCP congestion control is the cornerstone of internet stability. Since Van Jacobson’s seminal 1988 paper at SIGCOMM, congestion control algorithms have evolved over nearly four decades — from heuristic loss-based methods to precise model-based measurement.

Fundamentals of Congestion Control

All TCP congestion control algorithms work around the Congestion Window (CWND) — the maximum amount of unacknowledged data the sender can have in flight:

1
Throughput ≈ min(CWND, Receive Window) / RTT

This formula reveals two constraints on TCP throughput: the sender’s window (CWND) and the receiver’s window (Receiver Window), with the smaller of the two determining actual throughput. CWND is dynamically adjusted by the congestion control algorithm, while the receive window is determined by the receiver’s buffer size.

The core differences between algorithms manifest in three dimensions:

  1. CWND growth rate (slow start vs congestion avoidance) — determines connection startup and bandwidth probing efficiency
  2. Congestion detection signal (packet loss vs delay vs bandwidth measurement) — determines how the algorithm perceives network state
  3. Rate reduction strategy (multiplicative decrease vs model-driven adjustment) — determines recovery speed after congestion events

TCP Reno (1988)

Van Jacobson’s classic AIMD (Additive Increase Multiplicative Decrease) model laid the foundation for congestion control. Its core philosophy can be summarized in one sentence: increase linearly under light load, back off exponentially when congestion is detected.

mermaid
flowchart TD
    subgraph Slow Start
        SS["CWND = IW (≈10 segs)<br/>Per ACK: CWND += 1 MSS<br/>Exponential growth"]
    end

    subgraph Congestion Avoidance
        CA["CWND ≥ ssthresh<br/>Per RTT: CWND += 1 MSS<br/>Linear growth"]
    end

    subgraph Loss Handling
        LOSS["3 duplicate ACKs detected"]
        FR["Fast Recovery<br/>ssthresh = CWND/2<br/>CWND = ssthresh"]
        TO["Timeout<br/>ssthresh = CWND/2<br/>CWND = 1 MSS<br/>Restart slow start"]
    end

    SS -->|"CWND ≥ ssthresh"| CA
    CA -->|"Packet loss"| LOSS
    LOSS --> FR
    LOSS --> TO
    FR -->|"New ACK"| CA
    TO --> SS

During slow start, Reno exponentially probes available bandwidth — each ACK received increments CWND by one MSS (Maximum Segment Size), effectively doubling it every RTT. When CWND reaches the slow start threshold (ssthresh), it transitions to congestion avoidance, incrementing by only 1 MSS per RTT for linear growth. Upon detecting packet loss (three duplicate ACKs), it immediately halves ssthresh to CWND/2, sets CWND to ssthresh, and enters fast recovery.

Reno’s limitation: Conservative on high-bandwidth high-latency paths. Consider a link with 100ms RTT and 1Gbps bandwidth — BDP is approximately 12.5MB. After a single packet loss, CWND halves and requires thousands of RTTs to recover to its original size, meaning several minutes of recovery time with severe bandwidth waste.

TCP CUBIC (2005)

CUBIC is currently the default congestion control algorithm in the Linux kernel, proposed by Injong Rhee and colleagues, specifically designed to address Reno’s inefficiency in high BDP (Bandwidth-Delay Product) networks.

Cubic Growth Function

CUBIC’s most fundamental innovation is using a cubic function to drive window growth:

$$W(t) = C \times (t - K)^3 + W_{max}$$

Where $t$ is the time elapsed since the last congestion event, $W_{max}$ is the window size at the time of congestion, $C$ is a scaling constant (default 0.4), and $K = \sqrt[3]{W_{max} \times \beta / C}$ is the time needed for the window to recover to $W_{max}$.

mermaid
flowchart TD
    A["Time since last<br/>congestion event t"] --> B["CUBIC Window Function"]
    B --> C["W(t) = C × (t - K)³ + Wmax"]
    C --> D{"Window Position"}
    D -->|"Below Wmax<br/>Concave growth"| E["Fast recovery<br/>Catch up quickly"]
    D -->|"Near Wmax<br/>Slow probing"| F["Careful search<br/>Avoid re-congestion"]
    D -->|"Above Wmax<br/>Convex growth"| G["Active probing<br/>New bandwidth ceiling"]
    E --> H{Packet loss}
    F --> H
    G --> H
    H -->|"Yes"| I["CWND × 0.7<br/>Record new Wmax"]
    I --> A

Three Key Properties of CUBIC

CUBIC’s core idea is to use time rather than ACK arrivals to drive window growth, making the growth rate independent of RTT:

  • Concave growth: Fast recovery far below $W_{max}$. After a loss event, CUBIC’s window function rises sharply, quickly pulling CWND back to pre-congestion levels — precisely where Reno performs worst.
  • Convex growth: Active probing above $W_{max}$. Once CWND exceeds the previous congestion window value, the function curve accelerates, actively searching for new bandwidth ceilings.
  • RTT fairness: Flows with different RTTs achieve similar window increments over a fixed time period. In Reno, short-RTT flows grow their congestion windows faster and can “starve” long-RTT flows. CUBIC eliminates this RTT bias.

Why CUBIC Replaced BIC

CUBIC’s predecessor was BIC (Binary Increase Congestion Control), which used binary search to probe bandwidth boundaries — it performed well in high-bandwidth networks but was too aggressive in low-bandwidth scenarios, and its window function shape was not smooth. CUBIC replaced BIC’s piecewise binary search with a smooth cubic function, preserving high BDP efficiency while improving friendliness in low-bandwidth networks.

BBR’s Paradigm Revolution (2016)

Google’s 2016 release of BBR (Bottleneck Bandwidth and Round-trip propagation time) fundamentally changed the traditional paradigm of congestion control. The core insight: Packet loss is not equivalent to congestion.

In wireless networks, cellular networks, and similar environments, packets may be lost due to signal interference or channel attenuation — not because the network link is congested. Traditional loss-based algorithms mistakenly reduce their sending rate in such cases, causing severe bandwidth waste.

mermaid
flowchart LR
    subgraph Traditional Loss-based
        L["Reno/CUBIC<br/>Signal: packet loss"]
        L1["Assumption: loss = congestion"]
        L2["Action: loss → slow down"]
        L --> L1 --> L2
    end

    subgraph BBR Model-based
        M["BBR<br/>Signal: bandwidth + RTT"]
        M1["Model: measure BtlBw + RTprop"]
        M2["Action: operate at BDP point"]
        M --> M1 --> M2
    end

    L2 -.->|"On lossy links<br/>performance degrades"| CMP["Comparison"]
    M2 -.->|"On lossy links<br/>maintains throughput"| CMP

BBR’s Key Innovations

  • Loss-independent: Maintains high throughput even with up to 15% packet loss, while CUBIC’s throughput drops dramatically at just 1% loss.
  • Proactive BDP estimation: By measuring bottleneck bandwidth (BtlBw) and propagation delay (RTprop), BBR calculates the optimal amount of data for the link and operates at the BDP point — neither queuing nor wasting capacity.
  • Bufferbloat avoidance: BBR does not fill up buffers even when they are large, avoiding excessive queuing delay — critical for latency-sensitive applications such as real-time audio/video and online gaming.

Linux Practical: View and Switch Algorithms

Now that we understand the theory, let’s explore how to work with congestion control algorithms on a real Linux system. The following commands let you view and switch the algorithm:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# View the currently active congestion control algorithm
sysctl net.ipv4.tcp_congestion_control

# View available algorithms supported by the kernel
sysctl net.ipv4.tcp_available_congestion_control

# Temporarily switch to BBR (lost on reboot)
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# Make it permanent: write to sysctl configuration
echo "net.ipv4.tcp_congestion_control=bbr" | sudo tee -a /etc/sysctl.conf

# Check kernel version (BBR requires Linux kernel ≥ 4.9)
uname -r

Example output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Check current algorithm
$ sysctl net.ipv4.tcp_congestion_control
net.ipv4.tcp_congestion_control = cubic

# Check available algorithms
$ sysctl net.ipv4.tcp_available_congestion_control
net.ipv4.tcp_available_congestion_control = reno cubic bbr

# Check kernel version (must be ≥ 4.9 for BBR)
$ uname -r
5.15.0-91-generic

If bbr does not appear in the output, the BBR kernel module has not been loaded. You can load it manually:

bash
1
2
3
4
5
# Load the BBR module
sudo modprobe tcp_bbr

# Confirm the module is loaded
lsmod | grep tcp_bbr

Note that changing the congestion control algorithm is a system-wide operation — all new TCP connections will use the new algorithm. Established connections continue using the algorithm they were created with and are unaffected by the switch.

Performance Comparison

The performance gap between congestion control algorithms becomes stark on lossy links. The following table shows throughput comparison under identical network conditions (100ms RTT, 100Mbps bandwidth):

Packet Loss RateRenoCUBICBBR
0%100%100%100%
0.1%~50%~85%~99%
1%~10%~30%~95%
3%~3%~12%~88%
5%~1%~5%~80%
10%~0.5%~2%~60%

The data reveals a clear picture:

  • At 0.1% loss rate, CUBIC maintains about 85% throughput, but Reno has already lost half its bandwidth.
  • At 1% loss rate, CUBIC drops to ~30%, while BBR still maintains over 95% throughput — this is the typical loss range for wireless and cellular networks.
  • At 5% loss rate, Reno and CUBIC are practically unusable, while BBR still delivers about 80% throughput.

This explains why P2P applications often suffer from poor transport performance on wireless networks — it is not just the P2P protocol at fault; the underlying TCP congestion control algorithm plays an equally critical role.

BBR Version Evolution

Since its initial release, BBR has gone through three major versions:

VersionYearCore Improvements
BBR v12016First model-based congestion control, achieving high throughput with low latency; suffered from poor coexistence fairness with Reno/CUBIC and high packet loss during STARTUP
BBR v22019Added ECN signal support, improved fairness when coexisting with Reno/CUBIC, reduced STARTUP phase packet loss
BBR v32023Fixed bandwidth convergence bugs, reduced STARTUP gain from 2.89 to 2.77, further decreased queuing delay, optimized short-flow performance

Key Drivers of Version Evolution

BBR’s version evolution reflects the deepening understanding of “ideal congestion control” in both academia and industry:

  1. Coexistence fairness: BBR v1 was too aggressive when sharing a bottleneck link with other algorithms,抢占 CUBIC flows’ bandwidth. v2 and v3 progressively improved this through more conservative contention strategies.
  2. STARTUP oscillation: The initial version’s exponential gain during STARTUP was too high (2.89), causing frequent over-probing and packet loss. v3 reduced the gain to 2.77, significantly reducing startup oscillation.
  3. Bandwidth convergence: In multi-flow bottleneck sharing scenarios, v1 and v2 suffered from unstable bandwidth allocation. v3 achieved faster and more stable convergence through algorithmic optimization.

The next article will dive deep into BBR’s internal mechanisms and implementation details, including the complete workflow of its four-state machine (STARTUP/DRAIN/PROBE_BW/PROBE_RTT).

References

  • Jacobson, V. (1988). Congestion avoidance and control. SIGCOMM ‘88.
  • Ha, S., Rhee, I., & Xu, L. (2008). CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS.
  • Cardwell, N., et al. (2016). BBR: Congestion-based congestion control. ACM Queue.
  • Cardwell, N., et al. (2022). BBR Congestion Control (IETF Draft). https://datatracker.ietf.org/doc/draft-cardwell-iccrg-bbr-congestion-control/
  • Xu, L., Harfoush, K., & Rhee, I. (2004). Binary increase congestion control (BIC) for fast long-distance networks. INFOCOM 2004.