Retiring Rack::BodyProxy: Post-Response Hooks with rack.response_finished
The Problem: When Middleware Finishes But Response is Still Streaming
The Rack specification defines a response as a triplet [status, headers, body], but in the real world, significant time can pass between returning this triplet and the actual delivery of the last byte to the client. This is particularly critical when streaming large files or Server-Sent Events.
The Real Problem
Middleware can free resources (close DB connections, clear caches) long before the client receives the complete response. This leads to inaccurate metrics, premature log cleanup, and potential race conditions.
Historically, this problem was addressed using Rack::BodyProxy — a wrapper around the body object that allowed middleware to register callbacks for response closure. However, this solution introduced new problems.
Architectural Flaws of Rack::BodyProxy
Proxy Object Chains
Each middleware that needed a completion callback would create its ownBodyProxy. In a typical Rails application, this could result in chains of 5-10 nested proxies:
1# Typical chain in Rails applications
2original_body = ["Hello World"]
3body = Rack::BodyProxy.new(original_body) { logger.info "Request finished" }
4body = Rack::BodyProxy.new(body) { metrics.record_latency }
5body = Rack::BodyProxy.new(body) { cleanup_thread_locals }
6body = Rack::BodyProxy.new(body) { close_db_connections }
7# ... and so onTiming Uncertainty
The #close method was called by the server, but the specification didn't guarantee this happened exactly after all data was sent to the client. Depending on the server and buffering settings, callbacks could fire:
- Before data transmission begins to the client
- During transmission (in parallel)
- After transmission but before acknowledgment
- After complete connection termination
Exception Handling Issues
If an exception occurred during body iteration, callbacks inBodyProxy might not execute at all, leading to resource leaks:
1class ProblematicBody
2 def each
3 yield "Part 1"
4 raise "Something went wrong" # BodyProxy#close might not be called
5 yield "Part 2"
6 end
7endMemory and Performance Impact Analysis
Additional Allocations
Each BodyProxy is an additional object in memory. On high-traffic sites with tens of thousands of requests per second, even small additional allocations create GC pressure.
More critically, these objects can live longer than usual due to closures in callbacks, complicating generational GC work and potentially promoting objects to older generations.
Method Dispatch Overhead
Every body object method (especially #each) had to pass through the proxy chain. For large streams, this added measurable overhead:
1# Each call goes through the entire chain
2def each(&block)
3 @body.each(&block) # Delegate to next in chain
4ensure
5 @callback.call # Then execute callback
6endMeasurable Effect
According to Rack issue tracker data, some applications saw reduced object counts per request and improved tail latency when switching toresponse_finished due to more predictable GC behavior.
rack.response_finished: Design of the New Solution
Design Philosophy
The new mechanism follows "convention over configuration" principles. Instead of multiple proxy objects, it uses one standard key in the envhash — "rack.response_finished" — containing an array of callbacks.
1# Middleware registers callback
2def call(env)
3 callbacks = env["rack.response_finished"] ||= []
4 callbacks << lambda do |env, status, headers, error|
5 # Code executes AFTER complete response delivery
6 cleanup_resources
7 log_metrics(status, headers)
8 end
9
10 @app.call(env)
11endExecution Guarantees
Unlike BodyProxy#close, new callbacks are guaranteed to be called by the server in three cases:
- Successful completion: after all data is sent to client
- Application exception: even if body never started iterating
- Server exception: during network or I/O problems
Key Improvement
Callbacks execute even during exceptions, solving the resource leak problem characteristic of BodyProxy.
Implementation Details and API Contract
Callback Signature
Each callback receives four parameters: (env, status, headers, error). The logic for populating them depends on the completion scenario:
1# Successful completion
2callback.call(env, 200, {"content-type" => "text/html"}, nil)
3
4# Application exception
5callback.call(env, nil, nil, StandardError.new("App error"))
6
7# Server exception (e.g., connection reset)
8callback.call(env, 500, {}, IOError.new("Connection reset"))Error Handling in Callbacks
Exceptions in one callback shouldn't affect execution of others. Servers typically log such errors but don't interrupt processing of remaining callbacks:
1# Example of safe callback execution in server
2callbacks.each do |callback|
3 begin
4 callback.call(env, status, headers, error)
5 rescue => callback_error
6 logger.error "Response finished callback failed: #{callback_error}"
7 end
8endThread Safety
Callbacks execute in the same thread as the main request. This simplifies work with thread-local variables but requires caution with blocking operations:
1# Good pattern: fast cleanup
2callbacks << lambda do |env, status, headers, error|
3 Thread.current[:request_id] = nil
4 ActiveRecord::Base.clear_active_connections!
5 Rails.cache.clear if Rails.env.test?
6end
7
8# Bad pattern: slow I/O operations
9callbacks << lambda do |env, status, headers, error|
10 # Don't do this in callbacks!
11 send_email_notification(status) # Can take seconds
12 upload_logs_to_s3(env[:logs]) # Blocking network operation
13endProduction Migration Strategies
Phased Approach
Safe migration requires supporting both mechanisms during the transition period. Here's a proven pattern for production-ready middleware:
1class SafeMigrationMiddleware
2 def initialize(app)
3 @app = app
4 end
5
6 def call(env)
7 status, headers, body = @app.call(env)
8
9 # Try to use new API
10 if env["rack.response_finished"]
11 register_new_callback(env)
12 else
13 # Fallback to old mechanism
14 body = wrap_with_proxy(body)
15 end
16
17 [status, headers, body]
18 end
19
20 private
21
22 def register_new_callback(env)
23 callbacks = env["rack.response_finished"] ||= []
24 callbacks << method(:cleanup_resources)
25 end
26
27 def wrap_with_proxy(body)
28 Rack::BodyProxy.new(body) { cleanup_resources }
29 end
30
31 def cleanup_resources(*) # Accepts any number of arguments
32 # Your cleanup logic
33 logger.info "Request completed"
34 end
35endMigration Monitoring
To track migration progress, add metrics showing what percentage of requests use the new API:
1def call(env)
2 status, headers, body = @app.call(env)
3
4 if env["rack.response_finished"]
5 StatsD.increment('middleware.response_finished.new_api')
6 register_new_callback(env)
7 else
8 StatsD.increment('middleware.response_finished.fallback')
9 body = wrap_with_proxy(body)
10 end
11
12 [status, headers, body]
13endRails and Ecosystem Integration
ActionDispatch::Executor
One key improvement in Rails is more reliable operation ofActionDispatch::Executor. Now it can reliably clear thread-local variables exactly after response completion.
Popular Gems
Many popular solutions have already added support for the new API or plan to do so:
- rack-timeout: more accurate execution time measurement
- newrelic_rpm: improved performance metrics
- skylight: more precise request tracing
- sentry-ruby: correlating errors with completed requests
Monitoring and Problem Diagnostics
Key Metrics
When migrating to the new API, monitor these metrics:
- Memory: reduction in objects per request
- GC metrics: major GC frequency, pause times
- Latency distribution: especially tail latency (p95, p99)
- Callback errors: exceptions during cleanup
Common Problem
If callbacks take too long to execute, this can block worker processes. Move heavy operations to background jobs.
Useful Resources and Further Reading
Official Documentation
- Rack SPEC.rdoc — complete Rack 3.x specification
- Pull Request #1619 — main PR implementing response_finished
- Rails PR #44560 — integration in Rails 7.1+
In-Depth Analysis
- Friendship Ended with Rack::BodyProxy — excellent Rails at Scale article with detailed technical analysis
- Issue #1093: BodyProxy memory concerns — discussion of BodyProxy performance issues
Implementation Examples
- Puma implementation — how Puma implements response_finished support
- Falcon server support — support in Falcon async server
Migration Tools
- ruby-prof — for profiling performance before and after migration
- memory_profiler — analyzing changes in memory allocations
- Datadog Ruby metrics — monitoring GC metrics in production
Learning Recommendation
Start by reading the original Rails at Scale article and studying the Rack pull request. This will give you better understanding of the technical reasons and trade-offs in developing the new API.
Practical Steps
For starting migration, I recommend this action plan:
- Audit existing middleware using BodyProxy
- Update Rack to version 3.x in development environment
- Implement support for both APIs in critical middleware
- Add monitoring to track new API usage
- Test in staging with various completion scenarios
- Gradually deploy to production with metrics monitoring
- Remove BodyProxy fallback after stabilization
Conclusion: Path to More Reliable Rack Applications
The transition from Rack::BodyProxy to env["rack.response_finished"] is not just a technical improvement, but a fundamental step toward more predictable and efficient request completion handling in the Rack ecosystem.
The new API solves real production application problems: reduces GC pressure, eliminates race conditions during resource cleanup, and provides execution guarantees even during exceptions. This is especially important for high-throughput applications where every extra allocation and millisecond of delay matters.
The key lesson from this evolution is the importance of proper abstractions in foundational libraries. BodyProxy seemed like an elegant solution, but in practice created more problems than it solved. The new approach is conceptually simpler yet more reliable and efficient.
Key Takeaway
Start migration now with support for both APIs. This will give you experience with the new mechanism without production risk, and you'll be ready for full transition when your stack is updated.