Graceful Shutdown
When a deployment rolls or a pod is rescheduled, your application receives a SIGTERM and is expected to stop. If it stops instantly, any requests being processed are dropped and clients see errors. Graceful shutdown keeps the application alive just long enough to finish in-flight requests while refusing new ones, turning a noisy redeploy into a clean one. Spring Boot makes this a one-line setting.
Enabling graceful shutdown
Set the web server to drain requests on shutdown and give it a timeout budget.
server:
shutdown: graceful # default is 'immediate'
spring:
lifecycle:
timeout-per-shutdown-phase: 30s # max time to wait for in-flight work
When the JVM receives SIGTERM, Spring Boot:
- Stops the web server from accepting new connections.
- Lets in-flight requests run to completion, up to
timeout-per-shutdown-phase. - Closes the application context (datasources, executors, message listeners).
- Exits.
Output (console on SIGTERM):
2026-06-13T10:40:12.118 INFO o.s.b.w.e.tomcat.GracefulShutdown : Commencing graceful shutdown. Waiting for active requests to complete
2026-06-13T10:40:13.402 INFO o.s.b.w.e.tomcat.GracefulShutdown : Graceful shutdown complete
If requests are still running when the timeout elapses, the server proceeds with shutdown anyway and logs that some requests were not completed.
Note: Graceful shutdown is supported on all embedded servers — Tomcat, Jetty, Undertow, and Netty (for WebFlux). The behaviour and property are identical across them.
Understanding the timeout
spring.lifecycle.timeout-per-shutdown-phase is the grace budget per lifecycle phase, not just the web server. Set it longer than your slowest legitimate request but shorter than the platform’s hard-kill window, or the orchestrator will SIGKILL the process mid-drain.
| Property | Default | Role |
|---|---|---|
server.shutdown | immediate | graceful enables request draining |
spring.lifecycle.timeout-per-shutdown-phase | 30s | max wait for each shutdown phase |
SIGTERM handling in containers
In a container, the JVM is usually PID 1, and the runtime delivers SIGTERM on docker stop or a Kubernetes pod termination. Spring Boot registers a JVM shutdown hook that triggers the graceful sequence, so as long as SIGTERM reaches the Java process, draining happens automatically.
A common mistake is launching the app through a shell (ENTRYPOINT java -jar app.jar written in shell form), which makes the shell PID 1 and may not forward SIGTERM to Java. Use the exec form so the JVM is PID 1:
# Correct: exec form — java is PID 1 and receives SIGTERM directly
ENTRYPOINT ["java", "-jar", "app.jar"]
# Risky: shell form — the shell is PID 1 and may swallow the signal
# ENTRYPOINT java -jar app.jar
See Dockerizing Spring Boot for the full image setup.
Kubernetes coordination
Graceful shutdown alone isn’t enough on Kubernetes because of a race: when a pod is deleted, Kubernetes simultaneously (a) sends SIGTERM and (b) removes the pod from Service endpoints. For a brief moment the kube-proxy may still route new traffic to a pod that has already stopped accepting connections, producing dropped requests.
The standard fix is a small preStop sleep that delays SIGTERM until endpoint removal has propagated, plus a terminationGracePeriodSeconds larger than the Spring shutdown timeout.
spec:
terminationGracePeriodSeconds: 45 # > timeout-per-shutdown-phase (30s)
containers:
- name: order-service
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 5"] # let endpoint removal propagate
readinessProbe:
httpGet: { path: /actuator/health/readiness, port: 8080 }
Spring Boot complements this: during shutdown it flips the readiness state to OUT_OF_SERVICE, so a probing Kubernetes marks the pod not-ready and stops sending traffic even before the preStop delay finishes. Enable the availability probes:
management:
endpoint:
health:
probes:
enabled: true
Warning: Make sure
terminationGracePeriodSeconds(Kubernetes) is comfortably larger thantimeout-per-shutdown-phase(Spring), accounting for thepreStopsleep. If the platform’s kill window is shorter, Kubernetes SIGKILLs the JVM mid-drain and the graceful logic is wasted.
Timeline of a clean shutdown
t=0s Pod marked Terminating; preStop sleep starts; readiness -> OUT_OF_SERVICE
t=0s Kubernetes begins removing the pod from Service endpoints
t=5s preStop finishes; SIGTERM delivered to the JVM (PID 1)
t=5s Tomcat stops accepting new connections; in-flight requests continue
t≤35s All in-flight requests complete; context closes; JVM exits 0
t=45s (terminationGracePeriodSeconds) hard SIGKILL — never reached if drain finished
Best Practices
- Set
server.shutdown=gracefulin every deployed service. - Tune
timeout-per-shutdown-phaseto your slowest real request, not an arbitrary value. - Launch the JVM with the Dockerfile exec form so it is PID 1 and gets SIGTERM.
- On Kubernetes, add a
preStopsleep and setterminationGracePeriodSecondsabove the Spring timeout. - Enable availability probes so readiness flips to
OUT_OF_SERVICEduring shutdown.