Performance Tuning
Spring Boot is fast out of the box, but production workloads benefit from deliberate tuning across three dimensions: startup time (matters for autoscaling and serverless), throughput (requests per second under load), and memory footprint (cost per instance). This page collects the highest-leverage knobs — JVM flags, pool sizing, lazy initialization, virtual threads, and native images — with guidance on when each pays off.
JVM heap and GC flags
The JVM, not Spring, governs memory and garbage collection. In a container, the most important thing is to let the JVM see the container’s limits — modern JDKs do this automatically and respect cgroup memory.
java -XX:MaxRAMPercentage=75.0 \
-XX:+UseG1GC \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/tmp \
-jar app.jar
| Flag | Effect |
|---|---|
-XX:MaxRAMPercentage=75.0 | cap heap at 75% of container memory (prefer over fixed -Xmx) |
-XX:+UseG1GC | the default low-pause collector for most server apps |
-XX:+UseZGC | ultra-low-pause GC for large heaps / latency-sensitive apps |
-XX:+HeapDumpOnOutOfMemoryError | capture a heap dump when memory runs out |
Tip: Prefer
-XX:MaxRAMPercentageover a hard-coded-Xmxso the same image adapts to whatever memory limit the orchestrator assigns. Leave ~25% headroom for thread stacks, metaspace, and off-heap buffers.
Connection pool sizing
The database connection pool is the most common throughput bottleneck. Bigger is not better — an oversized pool overwhelms the database and adds contention. A practical starting point is around 10 connections, tuned from monitoring.
spring:
datasource:
hikari:
maximum-pool-size: 10
minimum-idle: 10
Full guidance, the sizing formula, and how to read pool metrics live in Connection Pooling.
Lazy initialization
By default Spring creates every bean at startup. With lazy initialization, beans are created only when first used, which can cut startup time noticeably for large applications — useful for dev iteration and fast-scaling environments.
spring:
main:
lazy-initialization: true
Warning: Lazy init hides startup errors (a misconfigured bean only fails on first use, possibly under live traffic) and shifts that cost to the first request. Use it in development freely; in production, prefer it only when startup time is critical, and consider
@Lazyon specific heavyweight beans instead of the global switch.
Startup time
Beyond lazy init, reduce startup cost by trimming work the application does at boot:
- Limit classpath scanning — fewer starters and narrower
@ComponentScanbase packages mean less to scan. - Defer non-essential work — move warm-ups and cache priming into an
@AsyncApplicationRunnerso they don’t block readiness. - Use AppCDS / class data sharing to share parsed class metadata across restarts.
Spring Boot logs the startup time and you can break it down with the startup Actuator endpoint:
management:
endpoints:
web:
exposure:
include: startup
2026-06-13T10:50:02.330 INFO Application : Started Application in 2.41 seconds (process running for 2.83)
Virtual threads (Java 21+)
Traditional servlet apps use a bounded thread pool, so under high concurrency requests queue waiting for a thread. Virtual threads (Project Loom) are cheap, JVM-managed threads — you can have millions — letting a thread-per-request model scale to high concurrency for I/O-bound workloads without an async rewrite.
spring:
threads:
virtual:
enabled: true # requires Java 21+
With this enabled, Tomcat serves each request on a virtual thread, and @Async/@Scheduled tasks also run on virtual threads. There is no pool to size.
Note: Virtual threads shine for I/O-bound work (database calls, HTTP clients). They give little benefit to CPU-bound work, and code that holds a
synchronizedblock during blocking I/O can pin a carrier thread — preferReentrantLockin hot paths. Requires Java 21 or newer.
GraalVM native image
For the fastest possible startup and lowest memory, compile to a GraalVM native image with Spring Boot’s AOT support. Startup drops from seconds to milliseconds and RSS shrinks dramatically — ideal for serverless and scale-to-zero.
./mvnw -Pnative native:compile
# or build a native container image:
./mvnw -Pnative spring-boot:build-image
The trade-offs: a longer, memory-hungry build; reflection and dynamic proxies need hints (Spring provides many automatically); and no JIT warmup means peak throughput can be lower than the JVM for long-running, high-throughput services.
| Mode | Startup | Memory | Peak throughput | Build time |
|---|---|---|---|---|
| JVM (JIT) | ~1–3 s | higher | highest (after warmup) | fast |
| Native image | ~50 ms | lowest | good, no warmup curve | slow |
Profiling endpoints
Actuator exposes endpoints that help you find the bottleneck before tuning blindly:
management:
endpoints:
web:
exposure:
include: metrics, threaddump, heapdump, startup
/actuator/metrics/jvm.memory.usedandjvm.gc.pause— heap pressure and GC behaviour./actuator/metrics/http.server.requests— endpoint latency percentiles (see Metrics & Micrometer)./actuator/threaddump— spot blocked threads and lock contention./actuator/heapdump— download a dump for offline analysis.
Warning:
heapdumpandthreaddumpexpose internal state and can be expensive; keep them behind authentication and off public ports — see securing Actuator.
Best Practices
- Measure first with Actuator metrics; tune the proven bottleneck, not a guess.
- Use
-XX:MaxRAMPercentageso heap tracks the container limit. - Right-size the connection pool from
pending/timeoutmetrics, not intuition. - Use lazy init for dev startup; be cautious enabling it globally in production.
- On Java 21+, enable virtual threads for I/O-bound services; consider native image for fast-scaling/serverless.