Performance Tuning

Spring Boot is fast out of the box, but production workloads benefit from deliberate tuning across three dimensions: startup time (matters for autoscaling and serverless), throughput (requests per second under load), and memory footprint (cost per instance). This page collects the highest-leverage knobs — JVM flags, pool sizing, lazy initialization, virtual threads, and native images — with guidance on when each pays off.

JVM heap and GC flags

The JVM, not Spring, governs memory and garbage collection. In a container, the most important thing is to let the JVM see the container’s limits — modern JDKs do this automatically and respect cgroup memory.

java -XX:MaxRAMPercentage=75.0 \
     -XX:+UseG1GC \
     -XX:+HeapDumpOnOutOfMemoryError \
     -XX:HeapDumpPath=/tmp \
     -jar app.jar

Flag	Effect
`-XX:MaxRAMPercentage=75.0`	cap heap at 75% of container memory (prefer over fixed `-Xmx`)
`-XX:+UseG1GC`	the default low-pause collector for most server apps
`-XX:+UseZGC`	ultra-low-pause GC for large heaps / latency-sensitive apps
`-XX:+HeapDumpOnOutOfMemoryError`	capture a heap dump when memory runs out

Tip: Prefer -XX:MaxRAMPercentage over a hard-coded -Xmx so the same image adapts to whatever memory limit the orchestrator assigns. Leave ~25% headroom for thread stacks, metaspace, and off-heap buffers.

Connection pool sizing

The database connection pool is the most common throughput bottleneck. Bigger is not better — an oversized pool overwhelms the database and adds contention. A practical starting point is around 10 connections, tuned from monitoring.

spring:
  datasource:
    hikari:
      maximum-pool-size: 10
      minimum-idle: 10

Full guidance, the sizing formula, and how to read pool metrics live in Connection Pooling.

Lazy initialization

By default Spring creates every bean at startup. With lazy initialization, beans are created only when first used, which can cut startup time noticeably for large applications — useful for dev iteration and fast-scaling environments.

spring:
  main:
    lazy-initialization: true

Warning: Lazy init hides startup errors (a misconfigured bean only fails on first use, possibly under live traffic) and shifts that cost to the first request. Use it in development freely; in production, prefer it only when startup time is critical, and consider @Lazy on specific heavyweight beans instead of the global switch.

Startup time

Beyond lazy init, reduce startup cost by trimming work the application does at boot:

Limit classpath scanning — fewer starters and narrower @ComponentScan base packages mean less to scan.
Defer non-essential work — move warm-ups and cache priming into an @Async ApplicationRunner so they don’t block readiness.
Use AppCDS / class data sharing to share parsed class metadata across restarts.

Spring Boot logs the startup time and you can break it down with the startup Actuator endpoint:

management:
  endpoints:
    web:
      exposure:
        include: startup

2026-06-13T10:50:02.330  INFO  Application : Started Application in 2.41 seconds (process running for 2.83)

Virtual threads (Java 21+)

Traditional servlet apps use a bounded thread pool, so under high concurrency requests queue waiting for a thread. Virtual threads (Project Loom) are cheap, JVM-managed threads — you can have millions — letting a thread-per-request model scale to high concurrency for I/O-bound workloads without an async rewrite.

spring:
  threads:
    virtual:
      enabled: true   # requires Java 21+

With this enabled, Tomcat serves each request on a virtual thread, and @Async/@Scheduled tasks also run on virtual threads. There is no pool to size.

Note: Virtual threads shine for I/O-bound work (database calls, HTTP clients). They give little benefit to CPU-bound work, and code that holds a synchronized block during blocking I/O can pin a carrier thread — prefer ReentrantLock in hot paths. Requires Java 21 or newer.

GraalVM native image

For the fastest possible startup and lowest memory, compile to a GraalVM native image with Spring Boot’s AOT support. Startup drops from seconds to milliseconds and RSS shrinks dramatically — ideal for serverless and scale-to-zero.

./mvnw -Pnative native:compile
# or build a native container image:
./mvnw -Pnative spring-boot:build-image

The trade-offs: a longer, memory-hungry build; reflection and dynamic proxies need hints (Spring provides many automatically); and no JIT warmup means peak throughput can be lower than the JVM for long-running, high-throughput services.

Mode	Startup	Memory	Peak throughput	Build time
JVM (JIT)	~1–3 s	higher	highest (after warmup)	fast
Native image	~50 ms	lowest	good, no warmup curve	slow

Profiling endpoints

Actuator exposes endpoints that help you find the bottleneck before tuning blindly:

management:
  endpoints:
    web:
      exposure:
        include: metrics, threaddump, heapdump, startup

/actuator/metrics/jvm.memory.used and jvm.gc.pause — heap pressure and GC behaviour.
/actuator/metrics/http.server.requests — endpoint latency percentiles (see Metrics & Micrometer).
/actuator/threaddump — spot blocked threads and lock contention.
/actuator/heapdump — download a dump for offline analysis.

Warning: heapdump and threaddump expose internal state and can be expensive; keep them behind authentication and off public ports — see securing Actuator.

Best Practices

Measure first with Actuator metrics; tune the proven bottleneck, not a guess.
Use -XX:MaxRAMPercentage so heap tracks the container limit.
Right-size the connection pool from pending/timeout metrics, not intuition.
Use lazy init for dev startup; be cautious enabling it globally in production.
On Java 21+, enable virtual threads for I/O-bound services; consider native image for fast-scaling/serverless.