Spring Boot Reactive API

 Spring Boot 2: Using Spring WebFlux & Tomcat for Reactive APIs


Spring WebFlux is a somewhat new (since 2019) library from Spring Boot 2 that provides massive improvement in performance compared to the traditional synchronous calls. To quote the Spring documentation itself:

Why was Spring WebFlux created?
Part of the answer is the need for a non-blocking web stack to handle concurrency with a small number of threads and scale with fewer hardware resources. Servlet 3.1 did provide an API for non-blocking I/O. However, using it leads away from the rest of the Servlet API, where contracts are synchronous (FilterServlet) or blocking (getParametergetPart). This was the motivation for a new common API to serve as a foundation across any non-blocking runtime. That is important because of servers (such as Netty) that are well-established in the async, non-blocking space.


Further reading: https://docs.spring.io/spring-framework/docs/current/reference/html/web-reactive.html


In summary, Spring Webflux is Spring's answer to reactive web that has been steadily gaining popularity. By using single non blocking thread and significant pool space to hold jobs, Spring Webflux avoids the bottleneck of running out of threads due to thread blocking that many synchronous web servers encounter. However reactive stream will not always produce performance gain in all cases as we shall explore in the scenarios below

Setup:

  • Spring Boot's embedded Tomcat is reconfigured to only use 10 worker threads, this way we can clearly see how Spring WebFlux scales better than Spring Web MVC

  • Gatling is configured to simulate 30 concurrent requests with 30000 milliseconds request timeout
  • APIs internal process is calling Wiremock standalone configured with 5000 milliseconds response delay

I have recreated an updated version of self-contained demo of Spring WebFlux, you can download the code here: TBD

Scenario 1: Performance gain when there is BLOCKING due to WAITING

There is a performance gain if:

  • Concurrent requests exceed server thread counts and
  • Significant processing time is spent on waiting for something

Spring Web MVC

With Spring Web, when the number of concurrent requests exceeds the number of server threads, the excess requests will have to block until threads become available. Therefore the total response time of these blocked requests are wait time + process time. The following is the result of using Spring MVC RestTemplate:

@GetMapping(path = "studentsresttemplate", produces = MediaType.APPLICATION_JSON_VALUE)
public Result getStudentsRestTemplate() throws Exception {
return studentService.getStudentsRestTemplate(createResultObject());
}
public Result getStudentsRestTemplate(Result result) {
result.setStudents(Arrays.asList(restTemplate.getForObject("http://localhost:9090/customerinfo", Student[].class)));
result.setEndDtm(LocalDateTime.now());
return result;
}



As we can see from the gatling report above, the first 10 requests have ~5s response time, the next 10 requests have ~10s response time and the last 10 requests have ~15s response time. This can be seen in details through the logs:


This behaviour can be fully described as the following sequence diagram:

Spring WebFlux

With Spring WebFlux, server will continue processing requests in a non blocking fashion, which means there will not be any blocked request threads. The following is the result of using Spring WebFlux WebClient the right way:

@GetMapping(path = "studentsreactive", produces = MediaType.APPLICATION_JSON_VALUE)
public Mono<Result> getStudentsReactive() throws Exception {
return studentService.getStudentsReactive(createResultObject());
}
public Mono<Result> getStudentsReactive(Result result) {
return client.get().uri("/customerinfo")
.retrieve()
.bodyToMono(Student[].class)
.map(p -> {
result.setStudents(Arrays.asList(p));
result.setEndDtm(LocalDateTime.now());
return result;
});
}


As we can see from the gatling report above, all 30 concurrent requests have ~5s response time, ranging from 5.4s to 5.5s. This can be seen in details through the logs:


This can be summarized by the following sequence diagram:




As a bonus, here is the WRONG way to use Spring Webflux:

@GetMapping(path = "studentsreactiveblocking", produces = MediaType.APPLICATION_JSON_VALUE)
public Result getStudentsWebClientBlocking() throws Exception {
return studentService.getStudentsWebClientBlocking(createResultObject());
}
public Result getStudentsWebClientBlocking(Result result) {

Result block = client.get().uri("/customerinfo")
.retrieve()
.bodyToMono(Student[].class)
.map(p -> {
result.setStudents(Arrays.asList(p));
result.setEndDtm(LocalDateTime.now());
return result;
}).block();
result.setStudents(block.getStudents());
result.setEndDtm(LocalDateTime.now());
return result;
}

And the result of the wrong way:


As we can see, with blocking WebClient, WebFlux pretty much downgraded to just another overly complicated api call using RestTemplate


Scenario 2: No performance gain when the WAITING is due to CPU intensive process

There will not be any performance gain if:

  • Concurrent requests exceed server thread counts and
  • Significant processing time is spent on CPU intensive work

Spring Web MVC

In this scenario, the long running CPU intensive process is calculating PI through 1 million iterations. The result of non Spring Reactive code is as follow:
@GetMapping(path = "calcpi", produces = MediaType.APPLICATION_JSON_VALUE)
public Result calcPi(@RequestParam Long itr) throws Exception {
return studentService.calculatePi(itr, createResultObject());
}
public Result calculatePi(final long iterations, Result result) {
result.setPiResult(calcPi(iterations));
result.setEndDtm(LocalDateTime.now());
return result;
}


As we can see from the gatling report above, the first 10 requests have ~13s response time, the next 10 requests have ~26s response time and the last 10 requests have exceeded the timeout period and failed. 

Spring WebFlux

Meanwhile the Spring WebFlux code results is pretty much similar, if not, even worse:

@GetMapping(path = "calcpireactive", produces = MediaType.APPLICATION_JSON_VALUE)
public Mono<Result> calcPiReactive(@RequestParam Long itr) throws Exception {
return studentService.calculatePiReactive(itr, createResultObject());
}
public Mono<Result> calculatePiReactive(final Long iterations, Result result) {
return Mono.just(iterations).map(i -> {
result.setPiResult(calcPi(i));
return result;
});
}




The simple reason why there is no performance gain is due to expensive context switching overheads since the amount of concurrent threads are higher than the amount of CPU cores, in this scenario we have 10 threads trying to calculate value of PI and there are only 4 CPU cores.

Having Spring Reactive implementation of the PI calculation does not help because we are still limited to the CPU cores and therefore exposed to the context switching and other overhead delays.


Conclusion

For applications that have a lot of wait time typically from calling databases, external APIs, messaging queues, etc the Spring WebFlux offers significant performance gain due to its queue and non blocking mechanism. However for CPU bound / intensive work, there is no advantage (or even disadvantage) of non blocking framework, mainly due to heavy penalty from context switch overheads in the low level CPUs and threadings.





Comments

Popular posts from this blog

Spring Boot 2: Parallelism with Spring WebFlux

Spring Boot Reactive API Part 2