Spring Boot 2: Spring WebFlux for Reactive APIs

Spring Boot 2: Spring WebFlux for Reactive APIs




Spring WebFlux is a relatively new library from Spring Boot 2 that provides massive improvement in performance compared to the traditional sync calls. To quote the Spring documentation itself:
Why was Spring WebFlux created?
Part of the answer is the need for a non-blocking web stack to handle concurrency with a small number of threads and scale with fewer hardware resources. Servlet 3.1 did provide an API for non-blocking I/O. However, using it leads away from the rest of the Servlet API, where contracts are synchronous (FilterServlet) or blocking (getParametergetPart). This was the motivation for a new common API to serve as a foundation across any non-blocking runtime. That is important because of servers (such as Netty) that are well-established in the async, non-blocking space.

Further reading: https://docs.spring.io/spring/docs/current/spring-framework-reference/web-reactive.html

In short, Spring Webflux is Spring's answer to reactive web that has been the buzz going on recently. By using single non blocking thread and almost infinite pool space to hold jobs, Spring Webflux avoids the bottleneck of running out of threads due to blocking that most synchronous web calls face.

I created a quick self contained demo of how performant Spring WebFlux can be, you can download the code here: https://github.com/overtakerx/webfluxdemo

To Run the code:
./gradlew clean build
./gradlew bootRun
./gradlew gatlingRun (this needs to be done in separate session from the bootRun)



Case Study:
Gatling simulating 3000 users increasing linearly over 30 seconds (100 users per second), with the user calling the endpoint 30x, executing on standard, wrong reactive and reactive endpoints.

Outcome:
Gatling output suggests that the normal method call, and the wrong webflux way are slower than the right webflux way with over 80% of the calls ending up above 1200ms.


Normal Method Call
Count
Request count
90000
Min response time (ms)
300
Max response time (ms)
3263
Mean response time (ms)
2405
Std deviation (ms)
939
Response time 50th percentile (ms)
3010
Response time 75th percentile (ms)
3047
Response time 95th percentile (ms)
3091
Response time 99th percentile (ms)
3159
Mean requests/sec
608.108

Response Time Distribution
Count
%
t < 800 ms
10466
12
800 ms < t < 1200 ms
4364
5
t > 1200 ms
75170
84
Failed
0
0



Wrong Webflux Method Call
Count
Request count
90000
Min response time (ms)
301
Max response time (ms)
3132
Mean response time (ms)
2447
Std deviation (ms)
902
Response time 50th percentile (ms)
3021
Response time 75th percentile (ms)
3051
Response time 95th percentile (ms)
3080
Response time 99th percentile (ms)
3099
Mean requests/sec
604.027

Response Time Distribution
Count
%
t < 800 ms
8668
10
800 ms < t < 1200 ms
4288
5
t > 1200 ms
77044
86
Failed
0
0



Right Webflux Method Call
Count
Request count
90000
Min response time (ms)
301
Max response time (ms)
2854
Mean response time (ms)
847
Std deviation (ms)
461
Response time 50th percentile (ms)
781
Response time 75th percentile (ms)
1157
Response time 95th percentile (ms)
1685
Response time 99th percentile (ms)
2127
Mean requests/sec
873.786

Response Time Distribution
Count
%
t < 800 ms
4615151
800 ms < t < 1200 ms
2329526
t > 1200 ms
2055423
Failed
0
0

Bonus Case Study:
Gatling simulating 100 users increasing linearly over 30 seconds (roughly 3 users per second), with the user calling the endpoint 30x, executing on deferredresult and callable endpoints.

Bonus Outcome:
Gatling output suggests that the Callable method call suffers horribly even with only a total of 3000 requests, and DeferredResult method call is even worse than that.

Callable Method Call
Count
Request count
3000
Min response time (ms)
302
Max response time (ms)
2519
Mean response time (ms)
1719
Std deviation (ms)
731
Response time 50th percentile (ms)
2152
Response time 75th percentile (ms)
2270
Response time 95th percentile (ms)
2366
Response time 99th percentile (ms)
2433
Mean requests/sec
23.622

Response Time Distribution
Count
%
t < 800 ms
58319
800 ms < t < 1200 ms
1776
t > 1200 ms
224075
Failed
0
0



DeferredResult Method Call
Count
Request count
3000
Min response time (ms)
302
Max response time (ms)
26483
Mean response time (ms)
7515
Std deviation (ms)
5341
Response time 50th percentile (ms)
6677
Response time 75th percentile (ms)
11697
Response time 95th percentile (ms)
15143
Response time 99th percentile (ms)
23963
Mean requests/sec
9.615

Response Time Distribution
Count
%
t < 800 ms
1395
800 ms < t < 1200 ms
753
t > 1200 ms
278693
Failed
0
0




Using Webflux


Do note that webflux has different set of lambda than java's standard lambda and it sort of enforces to maximize the use of lambda, here is an example of a lambda that fetch a website content, store it in database and transform it and send a transformed response:


return WebClient.create(hostUrl).get()
    .uri(uri)
    .retrieve()
    .bodyToMono(Response.class).flatMap(resp -> {
      resp.setQueryData(queryData);      
      return dbRepository.save(resp);    
    })
    .map(resp -> NewResponse.builder().id(resp.getId()).build()
);



Comments

Popular posts from this blog

Spring Boot 2: Parallelism with Spring WebFlux

Cucumber + Junit5 + Localstack