More thorough testing & error handling

Stress testing with ApacheBench and a little bit of penetration testing

Published in

Scrum.ai

5 min readMay 2, 2018

As we strived to serve Scrum.ai as a Software-as-a-service, we need to be ready for everything, that includes scaling and security aspects.

One way to test if our app can handle a big volume of request is though stress testing. For this purpose, I came across a novel tool called ab, or ApacheBench. It’s easy to use and give a fairly good metrics to evaluate.

Let’s start with testing our landing page.

ab -n 1000 -c 50 -s 3 https://scrum-ai-test.herokuapp.com/

Here we define the number of request we want to stress test, that is 1000 request (the landing page is just serving static content and shouldn’t have that much traffic). Then we define the concurrency level, 40, to simulate 50 user hitting the landing page at the same time. Then we define the timeout or how long should we wait for a response before closing the connection, 3 second.

After waiting about half a minute or so, we get the result.

Server Software:        Cowboy
Server Hostname:        scrum-ai-test.herokuapp.com
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
TLS Server Name:        scrum-ai-test.herokuapp.comDocument Path:          /
Document Length:        503 bytesConcurrency Level:      50
Time taken for tests:   25.238 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      671000 bytes
HTML transferred:       503000 bytes
Requests per second:    39.62 [#/sec] (mean)
Time per request:       1261.917 [ms] (mean)
Time per request:       25.238 [ms] (mean, across all concurrent requests)
Transfer rate:          25.96 [Kbytes/sec] receivedConnection Times (ms)
              min  mean[+/-sd] median   max
Connect:      751  968 243.6    919    2274
Processing:   245  275  24.2    273     472
Waiting:      244  273  19.9    273     444
Total:       1019 1243 245.7   1194    2562Percentage of the requests served within a certain time (ms)
  50%   1194
  66%   1224
  75%   1245
  80%   1258
  90%   1325
  95%   1388
  98%   2398
  99%   2455
 100%   2562 (longest request)

We can see that the average time per request is 1261 ms, and about 1000ms is wasted on the connection time (from Indonesia to US West, as US West is the only available region for Heroku Free plan). We can also see that all of it finishes within 3 seconds (the longest is 2.5s).

What can we learn from this numbers? Well, for one, we can see that Heroku gives a reasonable stable connection for their server, and we can see that it can handle a traffic spike just fine. But we can further improve the connection time with things like CDN that can also reduce the number of traffic to our server.

One more endpoint to stress test is the slack event endpoint. Basically, it’ll easily be the most frequently hit endpoint in our small server. All slack events will hit this endpoint, and if we want to serve it as a SaaS, each team slack will hit this endpoint at the same time with big volume of data.

ab -n 10000 -c 40 -s 3 -m post https://scrum-ai-test.herokuapp.com/slack/events

We add -m post to define the HTTP method we use to hit the URL. Here’s the result.

Server Software:        Cowboy
Server Hostname:        scrum-ai-test.herokuapp.com
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
TLS Server Name:        scrum-ai-test.herokuapp.comDocument Path:          /slack/events
Document Length:        506 bytesConcurrency Level:      40
Time taken for tests:   290.743 seconds
Complete requests:      10000
Failed requests:        0
Non-2xx responses:      10000
Total transferred:      6910000 bytes
HTML transferred:       5060000 bytes
Requests per second:    34.39 [#/sec] (mean)
Time per request:       1162.970 [ms] (mean)
Time per request:       29.074 [ms] (mean, across all concurrent requests)
Transfer rate:          23.21 [Kbytes/sec] receivedConnection Times (ms)
              min  mean[+/-sd] median   max
Connect:      739  870 163.4    849    8839
Processing:  -179  291 131.9    281   11417
Waiting:        0  289 131.2    280   11417
Total:        984 1161 215.8   1133   12248Percentage of the requests served within a certain time (ms)
  50%   1133
  66%   1165
  75%   1191
  80%   1219
  90%   1272
  95%   1313
  98%   1441
  99%   1977
 100%  12248 (longest request)

We can see an outlier here. A 12 second long request. Most likely this is the time Heroku takes to warm-up the server (cold start) because we use the Free plan. Aside from that, it seems normal enough for 10.000 request, all of it came out as successful. But if we want to scale, we might want to implement an event queue like I mentioned in my earlier blog:

Re: Architecture behind Scrum.ai

A more detailed, updated overview of the architecture

medium.com

Now, let’s do a little bit of penetration testing. Here, I’ll do a white box testing for injection, specifically SQL Injection. We use TypeORM with PostgreSQL, so we want to test if TypeORM canhandle SQL Injection attack.

Here, we are trying to see if we can add a task with the name containing an apostrophe. As we know, apostrophe is used in SQL, so if we inject an apostrophe, we expect an error if it’s not handled correctly.

But, well, the bot doesn’t respond, this is not good. But it’s a good thing we do this penetration testing before the app went public. Let’s look at the database.

The task is saved into the database correctly, along with the apostrophe, it seems that TypeORM handles that correctly. But the papertrail integration alerts an error in the application:

Well it seems that the error is with the application itself when handling the request, not with the TypeORM handling the SQL Injection. It’s a good thing we setup papertrail in advance to alert us for errors and exceptions.

Anyway, it’s not a good thing at all that our bot didn’t respond. In the future, we must handle these kind of error and report to the user too.

More thorough testing & error handling

Stress testing with ApacheBench and a little bit of penetration testing

Re: Architecture behind Scrum.ai

A more detailed, updated overview of the architecture

Written by Rakha Kanz Kautsar