Archive

Posts Tagged ‘Apache bench’

Did Amazon CloudFront CDN make my site faster?

December 28th, 2011 Comments off

Overview

When I was deploying my website, I ran into a slow page load problem. One of the pages had about 9 non-interleaved screenshot PNG image files, each about (700 x 500 pixels) in size and between 40 KB and 350KB file size.

I wondered if deploying these images on Amazon CloudFront would improve response times. Amazon CloudFront is the Content Delivery Network (CDN) offering from Amazon and is one of the cloud services that constitute Amazon Web Services (AWS). 

CDNs are supposed to improve response times by replicating resources across multiple servers around the world, and serving a requested resource from the server closest to the requesting client. The implicit assumption is that the root cause of latency is geographical distance (greater the distance, more the number of routers involved in-between), and so serving files from a server that is physically closer should reduce latency.

Since my site was already hosted on Amazon’s EC2, it made sense to try their CloudFront CDN, rather than some other vendor’s CDN. Though this was not a performance critical page, it did provide the opportunity to experiment with CloudFront for a realistic scenario, and the knowledge gained may prove useful in future. So I started experimenting…

 

Setup

I decided to use Amazon S3 as the origin server for Cloudfront (the origin server is the server from which Cloudfront picks up the resources to replicate). I opted for "reduced redundancy storage" setting instead of "standard redundancy" for the S3 bucket, to minimize costs (and also because these images are already available to me from my development machine and web server..standard redundancy makes more sense for user content data or critical backups).

 

Evaluation criteria

Better response times would be great.

Even if there was no improvement in response times, a CDN would still reduce the load on my rather underpowered EC2 micro instance web server, and spare me some more connections for more dynamic content, like my SaaS products. So I was already somewhat biased towards using Cloudfront or some other file server before evaluating them.

But CloudFront, like other AWS services, is a metered service. So the evaluation also needed to keep costs in mind.

 

Performance measurements

For response time measurements, I decided to use different tools to get a complete picture:

  • The first set of measurements are taken using browsers. All 3 major browsers – Chrome, Firefox and IE – provide excellent profiling tools for developers.
  • However, browser measurements are not enough. The system should also be tested for scalability. What happens to response times when there are dozens of concurrent connections requesting the page? Can the page be rendered to all those users without much increase in response times? With a single web server on an underpowered machine, this is clearly not possible. But putting a CDN in the mix should shift atleast some of the load on my puny single unscalable web server to Amazon’s scalable mammoth delivery network. I used Apache Jmeter and Apache Bench (ab) tools to load the server.

Browser measurements

Methodology

Chrome’s developer tools network tab, Firefox Firebug network tab, and IE’s developer tools Network tab provide profiling information.

Chrome and Firefox (via Firebug and Firebug NetExport plugin) can export profiling data to JSON format files called .har files.

IE exports to XML files which have a similar schema to the JSON .har files but expressed in XML.

 

Each browser was tested 5 times with a complete cache cleanup in between. The cache cleanup ensured that all images were downloaded in each test. However, cache cleanup does not clear the browsers’ DNS caches, which means DNS lookup timings are usually manifested only in the first test.

 

A python script was used to parse these files, calculate averages and produce the below HTML table of averages.

 

Results

Legend to the table:

1st column => the image file name

"OwnServer" => tests in which images were downloaded from my Apache web server running on EC2 and EBS

"Cloudfront" => tests in which images were downloaded from Cloudfront distribution with S3 as origin server

T => Total time for request and response (including thread blocked, wait, connect, send request, wait, receive response)

R => Total time for just receiving all the data

W => Time spent waiting before response started

All figures are in milliseconds

  Chrome
OwnServer
Chrome
Cloudfront
Mozilla
OwnServer
Mozilla
Cloudfront
IE
OwnServer
IE
Cloudfront
corporatesearch.png
158KB
T:14897
R:14525
W:369
T:7913
R:7721
W:141
T:9924
R:9476
W:447
T:13010
R:12792
W:177
T:9038
R:8567
W:386
T:11600
R:11406
W:153
jobsearch.png
353KB
T:15516
R:14795
W:359
T:15982
R:15666
W:265
T:19716
R:18328
W:367
T:16061
R:15856
W:162
T:19394
R:18629
W:393
T:17225
R:16997
W:187
p-and-f-charting.png
59KB
T:5600
R:4869
W:365
T:6520
R:6218
W:250
T:7728
R:6341
W:363
T:9085
R:8842
W:199
T:5934
R:4683
W:399
T:7098
R:6246
W:811
s-and-r-charting.png
40KB
T:4048
R:3313
W:366
T:5301
R:5006
W:243
T:6102
R:4347
W:363
T:3788
R:3550
W:177
T:5456
R:3700
W:973
T:3151
R:2614
W:496
dialogs.png
114KB
T:15074
R:4492
W:315
T:12620
R:5056
W:316
T:14032
R:4737
W:314
T:11809
R:4402
W:180
T:11830
R:5288
W:299
T:12483
R:4346
W:318
candidateshortlist.png
160KB
T:11319
R:10951
W:366
T:7246
R:7082
W:121
T:9932
R:9491
W:440
T:10277
R:10108
W:127
T:7694
R:7310
W:380
T:9235
R:9044
W:147
mainscreen.png
166KB
T:16810
R:10758
W:309
T:13498
R:8243
W:138
T:18673
R:11210
W:323
T:14714
R:7201
W:1516
T:16115
R:11113
W:337
T:13815
R:8579
W:252
technical-analysis-
signals.png
112KB
T:11550
R:7339
W:307
T:13209
R:8886
W:155
T:12731
R:7460
W:351
T:9232
R:6055
W:227
T:11668
R:6421
W:336
T:8031
R:5615
W:221
homepage.png
266KB
T:15834
R:15400
W:357
T:14030
R:13797
W:181
T:16209
R:14827
W:364
T:12099
R:11927
W:134
T:17235
R:16470
W:396
T:15956
R:15609
W:305

 

Analysis of browser results

The metrics to pay attention here are R (the average receive times) and W (the wait times).

I didn’t pay much attention to T (the average total times) because I felt they are misleading. The problem is that browsers download embedded resources like <img>s using a small number of connections. When there are more resources than there are connections, the extra resources are blocked until some connections are freed. These blocked times manifest in the T values, but they are not deterministic and are also not similar across browsers since connection implementations differ. Hence, total times should be ignored in my opinion.

What can we observe from the R(eceive) and W(ait) times?

  • Chrome: For 5 out of 9 images, R(eceive) times from cloudfront are less than receive from own server. For other 4 images, receive times from cloudfront are slightly higher. So it’s almost a tie. However, W(ait) times are consistently less for Cloudfront. So, Cloudfront leads.
  • Firefox: For 6 out of 9 images, R(eceive) times from cloudfront are lesser. W(ait) times are also consistently less, except in one case, which seems to be an anomaly. Cloudfront leads again.
  • IE: For 6 out of 9 images, R(eceive) times from cloudfront are lesser. W(ait) times are also consistently less, except in 2 cases, which seem to be anomalies. Cloudfront leads again.

Browser Conclusions

Cloudfront does makes the site fasterbut not as consistently or drastically as expected, atleast in my tests (I’m in India and my nearest edge locations seem to be Singapore or Hong Kong).

One possible factor may be that the resources should get lots of hits for Cloudfront to cache and provide them more effectively. I’m not sure about this, but cloudfront documentation does seem to hint that more popular resources will benefit more.

 

Load measurements using apache bench (ab)

Methodology

ab is incapable of downloading a web page and all its embedded resources. So I ran ab requests on just one of the image files – the biggest one at 350KB.

I set different values for -n and -c options. -k was enabled to simulate browser behaviour by keeping connections alive.

 

Results

 

Own server

Cloudfront

50 total requests, 1 user

   

Total time

292.69s

214s

Mean Time / request

5.85 s

4.28s

Max time taken by 90% of requests

8.4s

4.9s

Max time taken by 50% of requests

5.35s

4.26s

Data transferred

18051951

18070081

50 total requests, 5 concurrent users

 

Total time

211.97 s

215.24s

Mean Time / request

4.24 s

4.30s

Max time taken by 90% of requests

31.4s

35.3s

Max time taken by 50% of requests

19.3s

18.7s

Data transferred

18737815

18826276

50 total requests, 10 concurrent users

 

Total time

217.34 s

218.7s

Mean Time / request

4.35 s

4.37s

Max time taken by 90% of requests

58.8s

67.4s

Max time taken by 50% of requests

40s

27.5s

Data transferred

19460082

19705381

50 total requests, 25 concurrent users

 

Total time

227.57s

239.53s

Mean Time / request

4.55s

4.79s

Max time taken by 90% of requests

142.6s

130.6s

Max time taken by 50% of requests

61.9s

46.7s

Data transferred

20218419

21540782

80 total requests, 40 concurrent users

 

Total time

337.93s

412.69s

Mean Time / request

4.22s

5.16s

Max time taken by 90% of requests

235.3s

239.3s

Max time taken by 50% of requests

82.4s

84s

Data transferred

28945837

34307057

100 total requests, 50 concurrent users

 

Total time

477.91s

477.13s

Mean Time / request

4.78s

4.77s

Max time taken by 90% of requests

260.1s

201.7s

Max time taken by 50% of requests

137.1s

53.3s

Data transferred

34238460

43105119

 

Analysis of ab results

Results are so all over the place, that I found it difficult to draw any conclusion!

The 50th percentile results in some tests clearly favour Cloudfront, but not consistently.

 

I also found it hard to understand some of the raw values (not shown here). For example, in the last test with 100 requests across 50 concurrent users, total time was 477.1s but the longest request was 454s! How that can be is beyond me. I’m guessing that a request sent fairly early never got a response. It’s possible that this was because load was too much for my puny 512 kbps bandwidth.

Another thing to notice is that data volume with cloudfront is atleast 25% higher at higher loads. I’m guessing that this is because of TCP retransmissions, though why it appears only when communicating with cloudfront is not clear.

 

Conclusion

I’m reluctant to draw any concrete conclusion from ab results except that 50% of requests seem to be faster most of the time when using Cloudfront.

 

Load measurements using Apache JMeter

Methodology

JMeter was used to test the following loads:

  • 50 total requests with 1 user. Retrieve embedded resource using a pool of 9 threads (9 because the page had 9 images)
  • 50 total requests across 5 concurrent users. Retrieve embedded resources using pools of 5 threads (only 5 because JMeter creates multiple pools for each virtual user, which means 5 users x 5 threads = 25 threads would be created. I was afraid that higher pool sizes might make bandwidth contention a factor in the timings)

Results

 

Ownserver

Cloudfront

Notes

50 total requests,
1 user,
9 downloading threads

     

avg

22.8s

20.01s

 

90% of requests

24.4s

25.7s

 
       

50 total requests,
5 concurrent users,
5 downloading threads per user

     

avg

75s

48s

Actually overall 77s, but only 48s if 7 anomalous measurements were removed.
Ownserver actually never finished 50 requests. Probably, socket timeouts.

90% of requests

110.5s

60s

Actually 232.6s
But 34 out of 43 (80%) were within 60s.

 

Analysis of results

When simulating a single user, using Cloudfront didn’t show any major improvement in speed.

But when simulating 5 concurrent users with 5 resource downloading threads per user, I saw interesting results. 7 results timed out with extremely high times like 270 seconds. These I put down as anomalies, possibly because I was overloading my bandwidth.

Without those anomalies included, the average time per request was just 48 seconds when using cloudfront, compared to 75 seconds when not. Also, 80% of the remaining timings completed within 60 seconds when using cloudfront, compared to 110.5 seconds when not.

 

Conclusion

So load testing with JMeter shows that Cloudfront is better at higher loads.

 

Measurements using www.webpagetest.org

Methodology

www.webpagetest.org provides automated testing for websites, from client locations around the world.

5 tests were conducted from each location and each method of serving images.

 

Results

Its results come out as follows:

  Served from own server Cloudfront
New York 8.772 s 8.911 s
London 8.791 s 8.703 s

 

Conclusion

Doesn’t look like Cloudfront has improved page speeds.


 

Cost analysis

If the choice is between storing content on an EC2 EBS drive and serving it from EC2 web server, vs. storing it in S3 and serving it via Cloudfront, the following cost components are relevant (as of Dec 2011):

Assume ‘B’ GBs is the size of content (for simplicity, I’ll assume just 1 file of ‘B’ GBs) being stored.

Assume 1 user requests this file each and every second every day, which comes to 86400 requests/day or 2,592,000 requests/month.

via EBS and EC2 via S3 and Cloudfront
EBS storage per GB = $0.10B S3 reduced redundancy storage = $0.093B
(ignoring S3 IO request costs by assuming this file will be stored just once, and then always served via Cloudfront)
EBS cost per 1 million IO requests = $0.10 x 2.592 = 0.2592B Cloudfront data transfer = $0.19B
But as we have seen with ab tests, at higher loads, more data is transferred due to TCP retransmissions.
Assuming 20% extra data is transferred, this will come to $0.228B
Data transfer through elastic IP = $0.01B Cloudfront cost per 10000 HTTP requests = $0.009 x 2592000/10000 = $2.3328
Total: $0.3692B
If that file is 1GB in size, this comes to $0.37
Total: $2.3328 + 0.321B
If that file is 1GB in size, this comes to $2.65

So cost wise too, Cloudfront comes out costlier than serving off EBS or S3. It’s really only its HTTP request costs that tilt the choice away from Cloudfront.

 


 

Final conclusion

In my case, my website is not a high traffic site. I also didn’t observe any drastic improvement in page speeds, except possibly at high loads (shown by the JMeter results). And cost wise, it’s indeed cheaper to stick with EBS and EC2.

So, should I use Cloudfront or not? I think it’s not needed for my site at the moment.

Categories: AWS Tags: , , , ,