Did Amazon CloudFront CDN make my site faster?
Overview
When I was deploying my website, I ran into a slow page load problem. One of the pages had about 9 non-interleaved screenshot PNG image files, each about (700 x 500 pixels) in size and between 40 KB and 350KB file size.
I wondered if deploying these images on Amazon CloudFront would improve response times. Amazon CloudFront is the Content Delivery Network (CDN) offering from Amazon and is one of the cloud services that constitute Amazon Web Services (AWS).
CDNs are supposed to improve response times by replicating resources across multiple servers around the world, and serving a requested resource from the server closest to the requesting client. The implicit assumption is that the root cause of latency is geographical distance (greater the distance, more the number of routers involved in-between), and so serving files from a server that is physically closer should reduce latency.
Since my site was already hosted on Amazon’s EC2, it made sense to try their CloudFront CDN, rather than some other vendor’s CDN. Though this was not a performance critical page, it did provide the opportunity to experiment with CloudFront for a realistic scenario, and the knowledge gained may prove useful in future. So I started experimenting…
Setup
I decided to use Amazon S3 as the origin server for Cloudfront (the origin server is the server from which Cloudfront picks up the resources to replicate). I opted for "reduced redundancy storage" setting instead of "standard redundancy" for the S3 bucket, to minimize costs (and also because these images are already available to me from my development machine and web server..standard redundancy makes more sense for user content data or critical backups).
Evaluation criteria
Better response times would be great.
Even if there was no improvement in response times, a CDN would still reduce the load on my rather underpowered EC2 micro instance web server, and spare me some more connections for more dynamic content, like my SaaS products. So I was already somewhat biased towards using Cloudfront or some other file server before evaluating them.
But CloudFront, like other AWS services, is a metered service. So the evaluation also needed to keep costs in mind.
Performance measurements
For response time measurements, I decided to use different tools to get a complete picture:
- The first set of measurements are taken using browsers. All 3 major browsers – Chrome, Firefox and IE – provide excellent profiling tools for developers.
- However, browser measurements are not enough. The system should also be tested for scalability. What happens to response times when there are dozens of concurrent connections requesting the page? Can the page be rendered to all those users without much increase in response times? With a single web server on an underpowered machine, this is clearly not possible. But putting a CDN in the mix should shift atleast some of the load on my puny single unscalable web server to Amazon’s scalable mammoth delivery network. I used Apache Jmeter and Apache Bench (ab) tools to load the server.
Browser measurements
Methodology
Chrome’s developer tools network tab, Firefox Firebug network tab, and IE’s developer tools Network tab provide profiling information.
Chrome and Firefox (via Firebug and Firebug NetExport plugin) can export profiling data to JSON format files called .har files.
IE exports to XML files which have a similar schema to the JSON .har files but expressed in XML.
Each browser was tested 5 times with a complete cache cleanup in between. The cache cleanup ensured that all images were downloaded in each test. However, cache cleanup does not clear the browsers’ DNS caches, which means DNS lookup timings are usually manifested only in the first test.
A python script was used to parse these files, calculate averages and produce the below HTML table of averages.
Results
Legend to the table:
1st column => the image file name
"OwnServer" => tests in which images were downloaded from my Apache web server running on EC2 and EBS
"Cloudfront" => tests in which images were downloaded from Cloudfront distribution with S3 as origin server
T => Total time for request and response (including thread blocked, wait, connect, send request, wait, receive response)
R => Total time for just receiving all the data
W => Time spent waiting before response started
All figures are in milliseconds
Chrome OwnServer |
Chrome Cloudfront |
Mozilla OwnServer |
Mozilla Cloudfront |
IE OwnServer |
IE Cloudfront |
|
corporatesearch.png 158KB |
T:14897 R:14525 W:369 |
T:7913 R:7721 W:141 |
T:9924 R:9476 W:447 |
T:13010 R:12792 W:177 |
T:9038 R:8567 W:386 |
T:11600 R:11406 W:153 |
jobsearch.png 353KB |
T:15516 R:14795 W:359 |
T:15982 R:15666 W:265 |
T:19716 R:18328 W:367 |
T:16061 R:15856 W:162 |
T:19394 R:18629 W:393 |
T:17225 R:16997 W:187 |
p-and-f-charting.png 59KB |
T:5600 R:4869 W:365 |
T:6520 R:6218 W:250 |
T:7728 R:6341 W:363 |
T:9085 R:8842 W:199 |
T:5934 R:4683 W:399 |
T:7098 R:6246 W:811 |
s-and-r-charting.png 40KB |
T:4048 R:3313 W:366 |
T:5301 R:5006 W:243 |
T:6102 R:4347 W:363 |
T:3788 R:3550 W:177 |
T:5456 R:3700 W:973 |
T:3151 R:2614 W:496 |
dialogs.png 114KB |
T:15074 R:4492 W:315 |
T:12620 R:5056 W:316 |
T:14032 R:4737 W:314 |
T:11809 R:4402 W:180 |
T:11830 R:5288 W:299 |
T:12483 R:4346 W:318 |
candidateshortlist.png 160KB |
T:11319 R:10951 W:366 |
T:7246 R:7082 W:121 |
T:9932 R:9491 W:440 |
T:10277 R:10108 W:127 |
T:7694 R:7310 W:380 |
T:9235 R:9044 W:147 |
mainscreen.png 166KB |
T:16810 R:10758 W:309 |
T:13498 R:8243 W:138 |
T:18673 R:11210 W:323 |
T:14714 R:7201 W:1516 |
T:16115 R:11113 W:337 |
T:13815 R:8579 W:252 |
technical-analysis- signals.png 112KB |
T:11550 R:7339 W:307 |
T:13209 R:8886 W:155 |
T:12731 R:7460 W:351 |
T:9232 R:6055 W:227 |
T:11668 R:6421 W:336 |
T:8031 R:5615 W:221 |
homepage.png 266KB |
T:15834 R:15400 W:357 |
T:14030 R:13797 W:181 |
T:16209 R:14827 W:364 |
T:12099 R:11927 W:134 |
T:17235 R:16470 W:396 |
T:15956 R:15609 W:305 |
Analysis of browser results
The metrics to pay attention here are R (the average receive times) and W (the wait times).
I didn’t pay much attention to T (the average total times) because I felt they are misleading. The problem is that browsers download embedded resources like <img>s using a small number of connections. When there are more resources than there are connections, the extra resources are blocked until some connections are freed. These blocked times manifest in the T values, but they are not deterministic and are also not similar across browsers since connection implementations differ. Hence, total times should be ignored in my opinion.
What can we observe from the R(eceive) and W(ait) times?
- Chrome: For 5 out of 9 images, R(eceive) times from cloudfront are less than receive from own server. For other 4 images, receive times from cloudfront are slightly higher. So it’s almost a tie. However, W(ait) times are consistently less for Cloudfront. So, Cloudfront leads.
- Firefox: For 6 out of 9 images, R(eceive) times from cloudfront are lesser. W(ait) times are also consistently less, except in one case, which seems to be an anomaly. Cloudfront leads again.
- IE: For 6 out of 9 images, R(eceive) times from cloudfront are lesser. W(ait) times are also consistently less, except in 2 cases, which seem to be anomalies. Cloudfront leads again.
Browser Conclusions
Cloudfront does makes the site faster…but not as consistently or drastically as expected, atleast in my tests (I’m in India and my nearest edge locations seem to be Singapore or Hong Kong).
One possible factor may be that the resources should get lots of hits for Cloudfront to cache and provide them more effectively. I’m not sure about this, but cloudfront documentation does seem to hint that more popular resources will benefit more.
Load measurements using apache bench (ab)
Methodology
ab is incapable of downloading a web page and all its embedded resources. So I ran ab requests on just one of the image files – the biggest one at 350KB.
I set different values for -n and -c options. -k was enabled to simulate browser behaviour by keeping connections alive.
Results
Own server |
Cloudfront |
|
50 total requests, 1 user |
||
Total time |
292.69s |
214s |
Mean Time / request |
5.85 s |
4.28s |
Max time taken by 90% of requests |
8.4s |
4.9s |
Max time taken by 50% of requests |
5.35s |
4.26s |
Data transferred |
18051951 |
18070081 |
50 total requests, 5 concurrent users |
||
Total time |
211.97 s |
215.24s |
Mean Time / request |
4.24 s |
4.30s |
Max time taken by 90% of requests |
31.4s |
35.3s |
Max time taken by 50% of requests |
19.3s |
18.7s |
Data transferred |
18737815 |
18826276 |
50 total requests, 10 concurrent users |
||
Total time |
217.34 s |
218.7s |
Mean Time / request |
4.35 s |
4.37s |
Max time taken by 90% of requests |
58.8s |
67.4s |
Max time taken by 50% of requests |
40s |
27.5s |
Data transferred |
19460082 |
19705381 |
50 total requests, 25 concurrent users |
||
Total time |
227.57s |
239.53s |
Mean Time / request |
4.55s |
4.79s |
Max time taken by 90% of requests |
142.6s |
130.6s |
Max time taken by 50% of requests |
61.9s |
46.7s |
Data transferred |
20218419 |
21540782 |
80 total requests, 40 concurrent users |
||
Total time |
337.93s |
412.69s |
Mean Time / request |
4.22s |
5.16s |
Max time taken by 90% of requests |
235.3s |
239.3s |
Max time taken by 50% of requests |
82.4s |
84s |
Data transferred |
28945837 |
34307057 |
100 total requests, 50 concurrent users |
||
Total time |
477.91s |
477.13s |
Mean Time / request |
4.78s |
4.77s |
Max time taken by 90% of requests |
260.1s |
201.7s |
Max time taken by 50% of requests |
137.1s |
53.3s |
Data transferred |
34238460 |
43105119 |
Analysis of ab results
Results are so all over the place, that I found it difficult to draw any conclusion!
The 50th percentile results in some tests clearly favour Cloudfront, but not consistently.
I also found it hard to understand some of the raw values (not shown here). For example, in the last test with 100 requests across 50 concurrent users, total time was 477.1s but the longest request was 454s! How that can be is beyond me. I’m guessing that a request sent fairly early never got a response. It’s possible that this was because load was too much for my puny 512 kbps bandwidth.
Another thing to notice is that data volume with cloudfront is atleast 25% higher at higher loads. I’m guessing that this is because of TCP retransmissions, though why it appears only when communicating with cloudfront is not clear.
Conclusion
I’m reluctant to draw any concrete conclusion from ab results except that 50% of requests seem to be faster most of the time when using Cloudfront.
Load measurements using Apache JMeter
Methodology
JMeter was used to test the following loads:
- 50 total requests with 1 user. Retrieve embedded resource using a pool of 9 threads (9 because the page had 9 images)
- 50 total requests across 5 concurrent users. Retrieve embedded resources using pools of 5 threads (only 5 because JMeter creates multiple pools for each virtual user, which means 5 users x 5 threads = 25 threads would be created. I was afraid that higher pool sizes might make bandwidth contention a factor in the timings)
Results
Ownserver |
Cloudfront |
Notes |
|
50 total requests, |
|||
avg |
22.8s |
20.01s |
|
90% of requests |
24.4s |
25.7s |
|
50 total requests, |
|||
avg |
75s |
48s |
Actually overall 77s, but only 48s if 7 anomalous measurements were removed. |
90% of requests |
110.5s |
60s |
Actually 232.6s |
Analysis of results
When simulating a single user, using Cloudfront didn’t show any major improvement in speed.
But when simulating 5 concurrent users with 5 resource downloading threads per user, I saw interesting results. 7 results timed out with extremely high times like 270 seconds. These I put down as anomalies, possibly because I was overloading my bandwidth.
Without those anomalies included, the average time per request was just 48 seconds when using cloudfront, compared to 75 seconds when not. Also, 80% of the remaining timings completed within 60 seconds when using cloudfront, compared to 110.5 seconds when not.
Conclusion
So load testing with JMeter shows that Cloudfront is better at higher loads.
Measurements using www.webpagetest.org
Methodology
www.webpagetest.org provides automated testing for websites, from client locations around the world.
5 tests were conducted from each location and each method of serving images.
Results
Its results come out as follows:
Served from own server | Cloudfront | |
New York | 8.772 s | 8.911 s |
London | 8.791 s | 8.703 s |
Conclusion
Doesn’t look like Cloudfront has improved page speeds.
Cost analysis
If the choice is between storing content on an EC2 EBS drive and serving it from EC2 web server, vs. storing it in S3 and serving it via Cloudfront, the following cost components are relevant (as of Dec 2011):
Assume ‘B’ GBs is the size of content (for simplicity, I’ll assume just 1 file of ‘B’ GBs) being stored.
Assume 1 user requests this file each and every second every day, which comes to 86400 requests/day or 2,592,000 requests/month.
via EBS and EC2 | via S3 and Cloudfront |
EBS storage per GB = $0.10B | S3 reduced redundancy storage = $0.093B (ignoring S3 IO request costs by assuming this file will be stored just once, and then always served via Cloudfront) |
EBS cost per 1 million IO requests = $0.10 x 2.592 = 0.2592B | Cloudfront data transfer = $0.19B But as we have seen with ab tests, at higher loads, more data is transferred due to TCP retransmissions. Assuming 20% extra data is transferred, this will come to $0.228B |
Data transfer through elastic IP = $0.01B | Cloudfront cost per 10000 HTTP requests = $0.009 x 2592000/10000 = $2.3328 |
Total: $0.3692B If that file is 1GB in size, this comes to $0.37 |
Total: $2.3328 + 0.321B If that file is 1GB in size, this comes to $2.65 |
So cost wise too, Cloudfront comes out costlier than serving off EBS or S3. It’s really only its HTTP request costs that tilt the choice away from Cloudfront.
Final conclusion
In my case, my website is not a high traffic site. I also didn’t observe any drastic improvement in page speeds, except possibly at high loads (shown by the JMeter results). And cost wise, it’s indeed cheaper to stick with EBS and EC2.
So, should I use Cloudfront or not? I think it’s not needed for my site at the moment.