Online video content using Amazon S3 for TaxTV

Reynold Greenlaw

The conventional approach to making videos or other data available online is to buy a server and host it at an ISP. This requires up-front capital costs for the server, and development of systems to keep track of backups, redundant copies, and scaling up as your needs increase. An alternative approach is to use storage web services, such as Amazon's Simple Storage Service, or Amazon S3 for short. Your files are stored on the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. They only charge for storage you use (starting at 15¢ per gigabyte-month), so the costs start low and ramp up as you gain traffic. Your development time is freed up so you can spend more working on the more interesting aspects of your application.

TAXtv (http://taxtv.co.uk/) is a subscription-only monthly video of tax news and information. The episodes are viewable by subscription only, and the editing and production of the video files is contracted out to Video Inn Productions. OCC's task was to provide a convenient way for Video Inn to upload the files to the web site and for the editors of the site to obtain a unique URL for the episode to include in the monthly mail-out. To do this we used the service from Amazon called Simple Storage Service, or Amazon S3 for short.

The Trouble with Videos

If a picture is worth a thousand words, a video is worth a million: less poetically, an episode of TAXtv weighs in at about 200 MB. You generally want to offer users a choice of sizes, so we could reckon on an episode being 500 MB. In the case of TAXtv, we can delete old episodes; a site with videos that do not expire would need to budget for a lot more storage.

There are two other limits we may bump in to: monthly download caps, and maximum data rate. Assuming subscribers watch the video once each, we can multiply the number of subscribers by 200 MB to get the monthly download requirement. Someone watching in real time needs to fetch the video data at around 1 Mb/s to avoid pauses for buffering. With a monthly subscription, the majority of views can be expected to fall in the two days after an episode is released; worse than that, we can expect them to fall mostly in office hours. So we can estimate the necessary data rate by dividing the monthly bandwidth by 16 hours.

Subscriber count1001,00010,000100,0001,000,000
Monthly bandwidth/GB20200200020,000200,000
Data rate/Mb/s3282802,80028,000

These are just rough figures, but they show that 1000 subscribers would be pushing the limits of a shared host, 10,000 subscribers exceeds the capacity of a dedicated server with 100 Mb/s Ethernet and 100,000 exceeds Gigabit Ethernet: it will need a server farm, not just one server.

Obviously you don't expect 100,000 subscribers on day one. The headaches start because you need to worry about future requirements while deploying even the small-scale version. This entails buying server hardware before you have the subscriber numbers to pay for it. You also need to plan ahead for when you need to scale up to a server farm rather than a single server.

Amazon S3 and Amazon CloudFront

Obviously Amazon have had to solve this problem for their own massive datacentres, and they are willing to sell their solution as a service, Amazon S3. The idea is to pay just for the capacity you use, without needing to know the details of hardware and configuration.

S3 objects can be downloaded via normal URLs, or for greater performance we can use Amazon's content-delivery network, Amazon CloudFront. This gives each person downloading the file the lowest-latency, highest-bandwidth connection to the data, using Amazon's edge servers. CloudFront also supports streaming, using the protocol understood by Flash video players.

We developed a small web app that presents a form for uploading video files directly to Amazon S3. This is great because it saves us the trouble of handling the uploaded files and then copying them to S3 ourselves. The form includes a digital signature which means we can allow Video Inn to upload videos without giving them the keys to the S3 account. Our app stores the URLs of the uploaded files against the episode number, and then produces a URL for inclusion in the monthly mail-out.

The episode page uses the same embedded video player as before, Flowplayer 3.1. The difference is that we activate its streaming plug-in, and supply URLs referring to our distribution on Amazon CloudFront instead of on our own server.

Bottom Line

Uploading and downloading the videos during development cost us about 30¢, which is a lot easier to pay up-front than £5000 for a dedicated server and hosting. Not only that, but the development time was reduced by eliminating the need for us to plan and deploy a scaleable solution, or even to purchase and deploy the streaming server. The existing shared hosting will suffice for the episode page and the editing app.

As the number of subscribers increases we would expect the Amazon CloudFront cost to increase accordingly:

Subscriber count1001,00010,000100,0001,000,000
Monthly bandwidth/GB20200200020,000200,000
Monthly CloudFront cost$3$30$300$2500$17,000

These figures are approximate and ignore a few dollars spent on S3; the important thing is that you don't have to pay for 1,000 subscribers' usage until you have 1,000 subscribers.