Wednesday, October 30, 2019

A comparision of AWS S3 Glacier Deep Archive region pricing

I'm considering using S3 for personal backups.  They recently introduced a new tier of storage called "S3 Glacier Deep Archive" which is intended for storing files that you will likely never, or perhaps once need to read.  Every geographic region AWS offers storage in has its own pricing.  I couldn't find a nice table with all the prices compared so I found the price to store 1 TB for 1 year in each region:

Using their tool:

If you're considering this keep in mind there are some important caveats.  First you pay for each request, which means if you're storing 1,000,000 files you will pay $50 just for the requests.  Doesn't matter if each file is 1 MB, or 1 KB, or even 1 byte each, it's $0.50 per 1000 PUT requests.  You will then also pay storage fees every month on top of that.  As far as I can tell, you don't pay for the bandwidth to upload the files.

Retrieving the files has more caveats.  First you need to pick a speed, standard or bulk.  Standard takes up to 12 hours, and bulk is up to 48 hours.  Standard also costs about 10x as much as bulk.  And here you pay for the individual requests, the data retrieved, and (I believe) bandwidth to download from S3.

So if you're storing many smallish files (documents) you're probably much better off combing them all into a single zip file, to reduce the number of requests you have to do.  On the other hand if you're storing large files (videos), you'd probably be better off leaving them on their own so that ideally you just need to recover one or two, and then don't have to pay for the bandwidth to download them all.

I made this table to compare some scenarios.  The first 3 rows shows the costs to retrieve 1 TB split across either 1, 1024, or1048576 file.  The less file scenarios are cheaper, but not by a ton, and keep in mind if you only needed a few of those files it'd be much cheaper to just grab those individual files if they weren't zipped together.

The bottom 2 rows shows the cost to get 1 GB of files, either as 1 file or 1024 files.  Here the cost is negligible, pretty much however you store and access it.

So it seems in any case the bandwidth is the biggest cost.  Still, since you generally only pay for bandwidth out of S3 and not in to it, you should never really have to pay this, unless you're recovering from a pretty major disaster.  There is also the option to use AWS Snowball, where they will mail you a physical drive which you keep for up to 10 days then mail back.  That works out to be $200 + $0.03 per GB vs just $0.09 per GB for bandwidth.  So you need to be transferring 10s of TBs before it makes sense.