
Disk drive costs have fallen to such a degree that the idea of using tape to archive data is no longer necessary. Instead you can offline such data to the lowest-cost disk available or move it to the cloud.
Amazon Glacier is the leader in this area. They offer long-term data archival for as little as $0.01 USD per month. To put that into perspective, a sample Dell data-center quality 10 TB storage device costs $10,000 USD. 10 TB = 10,000 GB. Amazon Glacier would charge you $100 per month to store the same amount of data.
The word “glacier” in Amazon Glacier refers to the process by which they “thaw out” your data. Since it is archived, it is not online. So you put in a ticket, and 5 to 6 hours later Amazon puts it online.
You cannot simple upload archives to Amazon Glacier. You can create vaults using the console, but to write data there you have to use REST APIs and write a program. This is how their S3 object storage works, but one wonders why you would need to write a program just to load an archive there. However such a program could be a few simple lines of code, modified for the purposes at hand or programmed to receive a command line argument to upload a file.
Taking a shot at Dropbox and Amazon, Google Drive has lowered its storage costs to $100 per month for 10 TB. Company auditors, bank regulators, and others might be less comfortable with storing important, albeit old, data in something whose image would not at first glance seem as secure as an enterprise offering like Amazon. “Secure” in this case would mean there is some kind of replication and support. Certainly Google uses replication. They say they offer telephone support, which is surprising for those of us used to using the Google Drive for personal use. For personal users, Google does not answer the phone.
What kind of data do you need to archive? SOX, SEC stock exchange, litigation, and forensics requirements are all reasons why you need to store data, email, and document for many years. For example, a security system would run slower as more data is added, so to speed it up, you would need to archive certain table spaces or files. That is true for accounting and other ERP systems as well. No one is going to check forensics data 7 years old, but bank regulators and certainly lawyers and lawsuits are going to want to dig that far into the past. Financial applications too would need transaction-level detail many years in the past as stipulated by regulators.
Panzura is another company that offers archival in the cloud. It offers an CIFS and NFS interface, meaning you could mount their drive on your existing storage device. They say their Active Archive cloud solution offers online access to your data. That would mean it is not an archive at all, but who would complain about that.
Most computing is moving to the cloud, because of low costs. So it is only natural that one would look to the cloud to archive data as well.