Google expanded on an idea first proposed at Carnegie Mellon University, which we now call object storage. EMC and other companies have made various implementations of this, but what they have in common is REST web services to write and access data. REST web services are very much similar to ordinary HTML web pages, so they are conceptually easy-to-understand.
Object storage replaces the traditional hierarchical directory storage by giving each file a unique identifier instead of a directory location. So you do not refer to files like this \\mountpoint\a\b\c\d\file.suffix. Instead you refer to them using their name or object id, and in the case of S3, their bucket location.
You use REST web services to create and delete objects in object storage. (There is no update function.) As mentioned above, they work like ordinary HTML. To read a web page using HTML, you use the command GET. To write data to a web page, such as typing in your account number, you use PUT.
The same is true for REST. In REST, you use the command DELETE to delete a file, PUT to write one, and GET to read one. REST, like web pages, uses the HTTP protocol. That means it uses TCP port 80, so there is no need to create any fancy network port forwarding or firewall rules, since those are already in place, as that is how the internet works.
The best way to see how Amazon S3 REST APIs work is to look at an example.
Here are some examples taken from Amazon´s documentation, slightly modified for simplicity.
This first example, shown below, deletes a file named puppy.jpg from the mybucket bucket. You can create a bucket manually using the Amazon AWS Console or with code. A bucket is like a folder. The authorization code is your authentication code used to login to Amazon S3. The domain name is s3.amazonaws.com, but you can customize that.
DELETE /puppy.jpg HTTP/1.1
Authorization: AWS AKIAIOSFODNN7EXAMPLE:k3nL7gH3+PadhTEVn5EXAMPLE
Here is how to read the same file.
GET /puppy.jpg HTTP/1.1
Here is how to write the file.
PUT /puppy.jpg HTTP/1.1
The response sent back from Amazon is an XML message. Amazon S3 also supports the older, more complicated way of working with their object storage which is to send XML documents in both directions. That is called SOAP web services, which is why Amazon calls their collective services “Amazon Web Services (AWS).”
Amazon provides an SDK for their REST API for Java, Python, and other programming languages. Here is a Python example. As you can see it is very simple, which is why lots of programmers prefer Python over Java.
I have added line numbers to the code, so we can discuss those below.
1: import boto
2: s3 = boto.connect_s3()
3: bucket = s3.create_bucket(‘media.yourdomain.com’) # bucket names must be unique
4: key = bucket.new_key(‘examples/first_file.csv’)
The program works like this:
- Import the boto Amazon S3 API for Python, so you can access its functions.
- Connect to Amazon S3.
- Create an empty bucket called “media.”
- Generate a key for a new object.
- Upload the object (first_file.csv)
- Set the file permissions to read.
To try this out, you can sign up for a free account at Amazon AWS. All you need is to download Python and Boto, both of which are free.
Now you have an overview of how programmers interact with Amazon S3 object storage.