Amazon says their biggest customers are themselves and Netflix. Microsoft only mentions themselves: Skype, Office 365, Bing, and Xbox, although I am sure they have many large corporate customers, especially as something like 45% of all apps are written in .Net.
Windows Azure is Microsoft´s cloud. Windows Azure is not just for companies that use Windows. Azure supports Linux and Apache, but their at least one of their storage APIs, for their caching mechanism, are written for .NET languages only. Microsoft also has object and Hadoop storage like other cloud vendors, which are platform independent, although, in Microsoft tradition, they have given those their own names.
Here is some of what Microsoft offers (and what they call it).
HDInsight is Apache Hadoop.
Hadoop is a technology that was invented by Google and Yahoo to store their vast search engine data. It stores data across a network of computers called an Apache Hadoop Distributed File System. This is a hierarchical file system rather than an object one. That means files are referenced by their directory and file name rather than object id.
One of their competitors for that would be Cloudera.
Here again is Microsoft using name different from everyone else. The rest of us would call this “object storage.”
Object storage was conceived by the open source project Openstack Swift. The basic idea is it replaces the hierarchical file system. You no longer refer to files like this \\mountpoint\a\b\c\d\file.txt. Instead each file is given an object id. These files can be spread across an unlimited number of nodes, thus allowing it to scale almost without limit. Hadoop too is spread across nodes.
To write data to object storage you use REST web services. Conceptually these are basically the same as HTTP (HTML) web pages. You add data with the PUT or POST statement. You read it with a GET. To delete, use DELETE. When you view a web page in a browser, fill in a field, and press (-enter-) that is a POST. To read a web page in the browser is a GET. There is no update function with Hadoop or object storage, unless you are working data inside files (i.e., structure data).
Transitioning from traditional NFS (Unix) and SMB\CIFS (Windows) storage requires a data conversion. You would have to write a program to do that or buy a tool or maybe use the cloud´s tool. One thing Microsoft offers which is interesting is you can make a backup onto a disk drive of your data and then ship the disk to their facility and they will load it. That is not a SMB to object conversion. That just reduces the problem associated with transferring so much data over the network.Other object storage vendors are Cleversafe and Amazon S3.
This is Microsoft’s name for what everyone else calls hBase, which is a way to provide real-time access to Hadoop data. Hadoop data sets are processed in batch jobs called MapReduce. One thing to note here is the hBase data is structured data, meaning it fits into rows and columns like a traditional database. Hadoop can store structured or unstructured data (like log files, pictures, and video).
This is a combination of RAM (memory), NVRAM (non-volatile memory), and SSD (solid state storage) plus a caching mechanism for .NET applications. The idea would be to provide extremely fast access to data. Memory and flash drive (SSD) data access is much faster that hard disk drives, because the disk controller and rotating disks are mechanical devices and not solid state ones (meaning there are no moving parts).
Microsoft Offices backup services as well. They encrypt your data as they back it up. Your can administer this with graphical tools or the Windows Power Shell, which are command-line tools and a scripting language.
Microsoft prices their storage differently depending on whether you want local redundancy (i.e. redundancy in one Microsoft data center) or local and geographical (meaning data centers in different locations) and how much data you store. Geographic redundancy would be suitable for very large operations who both want the additional safety of geographical redundancy and the ability to locate data closer to their customers to reduce network latency.
|$0.07 per GB
|Next 49 TB
|$0.065 per GB
|And so on…