Amazon continues to prove itself as deserving of its spot as one of the driving forces behind the IT sector when they launched the new Amazon Redshift service, which aims to sell data warehousing as a service.
Redshift has given traditional data warehousing players (which include the likes of IBM and Oracle) that combine hardware and software a huge headache, as it targets large companies who feel like they are paying too much for datawarehousing, as well as small companies that don’t have the budget for datawarehousing in the first place, both of which end up just discarding some of their data.
Amazon’s Redshift service essentially allows customers to obtain significant improvements in their query performance during dataset analyzation, which can range in size from hundreds of gigabytes to petabytes or more, using the same SQL based business intelligence tools that they already have access to.
According to Amazon, running a decent sized data warehouse on their own will usually cost firms somewhere between $19,000 and $25,000 per terabyte every year. With Amazon Redshift, the total cost for the same capabilities will only amount to $1,000 per year.
The advantages of Amazon Redshift goes beyond mere cost savings, though, as the service also greatly optimizes and streamlines a company’s manpower and operations, since the data warehousing service frees up administrators who would otherwise be tied down doing monitoring, tuning, backups, and software patching, all of which are now done automatically through the AWS Management Console. Amazon also states that pricing is highly scalable, with their packages starting from hundred gigabytes at the low end, and can easily scale up to more than a petabyte.
Since Redshift is based on relational database technology and uses SQL as its query language, it is compatible with existing Business Intelligence tools. The service is especially compatible with ParAccel – and for good reason: Amazon is an investor in said company, and they have admitted to licensing key technology from it.
Redshift is not to be mistaken for the previously available Amazon Relational Database Service (RDS). Whereas Redshift is exclusive to warehousing and analytics, RDS is targeted more towards transactional database uses. Additionally, Redshift is also capable of big data scale. RDS is only based on Microsoft SQL server, MySQL, and Oracle, all of which aren’t designed to handle petabye-levels of data warehousing.
Amazon Redshift’s active instance is dubbed a “Data Warehouse Cluster,” or simply a cluster. The service allows for single and multi node clusters, with single node ones capable of storing up to 2 TB of data, and can be easily upgraded to a multi node one if needs require it.
Even though there is a potential for big data analysis, Amazon is keen to emphasize Redshift’s potential for small and midsize companies that want to get into data warehousing but don’t have the budget for conventional ones. If you want some 2 tb, you will be charged $8.50 for 10 hours. You can get further discounts and pull it down to as low as $2.28 per 10 hours. This puts it well within the reach of even small, budget-strapped companies in need of data warehousing.
Amazon has a knack for disrupting the market and offering things that nobody else has been able to offer in the past. They’re not intent on bucking the trend with Redshift, as it still managed to disrupt the market ands offer a new service, this time also doing so from a cost-value perspective.