July 3, 2018

Snowflake democratizes DW in the cloud

We had the privilege to meet and visit Snowflake last week during the 27th IT Press Tour and we were impressed by the team. Something big is happening there, one of the top 3 session of the tour for sure, great moment.

Founded in 2012 by Thierry Cruanes, Benoit Dageville and Marcin Zukozski with the mission to develop the first real datawarehouse (DW) built for the cloud, Snowflake Computing rapidly grew and recruited the famous senior executive Bob Muglia to lead the company. He arrived in 2014 aligned with first customers deliveries. The company has raised so far half a billion of dollars, exactly $472.9M, in 6 rounds from famous top VCs for a valuation around $2.5B estimated today. Bob's arrival has marked a new era for Snowflake.

The original idea came from the convergence of 2 facts: more and more data are stored in the cloud and there is no datawarehouse designed for the cloud to leverage and harness this fast growing presence. The past has demonstrated that solutions from Oracle and Teradata, very expensive, didn't well addressed the need for high volume of data for semi-structured data. On the other side, Hadoop provides a new cost effective way to ingest large volume of data but relies on batch techniques limiting the enterprise adoption with this lack of real-time capabilities. DW was really a batch exercice and even Hadoop failed to deliver this new enterprise must to boost market adoption. These elements were a real trigger for the founding team with a few key fundamental elements to integrate together and:
  • build and design a new architecture to sustain new needs especially on size with for instance 10PB tables but essentially unlimited data sizes,
  • leverage the cloud for elasticity and scalability of resources so think and design for the cloud as a cloud-native solution,
  • be fast with real-time capabilities and high concurrency with massive parallel processing techniques,
  • rely on SQL, the preferred, established, recognized and standard query language,
  • give access to all users from one location,
  • provide a very secure environment able to be shared within a eco-system,
  • sell at a fraction of the "classic" DW cost with a pay per use model charged per second.
The result is a solution built for the cloud who could be named a Cloud Analytic Database, a Datawarehouse as-a-Service, that is pretty unique on the market.

The product segregates compute and storage resources, first historically available on AWS with respectively EC2 instances and S3 blog storage. We heard that Azure will be offered soon. AWS EC2 hosts processing instances from dedicated clusters of compute nodes that run isolated workloads, this is defined as virtual warehouse. S3 is used for durability, ubiquity, cost and simplicity reasons. Data are compressed and price is about $30/TB/month, so really compelling. The database designed is column-oriented for better compression and I/O limitation built in micro-partitions for better isolation. Meta-data, critical in such system, relies on FoundationDB, a startup acquired by Apple in March 2015 and met by the IT Press Tour in June 2014 before that acquisition. All the team were impressed by the team products at that time no doubt, Apple picked up and Snowflake selected them.
Concurrency is a key aspect here and all users see always a consistent view of data, write operations never block readers.

The company is also a big promoter of data sharing what they called data sharehouse with the goal to offer collaboration on data without any copy or move of them or any long process and it's included in the product at no extra cost.

Snowflake had early successes from media, adtech, gaming and software companies and more recently from large corporations. We've been told that the company has passed the barrier of 1,000 clients which represents a huge achievement. Competition is also pretty active from big players such IBM, Microsoft or Oracle but also from Amazon, Google, HPE, Teradata or Actian. So again having raised this huge amount of money invites them to build, develop and act rapidly, very rapidly.

Finally if you need to think about Snowflake in 2 words, you could imagine it is the "Salesforce of the Data Warehouse", really a big shake of the industry palm...

0 commentaires: