Nov 15, 2016

Weka.IO, the new High Performance File Storage

Weka.IO ( is a new gem visited in Tel-Aviv last week during the 20th edition of The IT Press Tour. We have met a strong team with great talent again with some people coming from XIV. I wrote a very short post last June when the company announced a series B with a total of $32M. but not so many informations were available. Now it's a bit different and believe me Weka.IO will shake the File Storage market when the product will hit the market.
First, the background of the team is strong, super strong, with Michael Raam, CEO, Liran Zvibel, Omri Plamon and Barbara Murphy, coming from EMC, IBM, Intel, NetApp, Panasas, Sandforce and XIV.
Second, very important, even if the product is not available and the web site limited, the product is used a highly visible web 2.0 company and 2 POCs are running at major M&E and Genomics research where IO and File access is very demanding. In one of these configuration, Weka.IO is partnering with the best objet storage of the market, Cleversafe, I should say IBM COS.
Weka.IO is developing a new High Performance File Storage product, fully software, addressing the need of very fast IO and high throughput. By high performance, we mean latency in the range of 500usec - 300usec. The product is a distributed parallel file system designed for Flash running across VMs on a Hypervisor farm. As applications require industry file sharing protocols, Weka.IO offers also NFS v3 (SMB is on the roadmap) but the best performance is achieved with their kernel file storage module operating in parallel and fully POSIX compliant. By parallel, we mean and it's the only acceptable definition that the file is split at the client level and chunks are sent in parallel to multiple file storage heads or servers. Here the chunk size is 1MB. As the name stated - 10^30 - there is no human capacity limitation and performance is just the reason of the product. The design considers two levels of storage - a first level full of flash local at every VM for hot data and a secondary optional level with external object storage for cold data. Important point, Weka.IO is not developing the object storage layer but just partnering with secondary players offering this kind of solution and most of the time an S3 API is used between the two storage entities with the Weka.IO object connector. Some policies have to be defined to elect data to be move to secondary storage and Weka.IO can be seen also as an integrated HSM/Tiering file storage solution. Of course the company has several patents pending especially around data protection and the hot storage layer - SSD in VMs - relies on an advanced distributed erasure coding technique invented at Weka.IO. The data durability uses a 16+2 working at 4kB largely enough for hot data, remember that point, data won't stay years at this stage. The performance delivered is linear with a minimum in 6 nodes (due to EC design) and based on Weka.IO host and not storage entity connected to the Weka service.

The product is able to deliver on the config tested ~25k Read ops/core, ~250MB/sec/core and ~500usec. A rapid comparison with GPFS with 300 Weka.IO nodes and 600TB SSD gives 10x performance boost for bandwidth (75GB/s), 4x for IOPS (7.5M IOPS), deployed in less than a day and a TCO of $505/year. Weka.IO will rapidly gain market share, no doubt, their breakthrough approach, their design and brilliant team will play a top role in the primary data file storage space. Also Weka.IO has made two brilliant recruitments: Richard Dyke, VP Sales, coming from Hedvig and Christine Kerschbaum, sales executive coming from DDN. Clearly the company has an obvious strategy around capacity and performance profile. Now it's about execution.

No comments :