Jun 24, 2016

Noobaa super simple scalable S3 storage for masses

NooBaa (www.noobaa.com), a new storage Israeli ISV, continues its incremental visibility effort on the market. I had the privilege to meet Yuval Dimnik, CEO and co-founder, and Mike Davis, CMO, a few days ago in Palo Alto for a briefing. The first feeling I had was great as they plan to introduce a real new storage animal to the market. Let me summarize what I learned without telling any secret as the web site already exists. First, NooBaa has a strong DNA in storage and networking with many leaders having worked for Exanet, NooBaa's CEO and Guy Margalit, CTO, have worked there. The company, today with approximately 15 people and a HQ in Boston and R&D center in Rehovot in Israel, was founded in 2013 and is backed by JVP and Ourcrowd for an unknown amount, OurCrowd has ingested $934,504. The product they design and develop belongs to the object storage category with key differentiators, it is a full SDS philosophy dedicated to unstructured data.


NooBaa product is a pure software approach, offering only S3 interface running on any compute resource available anywhere supporting heterogeneous network-driven and shared storage entities. The product offers also deduplication with a sliding window technique, compression, encryption and replication to protect data with by default 3 copies. Erasure coding is on the roadmap. NooBaa uses a asymmetric model with side metadata servers and data servers acting as chunk servers. Metadata servers use MongoDB. The demo the team has made was super efficient having demonstrated how it is easy and fast to deploy, run and operate. Just 15 minutes and we played with the system, ingest some data, stream a video. The pricing has 2 modes: Freemium and then a pay-per-user subscription base on real used capacity (real capacity for one data copy). The product is in Beta mode and should be GA early 2017. Great first feeling with the team and a super promising product, lots of potential and real interesting ideas.

Jun 14, 2016

New iteration in Data Reduction

Ascava Inc., no web site yet neither a logo, is making progress in their mission to change the Data Reduction landscape. The company, founded in 2007 and incorporated in Delaware, has raised $1.59M in Nov. 2014, the team is currently located in Los Altos Hills, CA. I had the privilege to meet at SNIA DSI conference 2 key executives of the company: Harsh Sharangpani, CEO and CTO, and Rajesh Patil, VP Business Development and Operations. LinkedIn shows 5 people in addition to Rajesh not listed as Ascava.
As mentioned at the beginning of this post, the product is about data size reduction what the company named Data Distillation with ratio 1.5 to 2x better than the best DeDup + Compression ratio existing on the market. Ascava has made several innovations in that sector and is not ready yet to announce anything but developments are in sync with the plan.
The product is a software running as a separate standalone application you operate on any node able to shrink file size and return a super optimized reduced file. The outcome is obvious less consumed space for a fraction of additional cpus cycles. Of course you need a "reader" able to unpack the file to give you back the original content before any application reads it in its original content. We can imagine multiple usages of this such data archiving, data analysis and file tiering/HSM/migration stuff. It reminds me a bit what Ocarina Networks did in the past even if here Ascava is doing radical new things to achieve this reduction ratio. For whose who wish to read a few things about Ocarina, I wrote a few posts, sorry in French at that time, available here: Aug. 2007, Apr. 2008 and July 2010. Scava was a good meeting even a surprise and illustrates that some initiatives and companies continue to dig in that space. CPU and memories continue to be super fast and affordable, we then imagine that some innovators wish to move forward and propose something new, Ascava is one of these. Good Luck.

Jun 10, 2016

Lenovo powered by Cloudian

Lenovo (www.lenovo.com | HKSE:992) has jumped clearly in the SDS wave and has made a super choice with Cloudian HyperStore that finally confirms what everyone should know: Amazon S3, the de-facto cloud storage reference, is the winner. End of story. For all object storage vendors who have tried to compete saying 'Hey we have a better API', C'mon, there is only one rule: "The Market is always right !!".
So Lenovo has recognized that and once again picks the best Amazon S3 object storage for on-premise deployment. Lenovo markets its StorSelect appliance product line with 2 products and for Cloud stuff the model is the DX8200C - C for Cloudian - with GA should date in Q3 this year. The DX8200C is a 2U appliance, users must buy 3 nodes to start and the company delivers 1 and 2 lines support and Cloudian be the 3rd of course. Again Lenovo has made a super choice. We'll learn more in 2 weeks during The IT Press Tour.

Jun 8, 2016

New large scale primary storage approach from Weka.io

Weka.io (www.weka.io) is a storage startup founded in 2013 in Israel, with many employees coming from XIV. The company just announced a series B at $32M from previous VCs such NVP and Gemini and new ones like Qualcomm Ventures, WRV II to name a few. Weka.io plays in the Software-Defined Storage aka SDS space with a scale out storage solution ready for the cloud-based datacenter. We expect more informations soon... and it would be a perfect candidate for the IT Press Tour.

Jun 7, 2016

Big Data Services reinvented

Iguaz.io (www.iguaz.io), takes advantage of the Spark Summit this week in San Francisco, to unveil the result of active development effort for at least 2 years. I had the privilege yesterday to discuss with Asaf Somekh, founder and CEO, and Yaron Haviv, founder and CTO, here about their mission and solution approach.
First a few informations about the company, Iguaz.io was founded in 2014 in Tel-Aviv by Asaf Somekh, Yaron Haviv and Yaron Segev, founder of XtremIO, and raised so far $15M in Series A from Magma Venture Partners and Jerusalem Venture Partners. Today the company has 40+ employees all with strong background in fast networks and storage with experience at XIV, XtremIOVoltaire and Mellanox to name a few. And for the name, think about the Iguazu falls and you get the idea about the data deluge challenge the company wishes to address and solve.
The founders of the company having worked in demanding environment have realized that Big Data Services suffer from lack of simplicity, always based on classic layers not optimized for big data processing for a complex data pipeline outcome. Data and data tools are siloed as users and vendors finally just unified and glued without rethinking the I/O stack and think about a radical new approach to remove performance barriers and bottleneck at various points in the architecture. Iguaz.io took this challenge super seriously to design, build and develop a new approach, perfectly aligned with big data challenges and able to be integrated with all famous big data processing tools and products. In fact, some users have started to manage this challenge internally with lots of difficulties especially in people skills and some others refuse to deal with all this horizontal complexity of integration of data pipelining solutions (data movement, several copies, ETL and security). So 2 solutions finally exist and of course the obvious first one is to consider Amazon AWS or Microsoft Azure approach, everything is externalized, but still complex, with unpredictable performance, not optimized with multiple copies that impact timing and above all very expensive. The second approach is Iguaz.io who redefines all layers with only one copy, super fast pipelining capability with cost optimization in mind. Iguaz.io realized that many computers related aspects have made significant progress over last decade but the storage software stack is still based on things linked to the HDD world. And above all, at different layers, you find piece of software doing pretty similar things, consuming lots of cpu cycles.


First Iguaz.io defines a common data repository with a set of data services that sit above this data lake. The solution is fully storage agnostic and provides multiple data access methods )file, objects, streams, HDFS, KV, Records, new APIs...) to be integrated to now classic big data products such Hadoop, Spark or ElasticSearch. You can see the Iguaz.io product as a super fast, scalable, universal and hyper resilient access and data layer between consumers (big data applications and users) and back-end storage.


This what Iguaz.io named a 3 layered architecture with Application & APIs, Data Services, and Media:
  • The Applications & APIs layer is stateless, so failure resistance is delivered by nature, model is extensible and elastic, and it commits all updates to a zero-latency and concurrent storage. It is responsible to map and virtualize standard files, objects, streams, or NoSQL APIs to the common data services. Also, key in Iguaz.io approach and in big data world, changes are immediately visible in a consistent fashion.
  • The second step provides key data processing with inspection, indexation, compression and storage with a intelligent and efficient way on low-latency non-volatile memory or fast NVMe flash drives. Then data can be moved to the appropriate storage tier. Iguaz.io has introduced a data container notion to store objects that provides consistency, durability and availability.
  • The last service is the Media one with Iguaz.io K/V application-aware API mapped directly to different types of storage, including NV memory, flash, block, file, or cloud.
In term of adoption model, Iguaz.io promotes a self service approach that allows rapid deployment and production-ready systems. Iguaz.io has made a significant industry progress and takes an immediate leadership in that segment without real competition. Congratulations to the team.
To attend also a live introduction of this approach, Yaron Haviv, CTO of Iguaz.io, will be interviewed at TheCube this afternoon June 7 at 3pm PST.

May 25, 2016

New blood in storage with Symbolic IO

Symbolic IO (www.symbolicio.com), founded in 2012 by Brian Ignomirello, former CTO of HP global storage team, is now out of stealth. The company claims to be the first computational defined storage solution. The company introduces IRIS, for Intensified RAM Intelligent Server, with the goal to be media-agnostic and data driven. IRIS is delivered via 3 flavors: IRIS Compute, IRIS Vault and IRIS Store available as 2U units powered by Intel cpus. The team has built a radical new approach for storing data able to reduce the storage footprint in a great amplitude promoting a super dense 2U and be super fast working with RAM with non volatile entities. When we mean super dense storage, we didn't imagine that 1PB can be stored in just 2U. Without telling us exactly how it works, of course, Symbolic IO is about a new way to provide Data Virtualization with what the company says a "materializing" and "dematerializing" data in real-time, living in memory without any reality on storage.
In term of performance, Brian Ignomirello claims that IRIS is 10x faster than 3D XPoint and 10,000x faster than NAND flash. Wow, impressive and probably unbeatable at the time of the product release in Q4 2016. The company has worked super hard to develop OS, ASIC and algorithms for 4 years to reach that super level of performance. The product should appear on the market before the end of the year and will start at $80,000. From what we know today, IRIS is pretty unique and it deserves a look and test, no doubt. Symbolic IO would be a perfect candidate for a future edition of The IT Press Tour.

May 24, 2016

Datameer New Generation of Analytics

Datameer (www.datameer.com), one of the leaders in data analytics preparation and a good surprise of the 18th IT Press Tour in March 2016, just introduced the 6.0 release. This major product iteration offers 3 new key elements:
  • a new user interface to radically change how users interact with the platform,
  • an extension to Datamer's Smart Execution with the support of Spark processing engine that simplifies data mining and machine learning,
  • and a continuous data discovery mechanism.
The outcome resides in users behavior with visualization at each step of the process, the workflow is now more fluid. No need to wait the final phase to obtain result and finally discover some mismatches or inefficient sub-phases. Each step feeds the process and offers visualization. The goal of this 6.0 is to simplify, reduce processing time and improve the user experience. Battle is on for data preparation and Datameer clearly occupies a leading position with a limited number of other players. Congratulations to the team.

May 23, 2016

AtScale raised a new round

AtScale (www.atscale.com), a leader in BI on Hadoop and a new company met during the 17th IT Press Tour, announced recently a series B at $11M. The total raised now touches $20M, which is interesting and at the same time not so important to accelerate the growth. It seems that the team wish to keep the control and their destiny and as an effet, could reduce the growth speed versus competition. What is sure is AtScale delivers a rapid growth during last 12 months with 5 fold revenue expansion - ok was very tiny - and adopted by famous customers and brands. Among various investors, Comcast Ventures has joined the pack and Amr Awadallah, founder and CTO of Cloudera, is now a board member. Congrats to all the team and we expect to see you again for a future edition of The IT Press Tour.

May 19, 2016

Primary Data extends the data lake

Primary Data (www.primarydata.com), the Data Virtualization company extending pNFS, just announced new targets as data servers within the data lake built and controlled by its DataSphere software. With this new release Primary Data supports now EMC ScaleIO, EMC Isilon and Amazon S3. The company is able to support multiple back-end - block, file and object - local or remote - and unify them within a unique namespace. For some people who wish some background on Primary Data, I invite to read a post I wrote 2 years ago available here. Good progress.

May 16, 2016

When object storage promoters realize the role of files

It's fun, really fun, to see in some videos, posted on YouTube taken at OpenStack Summit in Austin last month, SwiftStack CPO Joe Arnold recognizes that Object APIs are for developers and finally accepts the reality of the high and ubiquity presence of file-based application especially commercial ones that don't speak object at all. And now these guys that serve as Ambassadors of Object stuff promote file access via ProxyFS, wow, what a conversion. They use to refuse file access, then they embedded Maldivica gateway as the market asked for such access and now they speak about native file access. In fact, there is nothing in the market, ho sorry yes it's new for Swift, ProxyFS allows file and object access to the same content. Again, really a new feature for Swift but pretty common in the industry by other object storage. I love these videos and if you don't find them, here they are: