October 2, 2015

OpenIO, ready to take off

OpenIO (www.openio.io), a french object storage vendor, is emerging slowly from stealth mode. Founded in 2015 with offices in Lille, France, and San Francisco, the company was launched by 7 co-founders, real veterans in the object storage space. Having worked actively for large projects at Orange and Atos Wordline and especially storage for email systems, the original idea started in 2006. Later in 2008, the first production ready version was produced and later in 2012 the solution was offered to the Open Source community. It reminds me a few other names of object storage leader launched at that time such Cleversafe, the clear and obvious leader founded in 2004, or Caringo, launched in 2005 and also a pioneer as founders came from Filepool, father of the CAS wave and later known as the EMC Centera offering.
As you understand, OpenIO develops an object storage product with several interesting features and properties. The product is divided in 2 layers, an access layer and a persistent or storage layer, the access layer working like gateways are protocol translators and exposed object outside and the storage layer is responsible of the durability of the data.

One of the challenges for solutions who address large scale environment is how to distribute data and place data on the right node or element and provide a fast method to retrieve data when needed. Some vendors choose to control that aspect from top to bottom and uses various hashing algorithms like Consistent Hashing for instance but OpenIO decided to implement a different method thanks to a massively distributed 3-level directory. The product can then learn from the environment to better place data. This approach named "Conscience" is one of the key differentiator of the product. Internally maximum 3 hops are necessary to reach the data in the worse case.
OpenIO considers a hierarchy who reminds you something for sure - Namespace/Account/Container/Object - and the container structure is flat. An object is a named BLOB plus metadata of course. At the container level, a set of options can be defined such as number of copy, number of versions... Versioning works a the container level. IO operations are classic - GET/PUT/DELETE - and work on last revision of the object. Data is split in chunks and each chunk is immutable represented by an independent file on the local file system. The system uses extended attributes to store metadata of objects. In term of data protection, OpenIO supports replication and erasure coding based on Cauchy Reed-Solomon - systematic mode -, a compression mode is available at the chunk level working asynchronously and the product also provides a tiering capability controlled by the storage policies, where protection (replication or EC) and processing are defined (compression or encryption). To access data, Amazon S3, Swift and an OpenIO APIs are available, like many object storage product on the market, finally the battle is no more on the access method but at the core level.
The product is currently under version 0.8.1 and available on GitHub at https://github.com/open-io/oio-sds/tree/v0.8.1. The company have already deployed more than 10PB in email environment. It seems that the story is pretty attractive and the product easy to deploy and use. Let's see now how the market and the US one will react to a new player in that space...

0 commentaires: