September 13, 2018

Versity unveils new file system

I was a big fan of QFS and SAM-QFS and was surprised that the LSC acquisition by SUN in 2001 finally did not deliver the promises. The fact that the team, especially Harriet Coverston, who developed LSC's product started a new story - Versity Software - was a good indicator that the story was not ended and an evolution is possible at the same time.

I wrote an early post in June 2013 about Versity and its new Storage Manager, sorry at that time, this post was in French. A few months later this post and still in 2013, Cray and Versity announced (sorry still in French) a reseller agreement before SC show in Denver and the super computer leader has distributed since that VSM under the name Tiered Adaptive Storage or TAS. Surprisingly, TAS is no more visible on Cray web site except if you do a search.

They started to make noise during the MSST 2016 (now in English) conference and we met Versity Software in June 2016 for the 19th edition of The IT Press Tour where Bruce Gilpin announced to us the development of a new file system by his team. For many months I scrutinized Versity to get some info about that development and I can say that the team is ready to announce something very soon even if the initial timeframe was a bit too ambitious. I wrote 2 posts (June and July 2016) about VSM and Versity ideas of a new scale-our file system dedicated to large scale archive.

Versity and the founding team has always that ambition to build the most scalable data archiving software to address actual and coming data management challenges. VSM, described above with various posts, was a proven and well adopted product who suffered from an old design, remember it was under development 20 years ago, but it worked and works very well. Now with the data deluge we live for more than a decade, something new must be offered aligned to these new and coming challenges. The good things are kept but many new things are introduced with VSM 2.0 which mark a new era in that domain.

VSM 2.0 has 2 parts: ScoutFS and ScoutAM plus a module named AFM. ScoutFS belongs to the hyper-scale file system group that recognizes that metadata and data must be segregated. First Scout could be a bizarre named but is really aligned to the Scale-Out philosophy where the name was built. ScoutFS represents the persistent layer implemented in the Linux kernel, it respects POSIX like VSM 1.0 and is developed under the open source GPLv2 license. It is what we called a symmetric distributed file system. Symmetric because there is no dedicated metadata servers, all nodes can serve this role in addition to be also data servers. As a remark, a sub-set of nodes can configured to support metadata and data and the rest only data but we still named this a symmetric approach and by design there is no SPOF. Distributed is related to the pooling of servers - metadata and date - acting together to constitute the file system from a shared block device here. At the time I write this post, I'm not sure if ScoutFS is Parallel as the notion means to split files and write/distribute file chunks to independent data server targets. I understand that this parallel aspect is delivered with the coupling between ScoutFS and ScoutAM. One of the key design constraints was the ability of the file system to support 1 Trillion files, yes, you read correctly, 1 Trillion. Running on Linux like VSM 1.0, ScoutFS is also what the industry named an Software-Defined Storage (or SDS) solution running on commodity servers inviting the users to select their preferred server brands and continue to support the environment for a few decades surviving multiple server lifecycles. The file system is exposed via standard NAS protocols I mean NFS and SMB to facilitate application integration.

Since LSC, the team has continued to glue a top layer to manage archive, provide policies and all the logic associated with the archiving environment, here the file system companion is named ScoutAM for Scout Archive Manager. It delivers multiple function and services such the multi-copy, WORM capability, the GNU TAR format, the support to cloud - AWS, GCP and Azure - and tape in addition to the ScoutFS. As an extension of my remark above on parallelism, ScoutAM adds this feature on top of ScoutFS.

In addition to VSM, Versity provides AFM (Archive Fabric Module) for external file systems analysis and help find files candidates for the migration to the archiving environment. AFM has the capability to transfer and replicate files to VSM.

The solution is sold as a subscription model and you can read the data sheet here.

We expect to learn more during the next SuperComputing show in Dallas mid-November. It will also make sense to meet again the team with The IT Press Tour to have a deep session.

0 commentaires: