Thursday, September 13, 2018

Versity unveils new file system

I was a big fan of QFS and SAM-QFS and was surprised that the LSC acquisition by SUN in 2001 finally did not deliver the promises. The fact that the team, especially Harriet Coverston, who developed LSC's product started a new story - Versity Software - was a good indicator that the story was not ended and an evolution is possible at the same time.

I wrote an early post in June 2013 about Versity and its new Storage Manager, sorry at that time, this post was in French. A few months later this post and still in 2013, Cray and Versity announced (sorry still in French) a reseller agreement before SC show in Denver and the super computer leader has distributed since that VSM under the name Tiered Adaptive Storage or TAS. Surprisingly, TAS is no more visible on Cray web site except if you do a search.

They started to make noise during the MSST 2016 (now in English) conference and we met Versity Software in June 2016 for the 19th edition of The IT Press Tour where Bruce Gilpin announced to us the development of a new file system by his team. For many months I scrutinized Versity to get some info about that development and I can say that the team is ready to announce something very soon even if the initial timeframe was a bit too ambitious. I wrote 2 posts (June and July 2016) about VSM and Versity ideas of a new scale-out file system dedicated to large scale archive.

Versity and the founding team has always that ambition to build the most scalable data archiving software to address actual and coming data management challenges. VSM, described above with various posts, was a proven and well adopted product who suffered from an old design, remember it was under development 20 years ago, but it worked and works very well. Now with the data deluge we live for more than a decade, something new must be offered aligned to these new and coming challenges. The good things are kept but many new things are introduced with VSM 2.0 which mark a new era in that domain.

VSM 2.0 has 2 parts: ScoutFS and ScoutAM plus a module named AFM. ScoutFS belongs to the hyper-scale file system group that recognizes that metadata and data must be segregated. Scout name comes from the Scale-Out philosophy where the name was built. ScoutFS represents the persistent layer implemented in the Linux kernel, it respects POSIX like VSM 1.0 and is developed under the open source GPLv2 license.

First, it is a disk file system, and even we can say a block-based file system. Second, it is a shared file system and as it relies on shared disks, it is what the industry named a cluster file system. To avoid bottleneck and service degradation, developers chose a model without a single or dedicated metadata server and prefer to consider a sub-set of all nodes as metadata servers spreading the load to maintain response time and quality of service. So it operates more like a symmetric implementation. And finally as the file system is only seen by ScoutFS nodes and not outside the cluster, it belongs to the internal file system category. In other words clients machines don't mount the file system natively but need an extra layer of service and translation to access the archiving platform. It realized via NFS or SMB, the two main industry standard file sharing protocols.

Some parallelism exists inside the cluster to migrate files to the archive storage targets. One of the key design constraints was the ability of the file system to support 1 Trillion files, yes, you read correctly, 1 Trillion. Running on Linux like VSM 1.0, ScoutFS is also what the industry named an Software-Defined Storage (or SDS) solution running on commodity servers inviting the users to select their preferred server brands and continue to support the environment for a few decades surviving multiple server lifecycles.

Since LSC, the team has continued to glue a service layer to manage archive, provide policies and all the logic associated with the archiving environment, here the file system companion is named ScoutAM for Scout Archive Manager. It delivers multiple function and services such the multi-copy, WORM capability, the GNU TAR format, the support to cloud - AWS, GCP and Azure - and tape in addition to ScoutFS.

In addition to VSM, Versity provides AFM (Archive Fabric Module) for external file systems analysis and help find files candidates for the migration to the Versity archiving platform. AFM has the capability to transfer and replicate files to VSM.

The solution is sold as a subscription model and you can read the data sheet here.

We expect to learn more during the next SuperComputing show in Dallas mid-November. It will also make sense to meet again the team with The IT Press Tour to have a deep session.

0 commentaires: