Friday, December 22, 2017

Panasas is back

Panasas, historic leader of high performance file storage, has started a new era following several years of redesign and re-architect of its solution.

In fact, the original motivation behind this period was to go beyond traditional HPC and apply scalable file storage to other market segments. In other words, it exists market categories with similar needs where the Panasas’ solution would be a very good fit.

Recently with the IT Press Tour crew, we had the privilege to spend a few hours at the Panasas HQ in Sunnyvale. It was a very interesting session, very interactive and the executive team was very transparent with our team.

Back to the root of the company, Panasas was founded in 1999 in Pittsburgh, PA by Garth Gibson, famous researcher associated with RAID patents. Garth Gibson and his past colleagues had an approach summarized later in the SCSI T10 with Object-based Storage Devices or OSDs. For readers who discovered Panasas the name means Pittsburgh Advanced Network Attached Storage Application Software. So far the company has raised $171 million - last round was in 2013 - and has delivered its product to more than 500 customers in 50 countries. Immediately if we do a simple math of 500/18 we obtain 28 customers meaning on average more than 2 per months during 216 months. Many players in such markets would dream about this number. The mission was and is still to deliver a high performance scale-out NAS solution. The company had several executives for several years changes but Faye Pairman (left on the photo below) is the CEO for now about 7 years. A few members of the current team have in common Adaptec who was a key storage player many years ago.

Initiated and supported by famous US research labs, the company has developed so far a pretty unique solution to address and solve file storage performance challenges in very high demanding IT environments. As already mentioned this story doesn't end with HPC but it’s also a very good fit for several use cases in M&E, Manufacturing, Life Sciences, Education/University and Government and of course Energy. We still don't understand why Gartner has decided to remove Panasas from its “bizarre” Magic Quadrant for Distributed File Systems and Object Storage. Read my comments in the article I published on StorageNewsletter almost 2 months ago.

We have also some remarks about the following picture as Panasas has omitted to list Primary Data, Rozo Systems, Quantum Xcellis scale-out NAS or WekaIO for asymmetric distributed parallel file system, very similar to Panasas PanFS, or Avere Systems, Elastifile or Qumulo for the “classic” NAS play. Panasas sells ActiveStor appliances powered by PanFS to be clear.

Back to the product, it’s fundamental to understand what make different a parallel file system and especially a design philosophy such PanFS. A consumer, i.e client, of the file system is able to send a file to multiple storage targets at the same time splitting the content cross these multiple units. Thus the time to write and read is dramatically reduced. This is very different if you send a file via SMB or NFS as the entire file is sent via only one NAS head. If you wish to do it with NFS, you have to consider pNFS with NFS v4.2, if not, you need a special piece of software embedded in the client machine to understand the interaction between meta-data server(s) and data servers and process I/O operations. A parallel file system can be asymmetric or symmetric, this is just related to the how the metadata server role is operated, again PanFS use an asymmetric model. By the way, Panasas was a key contributor to pNFS, a standardized proposal to extend NFS with this asymmetric mode. I invite you to refer to for more details.

To detail the definition of an asymmetric distributed parallel file system, we need to mention that:
  • asymmetric is the use of side machines acting as metadata servers (this role can also added on data servers),
  • distributed means that the file systems spans and relies of multiple machines and
  • parallel, as explained above, is related to the concurrent consideration of storage targets.
With the current market terminology, we use control plane for the metadata servers and data plane for the data servers.

In 2 words one of the benefits reside in the elapsed time to do I/O operations. If you need T seconds to write a file, you will need only approx. T/10 seconds if you send the same file across 10 back-end servers. And it makes clearly sense when applications consume large files as most of the time is dominated by data I/Os and not metadata I/Os, we see very often a 5-10% in metadata operations and 90-95% in favor of data operations.

Panasas PanFS supports both modes: parallel with the DirectFlow agent and is fully POSIX compliant and NAS with NFS and SMB protocols.

With such performance in critical environments, this kind of platform must provide advanced data protection mechanisms. Panasas offers file-based erasure coding in a N+2 fashion thus tolerating 2 simultaneous drives failures. RAID 6 and other disk-based oriented approaches fail to protect data with limited rebuilt time especially with large drives and for large capacity. For small files and small data volume, file replication across nodes is still a pretty good method.

I/O performance and protection improves with scale as stripe could be larger reducing elapsed operation time.

For PanFS, the team has made great effort to facilitate the management of the platform with an intuitive GUI and console and of course with a powerful CLI.

Panasas has made recently a few announcements:
  • An even more disaggregated architecture with a 2U director blade – you know the famous metadata servers – with 4 nodes in the chassis, it is name ActiveStor Director 100 or ASD-100, pretty well aligned with metadata intensive operations. This ASD-100 node has 8GB NVDIMM for the transaction logs beyond 96GB of DDR4 RAM and 2x40/4x10 GbE Chelsio NIC.
  • A new storage data blade – the ActiveStor Hybrid 100 aka ASH-100 -, hybrid this time, with a choice between HDD and SSD sizes.
  • A new DirectFlow software with 15%+ more B/W and availability yon MacOS in addition to Linux.
  • A new SMB stack coming from Samba with PanFS ACL translations module,
  • And an updated foundation with FreeBSD.
A very good meeting that invites us to anticipate some more good news from Panasas in 2018.

0 commentaires: