June 30, 2020

Sandia Labs uses SoftIron

SoftIron, the promoter of Ceph-based Software-Defined Storage, announced that Sandia National Laboratories adopts HyperDrive.

The Vanguard program selects Ceph to offer file and object storage for its ARM-based Stria HPC cluster. This cluster supports the Astra Supercomputer as a development system preparing software releases and codes to be used on the Petascale Astra system.

SoftIron designs, develops and builds HyperDrive as a storage appliance tailored for demanding environment with optimized components at a very efficient cost.

SoftIron with HyperDrive contribute to make Ceph a serious and strong solution beyond the hype of open source and its with availability and adoption. It helps Ceph to support critical environments with a robust design.

Sandia Labs choice is a true validation of SoftIron decision to embed Ceph in a appliance for factor with an ARM-based operating environment. Being open-source helps adoption but also creates lots of frustration as hardware is not always aligned with the real use, thus being often below in term of performance and availability needs.

Definitively SoftIron is a player to watch in 2020.

June 29, 2020

Pavilion signs with Sony Innovation Studios

Pavilion Data Systems, the emerging leader in high performance unified storage, continues sits market penetration as its adoption seems to accelerate. It confirms the maturity of the product and its capabilities with an interesting features set.

This increase of visibility seen for several months is the result of the arrival of Amy Love, CMO, who changed Pavilion image, building real marketing practice and establishing a real connection with press and analysts. Clearly Amy has a real impact turning an engineering driven company into a more business oriented one.

When you read Pavilion product spec sheet you understand why users with high demanding storage needs select HFA. In 4U, so in a super dense chassis, the I/O throughput it delivers is super high with 120GB/s and 20M IOPS for less for 100 micro seconds in block mode.

HFA was chosen by Sony Innovation Studios, a division of Sony Pictures Entertainment, to be the storage foundation of its volumetric capture based on Atom View software. This 3-D virtual and mixed environments put the storage layer under heavy stress and only a few vendors can absorb such workloads.

Sony has selected an integrated solution from Alliance Integrated Technology and Pixit Media coupled with HFAs. The capability of HFA, its parallelism design, internal fast switching with 20 controllers and end-to-end NVMe, here wit RoCE links, met performance requirements for GPU-based rendering.

This solution uses IBM Spectrum Scale to offer a parallel file access to the datastore leveraging the parallelism of multiple HFAs in the rendering farm.

Great case studies, more should come soon.

June 26, 2020

Qumulo launches Shift

Qumulo, among the 4 file storage blitzscalers*, unveiled recently a very interesting function that deliver real value to AWS users, and of course invite others now.

Launched during the recent IT Press Tour, Qumulo Shift provides a way to copy file data to AWS S3. Data come from on-prem or cloud Qumulo cluster instances. The idea is simple and very efficient and the Qumulo team wishes to leverage tons of AWS applications.

The image below summarizes the layered model of the architecture and the place of the various data services and core functionalities.

Started on-prem with flash in mind, Qumulo has been extended to cloud with AWS, more recently GCP, and we expect soon Azure and is now a real hybrid cloud file storage offering.

With Shift, the master or reference site is still the Qumulo cluster but now coupled with AWS S3 with this capability to feed AWS applications with file data. It offers a way to offload data processing with a large spectrum of apps. One limitation from AWS S3 with the object size limit of 5TB that doesn't seem to be a blocking point for Qumulo.

"The functionality doesn't address the potential data divergence between clusters and AWS S3, the idea is really to offload processing to AWS" confirms Molly Presley, head of global product marketing at Qumulo.

Ben Gitenstein, Vice President of Products, confirms during the session that the average file size on Qumulo cluster is between 3 to 4TB. We don't yet understand if the service is not available for larger then 5TB files as applications need to consider that case with multiple objects.

This Shift service complements pretty well what Qumulo did with MinIO embedding the gateway flavor of it to an unified and ubiquitous access method to data. Locally, and in the cloud, Qumulo can expose NAS and S3 interfaces on same data, and now purely S3 from AWS. VAST Data offers same data access from NFS, SMB and S3 like Qumulo but Pure Storage can't do that.

This new service will be available in a few days, completely free, just by a software upgrade but of course limited to users under maintenance contract.

*The blitzscaler refers to the Coldago Research Map 2019 for File Storage available here and of course the book from Reid Hoffman "Blitzscaling".

June 22, 2020

Where is ClearSky Data? Evaporated

Waiting the Nebulon announcement, I tried to check a few vendors in the cloud-connected on-prem storage ecosystem and discovered that ClearSky Data disappeared.

The web site is not responding and we even discovered that Helen Rubin, co-founder and CEO, is head of Storage Gateway Services at AWS since January 2020. Funny, it seems that again AWS made a silent acquisition like KMesh a few months ago. Some other employees left the company even if we found that Lazarus Vekiarides, CTO and co-founder, is still listed at Clear Sky Data in his LinkedIn page. The ClearSky Data Twitter account disappeared as well same thing for the company LinkedIn profile.

June 18, 2020

Alcestor is dead

Alcestor, the initiative from Tuxera to promote and sell MooseFS is dead. The domain is also not responding. I wrote a short post when I met Mikko Välimäki, the co-founder of Tuxera and initiator of Alcestor project to address enterprise storage needs. But it crashed...

June 17, 2020

A strong robust Ceph appliance from SoftIron

SoftIron, fast growing player in Ceph-based storage, delivered a very good session a few days ago during the current digital IT Press Tour.

The company is a UK-based enterprise designing, developing and building a Ceph-based storage appliance. This system named HyperDrive uses ARM CPUs for tasks specific needs with a low power consumption in mind.

Phil Straw, CEO of SoftIron, insisted on the next phase of IT infrastructure with several fundamental drivers where the data is considered as the most precious asset and the processing becoming disposable and by nature ephemeral.

Hybrid and Edge create a pressure on Data Center design with the necessity to move data where they will be used. Again it illustrates the data gravity effect and the bigger the data volume is the bigger the difficulty to centrally process them. And of course power consumption and energy optimization become key at scale.

Four key challenges are defined and addressed by SoftIron, let me summarize them:
  1. As an on-prem vendor, SoftIron preaches on-prem or hybrid model but the reality is not a vendor centric-view, it will pushed by users. We understand that vendors goal is to limit on-prem erosion. Personally I don't know and I'm not sure hybrid is better . What are the criteria to judge this?
  2. The second question is about open source software (OSS) vs. proprietary and for sureOSS drives DC and cloud [r]evolution. SoftIron claimed that OSS plus specific hardware can deliver better value than proprietary approaches.
  3. Hardware has a key role to play even with SDS. Good point it confirms a use case model.
  4. Security in the source of chosen components validating here the choice by SoftIron to manufacture its own appliance.
The key message here is that SoftIron designs system and not just assemble components like many server vendors like cars manufacturers.

Three products are offered:
  • HyperDrive is a Ceph-based appliance coupled with a management tool named HYperDrive Storage Manager and HyperDrive Router as a gateway for specific access needs,
  • HyperSwitch is a wire-speed network switch purpose built for hyperscale and enterprise storage based on SONiC,
  • HyperCast is a high density concurrent transcoding device based on FFmpeg.
The design shows 14 drives in 1U attached to each SoC and connected via SATA 3.IT confirms the role of tier 2 of Ceph. NVMe is on the roadmap and should arrive soon contributing to boost I/O performance for the system. The cooling of the server is paramount as it contributes to the TCO, imagine servers up and running for 5 years... The team insisted on the sourcing of components chosen for their appliance which is not the case for other server vendors. The company has developed 6 models from full HDDs with 120TB to hybrid with 168TB to full SSDs with 112TB. In addition to these nodes, you add one or multiple Routers to expose various access protocols such iSCSI, SMB, NFS, Custom, CephFS and a the Storage Manager to control the Ceph environment.

An independent firm StackHPC demonstrated performance and compares results with a reference architecture. To summarize, the HyperDrive delivers 817MB/s in write sequential non cached which is 26% better than the reference configuration and +44% at 3300MB/s in read random cached access. The test was pretty complete and we invite the reader to check the entire report available here.

SoftIron is a unique approach in storage coupling Ceph with a specific optimized hardware they design and build to deliver a ready to use hyperscale appliance. Adoption has started and seems to accelerate...

June 16, 2020

WekaIO recruits its CRO

WekaIO, emerging leader in file storage for high performance needs, has just recruited Ken Grohe as its CRO. Ken just spent almost a year at Stellus, the mysterious storage company.

We don't know what does it mean for Andrew Perry, the current VP Sales, as Richard Dyke, the previous VP Sales, has moved to a sales advisor role. On average, a VP Sales lasts 18 months from what we see...

June 15, 2020

Pavilion has lots of ambitions

Pavilion Data Systems, emerging leader in high-performance unified storage, delivered a super session during the recent digital IT Press Tour.

Started historically as a block storage array, full flash, with impressive characteristics, the Pavilion array named today Hyperparallel Flash Array (HFA) shows really high performance for its density. It's a real challenge, the mission is clear and ambitious: offer the DAS performance with a shared device benefits.

To position itself, Pavilion promotes HFA as the leader of a 3rd wave of storage with first wave being HDD and second wave with Flash/SSD. This timeline is very device centric and by device I mean disk itself and I regret the non consideration of connectivity and architecture as FC/SAN was a key milestone for the industry especially as HFA accepts NVMe SSDs but above all NVMe-oF. In a very rapid shortcut, we can assimilate NVMe-oF as the transport of NVMe like FC did/does for SCSI commands.

The HFA is a 4U chassis with 20 controllers - 10 dual controller cards - exposing iSCSI, FC and NVMe-oF (TCP and RDMA i/e IB and RoCE v2) but also NFSv3 & v4 and S3 with an embedded MinIO engine. This is what the industry named Unified Storage and I use several times the acronym BFO for Block, Files and Object to qualify the offering.

Internally the system receives 72 NVMe U.2 SSD organized in 4 groups of 18 drives. Drives and controllers are connected via a redundant 6.4Tb/s PCIe switch and each drive is dual ported to facilitate failover in case of controllers failures. A controller is not built with FPGAs nor ASICs as it is essentially an optimized and enriched Linux running on Xeon CPUs with a bunch of DRAM offering 2 networks ports thus a dual controller board delivers 4 x 100GbE/IB ports. The array implements a cacheless model with a shared PCIe memory for metadata that allows each controller to immediately serves consistent data to hosts.

The total capacity of a full HFA reaches 1.1PB raw built from 72 x 16TB, exactly 15.36TB SSDs, but the reality is different as a group of disks is internally protected by a RAID 6 implementation within each group. Therefore the hardware overhead is 12.5% (=18/16) and if a spare is enabled per group it means 13.3% overhead so pretty similar. In that case a group is divided with 15 data drives + 2 parities + 1 spare. Globally usable capacity dropped to 920TB with strong protection i.e dual parity plus spare. For the storage space efficiency with any of data processing on that, HFA can deliver 83.3% of usable storage on raw.

In term of performance, the array is pretty impressive with 120GB/s, 20M IOPS and 100 micro seconds of latency. The following slide illustrated perfectly the capability of the product.

In term of data services beyond access methods, Pavilion OS provides:
  • Thin Provisioning
  • RAID 6 and snapshots
  • Encryption-at-rest
  • and Multipathing
About access methods exposed by the HFA, as said, 3 different ones are available but a controller is able to expose only one. A protocol namespace belongs to only one controller today being associated with one disk group so up to 18 SSDs. Later this year, namespaces will be able to span controllers but with still the limit of namespace per controller. And we expect that namespace will also span chassis in the future. As namespaces are completely separated, it's not possible today to expose a namespace via multiple protocols especially between file-base protocols like NFS and object-based like S3. Also, SMB is not identified as a need from prospects and users. It confirms that HFA targets specific use cases where performance is a key requirement.

Reduction with compression and async replication across chassis is on the roadmap to make the array hyper resilient and resistant to complete array downtime or site failures or just inaccessibility.

On the business side, Pavilion is 100% channel and leverages key partners to sell its product. The company is a Silver Business Partner with IBM, has relationship with HPE in the field and Dell.

Pavilion is already deployed in one of the largest NVMe-oF configurations at TACC, the Texas Advanced Computing Center, where multiple HFAs are coupled with IBM Spectrum Scale in a shared array model. The company is already adopted by several high demanding environments where access data requires a high SLA.

A competitor with similar ideas, Vexata, hit a wall a few quarters ago due to too long sales cycle that finally killed them. Close to the bankruptcy, StorCentric was able to takeover the company and the engineering team at a very attractive price.

It seems that Pavilion team understands pretty well market fluctuations and impacts. Definitely a storage player to keep under the radar.

June 12, 2020

SoftIron recruits a strategic leader

SoftIron, pioneer in Ceph-based storage appliance, just recruited Andrew Moloney as VP Strategy to lead the go-to-market effort.

As the company continues to build and offer new solution - HyperDrive, HyperCast and HyperSwitch - SoftIron needs to enter into a new direction and accelerate market visibility, presence and adoption. Moloney came from its own consulting firm and before RSA Security and 3Com.

June 11, 2020

Data Dynamics confirms its data management momentum

Data Dynamics, a leader in unstructured data management, unveiled recently a new major milestone in its enterprise strategy.

Data Dynamics has entered into a new era for the company but also for the market as they set the pace now with a rich solution.

At the company level, the team grows with new key people like Helen Johnson as CTO and Brijesh Kumar as VP of software development. New offices as well in Pune, India, London and expansion of Houston, the historical site even during the NuView time. We don't know exactly what was the trigger of this almost sudden wave of people and development as we didn't see - yet - any money injection.

The first reaction about this is the company's wish to go beyond a product approach as they're known for StorageX and StorageX had a long life already under different companies. Companies changed but the product persists, a good sign perhaps.

The second info here resides in the company desire to build product families coupled under a platform model. This platform named Unified Unstructured Data Management platform is defined with key words, the term Unified is central here meaning an aggregation of various functions centered around file management but also covering object storage aspect. Of course the company is for unstructured data, nothing new here, but platform is essential as it invites users to understand a consolidation of usages and a central role in that data management aspect. It's even more critical today that enterprises live in an heterogeneous world with multi models and brands, file servers and NAS running Windows, Linux or "exotic" OS, deployed at local and remote offices and also connected to cloud entities. Having a wide solution that can cover this heterogeneity by nature with one file management umbrella is attractive.

The market is very segmented as it exists several point products in different categories of file management and consolidating functions into one platform is key for IT operations efficiency and investment protection.

Multiplying point product functions for every file management aspect creates complexity, introduces divergence, integration potential issues, compatibility questions and of course increases costs. The firm understood some time ago that one of the key differentiator across these solutions should come from the content and how you take decisions from the info you discover in files. For sure discovering some financial words in a file should invite the engine to increase file redundancy, add security for access, classify and tag that file, and potentially promote it to an encrypt workflow... the reverse is also true, "basic" files have to be processed with regular policies and this alignment has to be transparent and homogeneous across the enterprise. For this reason and more globally for data governance needs, Data Dynamics made a clever acquisition with Infintus end of 2019 to feed this strategy and today the integration provide real fruit with Insight Analytics 1.1.

Current enterprise class file management products embed what was named many years ago SRM (Storage Resource Management) when you wished to understand users access patterns but also find duplicates and potentially create links to them, remove space and gain empty space thus delaying new purchases.

StorageX 8.2 brings new capabilities and one of them is the file to object storage copy feature, to local or remote, on private or public cloud finally, that finally extends sharing. The term used "Transform and Sync" can lost the reader as there is no transformation of the file, it is just a copy giving then the access of the content via a object protocol but there is some actions on metadata for sure.

Platform supported are wide: any NAS exposing NFS or CIFS, Amazon S3 or EFS, NetApp StorageGRID or Cloud Volumes, Azure Blob or NetApp Files (NFS and CIFS again = NetApp Cloud Volumes), GCP Object Storage and IBM Cloud Object Storage. Other brands like Cloudian, MinIO, Pure Storage FlashBlade, Hitachi Vantara, Dell EMC ECS, PowerScale and Isilon should be supported as soon as NFS, SMB or S3 is exposed.

Data Dynamics expands its relationship with Lenovo, works with Dell EMC in the field and of course continues its partnership with NetApp. It seems that it is talking with IBM, illustrated by the support of IBM COS listed above, so the coverage is pretty large now. Azure should become a key cloud partner as Data Dynamics supports already several Azure storage services.

We wrote a recent article about NAS migration challenge and our last comment was "The NAS migration battle is on and there will be in 2020 some new partnerships and developments as this is a common need and often a nightmare for admins.". This Data Dynamics announcement confirms our anticipation.