January 14, 2020

Spectra confirms its software strategy with StorCycle

Spectra Logic, leader in secondary storage, has developed a new data tiering solution what was called in the past HSM for Hierarchical Storage Management, more recently ILM for Information Lifecycle Management and even DLM for Data lifecycle Management, dedicated to unstructured or file data.

Nothing in the concept as the idea is to reduce primary storage costs by migrating and storing inactive - non accessed- data on cheaper storage devices. By inactive common criteria are file size and age. But to do that an intelligent engine is required to find the rights files candidates with a file system scanner or crawler, then a file mover to copy files to secondary units coupled with some mechanisms to link new files residence with their original locations to maintain a seamless access to those files. You got the idea and you know this approach for quite long time with approach like HSM mentioned above.

HSM masks secondary storage levels to applications so access to data on second tier must be copied/cached back to the primary storage unit introducing some latency for these migrated files. More recently we saw new tiering methods that move file between storage tiers but allow access directly from their current residence.

Instead of partnering with a software vendor, Spectra Logic has decided to develop its own migration solution.

The other strategy behind this software it to solidify its secondary storage footprint with a clear wish to span all kind of devices. Spectra promotes a 2 tiers model with a first tier being the primary tier for production data and the second tier named the perpetual tier with various devices and entities between tape libraries, NAS, object storage and even cloud storage essentially with an S3 API protocol exposed.

Spectra built StorCycle as a sophisticated and modern HSM product based on the definition we gave above with the secondary storage not seen and touched by applications but data streamed from secondary if needed.

This model generates economy for primary storage backup and reduces cost of primary tier as volume is optimized.

The product supports encryption, end-to-end checksums, multi-copies and offsite replication. The other key element with multiple storage entities on tier 2 is the capability to pick the right device to migrate data and satisfy a good enough access time for future accesses. StorCycle is not in-band like the image below shows but this diagram facilitates the logic view on the configuration.

Two modes of migration is offered by StorCycle: Auto-Migrate and Project-Migrate. Auto-Migrate is the simple method with a file selection policy based on age and size defined by users. Project-Migrate is to move a directory of files with a manual operation and it is designed for completed projects.

When files are migrated, StorCycle offers 4 techniques:

  1. Like other HSM products, migrated files are replaced by a specific file with same original filename and here a html extension, it is named html link. To access that file, users have to open that new file with a browser and follow the copy back feature. This works pretty well but is a manual process and very slow and it seems that it can't be batched.
  2. Replaced migrated files by a symbolic link to the same file on secondary storage. This technique imposes that if NFS is used on primary it has to be used on secondary with similar file tree to simplify tree navigation. It's the same for SMB. This method is not yet available but provides a complete transparent file access from the origin al location avoiding the copy back to the original residence. For instance Komprise works that way.
  3. No replacement of migrated files, meaning that the user must search the file within the StorCycle catalog.
  4. and finally make a copy on secondary and in that case there is no migration at all.
Migrated files are kept in their original format, except for Tar and Zip packed files, and can be accessed directly from their new location without StorCycle.

By reducing file proliferation and presence on primary storage, this data management approach avoids and delays purchases of new or extension of primary storage units.

Four different levels of licenses exist described by this table:

For TCO and large scale environments, StorCycle is a very interesting approach that confirms the Spectra software strategy and the idea to control all data on secondary storage at scale.

0 commentaires: