







In addition to Ceph, Croit added more recently DAOS, the large scale storage project initiated by Intel, as the team realized the lack of expertise or just knowledge in this field.
Both open source products, Ceph and DAOS, can deliver real good storage services and performance levels if configurations are optimized and well mastered. The opposite is true as well as being a real lego, they can also become a nightmare.
For enterprises and large corporations with the desire to build hyperscaler like storage environments, Ceph and for high-demanding oriented ones with DAOS, seem to be good choices. And this is a real opportunity as we saw some other companies building and offering services on top of these, especially Ceph, DAOS being a real differentiator in favor of Croit.
The firm received 2 awards recently, one from Deloitte and the second one about innovation from the Top100 organization in Germany.
Croit selected a few key vendors to partner with like Intel of course about DAOS but also 2CRSi, Supermicro, Seagate, Western Digital or Fujitsu. The pricing model is based on the number of nodes in the configuration and their respective size.
CunoFS
S3 addresses a scalability challenge and by design durability with advanced data protection mechanisms. S3 was and is an answer to limited unstructured scalable NAS in the past. It was solved by recent developments by a few vendors demonstrating that limitations were not only coming from access protocols such NFS or SMB. With the availability of NVMe, NVMe-oF, persistent memory and special ideas about architectures, these limits are pushed to the horizon. But in the meantime, S3 was released in 2006 by AWS, plenty of companies jumped into on-premises or public S3 services and many applications need a file interface. We know several answers like gateways or other approaches like Arcitecta, CTera, Hammerspace, Panzura, Nasuni, Spectra Logic, Tiger Technology or XenData to name a few that solve this aspect. And we met PetaGene, a UK software company dedicated to lifesciences, who develops CunoFS, a universal file to S3 data service.
This service is radically different from ones listed above as it is a client software layer that gives to users a local file system experience. This software piece, that could be perceived as intrusive, must be installed on every system that requires access to S3 controlled content. There is no network file system involved here such as NFS or SMB that often introduces some latencies. CunoFS is fully POSIX compliant and avoids all inter-node communication, consensus or synchronization.
The team has produced some interesting benchmarks results to illustrate throughput advantages against EFS, FSx or other solutions. Playing in the life sciences space, PetaGene positions this product as a universal layer that can be used in various industries for distinct use cases. Again applications have the feeling to operate on local data and we can even imagine that such instances can serve global points of presence and access. For sure a new experience that changes a bit market positions.
Insurgo
Tape is everywhere for decades and we see strong reasons why it will last for a long time. This media, that someone saw disappearing, got recent new considerations with the ransomware threats. And users understood one of the key values of tape, it is a passive media, disconnected from live units. At the same time, users have accumulated various generations of tape, LTO or others, with recurrent need to upgrade their configurations. And at that time questions arise, how is it possible to recycle or logically destroy them? The logic behind could be to reuse, sell them to external entities, participate in exchange programs or simply to comply with industry procedures and regulations and “logically” destroy them.
Not satisfied by degaussing, shredding, incineration or landfill, Insurgo, founded in 2009 in Wales, initiated some developments around a dedicated solution for essentially 2 usages: recycling of tape with a 100% guarantee cleaning process that invite tape owners to safely dispose tape or sell them on the second market and destruction for definitive elimination. These 2 services are KIT and SWAT.
KIT for “Kill Information on Tape” exists for LTO and 3592 tapes and is certified by several official organizations. It fully erases the CM chip, doesn’t damage the tape or servo tracks and offers a secured lifecycle. It works today up to LTO8 and LTO9 is under preparation being the last LTO level available on the market.
SWAT for “Securely Wipes All Tracks” was developed to fully destroy the tape, we mean data, CM chip and servo tracks. The tape can’t be used any more, it can be safely put in a bin, even a public one, there is no chance at all to read any data from it.
Both solutions don’t physically destroy the cartridge, of course not for the first one as the goal is to reuse them, but also for the second approach.
These 2 processes are very fast and according to Insurgo itself KIT takes 6.7 minutes for LTO7 and 5.4 minutes for LTO6 and SWAT 4.8 minutes for LTO6.
The company also offers an inventory tool, called Investigo, to scan tape and its CM chip in just a second without mounting it. A report is generated that helps track and confirm cartridges in any state, even after the SWAT process.
Simplyblock
SDS is a hot topic promoted by established vendors, open source solutions and a few newcomers. SimplyBlock belongs to this latter group. Founded in 2022 in Berlin, Germany, by Michael Schmidt and Rob Pankow, the company develops a block SDS as its first release and plans to add file and object as the next 2 interfaces.
The team has in mind Ceph installed base and potential complexity and other expensive solutions. They choose to be fully independent on the hardware side and wish to run their solution on-premises and the cloud.
They selected SPDK like many others to build their solution and deliver several advanced features like high durability level with erasure coding, encryption, clones and unlimited snapshots, compaction/compression, tiering and replication. Obviously the architecture relies on a multi-node model of up to 255 nodes leveraging x86, Linux, NVMe, NVMe-oF and Ethernet networking.
The first performance results are remarkable with 500,000 IOPS per CPU-core and ultra low-latency around 10+ micro secs.
Within the cluster data protection is based on a distributed erasure coding technique and data integrity is guaranteed by a transaction model with variable write size across nodes and devices.
Currently in Beta, the project is ambitious and should receive a positive reaction from the market.
Pricing uses an effective capacity consumption monthly basis and the product can be fully tested for 3 months. We understand that the competition will jump into that period to stress the solution.
Next features scheduled in the coming months will be support of AWS, Kubernetes followed by GCP and Azure and later in Q3/2024 file and object should appear if everything works as planned, but these 2 new services are very different story, not so easy to deliver at scale.
Tuxera
Founded in 2008 in Finland, Tuxera plays in the data management space and especially in the file system segment of it. The company had a turnover of €25.1 million with 130 employees. 55% of its revenue comes from APAC, 26% NA and the rest 19% from Europe. In terms of verticals, 49% of the revenue is generated by the automotive industry, followed by the consumer and the telco/network segment in third place.
Very well known for quite some time for its Windows/MacOS file system portability capabilities, Tuxera addresses 2 market segments, the embedded and the enterprise one with 2 distinct product lines.
The embedded strategy is covered by Reliance and Gravity CS suite with FAT, exFAT, NTFS, APFS and HFS+ file system and distributed with key oem partners. It is extended with networking and flash oriented products.
The second axis for the firm is the enterprise segment targeted with NTFS Enterprise Edition by Tuxera and Fusion file share. The first product offers the capability to run NTFS on Linux and complements Windows and MacOS flavors perfectly. This is the result of a very close partnership with Microsoft since 2009. This product finally makes NTFS a universal or horizontal solution. The second product is a serious alternative to Samba delivering high performance and security levels being 60 times faster than Samba. The product implements a active/passive or active/active model, in a scale-out mode with persistent handles, failover, SMB over RDMA, multi-channel, compression, optimized for MacOS coupled with Windows oriented features like Active Directory, quota, ACL plus encryption signing and authentication. Of course it runs on various Linux distributions and supports different Windows client, server and MacOS OS versions, same thing for CIFS and SMB support up to 3.1.1. Fusion is already chosen by Croit, StorONE or WEKA to name a few players. Tuxera confirms its key role in the file system space with its ubiquitous presence.
Yuzuy
Qumulo triggered some interesting partner ecosystems and Yuzuy belongs to one of these dedicated to data protection and business continuity. The company founded in 2022 in Hamburg, Germany, develops an add-on for Qumulo clusters to extend continuous replication and backup. In fact, they play in the space where a cluster needs to be protected by one or several others as internal data protection, within the cluster, is achieved with erasure coding.
The first service is related to the failover and failback between clusters based on Qumulo replication. The firm develops a dedicated GUI to fully control that process.
The second is represented by the backup integration with BareOS providing a specific plug-in for Qumulo. BareOS, an open source backup software, fully protects Qumulo clusters thanks to this plug-in that can be used potentially by other backup software with little coding. This Yuzuy development relies on Qumulo’s snapshot to drastically reduce the backup window with a very fast file and block selection mechanism. The copy process remains the same but the time reduction avoids deep, time consuming tree walks and load on nodes. It invites users to also consider a more strict RPO as they can run more frequent backup jobs if they need.
Yuzuy tool is a VM appliance deployed on-premises or in the cloud, and is aligned to the Qumulo charging model. The tem plans to add a multi-factor authentication module, integration with syslog and some synchronization for snapshot policies.
With more than 900 installations worldwide, the addressable market is big enough for Yuzuy even if they start and consider for the moment Germany and adjacent Europe countries.
The young company develops a block SDS in the first release and should add file and object interfaces later.
They target expensive configurations usually covered by hardware solutions and wish to appear as a universal solution for various workloads and use cases operating on-premises and in the cloud.
The first release exposes a block interface and is based on SPDK with interesting advanced features such as high durability, encryption, clones and unlimited snapshots, compaction/compression, tiering or replication. The architecture resides on a multi-node approach with up to 255 nodes in the cluster leveraging standards like x86, Linux and Ethernet networking.
Surfing on the NVMe and NVMe-oF waves, the product delivers up to 500,000 IOPS per CPU-core and ultra low latency around 10+ usecs. Data protection is achieved with a distributed erasure coding implementation and data integrity is guaranteed by a transaction approach with a variable write size atomicity. All writes are atomic across nodes and devices.
The company targets Ceph installations even if Ceph provides a unified model exposing block, file and object. Today SimplyBlock offers only block and the product is in beta.
The pricing model is based on effective capacity used per month and the product can be fully tested for 3 months, perfect for MSPs and early adopters.
The roadmap is ambitious with AWS, Kubernetes support soon followed by GCP and Azure and later in Q3/24 file and objects services to offer a unified SDS.
Being between end-users and the installed product, the firm offers a variety of services like additional software to master Ceph installation, control and monitoring, but also more classic consulting things, professional services and support. And we understand that Ceph and DAOS can deliver real good services and performance levels if they’re well configured and deployed. The opposite is true and it can turned into a disaster let’s says a nightmare. From an external point of view, it could be seen as a cherry on the cake, but Croit receives an innovator award from Top100 companies in Germany recently ad last year a recognition from Deloitte.
Ceph belongs to the small group of software that invite users to deploy and build infrastructure like hyperscalers at least on the storage side. But the Ceph approach is an open source Lego that also represent a business opportunity for other similar players like SoftIron, Ambedded, Xsky, PetaSAN to list a few or in the past BigTera. BigTera was acquired by Silicon Motion in 2017 and Leander Yu founded in 2020 GRAID Technology.
In terms of go-to-market, Croit partners with a few key vendors and instance like Intel for DAOS, 2CRSi, Supermicro, Seagate, Western Digital or Fujitsu. Te service fee is based on the number of nodes in the configuration and their size.
The team prepares a new core platform, some key developments for DAOS and SaaS and managed service offerings. They also wish to go beyond storage with data movement coupled with computing. Obvious with AI generating tons of data everywhere.
In fact, the product is developed by PetaGene, an UK-based ISV dedicated to life sciences data challenges.
The basic problem still is the same since AWS release the S3 API to access its object storage service, how could you, I mean an application, interact with a storage system exposed with that interface if that application speaks and supports only file semantics. It exists tons of such applications where users can’t change their I/O behaviors.
I know, we know, plenty of services, from entry level to enterprise ones, free sometimes, that can do that objet-file translation for more than a decade, like Arcitecta, CTera, Hammerspace, Panzura, Nasuni, Spectra Logic, Tiger Technology or XenData to name a few and even AWS announced such capabilities a few weeks ago. Some can be considered as gateway, others as side engine, but it’s a bit different here as CunoFS is a client software installed on every systems who need to access this data. No gateway concept exists here in the architecture and no network file system involved as well. Everything realized at the file level is made locally on the client that appears to applications like a local file system. CunoFS supports of course POSIX semantic for user and node and as independent clients doesn’t suffer from inter-node communication, consensus or synchronization.
Now having said that, this service layer offers dramatic performance levels that are really different from other implementations. Results speak for themselves with 56.9Gb/s in read mode and 52.3Gb/s in wrote mode for copying files vs. EFS, FSx, S3fs or other solutions. Same time for time to copy - read and write - or with aggregated throughput beating the linearity. CunoFS team lists EBS in their configuration comparisons, it should be listed with a file system coupled with it as EBS is a block storage service. I’m surprised to read on slide 9 on the presentation that file storage is based on RAID and object storage on erasure coding. Isilon was built on erasure coding since day 1 more than 2 decades ago and some object storage players, started later, offered only replication adding EC later. This is the case for Caringo, founded in 2005 now owned by DataCore, Cloudian or Scality founded in 2009.
It’s important to understand that object storage is a technology and an internal organization with a specific access method but an object-based access API like S3 doesn’t imply an object storage back-end, we find S3 today on top of almost everything.
CunoFS represents definitely an attractive solution for every users wishing to reach ease of use and performance with full POSIX compliance. Then it’s a question fo philosophy as a client software must installed on every machine and could be considered as intrusive and time consuming. For more details I invite you to check their web site but interestingly also the performance white paper published by Dell available here.
The firm generates in 2022 approximately €25.1 million with €5.9 million of EBITDA for 130 employees. This revenue comes from APAC for 55%, North America for 26% and Europe for 19%. In terms of vertical industries, automotive is clearly #1 with 49% of the revenue followed by consumer and telco & network manufacturers. To support that growth, Tuxera also made 2 acquisitions, Datalight in 2019 and HCC Embedded in 2021, and they both immediately contribute significantly to the revenue trajectory.
On the business side, the team addresses 2 segments, the embedded and the enterprise market, with 2 distinct product lines.
The firm is well known for its embedded file system products largely adopted worldwide and distributed by key oem partners. This is today covered by the Tuxera Reliance product family. Tuxera also develops Disk Manager, a famous recognized tool to manage file system between MacOS and Windows with FAT, exFAT or NTFS but also APFS and HFS+, it is now promoted as part of the Gravity CS product line. And there are a few other products like networking or flash oriented ones. The team will add snapshot, deduplication or encryption and later a distributed file system should appear as well plus some additional security features. By embedded it means various use cases and customers like automotive for built-in DVR storage, energy with smart metering or video capture for the space industry.
The second group of solutions targeting enterprises is represented by Fusion and NTFS by Tuxera, Enterprise Edition. Fusion file share is a high performance SMB product already chosen by Croit, StorONE or WEKA to name a few. It is a real alternative to Samba on Linux and delivers high performance and security levels, up to 60x times faster than Samba, operating in active/passive or active/active mode with persistent handles, failover, in scale-out configurations. Of course, this product supports SMB over RDMA, multi-channel, compression, optimized for MacOS clients and Windows oriented features like Active Directory, quota, ACL plus encryption signing and authentication. This is why some file storage players have adopted it as Tuxera is synonymous of quality. Running on various Linux distributions, Fusion works with Windows 11 and below, Windows Server 2022, MacOS and supports CIFS, SMB 1.0, 2.0, 2.1, 3.0, 3.0.2 and 3.1.1. This solution is definitely a strong SMB solution for enterprises beyond classic Samba, Ryussi or Visuality.
With a close and pretty unique partnership with Microsoft since 2009, the company has developed a NTFS enterprise flavor for Linux that complements native Windows NTFS and MacOS capabilities. Finally Tuxera has made NTFS a universal file system “extracting” a proprietary approach to the masses. This is used for instance by Orca Security.
Speaking about file system expertise, clearly when a vendor or user is looking for cross platform, enterprise or embedded file storage solution, Tuxera should be considered, no doubt about this.
The scale-out NAS vendor is a key player in the domain, it has introduced a new generation of product as a natural successor of Isilon, known for a few years as PowerScale in the Dell catalog. Historically the company targeted on-premises and realized more recently a clear shift is needed towards cloud for file storage. Thus the firm has jumped into it to follow the trend and it appears to be a strong orientation for the company promoting a real hybrid model with some capabilities to bridge on-premises and in the cloud clusters instances. This was associated with the “Scale Anywhere” approach inviting users to consider Qumulo cluster sin various places and make them ubiquitous finally. As of today, it is not a question to make a gigantic geo cluster but more to connect independent clusters geographically dispersed. The team is working on Nexus to first globally manage all these clusters on the edge, cloud and data center, and then potentially add data services across them.
On the analyst side, Coldago Research continues to position Qumulo as a leader in high performance file storage and a challenger for enterprise file storage in its Map 2022 for File Storage.
On the company side, Bill Richter and his management team has rationalized its salary mass to reach today approximately 350 employees and it seems to be an efficient ratio as the recent growth touched 60% in the second half of FY2023 making the company in a cash positive situation. So far the company has raised almost $350 million in 7 rounds according to Crunchbase.
With some doubts a few quarters or years ago especially with the cloud strategy “hesitation” I should say, the business is now strong again illustrated by the capacity deployed which is always an interesting metric to check even if the $/TB dropped. The company now has delivered 3EB globally since the first release, the move from 2 to 3EB just lasted 11 months. It took 6.5 years to reach the 1st EB, 1.5 years from 1 to 2. In other words, from 2 to 3EB was 7 times faster than it took from nothing to 1.
Ryan Farris told us that today Qumulo has 900 customers worldwide covering 20 countries and managing 500 billions files, considering that the repeat business is strong. Also a good sign.
Wishing to deliver a global unstructured interface to its content, the engineering team has developed its own S3 API to replace the MinIO gateway end-of-life for a few quarters now. NFS, SNB and S3 can access the same content.
The product offers SMB multi-channel support, NFS 4.1 with Kerberos and security capabilities based on Varonis. An adaptive data protection mode, aka as transcoding, has been introduced to allow dynamic configuration in the erasure coding N+M scheme and “follow” cluster growth. It means that a configuration can add a new M, let’s say 2 more parities, for the same N, it is also easy to adapt the N and keep the same M, as many combinations are possible. A 4+2 mode can be changed to a 14+2 mode or 14+4. The company continues and even extends partnerships with companies like Yuzuy in addition to Atempo Miria.
For the pricing, the company has made some adjustments to reflect users’ needs and demands. The main characteristic is the predictability of the cost for the user and therefore the revenue for Qumulo. For instance on Azure with Azure Native Qumulo, a managed service, 1TB costs $85 per month.
Ryan Farris confirmed that even if HPE picked a competitor with VAST Data for Greenlake, so far they didn’t see any impact on their revenue. The market is big enough to push both product line especially for different use cases and deployment models.
Qumulo is working on different projects: run on any hardware and anywhere, this is now key strategic play for the company operating as a pure SDS, an advanced data reduction technology, Nexus and geo XXX and the company plans something for November.