Wednesday, November 27, 2024

The IT Press Tour #59 will land soon in Malta

The 59th IT Press Tour will take place in Valletta, Malta, in a few days.


Topics will be about IT infrastructure, cloud, networking, security, data management, big data, analytics and storage and of course AI as it is everywhere. We'll meet 6 innovative companies, among them:
  • DigiFilm Corporation, a emerging player in long term data preservation,
  • EasyVirt, a specialist in the efficiency of physical and virtual servers,
  • Indexima, a reference in fast BI and Analytics,
  • Manticore Search, key actor for information search,
  • ProxySQL, the fast enabler for MySQL and PostgreSQL,
  • and Scalytics, the fast growing company in AI federation.
I invite you to follow us on Twitter with #ITPT and @ITPressTour, my twitter handle and @CDP_FST and journalists' respective handle.
Share:

Friday, November 15, 2024

Arcitecta and Wasabi join forces

Arcitecta, a leader in unstructured data management, and 
Wasabi Technologies, the reference in alternative cloud storage, just announced a partnership. Mediaflux, the Arcitecta product, supports the S3 API so it's not a surprise that Wasabi is supported as a new member of the storage realm. It provides cloud storage, often remote, accessible transparently from any client that is connected to the global namespace enabled by Mediaflux. 

Share:

Thursday, November 07, 2024

Congruity360 unveils Classify360 3.1

Identified as a key player in unstructured data management, Congruity360, just announced a new iteration of its data classification platform, Classify360. Well detailed during the recent IT Press Tour in Boston, Mark Ward, COO, pre-announced this last release with new features for Insights, Actions and Comply modules to sustain and simplify data management at scale. Among them:

  • Data Normalization for AI fueled by precise classification,
  • Scan performance improvement and insights for Dell PowerScale, NetApp, Microsoft OneDrive and SharePoint on-premises plus enhancement for redundant, obsolete and trivial data, DSPM and AI governance,
  • New supported data sources with Nasuni, Wasabi, DāSTOR Object, VAST Data, and Oracle Cloud,
  • and updates of prepackaged risk models to support HITRUST and NYDFS plus "bring-your-own data dictionary".

Share:

Friday, October 25, 2024

Hydrolix adds Kibana dashboards thanks to Quesma

Hydrolix, a fast growing player in the streaming data lake landscape, has signed a partnership with Quesma, a polish software company, who develops a translation layers for database services. This middleware operates as a database gateway to store data within Hydrolix and maintain Kibana and Logstash/Beats and therefore reduce costs. In other words, Hydrolix can replace Elastic, can ingest data from Logstash and Beats and works with Elastic Common Schema. In the same way, OpenSearch users are able to leverage Hydrolix and then connect to OpenSearch Dashboards. 
Share:

Wednesday, October 16, 2024

Swissbit targets high-end SSDs

Known by its wide product line, Swissbit, the European leader in Flash media and SSD, has accelerated its strategy for enterprises and data centers market segments. In 2021, the company has acquired the German company Hyperstone, a SATA controller player for SSDs, to confirm its move. But SATA is definitely not enough for this demanding area with PCIe 4, 5 and soon 6 and NVMe. NVMe was a big change for storage infrastructure also with its network companion. It helps to fill the gap between the access performance need and the capability provided by internal devices. For our readers, it's worth mentioning that NVMe provides a series of a significant improvements with number of 64k queues and 64k commands per queue which is a big gap with SATA with a single queue and 32 commands and SAS still with a single queue and 256 commands. Coupled with PCIe, the performance delivered its massive with examples like 14,000 MB/s for sequential read, 10,000MB/s for sequential write, 3,200K IOPS in random read from a FADU SSD example.


More recently with the AI pressure but also its opportunity, the firm has chosen to partner with Burlywood Technology, a Colorado-based specialist, founded in 2015 with a minimum of $20 million raised, to enter the enterprise and data center SSD segment. It was announced in September 2022. Since that, it appears that Burlywood disappeared and it seems that Swissbit silently absorbed Burlywood. In fact, Swissbit acquired the asset and some of employees joined the German band like Tod Earhart, the founder, original CEO and later CTO of Burlywood. The company has been shutdown after this move and obviously the web site is not longer accessible.

We expect NVMe SSD for data center and enterprises in the next few quarters, in 2025.
Share:

Friday, October 11, 2024

Hydrolix shakes the data lake landscape

Hydrolix, a fast growing data lake vendor, joined for the first time The IT Press Tour this week in Boston. The company was founded in 2018 by Marty Kagan, CEO, and Hasan Alayli, CTO, and raised so far $65 million with 4 VC rounds. They both worked in the past at Cedexis, later acquired by Citrix.


The company develops an observability platform able to process real-time logs leveraging S3 storage coupled with independent ingest and query service layers. It includes real-time ETL, combination of multiple sources into 1 single table and SQL and Spark to ingest. The solution can store PBs of data with a very efficient compression techniques with 20:1 and even 50:1 ratios.

The architecture shown below shows the scalability of each element independently of others.


Orchestrated by Kubernetes, it is deployable on-premises but also in the cloud with Azure AKS, AWS EKS, Google GKE and also Akamai LKE following their Linode acquisition. Data connectors accepted so far are Splunk, Spark and Kibana.

In terms of use cases, the solution is positioned for platform and network observability, compliance, SIEM, multi-CDN observability and traffic steering, real user monitoring or ML/AI for anomaly detection...

At Paramount for instance, the numbers are impressive illustrating pretty well the scalability of the Hydrolix platform. The Peak ingestion rate if 10.8million rows/sec for a total of 53 billion records collected and 41TiB compressed into 5.76TiB. At peak, across all clients, it delivers 20 million rows/sec for 100 billion log lines.

The product is often compared with Snowflake and Big Query, here is below the comparison against the first one.
We anticipate an acceleration fo the business in the coming quarters as the trajectory is already impressive...
Share:

Thursday, October 10, 2024

Congruity360, a very comprehensive file management solution

Congruity360, an established data management player, joined the 58th edition of The IT Press Tour this week in Boston. We spoke with Mark Ward, COO, about enterprises' pains and how the company solves and addresses these challenges.


Founded in 2016 close to Boston, MA, the firm has raised so far $25 million in 2 rounds. They also acquired 2 companies Seven10 Software and NextGen Storage respectively in 2020 and 2017. Seven10 was absorbed to improve the data migration services offering with StorFirst, a well recognized virtual file system, on top of file servers, object storage instances and CAS solutions. In 2022, Park Place Technologies has purchased the StorFirst software platform from Congruity360.

The product Classify360 targets unstructured data and groups several key functions enterprises must adopt like storage optimization, cloud migration, data protection, DSPM for Data Security Posture Management, AI enablement and GRC for Governance, Risk & Compliance. They compete against several point solutions but also a few integrated ones and the market is rich in this domain as the pain exists for a few decades, being even more critical with a fast growing unstructured data volume for the last 2 decades.

It works with 3 simple efficient steps. The 1st obvious step is based on the knowledge of the environment with files, folders and content analysis, then a classification phase leveraging supervised machine learning followed by some actions fueled by a series of policies to delete, tag, move, secure, deduplicate, encrypt, alert or other custom operations.

The product works with several data sources like file servers supporting NFS and SMB but also object storage with S3 and collaboration solutions such as Office365, Google Workspace, Microsoft Exchange, OneDrive & Azure, Box, Slack, NetApp, Dell EMC...


The company plans to announce product iterations to extend data governance based on AI and DSPM. We'll learn more about this very soon now.

In terms of business model, Congruity360 sells only via channel partners.
Share:

Wednesday, October 09, 2024

Swissbit, an European champion

Swissbit, the European leader in flash media and SSD, joined yesterday the 58th edition of The IT Press Tour in Boston. We had the opportunity to meet Matthias Poppel, CSMO, Grady Lambert, GM North America, and Chris Colliers, IAM Technologist.

The company with its headquarter in Bronschhofen, Switzerland, targets data and identities with trusted products to deliver a strong connected world fully digitized.

To cover some key milestones for Swissbit, everything started in 2001 via a MBO from Siemens Memory, then in 2008 with the creation of industrial memory solution activity, 2013 with security solutions, 2014 and 2019 for SSD and production unit in Berlin, Germany, 2020 with a key investment for Ardian, the acquisition of Hyperstone for SATA SSD controller in 2021, and I add 2 key events, the partnership with Burlywood Technology announced in 2022 to enter the data center SSD segment that finally translated into the acquisition of the Colorado entity.

Today with 400 employees, 5,000 customers and a production capacity of 3 million unit per month, the company revenue appears to be in the rand of multiple 100s of millions of euros.


The product team has chosen a strong list of partners for NAND and SSD controllers with Kioxia, Phison, Micron, FADU, Samsung, Silicon Motion and SK Hynix in addition to their Hyperstone and Burlywood assets.

In terms of market segments, Swissbit promotes reliable data storage, data protection and digital identity and secure access to 8 key sectors: industrial automation, enterprise and networking/communications, edge computing, transportation, critical infrastructure (medical, financial, utilities), defense, industrial PC and public sector and governmental agencies.

For data storage, the product line is designed to address industrial, networking and enterprise data storage.

One aspect, strategic for all enterprises being more and more distributed, is their capability to process data where it is generated i.e at the edge. This is the case with some oems like Lenovo with the ThinkEdge line which embeds the Swissbit 1TB N3000 SSD or Nvidia BlueFiled-3 DPU with the Swissbit 128GB EM-30e.MMC. Same approach for Automotive fleet. This is even more a sensitive aspect with AI present everywhere that requires fast and reliable storage instances.

Share:

Wednesday, October 02, 2024

Boston, The IT Press Tour #58 is coming

The 58th IT Press Tour will land in a few days in Boston, MA.

Topics will be about IT infrastructure, cloud, networking, security, data management and storage with 8 innovative companies, among them:
  • Congruity360, a leader in unstructured data management,
  • HYCU, a reference in SaaS backup,
  • Hydrolix, key player in log-intensive data platform,
  • iRODS, the open source key solution for data management,
  • Swissbit, the European leader in flash media storage,
  • and Wasabi Technologies, the pioneer of hot cloud storage.
I invite you to follow us on Twitter with #ITPT and @ITPressTour, my twitter handle and @CDP_FST and journalists' respective handle.
Share:

Tuesday, September 17, 2024

MooseFS, a confidential pioneer that deserves a try

Distributed file system is a hot topic for many years with many initiatives both commercial and open source. In the open source world, it exists lots of projects but MooseFS has a special identity with this 20 years old, it is one of the pioneers in the domain. We can also listed Gluster, Lustre, Ceph, SaunaFS, RozoFS, BeeGFS, OrangeFS or XtreemFS and some commercial offering from Weka, Quobyte or Panasas to name a few.

Many of these had their roots with the famous Google file system paper published in 2003. The philosophy relies on 2 elements: a backing store fueled by a series of data servers what is called here chunks servers, a directory engine for data placement, locking... controlled by a central servers named here metadata server and one of them is the leader coupled with followers and finally the client layer which represents the access layer, where the file system is exposed. Chunk servers are running Linux and their local disks are formatted with classic disk file systems such as xfs, ext2 or zfs and each chunk is a file within a tree structure, with its name associated with the chunk reference. Clients can run Linux, MacOS or Windows supporting various flavors of FUSE and receive a software agent that exposes a Posix semantic and established communication with metadata and data servers. For Windows for instance, MooseFS leverages Dokany as a FUSE wrapper. As these machines access data directly and they operate as a standard machine, they usually run applications.

This is a real architecture. For MooseFS everything has started in 2005 when Gemius initiated an internal file system project. Some clusters deployed at the time continue to run and deliver services today without interruption. Being fully hardware agnostic, MooseFS is a perfect example of a Software-Defined Storage.


The other important to consider is that MooseFS is not a NAS even if clients can expose NFS and SMB via respectively Ganesha and Samba extensions and even S3 with MinIO to continue on the full open source dynamism. The product is able to expose a block interface and I have to say my surprise even if I understand the desire for the team to address a vast variety of needs. For sizing information, MooseFS supports cluster up to 16EB for 2 billion files.


The team had 4 main goals that are well illustrated by core features in the product:

  1. Scalability by multiplying servers, capacity is delivered,
  2. Performance by adding and processing I/O in parallel between clients and chunk servers,
  3. Reliability by utilizing replication then erasure coding
  4. and TCO via the support of any commodity hardware.
To give details on how data is access from the client machine, it's important to understand that below 64MB, a client sends data to only one server and above that level, data is chunked and distributes to different data servers. All this operates in parallel and we can qualify MooseFS as a parallel file system as well beyond to be a distributed one. In other words a distributed file system can be parallel or not but a parallel file system is for sure distributed.

For erasure coding based on Reed Solomon, the mechanism is controlled by chunk servers and works in the background. First, data is written to chunk servers as fast as possible without any protection. These servers then trigger replication across severs to provide a minimal protection and later they initiate the erasure coding phase wish the split of data, parities calculation, and redistribution of the data with all placement information sent to the meta data server for future access by clients. The stripe unit size seems to be 256kB.

Two editions exist for the product, a community edition with everything available on GitHub, full open source and free of charge, and a pro edition that is sold based on raw capacity, presenting some unique enterprise features like advanced tiering or snapshots. The cluster is managed via a CLI, a web Gui and presents an API.

The company behind MooseFS is based in Warsaw, Poland, and is privately held and profitable. Its revenue comes from selling pro licenses sold as a lifetime license, no subscription exists so far, and support and many users started by using the community edition and then expanded ot the pro one. In terms of use cases or vertical industries, the team is very open and doesn't really target some specific domains as they promote an universal approach, they more rely on the partner to "verticalize" the offering.

During the recent meeting during The IT Press Tour in Istanbul, Turkey, the team has launched the Community Edition 4.0, several years after the pro. This version shared 97% of the code of the pro version, offers manual failover, limited but good enough for many configuration erasure coding with a 8+1 model, tiering.

The MooseFS team will be at SuperComputing in Atlanta mid-November, perfect place to continue to talk, discover the solution and start to evaluate the solution.
Share: