Tuesday, January 07, 2025

Recap of the 59th IT Press Tour in Valletta, Malta

Initially posted on StorageNewsletter 20/12/2024

This 59th edition of The IT Press Tour took place in Valletta, Malta, early December and all the press group and organizations had time exchanging about IT infrastructure, cloud, networking, security, data management and storage, analytics and big data and of course AI present across all these topics. Six companies have been met, they are DigiFilm, EasyVirt, Indexima, Manticore Search, ProxySQL and Scalytics.


DigiFilm

Founded in 2013, the french company is on a mission of developing long-term data preservation media. Despite its small size, the team is highly skilled and led by Rip Hampton O’Neil, technical lead, Antoine Simkine, innovation, and Pierre Ollivier, CEO.

During the session, Antoine Simkine presented DigiFilm’s innovative technology, Archiflix, a Write Once, Read Many (WORM) media designed with a ‘Store and Forget‘ philosophy. The solution relies on a strict media selection utilizing film reels, known for their exceptional longevity and robustness plus standardized readers ensuring compatibility with commercial scanners to avoid reliance on proprietary devices. Unlike tape or other storage media, Archiflix eliminates the need for migrations or technology refreshes and is designed to endure multiple decades without degradation. Several existing long-term storage solutions compromise data integrity during retrieval, leading to transformations that result in data loss or altered metadata. Archiflix addresses this by ensuring the retrieved data is an exact replica of the original, including its metadata.

The company has developed the Pixa Code format to encode, store, and decode source data. This open standard allows for efficient retrieval: a simple scan followed by a decoding process produces a perfect copy of the original file. Each reel includes information about the decoding algorithm, scanner characteristics, and blueprints, making the system self-descriptive and ensuring straightforward data access. Additionally, Archiflix incorporates a dedicated Vault to securely store reels.

DigiFilm is currently raising a funding round to further develop Archiflix and is actively seeking partners to accelerate growth. The company does not intend to create proprietary hardware scanners or machines, focusing instead on the Pixa format as the foundation for its business model.

While speed is not a priority for Archiflix due to its focus on archiving, the solution demonstrates solid performance: 5,000–10,000 documents archived in 5 minutes, 200GB stored on a 600m reel, approximately 10 minutes required to process one reel.

DigiFilm competes with Norway-based Piql, which has marketed microfilm storage for several years. Other players in this space include Cerabyte, BioMemory, DNA storage pioneers, and Folio Photonics, which is still in development. Hyperscalers have also begun exploring similar technologies, underscoring the demand for reliable, long-term storage solutions.

The storage industry is in clear need of innovative approaches to address long-term data preservation. DigiFilm’s Archiflix offers a compelling solution with its focus on longevity, data integrity, and sustainability. As the project evolves, its progress will be closely watched as the industry continues to iterate on these critical directions.


EasyVirt
French IT software vendor founded in 2011, the session provided an excellent opportunity to explore EasyVirt’s solutions and discuss pressing issues like energy consumption, which has become increasingly critical with the rapid rise of AI usage. Meanwhile, Europe’s push for stricter regulations adds complexity, though some view it as a potential disadvantage compared to competitors like China and India.

The company offers 3 core solutions – DC Scope, DC NetScope, and CO2 Scope – developed by a small yet highly skilled team. In 2024, EasyVirt achieved revenue of approximately €1 million. Known for their expertise in IT infrastructure virtualization, FinOps, CloudOps, and Green IT, EasyVirt serves over 100 customers across diverse industries. Their clientele includes prominent names like Safran, Fleury Michon, La Poste, Pole Emploi, CNES, MAIF, and Amundi.

EasyVirt engages with organizations seeking to assess the environmental impact of their digital services but often unsure where to begin. This task is complex, requiring precise energy measurement methods and algorithmic accuracy without disrupting existing services.

As listed above, the firm promotes DC Scope which targets virtualized environments, whether on-premises, in the cloud, or hybrid, offering insights to optimize resource use, DC NetScope with a special focus on analyzing network traffic for efficiency improvements and CO2 Scope specialized in measuring IT-related carbon emissions.

These solutions evolve continuously, incorporating feedback from users. Case studies have demonstrated tangible benefits, including reductions in vCPUs, RAM, VMs, and hypervisors, as well as the resizing of production environments. EasyVirt’s solutions adopt an agent-less architecture, ensuring security and control by avoiding SaaS models and keeping operations entirely local.

Initially, VMware has been the primary hypervisor supported, but the roadmap includes Proxmox, Nutanix AHV, Kubernetes (via Red Hat OpenShift and VMware Tanzu), as well as GPU-focused enhancements to address their significant energy consumption. Looking further ahead, EasyVirt plans to integrate XCP-ng.

EasyVirt markets its solutions through a network of partners, including resellers, integrators, MSPs, and consulting firms such as Capgemini, CGI, and even Dell. They offer a 30-day trial to promote a ‘try and buy‘ model. Pricing options include perpetual licenses and subscription-based models.

As the vendor continues to align its solutions with evolving user demands, their focused approach and innovative offerings position them well for success in the dynamic IT landscape.


Indexima
Founded in 2016 to enhance business intelligence (BI) by Nicolas Korchia, CEO and co-founder, we had the chance to get deeper insights into the company and its product.

The concept for Indexima stemmed from a critical performance issue identified in 2015 at Mappy, a French geolocation service. Real-time navigation often resulted in significant latency whenever users made screen changes, leading to a frustrating user experience. In 2017, the team secured €1.3 million in funding, launching the project on Hadoop to address these challenges. By 2020, Indexima had expanded its platform to support additional databases and introduced a SaaS model, driving broader adoption and securing 15 customers that year.

The COVID-19 pandemic, however, presented challenges across industries, including for Indexima. This period became a pivotal moment for the company, prompting a strategic pivot. Recognizing the rapid adoption of Snowflake’s cloud data warehouse, Indexima identified it as a prime platform for growth. Over the same period, Hadoop’s relevance diminished as users transitioned to more modern open-source and commercial solutions. This evolution marked the advent of ‘Indexima 2.0,’ which now focuses on Snowflake and will soon include support for Databricks.

Indexima has developed deep expertise in SQL query optimization, particularly for complex queries. Its approach aims to deliver a seamless user experience through intuitive graphical dashboards or interfaces. At the core of its technology lies the Pre-Aggregation model, which uses keys and aggregates to optimize performance.

In its earlier iterations, Indexima and its data were deployed on-premises, requiring data copies that carried the risk of divergence. With the introduction of version 2.0, operations are conducted in-place. For Snowflake, this means functioning directly within the Cloud Data Warehouse (CDW) environment, ensuring all data remains accessible to users. Once the Indexima engine is deployed, it continuously collects SQL queries, analyzes request patterns using machine learning (ML) and AI, and learns the schema. The engine then optimizes performance by generating dynamic aggregation tables, dramatically reducing query delays – for instance, in a demo using NYC Citibike data, average query times dropped from 20 seconds to under 1 second.

The benefits are immediate: there is no need for additional development work—users simply redirect to the Indexima URL instead of the Snowflake one. This approach also reduces costs, as smaller Snowflake instances suffice, while performance sees significant boosts, often achieving a 100:1 improvement ratio.

Indexima has a substantial opportunity ahead, with clear pathways to expand its market presence by collaborating with data warehouse and aggregation solution providers. The company is actively seeking new partners to accelerate this growth. Looking forward, their next target is Databricks, promising exciting times ahead for the Indexima team.


Manticore Search
Developed as an open-source, high-performance search engine project, Manticore Search addresses the classic old challenge of efficient search for enterprises. The initiative was started by Sergey Nikolaev, CEO and co-founder.

The story began with the Sphinx project in the early 2000s, which ceased development in 2016–17. Recognizing the need for a new direction, the team decided to build upon Sphinx’s foundations while addressing emerging challenges and integrating modern technologies, all within an open-source framework.

Manticore Search was founded by Sergey Nikolaev, Peter Zaitsev, former CEO of Percona, and Mindaugas Zukas, COO of Altinity. Together, they lead a team of over 10 experienced developers.

The project’s mission is to deliver a simple yet scalable search engine in an open-source format, optimized for standard, cost-effective hardware. Manticore targets general search and log analytics, emphasizing faster query performance, efficient resource usage, and robust support for SQL and JSON. Beyond full-text search, the platform includes faceted, boolean, fuzzy, geo, and vector search capabilities, creating a comprehensive solution for modern search needs.

The live demonstrations showcased during the event were both impressive and transparent. Despite being relatively under the radar, Manticore Search already serves prominent clients, including Craigslist, Rozetka, Socialgist, Statista, Europrcs, Hotelplan, PubChem, and Huispedia.

While there are other competitors in the market, such as Elasticsearch, Manticore differentiates itself in several key areas, as summarized in the table provided during the presentation.

Looking ahead, the team plans to introduce auto-sharding, authentication, integration with Kibana, and auto-embedding capabilities for vector search. They are also exploring AI-driven enhancements to further elevate the platform’s capabilities.

The complete source code for Manticore Search is publicly available on GitHub, reinforcing the project’s commitment to open-source development.

ProxySQL
It is about solving performance problems for MySQL databases, We got the opportunity to delve into the technology that accelerates SQL queries with Jesmar Cannao, the company’s COO and co-founder.

The ProxySQL project was initiated in 2013, with the company formally established in 2014 to support developers and drive user adoption. ProxySQL was conceptualized by Rene Cannao, CEO and co-founder, a highly regarded database administrator (DBA) and MySQL expert. His vision was to create a solution that improves performance by positioning a centralized query traffic manager in the network between clients and MySQL servers. This approach optimizes query routing and overall database performance.

ProxySQL, built on an open-source foundation like MySQL itself, addresses several challenges faced by MySQL environments: Network Scalability with support of up to 1 million concurrent client connections, consolidating them into optimized backend queries, multi-region and multi-cloud support to enable operations across diverse architectures with thousands of backend servers, sharding and query routing to direct requests to the appropriate shard or replica in sharded databases, and additional functionality to provide failover detection, request routing, security enhancements, clustering integration, and query caching.

The product is compatible with various MySQL setups, including replication, Amazon Aurora, and Galera, as well as other databases like MariaDB and Percona Server.

The solution is trusted by 40 clients, including notable names in the finance and e-commerce sectors, as well as local Maltese companies in the gaming and betting industries. Clients benefit from improved performance without the need to scale MySQL servers or clusters, making ProxySQL a cost-effective solution.

The company’s distribution strategy includes partnerships with hyperscalers, MySQL integrators, and monitoring tool providers such as Grafana and Prometheus. OEM partnerships are also under exploration. ProxySQL is available in 2 versions: open source with a community edition and an enterprise edition with a subscription-based licensing that provides premium features, updates, 24/7 priority support, and additional training services. On average, clients spend $45,000 annually on the enterprise solution.

Building on ProxySQL’s success and strong reputation, the team has begun developing a similar solution for PostgreSQL. While the company continues to grow, increasing visibility and market presence will be crucial to their long-term success.

ProxySQL exemplifies a practical, efficient approach to managing MySQL database performance, making it an appealing choice for enterprises seeking to optimize their database operations.

Scalytics
Formerly known as DataBloom AI and founded in 2022, Malta edition was the opportunity to meet its CEO and founder, Alexander Alten-Lorenz.

The project was initiated by a group of industry experts, many of whom are active members of the Apache Software Foundation. Currently, the company operates with a team of 6 employees and 20–25 associated students. Scalytics is self-funded but anticipates raising additional capital in the future to accelerate its growth.

The idea for Scalytics emerged from the challenges posed by the proliferation of independent AI and ML engines, which often lack integration. This disconnection leads to suboptimal results, infrastructure inefficiencies, and skyrocketing costs.

The company’s flagship product, Scalytics Connect, aims to bridge the gap between diverse AI engines and data processing platforms, including Snowflake, Databricks, and Confluent. The latest version, 1.2, was launched recently. Scalytics addresses the complexity of managing and moving massive data volumes, advocating for a more efficient approach to enable enterprise-wide AI adoption. Data access, as well as seamless integration, remains a key focus for Scalytics.

Key features in version 1.2, the latest release, introduce several significant enhancements: federated ML to enable training models across multiple platforms without the need to migrate or duplicate data, with native integration for Apache Spark, TensorFlow, and JDBC-compliant solutions, traceability and auditability to provide granular logging and visibility into training processes, enhanced performance with a new runtime designed to simplify and accelerate development and integration and broader compatibility to expand platform support to advance Scalytics’ vision of universal AI integration.

A performance benchmark conducted with a 3TB dataset demonstrated a 150x reduction in processing time, highlighting the potential efficiency gains from Scalytics Connect.

Scalytics follows an open-source philosophy, rooted in its culture and community-driven approach. The company is also the initiator of the Apache Wayang project, which entered incubation in 2022. Scalytics leverages partnerships with organizations like NTT Data, Google Cloud, and ESA to expand its market reach. Recently, the company launched a partner program targeting MSPs, CSPs, and ISVs to further drive adoption.

Scalytics operates on a subscription-based model, with revenue gradually gaining traction. Though still modest, the company’s growth signals a promising trajectory.

As Scalytics continues to develop its solutions and grow its presence in the market, there is much more to anticipate from this innovative player in the AI and data integration space.

Share:

Tuesday, December 10, 2024

Coldago unveils its Map 2024 for File Storage

Coldago Research has published its 6th Map dedicated to File Storage and again it shows 3 categories: Enterprise, High Performance and Cloud File Storage.

This new report studied 27 distinct companies with the following details:

  • Enterprise File Storage with 11 players: DDN, Dell, Huawei, IBM, iXsystems, Microsoft, NetApp, Pure Storage, Qumulo, SUSE and VAST Data,
  • High Performance File Storage with 15 players: DDN, Dell, Fujitsu, Hammerspace, HPE, Huawei, IBM, Pure Storage, Quantum, Qumulo, Quobyte, ThinkParQ, VAST Data, Vdura and WEKA,
  • Cloud File Storage with 11 players: AWS, CTERA, Egnyte, Hammerspace, LucidLink, Microsoft, Nasuni, NetApp, Panzura, Peer Software and Tiger Technology.
And for leaders, Coldago selected:
  • EFS: Dell, Huawei, IBM, Microsoft, NetApp, Pure Storage and VAST Data,
  • HPFS: DDN, Dell, Huawei, IBM, Pure Storage, Quantum, ​Qumulo, VAST Data and WEKA,
  • CFS: CTERA, Hammerspace, Nasuni and Panzura.​
I invite to check the Map page and here are below the 3 images.


Share:

Friday, December 06, 2024

Indexima improves Snowflake user experience

Indexima, launched in 2016 to boost BI, joined The IT Press Tour this week that took place in Valletta, Malta. It was the right time to meet one of the founder, Nicolas Korchia, CEO, to learn more about the company and the product.

The idea came from a real performance problem discovered at Mappy, the french geo positioning service, around 2015. The realtime navigation implies some real latency each time a change is requested on the screen. And then the user experience became a nightmare. In 2017, the team raised €1.3M and the project took roots in Hadoop to address performance challenges. In 2020, more DBs have been added and a SaaS mode. It triggered some real adoption and in 2020 the company reached 15 customers. Then the Covid happened and hurt many companies of all flavors. This key moment served as a pivot time for Indexima and realized that the fast growing Snowflake cloud data warehouse represented the right environment to accelerate. During these years Hadoop has lost its footprint with many users adopting new solutions open source or commercial ones. This new wave was named Indexima 2.0, today for Snowflake and soon with Databricks.

The company has developed a real expertise on SQL queries optimization especially on complex ones wishing to solve this challenge to offer an easy user experience via simple interactive graphical dashboard or interfaces. The key idea in Indexima approach relies on the Pre Aggregation model with keys and aggregates.


In the past all data and Indexima were deployed on-premises and leverage some data copy with some risk of data divergence. With the v2, things are done in-place and for Snowflake it means within the CDW workplace so everything is available for all users. As soon as Indexima engine is deployed, it starts to collect SQL queries continuously, understand request model leveraging ML and AI and learn schema. Based on that it made its own optimization collecting information, creating special dynamic aggregation tables... and a 20s average delay is reduced to less than 1 second on the NYC citibike demo we saw.

The value is immediate as there is no need to develop anything just point to the new Indexima URL instead of the Snowflake one, cost is reduced as Snowflake instance can be smaller and performance is boosted with ratio such as 100:1.

The window of opportunity for Indexima appears to be significant with some real key paths to penetrate the market with data warehouse players and aggregations players. The team is looking for new partners to accelerate on this market aspect. The next effort will target Databricks so we anticipate some good days for the Indexima team.

Share:

Thursday, December 05, 2024

Manticore Search for an universal search engine

Manticore Search, an open source high performance search engine project, joined The IT Press Tour this week to introduce their approach to this common challenge for a few decades now. Sergey Nikolaev, CEO and co-founder, took time to introduce the idea and gave us lots of details.

Everything started with the Sphinx project early 2000 but it stopped in 2016-17, so a new direction was clearly needed to leverage all the work done but also consider new challenges and needs with better performance, new technologies underneath, still in open source.

The firm was founded by Sergey Nikolaev, Peter Zaitsev, former CEO of Percona, and Mindaugas Zukas, COO of Altinity, leading a team of 10+ key developers.

The mission is to deliver a very simple scalable search engine, in open source mode, operating on affordable standard hardware. The team targets general search and log analytics with a real motivation to boost queries speed, resource consumption and SQL and JSON support. Beyond full-text search, it also adds faceted, Boolean, Fuzzy, Geo and Vector search, so very comprehensive model.

The demos we saw have been pretty impressive and fully transparent. The product is really very confidential but secured pretty names like Craigslist, Rozetka, Socialgist, Statista, Europrcs, Hotelplan, PubChem or Huispedia.

Several other products exist on the market like Elasticsearch, the table below explains a bit some differences:

They plan to add auto-sharding, authentication, integration with Kibana and auto-embeddings for vector search and consider some enhancements with AI.

The full source code is available on GitHub via this link.
Share:

Wednesday, December 04, 2024

EasyVirt surfs on data center energy optimization needs

EasyVirt, a french IT software vendor founded in 2011, joined The IT Press Tour this week in Valletta, Malta, and it was the perfect opportunity to learn more about their solutions.

We spent some time to illustrate electricity and energy consumption and it becomes even more critical with the fast growing AI usage. At the same time Europe continues to promote new regulations probably to let China and Indian overtake Europe... and for that Europe is really a champion.

The company develops 3 solutions - DC Scope, DC NetScope and CO2 Scope - with a small highly skilled team and generates around €1 million in 2024. They're recognized for their expertise in IT infrastructure virtualization, FinOps and CloudOps and Green IT, confirmed by 100+ cross industries customers with names like Safran, Fleury Michon, La Poste, Pole Emploi, CNES, MAIF or Amundi among others.

They approach their prospects having strong desire to understand the green impact of their digital services but clearly they don't know where to start. But it appears that it is very difficult to build such solution due to the collection of energy measurement challenge and the choice of the right algorithms with the necessity to not alter current services. 


DC Scope targets virtualized environments deployed on-premises or in the cloud and of course in a hybrid model. DC NetScope is dedicated to the network traffic analysis and CO2 Scope's mission is around IT Carbon measurement. These solutions are stressed and used by their clients and improvements are included regularly based on end-users feedbacks. The various cases studies have generated a reduction in vCPU or RAMs, deletion of VMs, hypervisor deletions, resizing of production environments or other gains. The philosophy is agent-less and no SaaS model is available, everything is local and secure for better control. Until now VMware has represented the hypervisor of choice for the team and they plan to add Proxmox, Nutanix AVH, Kubernetes with Red Hat OpenShift and VMware Tanzu but also GPU as it is a significant energy burner. Later they think about adding XCP-ng.

EasyVirt sells its solutions via a network of partners such as resellers, integrators, MSPs or even consulting firms like Capgemini, or CGI but even Dell. A 30 days trial is available promoting a try and buy model. The pricing is based on perpetual model also a subscription one on-demand. We'll see where EasyVirt is going but clearly their approach fits current end-users needs, no doubt.

Share:

Wednesday, November 27, 2024

The IT Press Tour #59 will land soon in Malta

The 59th IT Press Tour will take place in Valletta, Malta, in a few days.


Topics will be about IT infrastructure, cloud, networking, security, data management, big data, analytics and storage and of course AI as it is everywhere. We'll meet 6 innovative companies, among them:
  • DigiFilm Corporation, a emerging player in long term data preservation,
  • EasyVirt, a specialist in the efficiency of physical and virtual servers,
  • Indexima, a reference in fast BI and Analytics,
  • Manticore Search, key actor for information search,
  • ProxySQL, the fast enabler for MySQL and PostgreSQL,
  • and Scalytics, the fast growing company in AI federation.
I invite you to follow us on Twitter with #ITPT and @ITPressTour, my twitter handle and @CDP_FST and journalists' respective handle.
Share:

Wednesday, November 20, 2024

Recap of the 58th IT Press Tour in Boston, MA

Initially posted on StorageNewsletter 15/11/2024

This 58th edition of The IT Press Tour took place in Boston, MA, early October and all the press group and organizations had time exchanging about IT infrastructure, cloud, networking, security, data management and storage, analytics and big data and of course AI present across all these topics. Eight companies have been met, they are Congruity360, ExaGrid, HYCU, Hydrolix, iRODS, Swissbit, Sync Computing and Wasabi Technologies.

Congruity360
Player a bit confidential, it appears to be an alternative player in the dynamic unstructured data management segment. The firm has developed a rich and comprehensive software solution, Classify360, to tackle ever growing pains to manage such data within enterprises. The company was founded in 2016 and raised $25 million so far.

The company acquired 2 organizations, NextGen Storage and Seven10, respectively in 2017 and 2020, and continues to leverage these technologies and AI to improve data migration services and cloud aspects. It appears that the Seven10 asset was sold to Park Place Technologies in 2022.

The product Classify360 covers several domains such as storage optimization, cloud migration, data protection, DSPM, AI enablement and GRC with competitors in each one. The team identified its competition with a reference actor in each category: storage optimization with Komprise, cloud migration with Datadobi, data protection with Veritas, DSPM with Laminar, AI enablement Microsoft and GRC with BigID. The interesting aspect is related to the market size and opportunity of each segment, the number of players and the absence of a clear established strong leader. Many of them are small and pretty limited in scope which is a paradox, Komprise and Datadobi are small companies when you consider their annual revenue. Laminar, small as well, was acquired by Rubrik mid 2023 as the backup player already identified DSPM as a key need.

Rich in terms of features, the solution works with 3 simple steps: understand the data environment with files, folders and content informations, centrally indexed and analyzed, then the second phase relies on the classification of all the data discovered based on supervised machine learning and rules and parameters set and finally actions on these collected data triggered by automated policies to move, delete, tag, secure, deduplicate, encrypt, alert… Several of these tasks remind me of what the industry delivered 10 or even 20 years ago with SRM or storage resource management but for sure, recent challenges have invited players to develop more rich and modern answers especially with embedded AI. This is of course obvious with the security pressure on the infrastructure, both on-premises and in the cloud, as data is present in both places.

Data sources are various supporting NFS, SMB and S3 plus collaboration software such as Office365, Google Workspace, Microsoft Exchange, OneDrive & Azure, Box, Slack, NetApp, Dell EMC…

The business model is fully indirect leveraging resellers, integrators and distributors as an extension of classic software and hardware storage players.

The company prepares an announcement soon with AI and DSPM.


ExaGrid
Reference in secondary storage dedicated to backup, it currently delivers an annual revenue just below $200 million and is confident to pass this barrier for the next fiscal year. The growth is strong and sustained for several quarters illustrating that the solution developed by the company finds its markets. The firm is still led by Bill Andrews, CEO for almost 20 years, one of the longest tenure in the industry. The current installed base touches more than 4,400 customers with deals becoming larger and larger, several of 7 digits, over 80+ countries. The firm confirms its desire to dig into the backup segment, the segment is big enough, with new challenges and regulations like DORA, NIS2 and others.

As backup has seriously evolved with strict requirements for a few decades, storing backup images on disk-based units became a natural answer for modern RPOs and RTOs.

The team develops storage software to deploy on commodity hardware for keeping backup data but sold as appliances. This is aligned with the engineering choice to leverage standard components but also modern and recent technology developments. Very scalable as users can start small and grow the configuration by adding nodes, the ExaGrid product supports up to 6PB. Unlike other backup storage products, ExaGrid’s idea is to store as fast as possible the received data stream to acknowledge the backup process in the shortest time. Then asynchronously, a data reduction process is triggered to reduce the data footprint and store this result in a secured, not exposed outside, data storage zone, named the repository tier. With this design in mind, restores are also fast as the vast majority of these operations consider recent images. It means that some data is maintained in dual state, on the landing and repository zones. For even more protection needs, replication to other ExaGrid targets are possible in star, up to 16 sites, cascade topology but also to the cloud.

Key partnerships are essential for the vendor, top ones are Veeam, Veritas and Commvault and we anticipate some extension with Rubrik. We realize that the company also started some activities with HYCU.

Regarding the future for the company, we understood that Andrews and his team still wish to stay independent but could potentially target an IPO to convert the success. At the same time, the team doesn’t need to raise any money from the private or public sector. As the trajectory is pretty impressive for several years, we realize that ExaGrid could join the candidate list of Storage Unicorn published bi-annually by Coldago Research.

HYCU
Obvious active and key player in SaaS and cloud backup, it continues to deliver new services around new applications to protect, regulations and data mobility.

It has more than 4,200 customers over 78 countries leveraging 440+ partners globally. Its market penetration is significant thanks to a vast list of supported applications both on-premises but above all online. This is clearly a direction for enterprises ignoring their real number of subscribed online services but maintaining a strong risk and security exposure. It was recently detailed in their SaaS survey available here. And recent events especially the famous Crowdstrike bug resonate in all people’s minds not to forget the Google accidental massive records deletion of pension data.

HYCU has shaken the SaaS backup landscape with a new approach offering the capability to develop fast specific application modules coupled with the R-Cloud engine. Today more than 80 applications and SaaS services are covered and more are coming, which appears to be the largest number available today. They support various domains like infrastructure services, ITSM, DevOps, IAM, Data Management & Analytics and Collaborative Work Management.

This is a critical aspect as backup suffers from the infrastructure diversity and applications choice from users. Finding a solution that can address at the same time on-premises and cloud deployments, bare-metal, VMs, containers and SaaS applications storing backup images locally or in the cloud as well is a real mission.

It explains why this IT part became a nightmare and offers some opportunities for new or recent players. Again it’s a classic cycle, the industry creates its own complexity trying to solve new challenges inviting new players to address these new levels, probably a virtuous circle.

Backup data is a foundation IT service and the Boston team has recognized the need to restore data anywhere, from VMware to AWS or GCP to Nutanix or Azure, mixing on-premises and cloud environments.

The latest news that puts pressure on data protection vendors such as HYCU and users is the soon in-place European Union regulations named DORA and NIS 2. This must do rules according to the article 12 will force users to keep a local copy of their data, pretty easy with on-premises applications as it is where we’ll come from but could represent a real difficulty for SaaS and cloud applications. It is confirmed by new secondary storage platforms supported by R-Cloud with Nutanix Objects, Dell, Cloudian, OVHcloud in addition to the ones currently available.


Hydrolix
Founded in 2018 by Marty Kagan, CEO, and Hasan Alayli, CTO, it raised so far $65 million with the mission to develop a modern observability platform dedicated for processing real time streaming logs. The Hydrolix streaming data lake has already penetrated 287 customers and delivers 12x Y/Y growth for just a fraction of classic solutions costs. The founding team leveraged its strong background in distributed observability with massive capability to process real time logs coming from CDNs or client telemetry. Real time observability coupled with historical dimension really creates a differentiator thanks to Hydrolix innovations in the product design.

The architecture of the product is around 3 key services, ingest, store and query. On the ingest side, several access methods are available via stream or batch mode with data transformation realized in this phase and then all data is routed and stored on compatible S3 storage deployed on-premises or in the cloud. Very efficient compression methods are used to reach 20:1 or even 50:1 all queried via SQL, Spark and other APIs and visible in dashboards such as Splunk, Databricks, Looker, Kibana or Grafana. Each of these 3 cacheless decoupled services are independently scalable dynamically based on what is needed to align the platform SLA on key objectives.

The service is deployed to address real time and historical big data analytics use cases in various sectors. The first example is about the famous movie company Paramount. Its IT team is able to ingest data at 10.8 million rows per second rate and globally they collected 53 billions records that is translated into 41TB of data transformed finally stored as 5.76TB of compressed data. The second one is illustrated by Akamai, both a customer and a partner, who delivers this Hydrolix service under the name TrafficPeak on the Akamai Connected Cloud, the vendor’s massive distributed edge and cloud platform. Users realized that understanding all these CDNs data is a must in a very active security risk moment with very high volume of traffic but should cost less for efficiently collecting, storing and querying over a multi year period. Hydrolix insisted that these global costs drop by 75%.

The product is available as a container orchestrated by Kubernetes for Akamai LKE, Google GKE, Amazon EKS and Azure AKS but also on-premises. The service is promoted and marketed with a direct sales force but also via key partners like Akamai.

The firm also announced a partnership and technology integration with Quesma, a Polish database access software vendor, to connect their platform to Kibana and OpenSearch Dashboards via Beats/Logstash and Data Prepper ingestion tools.


iRODS
It’s not new, open source is everywhere and touches every IT segment and data management simply crystallizes this deep wave. Just see what Linux has triggered for a few decades on the market and you realize its ubiquitous presence. The iRODS consortium represents a strong initiative addressing the long time need of universal data management.

Historically, the project started in 1995 as a storage resource broker at San Diego Super Computer Center, General Atomics, then in 2006 the name iRODS has been introduced with an open source BSD-3 license, transitioned to UNC Chapel Hill/RENCI to finally landed as the consortium in 2013 with a community and membership but also services and support.

The solution is adopted in various sectors but clearly the vast adoption comes from super computers and research centers and universities. And we explain this by a financial criteria dimension, these public entities have human skilled resources that can leverage this open source software solution, contribute and extend it, instead of buying commercial software. They, of course, prefer to route their budget to buy hardware that can’t be replaced by human intelligence.

iRODS, as a software, provides an abstraction layer and represents a universal model for data integration, control, automate, search and access with a rich metadata and advanced policies management exposing a unified namespace from an existing infrastructure. It relies on a client-server architecture and the software is written in C++ introducing the iRODS protocol and RPC API. On the client side, it is offered as CLI, WebDAV, WebApp… leveraging the existing data silos, file servers, NAS, object and cloud storage and even archival storage entities. The policy engine is very comprehensive with movement, verification, retention, replication, placement, metadata extraction, application and conformance to build more global actions such as tiering, auditing, indexing, integrity, ingest, publishing and compliance. Then use cases and associated applications are pretty wide with file system synchronization across sites, providing data to compute for instance in HPC or specific compute zone or the reverse. Again the key element here, as seen by other players in the industry we mentioned for probably 2 decades now, is the unified namespace with of course some significant evolutions.

The team plans to have some vertical integrations with some identified key sectors applications, add time series and statistics and obviously dashboards with costs and visibility of all resources and advanced management capabilities.

iRODS will exhibit at the RENCI booth at SC24 in Atlanta, GA, very soon now.


Swissbit

Leader in Europe for flash media and SSD, Swissbit confirms its business trajectory with today 400 employees with 5,000 customers fueled by its production capacity of 3 million units per month.

A few milestones are key to better understand the current profile of Swissbit. First the story started with a MBO in 2001 from Siemens Memory, 2008 the creation of industrial memory solutions, 2013 the addition of security solutions, 2014 for SSD and 2019 for a production unit in Berlin, Germany. Another strategic event happened in 2020 with the investment of Ardian within Swissbit, 2 acquisitions, the 1st in 2021 with Hyperstone, the German developer of SATA SSD controller and later some synergy and assets swallowing of Colorado-based entity Burlywood Technology following a partnership announced in 2022. The production site in Berlin illustrates the level of independence Swissbit has achieved for a few years offering some guarantees for some specific markets or regions.

Designed to address industrial, networking and enterprise data storage, the memory and storage team has chosen to partner with NAND and SSD controllers key players such as FADU, Kioxia, Micron, Phison, Samsung, Silicon Motion and SK Hynix in addition to the technology they now master.

The product line is pretty wide with different form factors, capacity, connectivity and interfaces from PCIe, SATA, NVMe, eMMC and USB/CF/SD cards.

The second activity for the company is related to digital identity and secure access FIDO2 certified solutions. One of them is the phishing-resistant MFA represented by the iShield Key product family with 2 instances: the FIDO2 one and the Pro. A new one is coming named iShield Key MIFARE combining also FIDO2. All these advanced security products offer strong authentication, access control, time tracking, secure login and one time password. This instance supports up to 300 passkeys for passwordless logins. For government segments, the company signed a partnership with RSA and together they provide the RSA iShield Key 2 series.

For embedded systems, the security aspects must comply with various regulations and standards in Europe, US and globally. The vendor has released a security upgrade kit with encryption and access control for MicroSD/SD cards. The encryption uses a real time AES 256 bits and also the memory considered is based on industrial grade pSLC to improve endurance.

Swissbit will exhibit at SC24 in Atlanta, GA, very soon now.


Sync Computing
Initiated at MIT, Sync Computing is a very young company founded in 2019 with $22.9 million raised as of today. With roots in high performance computing, the team has already signed some key technology partnerships with Nvidia and Databricks.

Already identified as a key need in HPC with job schedulers and we all know famous of them, some historical ones like NQS, Platform or Altair… and today a highly visible one is Slurm, the resource allocation problem is a complex challenge to solve. Globally two approaches addressed from different starting points: the resource itself or the task. In other words, it is again critical to allocate the right job to the right resource at the right time and profiling tasks and monitoring the IT computing environment are mandatory with pricey configurations. At scale, this is a real challenge that impacts significantly SLAs. And the declarative compute resources appear with ML models aligned with cloud infrastructure. The idea is to connect the environment to an intelligent service that learns from the usage and consumption and other key data such as cost inputs. The model is then perpetually enriched with closed-loop feedback and thus delivers better results as time goes.

Chosen to illustrate and validate their idea, Databricks is today widely used but with high associated cost. Sync develops Gradient as an AI-based software solution to solve these 3 dimensions: cost, optimization and SLAs.

Alternatives appear to be less dynamic and automatic, here the value is brought by the AI integration. Gradient relies on self-improving ML algorithms developed at MIT leveraging the close-loop feedback behavior.

Adoption has taken off with a few references like Duolingo, Forma.ai, Insider, Mediaradar, Abnormal or Handelsblatt confirming 2:1 ROI and 2x faster runtimes. As mentioned above, Sync Computing has identified Databricks as a first data intensive computing environment to optimize but more are coming.


Wasabi Technologies

Recognized as the elephant in the room, shaking seriously the top 3 cloud providers, AWS, GCP and Azure, Wasabi Technologies embodies a real leader targeting storage services. The company shared impressive figures with more than 100,000 customers, 15,000+ partners, several exabytes of data with more than 1EB just of Veeam backup images, all this stored on 14 storage regions on the planet. On the financial aspect, as their business is really capital intensive with hardware to buy and deploy, the firm has raised more than half a billion dollars. The revenue pace is impressive with more than 60% growth year-over-year.

On the service side, Wasabi offers an online S3 storage service at a fraction of the AWS price. It is charged at $6.99TB/month without any charges for egress, downloads or API requests which seriously impact the fee for other providers being a real cash machine. And this choice explains the success of the company, they pick a standard, a de-facto one, S3, and offer an alternative service at a lower price. The model also offers a reserved capacity option with 1, 3 or 5 years increments.

The business model relies on an indirect way with partners through channels. Partners integrate their products with Wasabi S3, it is the case in data protection, video surveillance or media and entertainment. For use cases, it means backup and recovery, archiving, data analytics, application development, content delivery, surveillance, IoT and AI/ML.

A few months ago, the company ran a survey and discovered that 92% of education respondents expect an increase of data stored in the cloud in the coming months spending half of their budget on egress fees. It is also 90% in the surveillance and 97% in the M&E segments.

But as structured data is inherently easy to search, for unstructured data it still is a challenge even if we saw and know some solutions for years with some interesting content indexing technologies, online or local. So the company has made a decision and has jumped into AI with the acquisition of Curio AI from GrayMeta early this year. Its CEO joined Wasabi as SVP of AI and ML. The idea here is to feed and enrich metadata with information discovered on the stored content and then offer this search capability to users to help them find the right info in seconds. Clearly this AI-powered object storage is a key direction for Wasabi.

Speaking about the future, David Friend told us that an IPO could be a natural next step for the company. On the product side, pushed by some European regulations, an on-premises flavor of the solution will appear with of course remote copy or replication to guarantee committed 11x9s data durability.

Share:

Friday, November 15, 2024

Arcitecta and Wasabi join forces

Arcitecta, a leader in unstructured data management, and 
Wasabi Technologies, the reference in alternative cloud storage, just announced a partnership. Mediaflux, the Arcitecta product, supports the S3 API so it's not a surprise that Wasabi is supported as a new member of the storage realm. It provides cloud storage, often remote, accessible transparently from any client that is connected to the global namespace enabled by Mediaflux. 

Share:

Thursday, November 07, 2024

Congruity360 unveils Classify360 3.1

Identified as a key player in unstructured data management, Congruity360, just announced a new iteration of its data classification platform, Classify360. Well detailed during the recent IT Press Tour in Boston, Mark Ward, COO, pre-announced this last release with new features for Insights, Actions and Comply modules to sustain and simplify data management at scale. Among them:

  • Data Normalization for AI fueled by precise classification,
  • Scan performance improvement and insights for Dell PowerScale, NetApp, Microsoft OneDrive and SharePoint on-premises plus enhancement for redundant, obsolete and trivial data, DSPM and AI governance,
  • New supported data sources with Nasuni, Wasabi, DāSTOR Object, VAST Data, and Oracle Cloud,
  • and updates of prepackaged risk models to support HITRUST and NYDFS plus "bring-your-own data dictionary".

Share:

Friday, October 25, 2024

Hydrolix adds Kibana dashboards thanks to Quesma

Hydrolix, a fast growing player in the streaming data lake landscape, has signed a partnership with Quesma, a polish software company, who develops a translation layers for database services. This middleware operates as a database gateway to store data within Hydrolix and maintain Kibana and Logstash/Beats and therefore reduce costs. In other words, Hydrolix can replace Elastic, can ingest data from Logstash and Beats and works with Elastic Common Schema. In the same way, OpenSearch users are able to leverage Hydrolix and then connect to OpenSearch Dashboards. 
Share: