Hybrid Transactional/Analytical Storage

I work for Confluent in its Technical Strategy Group (TSG). We’re a small team of technologists with a long and varied history in data systems. Our remit is to do forward-looking research with the aim of helping to guide the Confluent ship through dangerous waters. So, I write about technical strategy internally, and recently, I’ve begun writing about Confluent strategy externally too. This is one such post, so if you want to understand the recent Confluent announcements then read on. If you prefer my usual dist-sys writing, then head on over to my Analysis section, I’m in the middle of a set of deep dives into the table formats, starting with their consistency model (Hudi, Delta Lake, Iceberg coming soon).

Confluent has made two key feature announcements in the spring of 2024:

  • Freight Clusters, a new cluster type that writes directly to object storage. It is aimed at the “freight” of data streaming workloads, log ingestion, clickstreams, large-scale ETL and so on that can be cost-prohibitive using a low latency multi-AZ replication architecture in the cloud.

  • Tableflow, an automated feature that provides seamless materialization of Kafka topics as Apache Iceberg tables (and vice-versa in the future). 

This trend towards object storage is not just happening at Confluent but across the data ecosystem. Many different types of cloud data systems are integrating object storage into their architecture, and I have covered a few of these such as Neon and ClickHouse Cloud. There is also a flurry of start-ups doing logs (such as the Kafka API) over object storage directly (six and counting). There is money being plowed into data systems that use object storage as the primary, and sometimes, only storage layer.

Confluent is also doubling down on object storage, and this post will explain the vision Confluent has for using object storage as an enabler of a new storage paradigm - Hybrid Transactional/Analytical Storage (HTAS). This is not an established term, and it may only go as far as this blog post, but it makes for a good vehicle to describe Confluent’s vision for unifying transactional and analytical systems in one platform. You may see the resemblance to HTAP, a term from the world of database systems - Hybrid Transactional/Analytical Processing (HTAP). Let’s start there before looking at what HTAS is.

Hybrid Transactional/Analytical Processing (HTAP)

According to Wikipedia, the acronym HTAP, or Hybrid Transactional/Analytical Processing, was coined by Gartner back in 2014. 

Hybrid transaction/analytical processing (HTAP) is an emerging application architecture that "breaks the wall" between transaction processing and analytics. It enables more informed and "in business real time" decision making.

The term has always been used in relation to database systems. Traditionally, transactional workloads and analytical workloads have relied on purpose-built database systems as each workload has unique constraints and requires different technical solutions. 

For example, OLTP database systems such as Postgres, SQL Server, and RocksDB use row-oriented B+trees or LSM trees as these are well suited for the demands of an OLTP workload (small-medium data, low-latency operations, lots of simple CRUD). However, row-oriented B+trees aren’t such a good choice for massive data sets where the main workload is based on aggregations that slice and dice data. OLAP database systems prefer column-oriented data organization and massively parallel processing architectures. 

The concept of HTAP is to merge OLTP and OLAP into a single system, eliminating organizations needing to perform ETL and ELT jobs to synchronize the two data sets. This unification simplifies the data management process and enables more informed and 'in business real-time' decision-making.

I like Gartner’s wording of "breaks the wall" between transaction processing and analytics, and it is this spirit of wall breaking that Confluent is applying to storage.

Hybrid Transactional/Analytical Storage (HTAS)

The concept of HTAP can potentially extend to any data system where there is a division between transactional and analytical workloads. Moreover, with the advent of object stores taking on the role of the (near) universal storage layer, we now have the capability to unify entirely different workload types. However, HTAP is focused on “processing” and tends to apply to a single processing engine that is capable of both OLTP and OLAP. My focus today is not one processing engine to rule them all but multi-modal storage engines that can serve transactional workloads and horizontally integrate object storage for analytical compute - Hybrid Transactional/Analytical Storage.

To understand the difference between HTAS vs HTAP we need to look at another trend that is gathering pace in 2024 - the adoption of the table formats of Apache Iceberg, Apache Hudi and Delta Lake. This trend is about using object storage tables as shared tables that you can bring your own compute to. A colleague of mine in TSG dubbed this the “headless data architecture”.

Fig 1. The headless data architecture of bring-you-own-compute/query-engines over shared tables.

This data sharing is at the core of Confluent’s value. It makes money by helping organizations to share data, between teams, departments and even between other organizations.

“Iceberg et al have added another solid sharing primitive ideal for the analytics estate with its multitude of query engines and BI tools that can read and write to these tables. The first manifestation of this new sharing primitive is the commoditization of the data lake and the rise of the headless data architecture. With storage and compute disaggregated, each organization gets to choose the query engines/BI tools that suit their needs, putting more power into the hands of the organization to choose where they place their data and how they access it. Data warehouse vendors like Snowflake are also joining the fray by adding the ability to query “external” tables using the Iceberg table format.” Tableflow: the stream/table, Kafka/Iceberg duality

Rather than building The One compute engine, we democratize access to a shared storage layer. Rather than build data gravity, we open up the data for all. Where it gets interesting is when we combine the headless data architecture with a transactional data system to make a hybrid transactional/analytical storage (HTAS) architecture. Streaming storage, table storage, low-latency storage, low-cost storage: there are many workloads and many drivers (economic, latency, reliability). An HTAS architecture is one that seamlessly integrates the transactional and analytical sides with one storage engine.

The building blocks of HTAS at Confluent

Tableflow

Tableflow is a manifestation of HTAS that uses a single storage engine (Kora) to provide both transactional streaming semantics and analytical streaming/tabular semantics for the same topic (via native horizontal integration with object storage).

Fig 2. Kora integrates stream and table storage into a single data set (the feature known as Tableflow).

Microservices use the Kafka API for event-driven architectures (EDA) that use a workload characterized by many small transactional low-latency operations. In this sense, the row-oriented event stream (via the Kafka API) is the equivalent of the OLTP database such as Postgres. 

Stream processors like Apache Flink, Kafka Streams, and Apache Spark Structured Streaming can use this row-oriented Kafka API to read and write to streams.

Fig 3. Regular topics can straddle the transactional and analytical divide, as well as the row-oriented and column-oriented divide.

However, the Kafka API is not well suited to analytical data systems relying on column-oriented data organization, such as Trino, Spark, or BigQuery. Even a regular Flink streaming job might prefer a columnar stream if it only needs a subset of fields. Tableflow unifies transactional and analytical processing by offering streaming row-oriented semantics to applications and tabular column-oriented semantics to analytical systems. Iceberg can do streaming too. Append-only Iceberg tables are essentially streams that can be incrementally processed using either streaming or table-oriented APIs.

In the future, Tableflow will work bi-directionally, so that data created/modified in the analytics realm is seamlessly materialized as Kafka topics for the transactional application realm.

Freight clusters

Freight Clusters are one more building block towards unifying transactional and analytical workloads with a single storage engine. While Apache Kafka has historically been the universal event streaming storage layer, it is no longer top dog for massive streaming datasets in the cloud. The issue for Kafka is not Kafka per se but the pricing model of the modern cloud provider that charges for network traffic across availability zones. This economic model did not exist when Kafka, Pulsar, RabbitMQ, and Redis were designed. While they continue to handle massive workloads in environments without networking charges (such as on-premise, co-los, and for now, Azure), the replication architecture does not match the economic cloud model for the big workloads.

Fig 4. Freight cluster topics provide a low-cost option for high volume topics. On the analytics side, this object storage stream can be consumed with streaming semantics (via the Kafka API), or as Iceberg tables for a more incremental, columnar approach.

Freight Clusters, with leaderless, stateless brokers, address today’s cloud economics by simply writing directly to object storage and avoiding cross-AZ network transfers. This brings down the cost of the high-volume workloads. These volume-optimized topics can be consumed via the Kafka API on the analytics side for lower-than-Iceberg-latency, row-oriented streams or materialized as column-oriented object storage Iceberg tables by Tableflow.

The future of streaming storage is multi-modal

The Kora engine now speaks many different kinds of storage, enabling multiple workloads to be served from the same platform, and even the same topic.

  • Stream abstraction

    • Fast, fault-tolerant read/write-through cache (distributed, replicated over block storage). The classic Kafka API for transactional workloads. Caches have been used for decades and still play a pivotal role in high-performance storage systems. Clients interact directly with the stateful Kora brokers which replicate the data with low millisecond latency and asynchronously write the data to object storage. This is the home of the transactional row-oriented streaming workload.

    • Object storage direct-write (Freight). High throughput topics, which often contain analytics data, move huge amounts of data where latency is not an issue. Stateless Kora brokers write directly to object stores, avoiding the networking costs of the cloud.

  • Table abstraction

    • Apache Iceberg tables (Tableflow). Kora materializes topics as Iceberg tables over object storage (Tableflow), incorporating the Iceberg driver directly into the storage engine itself. This is the home of column-oriented analytics processing engines, such as Trino, Snowflow, Databricks and BigQuery. The headless data architecture lives here.

With a multi-model storage engine as the base, the various APIs become apps that sit on top.

Fig 5. Customers choose the API and storage format that suits them best.

HTAS and Caching

The enabler of this one storage engine, multiple workloads architecture, has been object storage. But you can’t build low-latency transactional systems on top of object storage without some kind of caching strategy. While S3 Express One Zone solved the latency issue, it did so without bringing the same economical model of S3-standard. Not only is S3 Express One Zone more costly for storage, but it didn’t bring per-request costs down enough to make high-frequency writes (needed for low-latency operations) economical enough.

For a single storage engine to serve low-latency transactional and large-scale analytical workloads at a competitive price, you need fault-tolerant caching. In Kora, this caching layer is a set of stateful Kora brokers that implement the Kora replication protocol, which has its roots in Kafka (but has since diverged). Essentially, we’re talking about caching implemented using replication for fault tolerance and high availability. 

The streaming workload is the perfect fit for seamlessly combining the read-write cache with object storage. The sequential nature of a stream allows for the use of low-latency replication for the hot transactional data, and object storage for the rest.

HTAS vs HTAP

HTAS and HTAP are in the same neighborhood but have different priorities regarding the unification of transactional and analytical workloads.

HTAP is a harder problem to solve because it aims to provide a single consistent data set to both transactional and analytical systems. That is a tough challenge and requires a single vertically integrated stack of storage and compute. It’s not an open system.

HTAS loosens its consistency model for openness. Analytics systems don’t usually need strict real-time consistency/isolation (such as linearizability and strict serializability). A consistency level with looser time constraints such as sequential consistency or just some form of “eventual consistency” is usually enough. By giving up real-time consistency between the transaction and analytical estates HTAS gains something arguably more important - open access for any compute you want to bring to it.

Final thoughts

Freight clusters are a really important addition to Confluent’s product suite for sure, but what makes Freight and Tableflow such a game-changing pair of announcements is the kind of platform that emerges from the combination. As I said in my last blog post on competition in the age of object storage, many things that used to be companies are now features due to the commoditization of cloud data primitives. Tableflow alone is not a viable business and neither are Freight Clusters. It is the platform that is the bedrock of the modern data infrastructure company. The platform is the single coherent interface with multiple capabilities that services different market needs around cost, performance, and availability. 

As the capabilities of the table formats grow, so too will the value of HTAS. Iceberg at this stage is still just a protocol for how to organize Parquet and metadata files to form a table abstraction. But in the future, I expect standards around indexing, table maintenance and so on to emerge. Apache Hudi already has a lot of this. As these open table formats become more capable, the need for closed-source data lake technology will diminish. The headless data architecture of shared tables, shared indexes, and shared metadata will have totally commoditized lake house storage. The future battle will be among the compute engines and integrated transactional systems.

I don’t know if the term HTAS will survive beyond this blog post, but it’s been a useful term to describe the Confluent storage strategy. Object storage will continue to shake-up the data infrastructure ecosystem and infra companies will need to stay nimble in order to stay relevant.