Strategy and commentary

Hybrid Transactional/Analytical Storage

Hybrid Transactional/Analytical Storage

Confluent has made two key feature announcements in the spring of 2024:

  • Freight Clusters, a new cluster type that writes directly to object storage. It is aimed at the “freight” of data streaming workloads, log ingestion, clickstreams, large-scale ETL and so on that can be cost-prohibitive using a low latency multi-AZ replication architecture in the cloud.

  • Tableflow, an automated feature that provides seamless materialization of Kafka topics as Apache Iceberg tables (and vice-versa in the future). 

This trend towards object storage is not just happening at Confluent but across the data ecosystem.

Tableflow: the stream/table, Kafka/Iceberg duality

Tableflow: the stream/table, Kafka/Iceberg duality

Confluent just announced Tableflow, the seamless materialization of Apache Kafka topics as Apache Iceberg tables. This announcement has to be the most impactful announcement I’ve witnessed while at Confluent. This post is about why Iceberg tables aren’t just another destination to sync data to; they fundamentally change the world of streaming. It’s also about the macro trends that have led us to this point and why Iceberg (and the other table formats) are so important to the future of streaming.

S3 Express One Zone, not quite what I hoped for

S3 Express One Zone, not quite what I hoped for

AWS just announced a new lower-latency S3 storage class and for those of us in the data infrastructure business, this is big news. It’s not a secret that a low-latency object storage primitive has the potential to change how we build cloud data systems forever. So has this new world arrived with S3 Express One Zone?

The answer is no, but this is a good time to talk about cloud object storage, its role in modern cloud data systems and the potential future role it can take.

On the future of cloud services and BYOC

On the future of cloud services and BYOC

My job at Confluent involves a mixture of research, engineering and helping us figure out the best technical strategy to follow. BYOC is something I’ve been thinking about recently so I decided to write down the thoughts I have on it and where I think cloud services are going in general.

Bring Your Own Cloud (BYOC) is a deployment model which sits somewhere between a SaaS cloud service and an on-premise deployment. The vendor deploys their software in a VPC in the customer account but manages most of the administration for the customer. It’s not a new idea, the term Managed Service Provider (MSP) has been around since the 90s, and refers to the general term of outsourcing management and operations of IT infrastructure deployed within customer or third-party data centers.