September 2024 newsletter

Welcome to the September ClickHouse newsletter, which will round up what’s
happened in real-time data warehouses over the last month. This month, we have
the much-awaited JSON data type, our 1st ClickHouse research paper, a Private
Preview of BYOC on AWS, better PyPi stats with Ibis, and more!

 

Inside this issue

 

Featured community member

sep2024featuredmember.png

beehiiv is a newsletter platform that helps creators, publishers, and
businesses build and grow their email audiences. They collect events capturing
every time an email is processed, every time it lands in an inbox, every time
it’s deferred, every time it’s bounced, every time you open it, every time you
click a link, and so on.

Eric has worked at beehiv for just over a year and was responsible for moving data operations from Postgres to ClickHouse Cloud. There’s a user story on the work he and his team did, and he also presented at the
New York meetup in the summer.

Eric previously worked as a Tech Lead at Arthur.ai, where he architected and built the company’s data ingestion pipeline, storage, and much of the backend infrastructure.

Follow Eric on LinkedIn

 

Upcoming events

Global events

Free training

Events in EMEA

Events in Asia Pacific

 

VLDB 2024: First ClickHouse research paper

vldbpaper.png

It’s been almost a year in the making, and at the end of August, we presented
our first research paper at VLDB 2024. 

VLDB—the international conference on very large databases—is widely regarded
as one of the leading conferences in data management. VLDB generally has an
acceptance rate of ~20% among the hundreds of submissions.

The paper concisely describes ClickHouse’s most interesting architectural and
system design components, which make it so fast. We’ve embedded the PDF of the
paper in the blog post linked below.

Read the blog post

 

How Reco leverages advanced analytics to detect sophisticated SaaS threats

Reco is a full-lifecycle SaaS security solution that uses ClickHouse as the
foundation of its advanced analytics system. Nir Barak explains how ClickHouse
gives them a holistic view of data across multiple layers and allows them to
detect outliers and anomalies.

Read the blog post

 

24.8 LTS release

release24.8.png

The 24.8 release is here, and it has an exciting feature that I (and many of
you) have been waiting for – the new JSON data type! 

It’s in experimental mode, but that didn’t stop us from taking it through its
paces while exploring structured data of events in football/soccer matches.

This release also introduces the TimeSeries table engine, which can store
Prometheus data, and a new Kafka table engine that supports exactly-once event
processing.

Read the release post

 

Better PyPI stats with Ibis, ClickHouse, and Shiny

pypistats.png

ClickPy
is a ClickHouse-backed application that analyzes the download of Python
packages published on PyPI. In addition to the front-end application, you can
also query the underlying data, which is exactly what Cody Peterson has
done. 

Cody shows how to connect to ClickPy using
Ibis
and then explores the seasonality of downloads of the clickhouse-connect
package by day of the week and month. The results are visualized using
plot.ly, and Cody then puts everything together into a Shiny
application. 

Read the blog post

 

ClickHouse Cloud: BYOC AWS in Private Preview

byoc.png

ClickHouse Cloud has been
running for almost two years and supports all the major cloud platforms, AWS, Azure, and GCP. So far, it’s
been a SaaS offering that runs entirely on ClickHouse’s cloud account, which
made it a non-starter for users with strict data residency and compliance
requirements. 

We’re therefore happy to announce the Private Preview release of Bring Your
Own Cloud (BYOC) on AWS. BYOC is a fully managed ClickHouse Cloud service
deployed to your AWS account.

The waiting list is now open, so be sure to sign up, and we’ll contact you to
set you up.

Join the waitlist

 

Quick reads

 

Post of the month

Our favorite post this month was by
Michael Driscoll about
the new JSON data type:

tweet_1831900730254582115_20240919_134823_via_10015_io.png