Welcome to the April ClickHouse newsletter where we round up what’s been
happening in real-time data warehouses over the last month.
This month, we have the 24.3 release, building a rate limiter, a migration
from MySQL to ClickHouse story, meetup videos, and more!
Inside this issue
- Featured community member
- Upcoming events
- 24.3 release
- Storing Continuous Profiling Data in ClickHouse
- Migrating to ClickHouse: Releem’s Journey
- How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions
- Building a Rate Limiter with ClickHouse
- Video Corner
- ClickHouse Cloud Updates
- Post of the month
Featured community member
This month’s featured community member is Shivji kumar Jha, a Staff Engineer
for Data Platforms at Nutanix.
Shiv leads a five-member team, managing and supporting Nutanix’s data
platform, which acts as a service for messaging, streaming, event sourcing,
analytics, and time series databases. Shiv actively engages with the
communities of the technologies used at Nutanix, including ClickHouse.
We recently hosted a ClickHouse meetup in Nutanix’s office in Bangalore,
India. Shiv was invaluable in making this event happen, helping organize it,
and acting as an MC for the evening. He recorded all the talks and
uploaded them to YouTube afterward. Shiv also participated in a follow-up Q&A session on 15th April to address unanswered questions from the meetup.
Thanks for all your work Shiv and we’ll see you at the next meetup!
Upcoming events
- Copenhagen Meetup – April 23rd
- FREE ClickHouse Training – April 24th & 25th
- AWS Summit London – April 24th
- v24.4 ClickHouse Community Call – April 30th
-
Bengaluru Meetup
– May 4th -
AWS Summit Berlin
– May 15th - Stockholm Meetup – May 22nd
-
Dubai Meetup
– May 28th
24.3 release
The big feature in the 24.3 release is the analyzer being enabled by default.
Analyzer is a new query analysis and optimization infrastructure that’s been
in the works for a couple of years and lets you have multiple
ARRAY JOIN clauses in
a query, treats tuple elements like columns, handles queries with nested CTEs
and sub-queries, and more.
Storing Continuous Profiling Data in ClickHouse
Coroot is an open-source tool for observability that turns observability data
into actionable insights. Nikolay Sivko wrote a blog post in which he
describes how they built their own storage system for profiling data based on
ClickHouse. After defining continuous profiling, Nikolay takes us through the
data model and gives examples of queries that check on the performance of a
service.
Migrating to ClickHouse: Releem’s Journey
Releem is a MySQL performance tuning tool that automatically detects
performance degradation and optimizes configuration files. To do this, they
collect metrics from hundreds of database servers across various operating
systems and cloud solutions.
They used to store these metrics in MySQL, which started to struggle once it
reached almost 5 billion records. Enter ClickHouse, which helped shrink the
database size by 20 times, cut aggregation query times from 45 to 2 minutes,
and reduced the page load time of the Releem dashboard by 25%.
How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions
Rory Crispin, SRE at ClickHouse, shared his experience building a platform for the logging data generated by ClickHouse Cloud. Rory takes us through key design decisions, including whether to use Kafka and structured vs unstructured logging. He also explains why the team decided to use OpenTelemetry to collect metrics and does a cost comparison of the in-house solution vs using an off-the-shelf product like Datadog.
Building a Rate Limiter with ClickHouse
If you were going to build a rate limiter, the obvious choice for storing the
data would be Redis. But Brad Lhotsky, Systems and Security Administrator at
Craigslist, was curious whether ClickHouse would be fit-for-purpose and used
it to build a proof-of-concept. Brad shared the slides of a talk explaining
how he imported data from Kafka, built a bridge from the ACL API to
ClickHouse, and tested high availability, all in just one week.
Video corner
-
At the New York City meetup, Adam Azzam presented how Prefect
uses ClickHouse to enable real-time event drive orchestration. -
Mark Needham walked us through some of the most common
aggregate function combinators
and showed how and why we might use them. -
At Kubecon Europe 2024, Manish Gill discussed
the challenges of auto-scaling databases in Kubernetes, using ClickHouse Cloud as a case study.
ClickHouse Cloud Updates
-
Over the last 9 months, we’ve been rebuilding the UI for ClickHouse Cloud
and
last week, started rolling it out to everybody. -
Today,
ClickPipes
introduces beta support for continuous data ingestion from S3 and GCS. Let
us know if you’re interested in giving this a try by replying to this email! -
Tokyo (ap-northeast-1) has been added as a new region for AWS.
Sign up now.
Post of the month
Our favorite post this month was by
Divyendu Singh about real-time
monitoring.