April 2024 Newsletter

Welcome to the April ClickHouse newsletter where we round up what’s been
happening in real-time data warehouses over the last month.

This month, we have the 24.3 release, building a rate limiter, a migration
from MySQL to ClickHouse story, meetup videos, and more!

Inside this issue

Featured community member

This month’s featured community member is Shivji kumar Jha, a Staff Engineer
for Data Platforms at Nutanix.

april2024-featuredmember.png

Shiv leads a five-member team, managing and supporting Nutanix’s data
platform, which acts as a service for messaging, streaming, event sourcing,
analytics, and time series databases. Shiv actively engages with the
communities of the technologies used at Nutanix, including ClickHouse.

We recently hosted a ClickHouse meetup in Nutanix’s office in Bangalore,
India. Shiv was invaluable in making this event happen, helping organize it,
and acting as an MC for the evening. He recorded all the talks and
uploaded them to YouTube afterward. Shiv also participated in a follow-up Q&A session on 15th April to address unanswered questions from the meetup.

Thanks for all your work Shiv and we’ll see you at the next meetup!

Follow Shivji on LinkedIn

 

Upcoming events

 

24.3 release

Release blog cover (2).png

The big feature in the 24.3 release is the analyzer being enabled by default.
Analyzer is a new query analysis and optimization infrastructure that’s been
in the works for a couple of years and lets you have multiple
ARRAY JOIN clauses in
a query, treats tuple elements like columns, handles queries with nested CTEs
and sub-queries, and more.

Read the release post

 

Storing Continuous Profiling Data in ClickHouse

2024-04-15_14-02-36.png

Coroot is an open-source tool for observability that turns observability data
into actionable insights. Nikolay Sivko wrote a blog post in which he
describes how they built their own storage system for profiling data based on
ClickHouse. After defining continuous profiling, Nikolay takes us through the
data model and gives examples of queries that check on the performance of a
service.

Read the blog post

 

Migrating to ClickHouse: Releem’s Journey

Releem is a MySQL performance tuning tool that automatically detects
performance degradation and optimizes configuration files. To do this, they
collect metrics from hundreds of database servers across various operating
systems and cloud solutions.

They used to store these metrics in MySQL, which started to struggle once it
reached almost 5 billion records. Enter ClickHouse, which helped shrink the
database size by 20 times, cut aggregation query times from 45 to 2 minutes,
and reduced the page load time of the Releem dashboard by 25%.

Read the blog post

 

How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions

logging_thumbnail.png

Rory Crispin, SRE at ClickHouse, shared his experience building a platform for the logging data generated by ClickHouse Cloud. Rory takes us through key design decisions, including whether to use Kafka and structured vs unstructured logging. He also explains why the team decided to use OpenTelemetry to collect metrics and does a cost comparison of the in-house solution vs using an off-the-shelf product like Datadog. 

Read the blog post

 

Building a Rate Limiter with ClickHouse

2024-04-15_13-55-49.png

If you were going to build a rate limiter, the obvious choice for storing the
data would be Redis. But Brad Lhotsky, Systems and Security Administrator at
Craigslist, was curious whether ClickHouse would be fit-for-purpose and used
it to build a proof-of-concept. Brad shared the slides of a talk explaining
how he imported data from Kafka, built a bridge from the ACL API to
ClickHouse, and tested high availability, all in just one week.

View the slide deck

 

Video corner

 

ClickHouse Cloud Updates

Cloud Monthly Update Highlights (1).png

  • Over the last 9 months, we’ve been rebuilding the UI for ClickHouse Cloud
    and
    last week, started rolling it out to everybody.
  • Today,
    ClickPipes
    introduces beta support for continuous data ingestion from S3 and GCS. Let
    us know if you’re interested in giving this a try by replying to this email!
  • Tokyo (ap-northeast-1) has been added as a new region for AWS.
    Sign up now.

 

Post of the month

Our favorite post this month was by
Divyendu Singh about real-time
monitoring.

tweet-1775832353572544681 (1).png

See it here