Database Amigos: October 2025

Why This Matters

ClickHouse powers Brevo’s large-scale analytical workloads — handling billions of events per day.
While it excels at fast reads and inserts, deleting data in ClickHouse is a completely different story.

In this article, we’ll explore:

Why traditional deletes slow down ClickHouse
What’s happening under the hood
Real examples from our production cluster
How we made deletes 3× faster and safer using partition-based strategies

How ClickHouse Stores Data

Think of ClickHouse as a container ship warehouse.

ClickHouse Concept	Real-World Analogy
Table	Container ship
Partition	Container on deck
Part	Carton inside the container
Row	Individual item inside a carton

Each new insert adds more cartons (parts) inside the containers (partitions).
ClickHouse is designed to append data, not rewrite it — and that’s where delete challenges begin.

Inserts Are Fast (Append-Only)

ClickHouse shines with bulk inserts:

Append-only: Data is always added, never modified in place.
Columnar storage: Each column is stored separately for better compression.
Batch-friendly: Ideal for large inserts (thousands of rows per batch).
Immediate availability: Data becomes queryable as soon as it’s inserted.

Deletes Are Slow — They Trigger Mutations

When you delete in ClickHouse, it doesn’t remove rows directly.
It creates a mutation job, which rewrites the affected parts in the background.


ALTER TABLE events.email_transactional
DELETE WHERE organization_id = 140354
  AND email = 'heiditanyh@gmail.com';

Each mutation:

Rewrites entire parts (even if only one row changes)
Blocks merges on affected partitions
Consumes heavy CPU, disk I/O, and temporary space

Partitions, Merges, and Mutations — Visualized

Merge: Background Optimization

Merges continuously combine small parts into bigger ones — improving query performance.

Mutation: Costly Rewrites

Mutations, on the other hand, must open and repack containers — blocking merges until they finish.
Only one mutation per part runs at a time, so queue buildup is inevitable during heavy delete workloads.

The Problem: Cross-Partition Deletes

Here’s an example of a problematic query we once ran:


ALTER TABLE logs.automation_workflow 
DELETE 
WHERE organization_id = 2373101
  AND event_date >= '2020-01-20'
  AND event_date <= '2021-01-31';

Why It’s Bad

Affects 13+ partitions (202001–202101)
Rewrites 20K+ parts
Triggers massive merge backlog
Keeps mutation queues busy for hours

The Fix: Partition-Scoped Deletes

Instead of deleting across all partitions, we delete one partition at a time:


ALTER TABLE logs.automation_workflow
DELETE IN PARTITION '202012'
WHERE organization_id = 2373101
  AND event_date >= '2020-12-01'
  AND event_date < '2021-01-01';

We repeat this for each month.
This drastically reduces the number of parts under mutation at once.

One partition at a time keeps merges running, avoids massive lock windows, and isolates failure risk.

Real Result: 3× Faster Execution

Before: Single wide mutation → 90 mins, heavy cluster load
After: Monthly partition deletes → 30 mins total, smooth merges
CPU and I/O dropped by ~60%
No Keeper overload or replication lag

How Mutations Replicate Across Replicas

ClickHouse-Keeper acts as the control tower for all replicas:

The mutation is recorded as an instruction in Keeper.
Each replica picks it up and executes independently.
Finished replicas can fetch mutated parts instead of reprocessing.
Lagging replicas replay missed mutation IDs later.

Monitor replication status


SELECT database, table, mutation_id, is_done, parts_to_do
FROM system.mutations
WHERE database='logs' AND table='automation_workflow_local_replicated';

Root Cause Summary – Why Mutations Slow the Cluster

Delayed Execution: Mutations queued per table & replica; depend on background load.
Frozen Parts: Merges paused → part backlog grows.
Resource Contention: Each replica rewrites full parts; heavy CPU & disk I/O.

Together, they cause long mutation queues, merge starvation, and replication lag.

Real Queries for Monitoring


-- Check active mutations
SELECT * FROM system.mutations WHERE is_done = 0;

-- Detect merge backlog
SELECT table, count() AS merging_parts FROM system.merges GROUP BY table;

-- Check replication lag
SELECT table, queue_size, last_queue_update
FROM system.replication_queue
ORDER BY queue_size DESC;

Best Practices for Safe Deletes

Principle	Recommendation
Scope by Partition	Always delete within a partition (`DELETE IN PARTITION`)
Batch Deletes	Delete by org IDs in groups (e.g., 50–200 at a time)
Off-Peak Schedule	Run during low traffic windows
Monitor Progress	Watch `system.mutations` and `system.merges`
Optimize Afterwards	`OPTIMIZE TABLE ... PARTITION ... FINAL` if needed

Key Takeaways

ClickHouse is append-optimized, not delete-friendly.
Mutations rewrite entire parts → slow and costly.
Cross-partition deletes cripple merges and replication.
Partition-scoped deletes = 3× faster, safer, cleaner.
Always monitor system.mutations and plan deletes during off-peak hours.

Tuesday, October 14, 2025