Compacted Topic

KSN and Apache Kafka both support topic compaction, a key-based data retention mechanism. Compared with the segment-based data retention mechanism (remove the old segment based on time or size), the key-based data retention mechanism is to keep the latest value for a given key. However, there are subtle divergences in the data compaction between KSN and Apache Kafka because the KSN topic compaction mechanism is based on Apache Pulsar.

  • Apache Kafka supports a delete+compact retention policy, which can remove the record of an old key even if it is the latest value of the key and applies the segment-based data cleanup policy(1 day by default). But KSN can't remove the compacted keys by retention time or size unless the tombstone(null value) message is written into the topic.
  • The Topic config max.compaction.lag.ms and min.compaction.lag.ms doesn't support.
  • KSN topic compaction can't work with transactions together, but it will support it in the future.
  • The tombstone (the key with a null value) will be kept in Kafka for a while (setting by delete.retention.ms), but in Pulsar the tombstone will be removed immediately.
  • KSN supports manually triggering compaction and Kafka does not.

Use Compacted Topic

Configuration Broker Configuration

You must configure the following in KSN; these are already enabled by default on SN Cloud.

exposingBrokerEntryMetadataToClientEnabled=true
compactionServiceFactoryClassName=io.streamnative.pulsar.compaction.SNCompactionServiceFactory

Create Compacted Topic

To create a compact topic, you can follow the CLI Tools tutorial and use the following command line to create the compact topic:

./bin/kafka-topics.sh --create --bootstrap-server <YOUR-BOOTSTRAP-SERVER-ADDRESS> --replication-factor 1 --partitions 1 --topic my_compact_topic --config "cleanup.policy=compact"

Configure Compaction Policy

You can change the compaction-threshold policy to control how often compression is triggered (default 100MB) it specifies how large the topic backlog can grow before compaction is triggered, or you can manually trigger compaction using the Pulsar administrative API. For more information, see Topic Compaction Cookbook.

Previous
Schema Registry