Compacted Topic
KSN and Apache Kafka both support topic compaction, a key-based data retention mechanism. Compared with the segment-based data retention mechanism (remove the old segment based on time or size), the key-based data retention mechanism is to keep the latest value for a given key. However, there are subtle divergences in the data compaction between KSN and Apache Kafka because the KSN topic compaction mechanism is based on Apache Pulsar.
- Apache Kafka supports a
delete+compact
retention policy, which can remove the record of an old key even if it is the latest value of the key and applies the segment-based data cleanup policy(1 day by default). But KSN can't remove the compacted keys by retention time or size unless the tombstone(null value) message is written into the topic. - The Topic config
max.compaction.lag.ms
andmin.compaction.lag.ms
doesn't support. - KSN topic compaction can't work with transactions together, but it will support it in the future.
- The tombstone (the key with a null value) will be kept in Kafka for a while (setting by
delete.retention.ms
), but in Pulsar the tombstone will be removed immediately. - KSN supports manually triggering compaction and Kafka does not.
Use Compacted Topic
Configuration Broker Configuration
You must configure the following in KSN; these are already enabled by default on SN Cloud.
exposingBrokerEntryMetadataToClientEnabled=true
compactionServiceFactoryClassName=io.streamnative.pulsar.compaction.SNCompactionServiceFactory
Create Compacted Topic
To create a compact topic, you can follow the CLI Tools tutorial and use the following command line to create the compact topic:
./bin/kafka-topics.sh --create --bootstrap-server <YOUR-BOOTSTRAP-SERVER-ADDRESS> --replication-factor 1 --partitions 1 --topic my_compact_topic --config "cleanup.policy=compact"
Configure Compaction Policy
You can change the compaction-threshold
policy to control how often compression is triggered (default 100MB) it specifies how large the topic backlog can grow before compaction is triggered, or you can manually trigger compaction using the Pulsar administrative API. For more information, see Topic Compaction Cookbook.