KSN and Apache Kafka both support topic compaction, a key-based data retention mechanism. Compared with the segment-based data retention mechanism (remove the old segment based on time or size), the key-based data retention mechanism is to keep the latest value for a given key. However, there are subtle divergences in the data compaction between KSN and Apache Kafka because the KSN topic compaction mechanism is based on Apache Pulsar.
- Apache Kafka supports a
delete+compactretention policy, which can remove the record of an old key even if it is the latest value of the key and applies the segment-based data cleanup policy(1 day by default). But KSN can't remove the compacted keys by retention time or size unless the tombstone(null value) message is written into the topic.
- The Topic config
- KSN topic compaction can't work with transactions together, but it will support it in the future.
Use Compacted Topic
To create a compact topic, you can follow the CLI Tools tutorial and use the following command line to create the compact topic:
./bin/kafka-topics.sh --create --bootstrap-server <YOUR-BOOTSTRAP-SERVER-ADDRESS> --replication-factor 1 --partitions 1 --topic my_compact_topic --config "cleanup.policy=compact"