kafka
and pulsar
. Each format has distinct characteristics:
kafka
format: This provides the best performance; however, a Pulsar consumer cannot consume it unless a payload processor is employed.pulsar
format: This is the default data entry format on StreamNative cloud which supporting interoperability between Kafka and Pulsar clients, including Kafka client to Kafka client, and Pulsar client to Kafka client interactions, and vice versa. This means data between Apache Kafka and Apache Pulsar are interoperable. It is suitable for most scenarios where performance is not a critical consideration.pulsar_non_batched
format: It is similar to pulsar entry format, the difference is this entry format will encode the Kafka batch messages to non-batched messages for Pulsar client to consume messages with a key-shared subscription, for Ursa, the behavior will be same as pulsar format.bin/pulsar-admin topics update-properties
. The configuration key is kafkaEntryFormat
, and the possible values are kafka
or pulsar
. The default value is pulsar
if not specified.
kafka
entry format, Kafka producers can produce and consume messages directly and freely. However, Pulsar producers SHOULD NOT produce messages in these topics because they are unable to encode messages into a format consumable by Kafka clients.
However, since version 2.9
of the Pulsar client, we introduced a message payload processor for Pulsar consumers. This means that messages produced from Kafka producers can now be consumed and decoded by Pulsar consumers.
For the pulsar
format, it allows messages to be freely produced and consumed between Kafka and Pulsar clients. The message format conversion is automatically handled by the broker, enabling more flexible use of either Kafka clients or Pulsar clients.
pulsar
formatKey_Shared
subscriptionsbyte[]
in Kafka, while in Pulsar, the types of key and property value are both String
. For keys, each key will be converted to a base64-encoded string as Pulsar’s key. See the following example:
getKeyBytes()
or getOrderingKey()
to retrieve the original keys of Kafka messages. The anti-intuitive behavior is that the getKey()
method will return the base64-encoded string. This behavior is made because the byte array could vary after the bytes -> UTF-8 string -> bytes
conversion, for example:
__ksn_internal_header_format
will be received by the Pulsar consumer if there is a header value that is a base64-encoded string’s bytes.
kafkaEntryFormat=kafka
property.pulsar_non_batched
format
3.3.5.1
or 4.0.1.1
, we can use the pulsar-admin
CLI to get the KSN’s topic producer and consumer stats.
__ksn_internal_subscription
as the subscription name.{clusterName}-{generatorInstanceId}-{counter}
, and the consumer’s name is __KSN__internal_consumer_{remoteAddress}
. The subscription name is __ksn_internal_subscription
.
The internal producer and consumer will not send or consume messages, nor will they acknowledge messages or affect message retention.
You can use the following command to get the stats: