Optimize Kafka Client for Latency

StreamNative Cloud supports two different data streaming engines: Classic Engine and Ursa Engine. The Classic Engine uses BookKeeper for storage, providing lower latency (less than 100ms, typically single-digit milliseconds) but at a higher cost. The Ursa Engine uses object storage, offering reduced costs but with slightly higher latency (sub-second, typically in the range of 200-500ms). Based on your latency requirements, you need to choose the appropriate engine.

Note

The following latency optimization recommendations are applicable for Classic Engine clusters. However, they are not recommended for Ursa Engine clusters. Even with client-side latency optimizations, the overall latency for Ursa Engine will remain in the sub-second range due to broker-side batching.

In addition to the data streaming engine, the remaining configurations in this guide provide general recommendations for optimizing latency.

Data Format

StreamNative supports storing data in different formats to achieve varying levels of interoperability between protocols: kafka and pulsar. Each format has different performance characteristics:

kafka format: The default format for StreamNative Cloud. It provides the best performance with Kafka clients. However, Pulsar consumers cannot consume this format unless a payload processor is employed.
pulsar format: Provides the highest interoperability between protocols. However, it incurs a performance penalty as it requires transforming data from Kafka producers into the Pulsar format before storage.

If you want to achieve the lowest latency with Kafka clients and don't need Pulsar clients to read the data, consider configuring your cluster to store data in the kafka format.

Batching Messages

Producers automatically batch messages by collecting multiple messages to send together. To minimize latency when producing data to StreamNative Cloud, you can reduce the time spent waiting for batches to fill. By default, the producer is optimized for low latency with the linger.ms parameter set to 0, meaning the producer sends data as soon as it's available. While batching is always enabled—messages are always sent in batches—with linger.ms=0, a batch may contain only one message (unless messages arrive faster than the producer can send them).

Compression

Consider whether you need to enable compression. Enabling compression requires additional CPU cycles but reduces network bandwidth usage. Disabling compression (setting compression.type=none) spares CPU cycles but increases network bandwidth usage. While a good compression codec may potentially reduce latency by decreasing network transfer time, the CPU overhead of compression could offset those gains. Evaluate your specific use case - if CPU is your bottleneck, consider disabling compression; if network bandwidth is constrained, compression may help reduce overall latency.

Producer Acknowledgements

You can tune the number of acknowledgments the producer requires from the designated broker in StreamNative Cloud before considering a request complete.

Note

This producer acknowledgment is separate from when a message is considered durably committed to storage.

The sooner the designated broker responds, the sooner the producer can send the next batch of messages, reducing producer latency. You can configure this using the acks parameter:

acks=0: Producer doesn't wait for any acknowledgment, providing lowest latency but no durability guarantees
acks=1: Producer waits for acknowledgment from the designated broker after receiving at least one acknowledgment from storage
acks=all: Producer waits for acknowledgment from the designated broker after receiving all acknowledgments from storage

By default, acks=all provides the strongest durability guarantees but higher latency. For latency-sensitive applications that can tolerate potential message loss, you can set acks=0, but be aware that messages may be lost silently if broker failures occur.

Consumer Fetching

Similar to producer batching, you can tune consumers for lower latency by adjusting how much data a consumer gets from each fetch from the designated broker in StreamNative Cloud. The consumer configuration parameter fetch.min.bytes defaults to 1, which means fetch requests are answered as soon as a single byte of data is available or the fetch request times out (controlled by fetch.max.wait.ms). These two parameters work together to control both the size of fetch requests (fetch.min.bytes) and how long to wait for data (fetch.max.wait.ms).

For lowest latency, keep fetch.min.bytes at its default of 1 and reduce fetch.max.wait.ms from its default of 500ms. This ensures consumers receive data as soon as it's available, though at the cost of potentially more frequent fetch requests.

Summary

Here's a summary of key configurations for optimizing latency:

Producer Configurations

Configuration	Recommended Value	Default Value	Description
`linger.ms`	0	0	Time to wait for batches to fill
`compression.type`	`none`	`none`	Compression codec to use
`acks`	`1`	`all`	Number of acknowledgments required

Consumer Configurations

Configuration	Recommended Value	Default Value	Description
`fetch.min.bytes`	1	1	Minimum data size for fetch responses