1. Build Applications
  2. Kafka Clients
  3. Optimize and Tune

Optimize Kafka Client for Latency

StreamNative Cloud supports two different data streaming engines: Classic Engine and Ursa Engine. The Classic Engine uses BookKeeper for storage, providing lower latency (less than 100ms, typically single-digit milliseconds) but at a higher cost. The Ursa Engine uses object storage, offering reduced costs but with slightly higher latency (sub-second, typically in the range of 200-500ms). Based on your latency requirements, you need to choose the appropriate engine.

Note

The following latency optimization recommendations are applicable for Classic Engine clusters. However, they are not recommended for Ursa Engine clusters. Even with client-side latency optimizations, the overall latency for Ursa Engine will remain in the sub-second range due to broker-side batching.

In addition to the data streaming engine, the remaining configurations in this guide provide general recommendations for optimizing latency.

Data Format

StreamNative supports storing data in different formats to achieve varying levels of interoperability between protocols: kafka, mixed_kafka, and pulsar. Each format has different performance characteristics:

  • kafka format: The default format for StreamNative Cloud. It provides the best performance with Kafka clients. However, Pulsar consumers cannot consume this format unless a payload processor is employed.
  • mixed_kafka format: Functions similarly to the kafka format and supports some non-official Kafka clients for encoding or decoding Kafka messages. Offers moderate performance.
  • pulsar format: Provides the highest interoperability between protocols. However, it incurs a performance penalty as it requires transforming data from Kafka producers into the Pulsar format before storage.

If you want to achieve the lowest latency with Kafka clients and don't need Pulsar clients to read the data, consider configuring your cluster to store data in the kafka format.

Batching Messages

Producers automatically batch messages by collecting multiple messages to send together. To minimize latency when producing data to StreamNative Cloud, you can reduce the time spent waiting for batches to fill. By default, the producer is optimized for low latency with the linger.ms parameter set to 0, meaning the producer sends data as soon as it's available. While batching is always enabled—messages are always sent in batches—with linger.ms=0, a batch may contain only one message (unless messages arrive faster than the producer can send them).

Compression

Consider whether you need to enable compression. Enabling compression requires additional CPU cycles but reduces network bandwidth usage. Disabling compression (setting compression.type=none) spares CPU cycles but increases network bandwidth usage. While a good compression codec may potentially reduce latency by decreasing network transfer time, the CPU overhead of compression could offset those gains. Evaluate your specific use case - if CPU is your bottleneck, consider disabling compression; if network bandwidth is constrained, compression may help reduce overall latency.

Producer Acknowledgements

You can tune the number of acknowledgments the producer requires from the designated broker in StreamNative Cloud before considering a request complete.

Note

This producer acknowledgment is separate from when a message is considered durably committed to storage.

The sooner the designated broker responds, the sooner the producer can send the next batch of messages, reducing producer latency. You can configure this using the acks parameter:

  • acks=0: Producer doesn't wait for any acknowledgment, providing lowest latency but no durability guarantees
  • acks=1: Producer waits for acknowledgment from the designated broker after receiving at least one acknowledgment from storage
  • acks=all: Producer waits for acknowledgment from the designated broker after receiving all acknowledgments from storage

By default, acks=all provides the strongest durability guarantees but higher latency. For latency-sensitive applications that can tolerate potential message loss, you can set acks=0, but be aware that messages may be lost silently if broker failures occur.

Consumer Fetching

Similar to producer batching, you can tune consumers for lower latency by adjusting how much data a consumer gets from each fetch from the designated broker in StreamNative Cloud. The consumer configuration parameter fetch.min.bytes defaults to 1, which means fetch requests are answered as soon as a single byte of data is available or the fetch request times out (controlled by fetch.max.wait.ms). These two parameters work together to control both the size of fetch requests (fetch.min.bytes) and how long to wait for data (fetch.max.wait.ms).

For lowest latency, keep fetch.min.bytes at its default of 1 and reduce fetch.max.wait.ms from its default of 500ms. This ensures consumers receive data as soon as it's available, though at the cost of potentially more frequent fetch requests.

Summary

Here's a summary of key configurations for optimizing latency:

Producer Configurations

ConfigurationRecommended ValueDefault ValueDescription
linger.ms00Time to wait for batches to fill
compression.typenonenoneCompression codec to use
acks1allNumber of acknowledgments required

Consumer Configurations

ConfigurationRecommended ValueDefault ValueDescription
fetch.min.bytes11Minimum data size for fetch responses
Previous
Throughput