Write Quorum and Acknowledgment Quorum Size
The following configurations apply to the Classic Engine only. They do not apply to the Ursa Engine.
acks=all
or acks=-1
, two configuration parameters control message replication:
managedLedgerDefaultWriteQuorum
: Specifies the replication factor for storing messages (number of replicas)managedLedgerDefaultAckQuorum
: Specifies the minimum number of replicas that must acknowledge a write before it is considered successful
- Increase the write quorum to maintain more replicas of the data
- Increase the difference between write quorum and acknowledgment quorum sizes to improve write availability while maintaining durability guarantees
Consumer failures
Consumers in a consumer group can share processing load. If a consumer unexpectedly fails, StreamNative Cloud detects the failure and rebalances the partitions amongst the remaining consumers in the consumer group. Consumer failures can be either hard failures (for example,SIGKILL
) or soft failures (for example, expired session timeouts). These failures are detected when consumers fail to send heartbeats or poll()
calls.
Consumer liveness is maintained with a heartbeat (running in a background thread since KIP-62). The session.timeout.ms
configuration parameter dictates the timeout used to detect failed heartbeats. You can increase the session timeout to account for potential network delays and avoid soft failures.
Soft failures most commonly occur in two cases:
- When
poll()
returns a batch of messages that take too long to process - When a JVM GC pause takes too long
poll()
loop that spends significant time processing messages, you can:
- Increase
max.poll.interval.ms
to allow more time between fetching records - Reduce
max.poll.records
to decrease the batch size returned
Summary
Here’s a summary of key configurations for optimizing availability:Consumer Configurations
Configuration | Recommended Value | Default Value | Description |
---|---|---|---|
session.timeout.ms | 30000-60000 | 45000 | Time before consumer is considered failed |