Partition Key - StreamNative Documentation

Partition keys control how data is organized into partitions within the Iceberg table. Partitioning improves query performance by enabling partition pruning. partition.key is a dynamic configuration key that takes effect only at the topic level. Setting it at the cluster or namespace level has no effect.

Cluster-name prefix: All dynamic configuration keys must be prefixed with the cluster name (for example, <cluster-name>.partition.key). The cluster name is the value of clusterName in conf/broker.conf — see Finding the Cluster Name. The examples below use private-cloud as the cluster name; replace it with the name of your cluster.

Cardinality limit: Keep the total number of partition values across all levels under 10 (the cardinality of key1 × key2 × ... × keyN should not exceed 10). If the partitioning would produce more than 10 distinct partition values, use the bucket[N] transform to bound it. Excessive partitions cause many small files, which degrade write throughput and query performance.

Configuration Format

The partition key is specified as a JSON array. Each element has three fields:

Field	Required	Description
`sourceColumn`	Yes	The field name from the topic schema
`transform`	No	Iceberg partition transform function. Defaults to `identity`.
`targetName`	No	Custom name for the transformed partition column

Supported Iceberg Transforms

Transform	Description
`identity`	Use the field value as-is
`bucket[N]`	Hash into N buckets
`truncate[N]`	Truncate strings to N characters
`year`	Extract year from timestamp
`month`	Extract month from timestamp
`day`	Extract day from timestamp
`hour`	Extract hour from timestamp

For full semantics, see the Iceberg partition transforms specification.

Apply at Topic Level

bin/pulsar-admin topics update-properties \
  -p private-cloud.partition.key='[{"sourceColumn":"<field>","transform":"<transform>","targetName":"<name>"}]' \
  persistent://<tenant>/<namespace>/<topic>

Setting partition.key at the cluster (sn/system) or namespace level has no effect. Apply it on the topic only.

Example

Configure two partition keys on a topic:

timestamp — bucketed by hour, named ts_hour
address — truncated to 7 characters, named t_address

bin/pulsar-admin topics update-properties \
  -p private-cloud.partition.key="[{\"sourceColumn\":\"timestamp\",\"transform\":\"hour\",\"targetName\":\"ts_hour\"},{\"sourceColumn\":\"address\",\"transform\":\"truncate[7]\",\"targetName\":\"t_address\"}]" \
  persistent://public/default/events

Important Notes

The sourceColumn value must reference a field that exists in the topic schema.
The targetName is the name produced after applying the transform; it does not need to exist in the topic schema.
The JSON value must be a valid JSON array. When passing it on the shell, escape inner double quotes (\").
If the JSON cannot be parsed, the system falls back to a non-partitioned table.
Keep the total cardinality of partition values under 10. If a column has high cardinality, wrap it with bucket[N] to bound the number of partitions (for example, {"sourceColumn":"userId","transform":"bucket[8]"}). High partition counts produce many small files and degrade performance.

Dynamic Configuration Guide — Cluster-name prefix, override priority, and apply procedure
Upsert — Combining partition keys with upsert has compatibility constraints

Documentation Index

​Configuration Format

​Supported Iceberg Transforms

​Apply at Topic Level

​Example

​Important Notes

​Related

Configuration Format

Supported Iceberg Transforms

Apply at Topic Level

Example

Important Notes

Related