Documentation Index
Fetch the complete documentation index at: https://docs.streamnative.io/llms.txt
Use this file to discover all available pages before exploring further.
The Persist Extra Metadata feature injects a __meta column into the lakehouse table that contains additional message metadata captured at write time. Use it when downstream queries need access to information that is part of the message envelope but not part of the message body — for example, the message offset, publish time, event time, or producer-side properties.
What Gets Persisted
When the feature is enabled, every record written to the lakehouse table includes a __meta column populated with the following fields:
| Field | Description |
|---|
__messageOffset | The original Pulsar / Kafka message ID or offset |
__publishTime | Pulsar publish timestamp (epoch ms) |
__eventTime | Pulsar event time (epoch ms), if set by the producer |
__schemaVersion | Numeric schema version of the source message |
__properties | Producer-supplied key/value properties (Pulsar message properties / Kafka headers) |
The column is stored as an Iceberg / Delta Variant value. This keeps the schema stable when properties evolve (new keys, removed keys) and lets downstream engines extract individual fields with standard Variant accessors (for example, __meta.__publishTime in Spark SQL).
Configuration
| Property | Default | Description |
|---|
persistExtraMetadata | false | Master switch. When true, the __meta column is added to the table and populated for every record. |
Required Companion Settings
Because __meta is a Variant column, the feature shares the Variant type prerequisites:
| Property | Default | Required When |
|---|
variantTypeEnabled | false | Always (master switch for Variant support) |
tableEvolveSchemaEnabled | true | Always (the column is added by schema evolution) |
allowIcebergV3 | false | Required for Iceberg (Variant is an Iceberg V3 feature). Ignored for Delta. |
Important: Variant support is gated by a feature flag. Contact the StreamNative Support Team to enable it before turning on persistExtraMetadata.
Iceberg
Add the following to the compaction service custom config:
persistExtraMetadata: "true"
variantTypeEnabled: "true"
tableEvolveSchemaEnabled: "true" # default; only override if previously disabled
allowIcebergV3: "true"
Downstream query engine compatibility: When allowIcebergV3 is enabled, your readers (Spark, Trino, Athena, etc.) must support Iceberg V3 to read tables that contain Variant columns. See Variant Type for details.
Delta Lake
persistExtraMetadata: "true"
variantTypeEnabled: "true"
tableEvolveSchemaEnabled: "true"
Delta does not require allowIcebergV3.
Once enabled, the __meta column appears in the lakehouse table alongside the user-defined fields. Examples:
Spark SQL (Iceberg):
SELECT __meta:__publishTime AS publish_time,
__meta:__messageOffset AS offset,
__meta:__properties:trace_id AS trace_id,
*
FROM iceberg_catalog.namespace.events
WHERE __meta:__publishTime >= 1700000000000
LIMIT 10;
Spark SQL (Delta):
SELECT variant_get(__meta, '$.__publishTime', 'long') AS publish_time,
variant_get(__meta, '$.__messageOffset', 'string') AS offset,
*
FROM delta.`s3://bucket/path/events`
LIMIT 10;
The exact Variant accessor syntax depends on your engine version; consult the engine’s Variant documentation for the canonical form.
Behavior Notes
- Adding metadata to existing tables. Because the column is added through schema evolution (
tableEvolveSchemaEnabled=true), enabling the flag on a topic that already has a lakehouse table appends the __meta column on the next compaction. Records written before the flag was enabled will have a null value in the column.
- Disabling the feature. Setting
persistExtraMetadata back to false stops new records from receiving metadata, but the column itself is not removed. Older rows retain their values.
- Performance impact. The
__meta column is small per-row but increases storage and write throughput slightly. The Variant encoding is efficient and supports predicate pushdown when the engine extracts a specific field.
- Variant Type — Prerequisite feature for
persistExtraMetadata
- Schema Evolution — The mechanism by which the
__meta column is added to the table
- Persist Key — Companion feature that persists the message key as a separate column