Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.streamnative.io/llms.txt

Use this file to discover all available pages before exploring further.

The Variant type allows a single column to hold values of different data types, enabling flexible handling of semi-structured data without defining a rigid schema upfront. StreamNative Ursa supports the Variant type for both Apache Iceberg (V3) and Delta Lake tables.
Important: Variant type support is disabled by default. Contact the StreamNative Support Team to enable the feature flag before using Variant types.

Enabling Variant Support

Variant support is gated by a small set of broker properties. The required combination depends on the target table format.
PropertyDefaultDescription
variantTypeEnabledfalseMaster switch for Variant support. Required for both Iceberg and Delta.
tableEvolveSchemaEnabledtrueSchema evolution must remain enabled so Variant fields can be added/removed during writes. Required for both Iceberg and Delta.
allowIcebergV3falseEnables Iceberg V3 features (including Variant). Required for Iceberg, ignored for Delta.

Iceberg

Set the following properties on the compaction service custom config:
variantTypeEnabled: "true"
tableEvolveSchemaEnabled: "true"   # default; only override if previously disabled
allowIcebergV3: "true"
Downstream query engine compatibility: When allowIcebergV3 is enabled, the downstream query engine reading the table must also support Iceberg V3. Older Spark / Trino / Athena versions that only support Iceberg V2 will fail to read tables that use Variant or other V3-only features. Verify your engine’s Iceberg support level before enabling.

Delta Lake

Delta Lake’s Variant type is not gated by an Iceberg version flag. Only the master switch and schema evolution are required:
variantTypeEnabled: "true"
tableEvolveSchemaEnabled: "true"   # default; only override if previously disabled

Supported Data Types

  • Primitives: string, int, long, float, double, boolean, bytes
  • Complex types: map, list / array, set
  • Nested POJOs and entire POJOs

Configure Variant in Pulsar

Avro Schema

Use the @AvroSchema annotation with logicalType: "variant":
@Data
public class Event {
    private String name;

    // Variant for primitive type
    @AvroSchema("{\"type\": \"string\", \"logicalType\": \"variant\"}")
    private String flexibleField;

    // Variant for Map
    @AvroSchema("{\"type\": \"map\", \"values\": \"string\", \"logicalType\": \"variant\"}")
    private Map<String, String> metadata;

    // Variant for List
    @AvroSchema("{\"type\": \"array\", \"items\": \"string\", \"logicalType\": \"variant\"}")
    private List<String> tags;

    // Variant for nested POJO with metadata fields for query optimization
    @AvroSchema("{\"type\": \"record\", \"name\": \"Address\", \"fields\": ["
            + "  {\"name\": \"city\", \"type\": \"string\"},"
            + "  {\"name\": \"zip\", \"type\": \"int\"}"
            + "],"
            + "\"logicalType\": \"variant\","
            + "\"variant-metadata-fields\": \"[\\\"zip\\\", \\\"city\\\"]\" "
            + "}")
    private Address address;
}

JSON Schema

Use @JsonPropertyDescription with the Variant annotation:
@Data
public class Event {
    private String name;

    @JsonPropertyDescription("logicalType: variant")
    private String flexibleField;

    @JsonPropertyDescription("logicalType: variant")
    private Map<String, String> metadata;

    @JsonPropertyDescription("logicalType: variant")
    private List<String> tags;

    @JsonPropertyDescription("logicalType: variant")
    private Address address;
}

ProtobufNative Schema

Define a custom field option named logical_type:
syntax = "proto3";

import "google/protobuf/descriptor.proto";

extend google.protobuf.FieldOptions {
  string logical_type = 1001;
}

message Event {
  string name = 1;
  string flexible_field = 2 [(logical_type) = "variant"];
  map<string, string> metadata = 3 [(logical_type) = "variant"];
  repeated string tags = 4 [(logical_type) = "variant"];
  Address address = 5 [(logical_type) = "variant"];
}

message Address {
  string city = 1;
  int32 zip = 2;
}

Configure Variant in Ursa (Kafka Protocol)

Avro Schema — POJO Annotation

Same @AvroSchema annotations as Pulsar. Produce data using ReflectionAvroSerializer (do not use KafkaAvroSerializer).

Avro Schema — Inline Definition

Define logicalType: "variant" directly in the Avro schema:
{
  "type": "record",
  "name": "Event",
  "fields": [
    {
      "name": "id",
      "type": ["null", "string"],
      "default": null
    },
    {
      "name": "score",
      "type": {
        "type": "double",
        "logicalType": "variant"
      }
    },
    {
      "name": "tags",
      "type": {
        "type": "array",
        "items": "string",
        "logicalType": "variant"
      },
      "default": []
    },
    {
      "name": "attributes",
      "type": {
        "type": "map",
        "values": "string",
        "logicalType": "variant"
      },
      "default": {}
    },
    {
      "name": "address",
      "type": {
        "type": "record",
        "name": "Address",
        "fields": [
          {"name": "street", "type": "string"},
          {"name": "city", "type": "string"}
        ],
        "logicalType": "variant",
        "variant-metadata-fields": "[\"street\", \"city\"]"
      }
    }
  ]
}

JSON Schema

Same @JsonPropertyDescription("logicalType: variant") annotations as Pulsar.

Protobuf Schema

Protobuf is supported via the same logical_type custom field option as Pulsar’s ProtobufNative schema. Define the option once in your .proto file and tag each Variant field:
syntax = "proto3";

import "google/protobuf/descriptor.proto";

extend google.protobuf.FieldOptions {
  string logical_type = 1001;
}

message Event {
  string name = 1;
  string flexible_field = 2 [(logical_type) = "variant"];
  map<string, string> metadata = 3 [(logical_type) = "variant"];
  repeated string tags = 4 [(logical_type) = "variant"];
  Address address = 5 [(logical_type) = "variant"];
}

message Address {
  string city = 1;
  int32 zip = 2;
}

Performance Optimization

Use variant-metadata-fields to specify fields that should be extracted as top-level columns. This accelerates query performance by enabling predicate pushdown on those fields:
"variant-metadata-fields": "[\"zip\", \"city\"]"

Schema Evolution Rules for Variant

OperationSupported
Adding new Variant fieldsYes
Removing existing Variant fieldsYes
Converting non-Variant field to VariantNo (messages sent to Dead Letter Table)
Converting Variant field to another typeNo (messages sent to Dead Letter Table)
For more information about the Dead Letter Table (DLT), see Schema Evolution — Dead Letter Table.