Documentation Index
Fetch the complete documentation index at: https://docs.streamnative.io/llms.txt
Use this file to discover all available pages before exploring further.
After preparing your external catalog, configure the compaction service to connect to it. Catalog configuration is added to the compactionScheduler.config.custom section of the PulsarBroker YAML.
Multi-Catalog Architecture
StreamNative supports configuring multiple catalogs simultaneously. Different namespaces or topics can route data to different catalogs.
Configuration Pattern
- Iceberg catalogs:
iceberg.catalog.<catalog-name>.<property>
- Delta catalogs:
delta.catalog.<catalog-name>.<property>
Default Catalog
Set the default catalog used when no topic or namespace override is specified:
custom:
catalog.default: <catalog-name>
Catalog Resolution Priority
The catalog used for a topic is resolved in this order:
Topic property (catalog.name)
↓ (if not set)
Namespace property (catalog.name)
↓ (if not set)
Default catalog (catalog.default)
See Enable Lakehouse Integration for how to assign catalogs at namespace and topic level.
Iceberg Catalogs
Unity Catalog (Managed Iceberg Table)
compactionScheduler:
config:
custom:
catalog.default: <catalog-name>
iceberg.catalog.<catalog-name>.catalog-backend: "UNITYCATALOG"
iceberg.catalog.<catalog-name>.type: "rest"
iceberg.catalog.<catalog-name>.uri: "https://<workspace-url>/api/2.1/unity-catalog/iceberg-rest"
iceberg.catalog.<catalog-name>.warehouse: "<catalog-name-in-databricks>"
iceberg.catalog.<catalog-name>.credential: "<access-token>"
iceberg.catalog.<catalog-name>.oauth2-server-uri: "https://<workspace-url>/oidc/v1/token"
iceberg.catalog.<catalog-name>.scope: "all-apis"
iceberg.catalog.<catalog-name>.security: "OAUTH2"
iceberg.catalog.<catalog-name>.vended-credentials-enabled: "true"
iceberg.catalog.<catalog-name>.token-refresh-enabled: "true"
| Property | Description |
|---|
catalog-backend | UNITYCATALOG |
type | rest |
uri | Databricks workspace URL |
warehouse | Catalog name created in Databricks |
credential | Databricks access token |
oauth2-server-uri | Databricks oauth2 service uri |
scope | all-apis |
security | OAUTH2 |
vended-credentials-enabled | true |
token-refresh-enabled | true |
Snowflake Horizon Catalog
compactionScheduler:
config:
custom:
catalog.default: <catalog-name>
iceberg.catalog.<catalog-name>.catalog-backend: "HORIZON"
iceberg.catalog.<catalog-name>.type: "rest"
iceberg.catalog.<catalog-name>.uri: "https://<org>-<account>.snowflakecomputing.com/polaris/api/catalog"
iceberg.catalog.<catalog-name>.credential: "<PAT-token>"
iceberg.catalog.<catalog-name>.scope: "session:role:<role>"
iceberg.catalog.<catalog-name>.warehouse: "<database-name>"
iceberg.catalog.<catalog-name>.header.X-Iceberg-Access-Delegation: "vended-credentials"
iceberg.catalog.<catalog-name>.token-refresh-enabled: "true"
| Property | Description |
|---|
catalog-backend | HORIZON |
uri | Snowflake Horizon REST API endpoint |
credential | PAT token |
scope | Snowflake role scope (e.g., session:role:PUBLIC) |
warehouse | Snowflake database name |
header.X-Iceberg-Access-Delegation | vended-credentials (required) |
token-refresh-enabled | true (recommended) |
Snowflake Open Catalog (Polaris)
compactionScheduler:
config:
custom:
catalog.default: <catalog-name>
iceberg.catalog.<catalog-name>.catalog-backend: "POLARIS"
iceberg.catalog.<catalog-name>.type: "rest"
iceberg.catalog.<catalog-name>.uri: "https://<account>.snowflakecomputing.com/polaris/api/catalog"
iceberg.catalog.<catalog-name>.credential: "<client-id>:<client-secret>"
iceberg.catalog.<catalog-name>.warehouse: "<catalog-name>"
iceberg.catalog.<catalog-name>.header.X-Iceberg-Access-Delegation: "vended-credentials"
iceberg.catalog.<catalog-name>.scope: "PRINCIPAL_ROLE:ALL"
iceberg.catalog.<catalog-name>.token-refresh-enabled: "true"
| Property | Description |
|---|
catalog-backend | POLARIS |
credential | Client ID and secret in <id>:<secret> format |
warehouse | Polaris catalog name |
header.X-Iceberg-Access-Delegation | vended-credentials |
scope | PRINCIPAL_ROLE:ALL |
token-refresh-enabled | true |
AWS S3Table
compactionScheduler:
config:
custom:
catalog.default: <catalog-name>
iceberg.catalog.<catalog-name>.catalog-backend: "S3TABLE"
iceberg.catalog.<catalog-name>.type: "rest"
iceberg.catalog.<catalog-name>.rest.sigv4-enabled: "true"
iceberg.catalog.<catalog-name>.rest.signing-name: "s3tables"
iceberg.catalog.<catalog-name>.rest.signing-region: "<region>"
iceberg.catalog.<catalog-name>.uri: "https://s3tables.<region>.amazonaws.com/iceberg"
iceberg.catalog.<catalog-name>.warehouse: "arn:aws:s3tables:<region>:<account>:bucket/<bucket-name>"
iceberg.catalog.<catalog-name>.rest-metrics-reporting-enabled: "false"
| Property | Description |
|---|
catalog-backend | S3TABLE |
rest.sigv4-enabled | true (required for AWS SigV4 auth) |
rest.signing-name | s3tables |
rest.signing-region | AWS region of the S3Table bucket |
uri | S3Tables REST endpoint (varies by region) |
warehouse | S3Table bucket ARN |
rest-metrics-reporting-enabled | false (S3Table does not support metric reporting) |
Important: The Ursa cluster must run in the same region as the S3Table bucket.
Google BigLake
compactionScheduler:
config:
custom:
catalog.default: <catalog-name>
iceberg.catalog.<catalog-name>.catalog-backend: "BIGLAKE"
iceberg.catalog.<catalog-name>.type: "rest"
iceberg.catalog.<catalog-name>.uri: "https://biglake.googleapis.com/iceberg/v1/restcatalog"
iceberg.catalog.<catalog-name>.warehouse: "gs://<bucket-name>"
iceberg.catalog.<catalog-name>.header.x-goog-user-project: "<gcp-project-id>"
iceberg.catalog.<catalog-name>.rest.auth.type: "org.apache.iceberg.gcp.auth.GoogleAuthManager"
iceberg.catalog.<catalog-name>.io-impl: "org.apache.iceberg.gcp.gcs.GCSFileIO"
iceberg.catalog.<catalog-name>.rest-metrics-reporting-enabled: "false"
iceberg.catalog.<catalog-name>.header.X-Iceberg-Access-Delegation: "vended-credentials"
| Property | Description |
|---|
catalog-backend | BIGLAKE |
warehouse | GCS bucket path from BigLake catalog properties |
header.x-goog-user-project | GCP project ID from BigLake catalog properties |
rest.auth.type | org.apache.iceberg.gcp.auth.GoogleAuthManager (fixed) |
io-impl | org.apache.iceberg.gcp.gcs.GCSFileIO (fixed) |
header.X-Iceberg-Access-Delegation | vended-credentials (fixed) |
Delta Lake Catalogs
Unity Catalog (Delta)
compactionScheduler:
config:
custom:
catalog.default: <catalog-name>
delta.catalog.<catalog-name>.unityCatalogUri: "https://<workspace-url>"
delta.catalog.<catalog-name>.unityCatalogName: "<catalog-name-in-databricks>"
delta.catalog.<catalog-name>.unityCatalogToken: "<access-token>"
Authentication Options
Token-based (recommended):
delta.catalog.<catalog-name>.unityCatalogToken: "<token>"
# OR from file:
delta.catalog.<catalog-name>.unityCatalogTokenFile: "/path/to/token/file"
OAuth2 (machine-to-machine):
delta.catalog.<catalog-name>.unityCatalogClientId: "<client-id>"
delta.catalog.<catalog-name>.unityCatalogClientSecret: "<client-secret>"
BYOL (Bring Your Own Lakehouse)
Enable managed commit support for Unity Catalog:
# Delta Lake
delta.catalog.<catalog-name>.unityCatalogByolEnabled: "true"
# Iceberg
iceberg.catalog.<catalog-name>.unityCatalogByolEnabled: "true"
Without Catalog (Direct Bucket)
If you do not need an external catalog service, data can be written directly to the object storage bucket.
Required permissions: When no external catalog is used, the compaction-scheduler pod’s IAM role (AWS), service account (GCP), or workload identity (Azure) must have read, write, create, and list permissions on the target bucket. Without an external catalog, the compaction service interacts with object storage directly to create namespaces, write metadata, list existing files, and read prior snapshots.
Examples of the required permissions per cloud:
| Cloud | Permissions |
|---|
| AWS S3 | s3:GetObject, s3:PutObject, s3:DeleteObject, s3:ListBucket, s3:GetBucketLocation on the warehouse bucket and prefix |
| GCS | storage.buckets.get, storage.objects.get, storage.objects.list, storage.objects.create, storage.objects.delete (or the Storage Object Admin role) |
| Azure Blob / ADLS | Storage Blob Data Contributor on the container |
Iceberg (Hadoop Catalog)
The default Hadoop catalog writes Iceberg metadata and data files directly to the configured storage path. No external catalog service is required.
compactionScheduler:
config:
lakehouseType: iceberg
catalog.default: <catalog-name>
iceberg.catalog.<catalog-name>.type: "hadoop"
iceberg.catalog.<catalog-name>.warehouse: "<bucket>/suffix"
streamTableMode: "EXTERNAL"
Delta (No Unity Catalog)
Delta tables are written directly to the configured storage path without Unity Catalog integration.
compactionScheduler:
config:
catalog.default: <catalog-name>
lakehouseType: delta
delta.catalog.<catalog-name>.directExternalStoragePath: "<bucket>/suffix"
streamTableMode: "EXTERNAL"
Multi-Catalog Example
Configure two catalogs (one Polaris, one S3Table) and set a default:
compactionScheduler:
config:
custom:
# Default catalog
catalog.default: polaris-prod
# Catalog 1: Snowflake Open Catalog (Polaris)
iceberg.catalog.polaris-prod.catalog-backend: "POLARIS"
iceberg.catalog.polaris-prod.type: "rest"
iceberg.catalog.polaris-prod.uri: "https://xyz.snowflakecomputing.com/polaris/api/catalog"
iceberg.catalog.polaris-prod.credential: "<client-id>:<client-secret>"
iceberg.catalog.polaris-prod.warehouse: "prod-catalog"
# Catalog 2: AWS S3Table
iceberg.catalog.s3table-analytics.catalog-backend: "S3TABLE"
iceberg.catalog.s3table-analytics.type: "rest"
iceberg.catalog.s3table-analytics.rest.sigv4-enabled: "true"
iceberg.catalog.s3table-analytics.rest.signing-name: "s3tables"
iceberg.catalog.s3table-analytics.rest.signing-region: "us-east-2"
iceberg.catalog.s3table-analytics.uri: "https://s3tables.us-east-2.amazonaws.com/iceberg"
iceberg.catalog.s3table-analytics.warehouse: "arn:aws:s3tables:us-east-2:123456789:bucket/analytics"
iceberg.catalog.s3table-analytics.rest-metrics-reporting-enabled: "false"
# Configure SDT
streamTableMode: "EXTERNAL"
# Configure to use Iceberg
lakehouseType: "ICEBERG"
Then assign catalogs per namespace or topic:
# Use default (polaris-prod) for all topics in the namespace
pulsar-admin namespaces set-property -k catalog.name -v polaris-prod public/default
# Override for a specific topic to use S3Table
pulsar-admin topics update-properties \
-p catalog.name=s3table-analytics \
persistent://public/default/analytics-topic
Limitations
- A namespace or topic can reference only one catalog at a time
- You can assign different catalogs to different topics or namespaces
- You cannot assign multiple catalogs to a single topic or namespace
Next Steps