Skip to main content
This page provides instructions for setting up catalog integration with Databricks Unity Catalog for Apache Iceberg on AWS.

Introduction

StreamNative’s integration with managed Iceberg tables in Unity Catalog enables organizations to seamlessly deliver real-time data into governed, open lakehouse environments. By leveraging the Iceberg REST catalog protocol, this integration ensures strong schema enforcement, lineage tracking, and security—making streaming data AI- and analytics-ready while simplifying operations for data teams.This document primarily focuses on the setup steps for AWS.

Setup Databricks

Before initiating the integration of Databricks with StreamNative Cloud, please ensure the following prerequisites are fulfilled. You can also watch this video to learn more about Preparing Databricks Account (AWS Example)

Setup storage location

Before creating managed Iceberg tables in Unity Catalog, you need to configure a storage location where table data will be stored. A storage location is a secure path in cloud object storage (e.g., an S3 bucket in AWS) that Unity Catalog uses to manage table files and metadata. Setting this up ensures that all data written through StreamNative is organized, governed, and accessible within your chosen cloud environment.

Create a storage bucket

Create an Amazon S3 bucket that will serve as the storage location for topics data managed through Unity Catalog. Creat storage bucket databricks

Create an IAM role

To allow Unity Catalog to access the S3 storage bucket, you must create an AWS IAM role with the necessary permissions. This role should include a trust relationship with Databricks and grant access policies that allow read and write operations on the designated S3 bucket. By configuring the IAM role correctly, you ensure secure and controlled access to your topics data while maintaining compliance with AWS security best practices. Creat storage bucket databricks

Create a policy for storage bucket permissions and attach to role

Next, create an IAM policy that grants the required permissions on the S3 bucket, such as s3:PutObject, s3:GetObject, s3:DeleteObject, and s3:ListBucket. This policy ensures that Unity Catalog can securely read from and write data to the storage location. Once created, attach the policy to the IAM role defined in the previous step so that Databricks can assume the role and access the bucket with the appropriate level of permissions. Creat storage bucket databricks

Create a policy for storage bucket notifications and attach to role

In addition to read and write access, Unity Catalog requires permissions to manage bucket notifications so that updates to table data can be tracked in real time. Create a separate IAM policy (or extend the existing one) to include actions such as s3:GetBucketNotification, s3:PutBucketNotification, and s3:ListBucket. Attaching this policy to the IAM role ensures that Unity Catalog can configure and read bucket event notifications, enabling proper synchronization and governance of managed Iceberg tables. Creat storage bucket databricks

Create external location

Right-click on the catalog, select the gear icon to open Settings, and then choose Create External Location. Creat storage bucket databricks Select the Manual option. Creat storage bucket databricks

Enter external location details

Enter the following details and click Create
  • Name for the external location
  • Set the storage type to S3
  • Provide the S3 bucket URL
  • Specify the IAM role ARN
Creat storage bucket databricks Creating an external location opens a window to configure the IAM role, as shown in the image below. Before clicking the IAM role configured button, copy the ExternalId from this window and update it in the trust policy of the IAM role. Creat storage bucket databricks

Update trust policy

Open a new browser window and navigate to the IAM role’s Trust Policy section and update the ExternalId, as shown in the image below. Creat storage bucket databricks After updating the trust policy, click the IAM role configured button to create the external location. Creat storage bucket databricks

Test connection

Once the external location is created, click Test Connection to validate the configuration. Creat storage bucket databricks Creat storage bucket databricks

Create and configure catalog

Create a new catalog in Databricks by entering a name and selecting Standard from the type dropdown. Creat storage bucket databricks Select the external location created in the previous step and enter the sub path and click Create. Creat storage bucket databricks Once the catalog is created, click View catalog. Creat storage bucket databricks

Grant permissions for the catalog. EXTERNAL_USE_SCHEMA permission is required for Iceberg Managed Table in Unity Catalog

Click the catalog, then open the Permissions tab and then click on Grant. Creat storage bucket databricks Select the privileges as shown in the image below, ensuring that the External Use Schema option is enabled and click on Confirm. Creat storage bucket databricks The assigned privileges will be visible as shown below. Creat storage bucket databricks

Create OAuth2 credentials for client authentication

Within Databricks settings click on Identity & Access tab to create a Service Principal Creat storage bucket databricks Click Add service principal to create OAuth2 client id and secret. Creat storage bucket databricks

Enable external data access for the metastore

Click on Metastore Creat storage bucket databricks Enable External data access Creat storage bucket databricks

Setup Catalog Integration In StreamNative Cloud

Enable catalog integration

Catalog integration is supported in both StreamNative Ursa and Classic engines across Serverless, Dedicated, and BYOC deployment options.

Storage location

When enabling catalog integration, the option to configure a storage location is available in the StreamNative Ursa engine on BYOC deployments. This option is not supported for the Classic engine in Serverless or Dedicated deployments. Users can choose either a StreamNative-provided bucket or their own bucket.For a StreamNative-provided bucket, users can view the bucket location where data is stored in Ursa format. Creat storage bucket databricks For configure your own bucket user can enter the following details
  • AWS role ARN
  • Cloud provider region
  • Storage bucket name
  • Storage bucket path
Creat storage bucket databricks

Catalog integration in Ursa engine - BYOC deployment

Click to enable Lakehouse Table, choose Databricks Unity Catalog for Iceberg as the catalog provider, and select a catalog from the dropdown. You may also register a new catalog directly from the dropdown. Creat storage bucket databricks

Catalog Integration in Classic engine - Serverless, Dedicated deployments

Click to enable Lakehouse Table, choose Databricks Unity Catalog for Iceberg as the catalog provider, and select a catalog from the dropdown. You may also register a new catalog directly from the dropdown.
Note : Currently the Databricks Unity Catalog for Delta is not available in Classic engine.
Creat storage bucket databricks

Lakehouse Settings

To apply the catalog settings to all topics, enable the option and select Deliver to External Table.
Note : The option to Expose Internal Table is currently not available.
Creat storage bucket databricks

Query Data in Databricks

Once the iceberg tables are published in Databricks Unity Catalog, users can query it using the SQL Editor or explore it using Genie. Creat storage bucket databricks