This page provides instructions for setting up catalog integration with Databricks Unity Catalog for Apache Iceberg on AWS.
Introduction
StreamNative’s integration with managed Iceberg tables in Unity Catalog enables organizations to seamlessly deliver real-time data into governed, open lakehouse environments. By leveraging the Iceberg REST catalog protocol, this integration ensures strong schema enforcement, lineage tracking, and security—making streaming data AI- and analytics-ready while simplifying operations for data teams.This document primarily focuses on the setup steps for AWS.Setup Databricks
Before initiating the integration of Databricks with StreamNative Cloud, please ensure the following prerequisites are fulfilled. You can also watch this video to learn more about Preparing Databricks Account (AWS Example)Setup storage location
Before creating managed Iceberg tables in Unity Catalog, you need to configure a storage location where table data will be stored. A storage location is a secure path in cloud object storage (e.g., an S3 bucket in AWS) that Unity Catalog uses to manage table files and metadata. Setting this up ensures that all data written through StreamNative is organized, governed, and accessible within your chosen cloud environment.Create a storage bucket
Create an Amazon S3 bucket that will serve as the storage location for topics data managed through Unity Catalog.
Create an IAM role
To allow Unity Catalog to access the S3 storage bucket, you must create an AWS IAM role with the necessary permissions. This role should include a trust relationship with Databricks and grant access policies that allow read and write operations on the designated S3 bucket. By configuring the IAM role correctly, you ensure secure and controlled access to your topics data while maintaining compliance with AWS security best practices.
Create a policy for storage bucket permissions and attach to role
Next, create an IAM policy that grants the required permissions on the S3 bucket, such as s3:PutObject, s3:GetObject, s3:DeleteObject, and s3:ListBucket. This policy ensures that Unity Catalog can securely read from and write data to the storage location. Once created, attach the policy to the IAM role defined in the previous step so that Databricks can assume the role and access the bucket with the appropriate level of permissions.
Create a policy for storage bucket notifications and attach to role
In addition to read and write access, Unity Catalog requires permissions to manage bucket notifications so that updates to table data can be tracked in real time. Create a separate IAM policy (or extend the existing one) to include actions such as s3:GetBucketNotification, s3:PutBucketNotification, and s3:ListBucket. Attaching this policy to the IAM role ensures that Unity Catalog can configure and read bucket event notifications, enabling proper synchronization and governance of managed Iceberg tables.
Create external location
Right-click on the catalog, select the gear icon to open Settings, and then choose Create External Location.
Select the Manual option.
Enter external location details
Enter the following details and click Create- Name for the external location
- Set the storage type to S3
- Provide the S3 bucket URL
- Specify the IAM role ARN
Creating an external location opens a window to configure the IAM role, as shown in the image below. Before clicking the IAM role configured button, copy the ExternalId from this window and update it in the trust policy of the IAM role.
Update trust policy
Open a new browser window and navigate to the IAM role’s Trust Policy section and update the ExternalId, as shown in the image below.
After updating the trust policy, click the IAM role configured button to create the external location.
Test connection
Once the external location is created, click Test Connection to validate the configuration.
Create and configure catalog
Create a new catalog in Databricks by entering a name and selecting Standard from the type dropdown.
Select the external location created in the previous step and enter the sub path and click Create.
Once the catalog is created, click View catalog.
Grant permissions for the catalog. EXTERNAL_USE_SCHEMA permission is required for Iceberg Managed Table in Unity Catalog
Click the catalog, then open the Permissions tab and then click on Grant.
Select the privileges as shown in the image below, ensuring that the External Use Schema option is enabled and click on Confirm.
The assigned privileges will be visible as shown below.
Create OAuth2 credentials for client authentication
Within Databricks settings click on Identity & Access tab to create a Service Principal
Click Add service principal to create OAuth2 client id and secret.
Enable external data access for the metastore
Click on Metastore
Enable External data access
Setup Catalog Integration In StreamNative Cloud
Enable catalog integration
Catalog integration is supported in both StreamNative Ursa and Classic engines across Serverless, Dedicated, and BYOC deployment options.Storage location
When enabling catalog integration, the option to configure a storage location is available in the StreamNative Ursa engine on BYOC deployments. This option is not supported for the Classic engine in Serverless or Dedicated deployments. Users can choose either a StreamNative-provided bucket or their own bucket.For a StreamNative-provided bucket, users can view the bucket location where data is stored in Ursa format.
For configure your own bucket user can enter the following details
- AWS role ARN
- Cloud provider region
- Storage bucket name
- Storage bucket path
Catalog integration in Ursa engine - BYOC deployment
Click to enable Lakehouse Table, choose Databricks Unity Catalog for Iceberg as the catalog provider, and select a catalog from the dropdown. You may also register a new catalog directly from the dropdown.
Catalog Integration in Classic engine - Serverless, Dedicated deployments
Click to enable Lakehouse Table, choose Databricks Unity Catalog for Iceberg as the catalog provider, and select a catalog from the dropdown. You may also register a new catalog directly from the dropdown.Note : Currently the Databricks Unity Catalog for Delta is not available in Classic engine.
Lakehouse Settings
To apply the catalog settings to all topics, enable the option and select Deliver to External Table.Note : The option to Expose Internal Table is currently not available.
Query Data in Databricks
Once the iceberg tables are published in Databricks Unity Catalog, users can query it using the SQL Editor or explore it using Genie.