Quick Start for Ursa-Engine BYOC Cluster

StreamNative Cloud is a resilient, scalable, data streaming service powered by the URSA engine, delivered as a fully managed Pulsar and Kafka service. StreamNative Cloud provides multiple interfaces for management and interaction:

StreamNative Cloud Console: A user-friendly web-based interface for managing cluster resources, configuring settings, and handling billing.
Command-Line Interface (CLI):
- StreamNative CLI (snctl): The unified command-line interface for deploying and managing StreamNative Cloud infrastructure and interacting directly with your Pulsar clusters and Kafka-protocol endpoints.
- Pulsar CLI (pulsarctl): For managing cluster-specific resources, such as tenants, namespaces, topics, functions, connectors, and more.
REST APIs: For programmatic access and integration with other systems.
Terraform Providers:
- StreamNative Terraform Provider
- Pulsar Terraform Provider

These tools provide flexibility in how you interact with and manage your StreamNative Cloud environment, catering to different user preferences and use cases.

This quick start guides you through getting started with StreamNative BYOC. It demonstrates how to use a StreamNative BYOC cluster to create topics, produce data to the cluster, and consume data from it.

This QuickStart assumes you are familiar with the basics concepts of StreamNative Cloud Clusters.

Prerequisites

Access to StreamNative Cloud.
Internet connectivity.
Access to Your AWS Account for provisioning the BYOC infrastructure.
Install Java 17. For details, see overview of JDK installation.
Install Maven. For details, see installing Apache Maven.

If you have an email account configured for using Single Sign-On (SSO) with StreamNative Cloud, use that email address and password when signing up.

To sign up, navigate to the StreamNative Cloud Console signup page. Follow the prompts to create an account. After you click Finish, you might have to wait briefly for your first organization to be created. After your new organization is created, continue on to creating your first instance and cluster.

Step 2: Grant StreamNative Vendor Access

Before deploying a StreamNative cluster within your cloud account, you must first grant StreamNative vendor access. Follow the instructions in Account Access for BYOC on AWS to provision AWS access for StreamNative Cloud. This step ensures StreamNative has the necessary permissions to manage resources in your AWS account. Once completed, please note the account ID of the AWS account you have granted access to StreamNative Cloud. You will use this account ID to create a Cloud Connection.

Step 3: Create a Cloud Connection

Cloud Console

In the upper-right corner of the StreamNative Cloud Console, click your Profile and select Cloud Environments.
Select Cloud Connections tab and click New Cloud Connection.
In the Create connection dialog, enter the following information:
- Name: Enter a name for the cloud connection. For example, my-aws-connection.
- Connection provider: Select AWS as the connection provider.
- AWS Account ID: Enter the account ID of the AWS account you noted in the previous step.
- Check Confirm if vendor access Terraform module is executed.
Click Submit.
The cloud connection creation process will start. Once completed, you can see the status of the cloud connection turned to Connected in the Cloud Connections tab.

Step 4: Create a Cloud Environment

Once you have created a cloud connection, you can create a cloud environment and provision a BYOC instance.

Cloud Console

Navigate to Organization Dashboard.
Select Instances at the left navigation pane.
On the Instances page, click + New Instance.
On the Choose the deployment type for your instance page, click Deploy BYOC.
You will see a dialog “Cloud Environment required”. Click Create button to create a cloud environment.
On the Cloud Connection page, select the cloud connection you created in the previous step. In this case, it is my-aws-connection. Then click Environment setup.
Then fill out the information for the cloud environment.
- Region: Select the region where you want to deploy the BYOC cluster. In this example, it is us-west-2.
- Environment tag: Enter a tag for the cloud environment. For example, poc.
- Configure StreamNative Managed VPC Network:
  - Network CIDR: By default, it creates a StreamNative Managed VPC with a CIDR of 10.0.0.0/16. If you need to specify a different CIDR, you can enter it here.
- Default Gateway: You can configure how do you want to expose your BYOC cluster, whether it is public or private. By default, it is public.
Click Create.

The provisioning process of a cloud environment usually takes about 40 minutes to complete. You can safely close the page and come back later. You will also receive an email notification when the cloud environment is ready. For more information about provisioning BYOC infrastructure, see Provision BYOC Infrastructure.

Step 5: Create a StreamNative Instance & Cluster

Once the cloud environment is ready, you can create a Pulsar instance.

Cloud Console

Navigate to Organization Dashboard.
Select Instances at the left navigation pane.
On the Instances page, click + New Instance.
On the Choose the deployment type for your instance page, click Deploy BYOC again. You will not see the dialog “Cloud Environment required” this time.
On the Instance Configuration page, fill out the information for the Pulsar instance. Then, click Cluster Location to start the cluster creation process.
- Instance Name: Enter a name for the instance. For example, my-ursa-instance.
- Cloud Connection: Select the cloud connection you created in the previous step. In this example, it is my-aws-connection.
- Engine: Select the engine for the instance. In this example, it is URSA Engine.
On the Cluster Location page, fill out the information for the cluster.
- Cluster Name: Enter a name for the cluster. For example, my-ursa-cluster.
- Cloud Environment: Select the cloud environment you created in the previous step. In this example, it is aws-usw2-poc-<random-suffix>.
Click Cluster Size to configure the cluster size.
- You can use the slider to adjust the throughput accordingly.
- You can also manually configure the number of brokers and its corresponding resources in the Advanced section.
- There is no bookie in the URSA Engine cluster.
At the right navigation pane, you can see the estimated cost for the cluster.
Click Finish to start the cluster creation process.

The cluster page appears, displaying the cluster creation process. Depending on the chosen cloud provider and other settings, it may take several minutes to provision the cluster. Once the cluster has been provisioned, the page will show “Cluster Provisioned successfully” and you can click Go To The Dashboard to access the Cluster Dashboard page.

Now you can get started configuring apps and data on your new cluster.

Step 6: Create a service account

To interact with your cluster by producing and consuming messages, you need to set up authentication and authorization. This is done by creating a Service Account, which serves as an identity for authenticating and authorizing access to the cluster. The service account will provide the necessary credentials for your applications to securely connect and perform operations on the Pulsar cluster.

Cloud Console

In the upper-right corner of the StreamNative Cloud Console, click your Profile and select Accounts & Accesses.
On the left navigation pane, click Service Accounts.
On the Service Account page, click + New.
On the Create Service Account dialog, enter a name for the service account, and then click Confirm.
On the Service Account page, in the row of the service account you just created, click the … icon, and select Create API Key in the dropdown menu.
On the New API Key dialog:
- Enter a name for the API key
- Set the expiration date
- Select the instance you created in previous step
- Write a description for the API key
- Click Confirm

An API key and associated secret apply to the active StreamNative instance. If you add a new instance, you must create a new API key for producers and consumers on the new Pulsar instance. For more information, see Use API Keys to Authenticate to StreamNative Cloud.

After the API key is created, you can see the API key shown in the New API Key dialog. Click the Copy and close button to copy the API key to your clipboard. Please note that you cannot retrieve the API key later after closing the dialog.

Step 7: Create Tenant and Namespace, and Authorize the Service Account

After creating the service account and obtaining the API key, the next crucial step is to authorize the service account. This process grants the necessary permissions for the service account to interact with your StreamNative Cloud cluster. Authorization involves setting up Access Control Lists (ACLs) that define what actions the service account can perform. Typically, you’ll want to grant permissions for producing and consuming messages on specific topics or namespaces. For more information about authorization, see Authorization and ACLs.

Cloud Console

Go back to the Cluster Dashboard page.
On the left navigation pane, click Instances.
On the Instances page, click the name of the instance you created in Step 2.
On the Cluster Dashboard page, click Tenants on the left navigation pane.
On the Tenants page, click + New Tenant.
On the New Tenant dialog:
- Enter a name for the tenant
- Select your user account as the Admin role
- Select the cluster created in Step 2 as Allowed clusters
- Click Confirm
On the Tenants page, click the name of the tenant you just created. You will be directed to the Tenant Dashboard page.
On the Tenant Dashboard page, click Namespaces on the left navigation pane.
On the Namespaces page, click New Namespace.
On the New Namespace dialog:
- Enter a name for the namespace
- Select the cluster created in Step 2 as Allowed clusters and Replication clusters
- Click Confirm
On the Namespaces page, click the name of the namespace you just created. You will be directed to the Namespace Dashboard page.
On the Namespace Dashboard page, click Configuration on the left navigation pane.
On the Namespace configuration page, click ADD ROLE. Select the name of the service account you just created, and choose the consume and produce permissions. This grants your service account the produce and consume permissions for this namespace.

Step 8: Grant permission to access Kafka Schema Registry

This quick start uses AVRO to produce the message. You need to grant the service account to access the Kafka Schema registry. You can do this by granting the service account the produce permission to the public/__kafka_schemaregistry/__schema-registry topic. This is required because the current implementation of Kafka Schema Registry uses this topic’s ACL to authorize access to the schema registry.

Cloud Console

Navigate to the Namespace Dashboard page of public / __kafka_schemaregistry namepsace.
On the left navigation pane, click Topics.
On the Topics page, click the topic __schema-registry.
On the topic details page, click Policy tab.
Click + Add button and, in the dropdown menu, select the name of the service account you just created and choose the produce permission.

Now you have created a tenant, namespace, and granted the service account the produce and consume permissions for this namespace. You also grant the service account permission to access the Kafka Schema Registry. You can now continue on to building a Python app, connecting to the cluster, and producing and consuming messages.

Step 9: Produce and consume messages

This QuickStart provides you with an example Java app to get you up and running with consuming and producing messages. This is a simple example and is not intended for production environments.

Create a producer/consumer

Cloud Console

Return to the StreamNative Cloud Console and go to the “Cluster Dashboard” page.
On the left navigation pane, click Kafka Clients.
On the Kafka client setup page, click the Code libraries tab, follow the setup wizard to get the sample codes for your producer and consumer. a. Select Java as the client library and click Next. b. Select the service account you created and click Next. c. Select API Key as the authentication type and click Next. If you already have an API key, you can use the API key noted in Step 6. Otherwise, you can create a new API key. d. Check Kafka Schema Registry and click Next. d. Copy the required dependencies to your pom.xml file, and click Next. e. Select the target tenant, namespace, topic, and subscription. f. You are now ready to copy the auto-generated sample codes.

Build your project. a. Create a new file named pom.xml and add the following content:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>kafka-examples</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>
    <properties>
        <log4j.version>2.17.1</log4j.version>
        <!-- Maven properties for compilation -->
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    </properties>
    <dependencies>
        <!-- Add the Kafka dependency -->
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-clients</artifactId>
            <version>3.4.0</version>
        </dependency>
        <dependency>
            <groupId>io.confluent</groupId>
            <artifactId>kafka-avro-serializer</artifactId>
            <version>7.5.0</version>
        </dependency>
        <dependency>
            <groupId>io.streamnative.pulsar.handlers</groupId>
            <artifactId>oauth-client</artifactId>
            <version>3.1.0.4</version>
        </dependency>
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-streams</artifactId>
            <version>3.4.0</version>
        </dependency>
        <!-- Add the Log4j dependency -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-slf4j-impl</artifactId>
            <version>${log4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>${log4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.30</version>
        </dependency>
    </dependencies>
    <repositories>
        <repository>
        <id>confluent</id>
        <url>https://packages.confluent.io/maven/</url>
        </repository>
    </repositories>

</project>

b. Create a folder src/main/java/org/example. c. Under src/main/java/org/example, create a file named SNCloudTokenProducer.java. Copy and paste the producer code to this file. d. Under src/main/java/org/example, create a file named SNCloudTokenConsumer.java. Copy and paste the consumer code to this file. e. In both files, replace <JWT Token> with the API key you copied from the Service Account page. c. Go to the root folder of your project and run the following command to build your project:

mvn clean install

Run the clients to produce and consume your first message

Open a terminal window, navigate to the root folder of your project, and run the following command:
```
mvn exec:java -Dexec.mainClass="org.example.SNCloudTokenConsumer"
```
Open a second terminal window, navigate to the root folder of your project, and run the following command:
```
mvn exec:java -Dexec.mainClass="org.example.SNCloudTokenProducer"
```
You will see a message like the following:
```
Send hello to <YOUR-TOPIC-NAME>-0@0
```

Return to the first terminal window. You should see the following:

Receive record: {"name": "jwt-sr", "age": 20} from <YOUR-TOPIC-NAME>-0@0

You can continue to produce and consume messages by repeating the above steps. For example, you can run the producer in a loop to produce 100 messages:

for i in {1..100}; do
  mvn exec:java -Dexec.mainClass="org.example.SNCloudTokenProducer"
done

Step 10: Check the storage bucket

Since URSA Engine currently uses S3 as the storage layer, you can check the storage bucket to verify the data is persisted.

Navigate to the Cluster Dashboard page.
Click Details tab.
On the Details page, you can find the S3 bucket name listed in the Storage Bucket field under the Access Points section.
Navigate to your AWS account and check the S3 bucket to verify the data is persisted.
You should be able to see the data persisted in the S3 bucket under folder <org-id>-<cluster-name>-<randomized-string>-ursa. There are two sub folders: storage and compaction. The storage folder contains the raw WAL files and the compaction folder contains the compacted lakehouse tables. Those lakehouse tables are organized by <tenant>/<namespace>/<topic>.

Step 11: Query the compacted lakehouse tables using DuckDB

Install DuckDB. For details, see DuckDB Installation.
Run DuckDB.
```
duckdb
```

Load the lakehouse table into DuckDB.

CREATE SECRET (
   TYPE S3,
   PROVIDER CREDENTIAL_CHAIN
);

SELECT COUNT(*) FROM delta_scan('s3://path/to/compacted/lakehouse/table')

You should be able to see the output like the following:

┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│          101 │
└──────────────┘

Next steps

After you have successfully provisioned a BYOC cluster and connected to the cluster, you can learn more about working with StreamNative Cloud by reading through Cloud Console basics.
If you want to learn more about Kafka and StreamNative Cloud, take our developer courses at the StreamNative Developer Portal.

Introduction

Get Started

Clusters

Data Streams

Security

Governance

Connect

Lakehouse

Agents (Orca Engine)

Process

Networking

Log And Monitor

Universal Linking

Billing

References

Prerequisites

Step 2: Grant StreamNative Vendor Access

Step 3: Create a Cloud Connection

Step 4: Create a Cloud Environment

Step 5: Create a StreamNative Instance & Cluster

Step 6: Create a service account

Step 7: Create Tenant and Namespace, and Authorize the Service Account

Step 8: Grant permission to access Kafka Schema Registry

Step 9: Produce and consume messages

Create a producer/consumer

Run the clients to produce and consume your first message

Step 10: Check the storage bucket

Step 11: Query the compacted lakehouse tables using DuckDB

Next steps

Introduction

Get Started

Clusters

Data Streams

Security

Governance

Connect

Lakehouse

Agents (Orca Engine)

Process

Networking

Log And Monitor

Universal Linking

Billing

References

​Prerequisites

​Step 1: Sign up

​Step 2: Grant StreamNative Vendor Access

​Step 3: Create a Cloud Connection

​Step 4: Create a Cloud Environment

​Step 5: Create a StreamNative Instance & Cluster

​Step 6: Create a service account

​Step 7: Create Tenant and Namespace, and Authorize the Service Account

​Step 8: Grant permission to access Kafka Schema Registry

​Step 9: Produce and consume messages

​Create a producer/consumer

​Run the clients to produce and consume your first message

​Step 10: Check the storage bucket

​Step 11: Query the compacted lakehouse tables using DuckDB

​Next steps

Prerequisites

Step 1: Sign up

Step 2: Grant StreamNative Vendor Access

Step 3: Create a Cloud Connection

Step 4: Create a Cloud Environment

Step 5: Create a StreamNative Instance & Cluster

Step 6: Create a service account

Step 7: Create Tenant and Namespace, and Authorize the Service Account

Step 8: Grant permission to access Kafka Schema Registry

Step 9: Produce and consume messages

Create a producer/consumer

Run the clients to produce and consume your first message

Step 10: Check the storage bucket

Step 11: Query the compacted lakehouse tables using DuckDB

Next steps