1. StreamNative Cloud
  2. Compute

Flink SQL

Flink SQL provides a simple and completely interactive SQL interface for processing streaming data in Apache Pulsar. This document describes how to create a Flink cluster and perform interactive queries with Flink SQL.

Currently, Flink SQL is only available for the standard subscription plan through snctl.

Before creating a Flink cluster, you need to prepare a running Pulsar cluster under the same Pulsar instance. Then, the Flink cluster is associated with the Pulsar cluster automatically.

This section describes how to provision Flink clusters through snctl.

You can use the snctl create flinkcluster FLINK_CLUSTER_NAME --instance-name INSTANCE_NAME command to create a Flink cluster. In addition, you can use the --node-type NODE_TYPE and --location LOCATION to specify the node type and the location for the Flink cluster.

This example shows how to create the foo2 Flink cluster which uses the tiny-1 node type and is deployed on Google Cloud Platform (GCP).

snctl create flinkcluster foo2 --instance-name gcp-1 --location us-east4

Output

flinkcluster.flink.streamnative.io/foo2 created

You can use the snctl get flinkcluster FLINK_CLUSTER_NAME command to get details of the Flink cluster.

This example shows how to get details of the foo2 Flink cluster.

snctl get flinkcluster foo2

Output

...
status:
  conditions:
  - lastTransitionTime: "2021-03-11T00:49:10Z"
    reason: AllConditionStatusTrue
    status: "True"
    type: Ready
  gatewayConnectionString: foo2-sql.test.test.sn2.dev

From the output, you can see that the status and type parameters for items under Conditions are set to True and Ready. This means that the Flink cluster was created successfully.

For details about commands used to provision the Flink cluster, see snctl command reference.

You can use an application to load data to a Pulsar topic. Here is an example about how to use the taxidata tool to load a well-known dataset called the NYC Taxi Data into a Pulsar topic.

Previous
Cloud Storage Sink