The Google Cloud Pub/Sub source connector feeds data from Google Cloud Pub/Sub topics and writes data to Pulsar topics.
How to get
This section describes how to build the Google Cloud Pub/Sub source connector.
Work with Function Worker
You can get the Google Cloud Pub/Sub source connector using one of the following methods if you use Pulsar Function Worker to run connectors in a cluster.
Download the NAR package of the connector from the download page.
Build the connector from the source code.
To build the Google Cloud Pub/Sub source connector from the source code, follow these steps.
Clone the source code to your machine.
git clone https://github.com/streamnative/pulsar-io-google-pubsub
Build the connector in the
pulsar-io-google-pubsub
directory.mvn clean install -DskipTests
After the connector is successfully built, a
NAR
package is generated under the target directory.ls target pulsar-io-google-pubsub-2.11.4.3.nar
Work with Function Mesh
You can pull the Google Cloud Pub/Sub source connector Docker image from the Docker Hub if you use Function Mesh to run the connector.
How to configure
Before using the Google Cloud Pub/Sub source connector, you need to configure it. This table lists the properties and the descriptions.
Name | Type | Required | Default | Description |
---|---|---|---|---|
pubsubEndpoint | String | false | "" (empty string) | The Google Cloud Pub/Sub end-point URL. |
pubsubCredential | String | false | "" (empty string) | The credential (JSON string) for accessing the Google Cloud. |
pubsubProjectId | String | true | "" (empty string) | The Google Cloud project ID. |
pubsubTopicId | String | true | " " (empty string) | The topic ID. It is used to read messages from or write messages to Google Cloud Pub/Sub topics. |
pubsubSchemaId | String | false | "" (empty string) | The schema ID. You must set the schema ID when creating a schema for Google Cloud Pub/Sub topics. |
pubsubSchemaType | String | false | "" (empty string) | The schema type. You must set the schema type when creating a schema for Google Cloud Pub/Sub topics. Currently, only the AVRO format is supported. |
pubsubSchemaEncoding | String | false | "" (empty string) | The encoding of the schema. You must set the schema encoding when creating a schema for Google Cloud Pub/Sub topics. Currently, only the JSON format is supported. |
pubsubSchemaDefinition | String | false | "" (empty string) | The definition of the schema. It is used to create a schema to or parse messages from Google Cloud Pub/Sub topics. |
Note
The provided Google Cloud credentials must have permissions to access Google Cloud resources. To use the Google Cloud Pub/Sub source connector, ensure the Google Cloud credentials have the following permissions to Google Cloud Pub/Sub API:
- projects.subscriptions.get
- projects.subscriptions.create
- projects.subscriptions.pull
- projects.subscriptions.acknowledge
For more information about Google Cloud Pub/Sub API permissions, see Google Cloud Pub/Sub API permissions: Access control.
Work with Function Worker
You can create a configuration file (JSON or YAML) to set the properties if you use Pulsar Function Worker to run connectors in a cluster.
Example
JSON
{ "tenant": "public", "namespace": "default", "name": "google-pubsub-source", "topicName": "test-google-pubsub-pulsar", "archive": "connectors/pulsar-io-google-pubsub-2.11.4.3.nar", "parallelism": 1, "configs": { "pubsubProjectId": "pulsar-io-google-pubsub", "pubsubTopicId": "test-pubsub-source" } }
YAML
tenant: public namespace: default name: google-pubsub-source topicName: test-google-pubsub-pulsar archive: connectors/pulsar-io-google-pubsub-2.11.4.3.nar parallelism: 1 configs: pubsubProjectId: pulsar-io-google-pubsub pubsubTopicId: test-pubsub-source
Work with Function Mesh
You can create a CustomResourceDefinitions (CRD) to create a Google Cloud Pub/Sub source connector. Using CRD makes Function Mesh naturally integrate with the Kubernetes ecosystem. For more information about Pulsar source CRD configurations, see source CRD configurations.
You can define a CRD file (YAML) to set the properties as below.
apiVersion: compute.functionmesh.io/v1alpha1
kind: Source
metadata:
name: google-pubsub-source-sample
spec:
image: streamnative/pulsar-io-google-pubsub:2.11.4.3
className: org.apache.pulsar.ecosystem.io.pubsub.PubsubSource
replicas: 1
maxReplicas: 1
output:
topics:
- persistent://public/default/test-google-pubsub-pulsar
sourceConfig:
pubsubCredential: 'SECRETS'
pubsubProjectId: pulsar-io-google-pubsub
pubsubTopicId: test-google-pubsub-source
pulsar:
pulsarConfig: "test-pulsar-source-config"
resources:
limits:
cpu: "0.2"
memory: 1.1G
requests:
cpu: "0.1"
memory: 1G
java:
jar: connectors/pulsar-io-google-pubsub-2.11.4.3.nar
clusterName: test-pulsar
How to use
You can use the Google Cloud Pub/Sub source connector with Function Worker or Function Mesh.
Work with Function Worker
You can use the Google Cloud Pub/Sub source connector as a non built-in connector or a built-in connector as below.
If you already have a Pulsar cluster, you can use the Google Cloud Pub/Sub source connector as a non built-in connector directly.
This example shows how to create a Google Cloud Pub/Sub source connector on a Pulsar cluster using the pulsar-admin sources create
command.
PULSAR_HOME/bin/pulsar-admin sources create \
--source-config-file <google-pubsub-source-config.yaml >