source
Google Cloud Pub/Sub Source Connector
Authored by
nodece,Huanli-Meng,nicoloboschi,urfreespace
Support type
streamnative
License
Business License

The Google Cloud Pub/Sub source connector feeds data from Google Cloud Pub/Sub topics and writes data to Pulsar topics.

How to get

This section describes how to build the Google Cloud Pub/Sub source connector.

Work with Function Worker

You can get the Google Cloud Pub/Sub source connector using one of the following methods if you use Pulsar Function Worker to run connectors in a cluster.

  • Download the NAR package of the connector from the download page.

  • Build the connector from the source code.

To build the Google Cloud Pub/Sub source connector from the source code, follow these steps.

  1. Clone the source code to your machine.

    git clone https://github.com/streamnative/pulsar-io-google-pubsub
    
  2. Build the connector in the pulsar-io-google-pubsub directory.

    mvn clean install -DskipTests
    

    After the connector is successfully built, a NAR package is generated under the target directory.

    ls target
    pulsar-io-google-pubsub-2.8.4.2.nar
    

Work with Function Mesh

You can pull the Google Cloud Pub/Sub source connector Docker image from the Docker Hub if you use Function Mesh to run the connector.

How to configure

Before using the Google Cloud Pub/Sub source connector, you need to configure it. This table lists the properties and the descriptions.

NameTypeRequiredDefaultDescription
pubsubEndpointStringfalse"" (empty string)The Google Cloud Pub/Sub end-point URL.
pubsubCredentialStringfalse"" (empty string)The credential (JSON string) for accessing the Google Cloud.
pubsubProjectIdStringtrue"" (empty string)The Google Cloud project ID.
pubsubTopicIdStringtrue" " (empty string)The topic ID. It is used to read messages from or write messages to Google Cloud Pub/Sub topics.
pubsubSchemaIdStringfalse"" (empty string)The schema ID. You must set the schema ID when creating a schema for Google Cloud Pub/Sub topics.
pubsubSchemaTypeStringfalse"" (empty string)The schema type. You must set the schema type when creating a schema for Google Cloud Pub/Sub topics. Currently, only the AVRO format is supported.
pubsubSchemaEncodingStringfalse"" (empty string)The encoding of the schema. You must set the schema encoding when creating a schema for Google Cloud Pub/Sub topics. Currently, only the JSON format is supported.
pubsubSchemaDefinitionStringfalse"" (empty string)The definition of the schema. It is used to create a schema to or parse messages from Google Cloud Pub/Sub topics.

Note

The provided Google Cloud credentials must have permissions to access Google Cloud resources. To use the Google Cloud Pub/Sub source connector, ensure the Google Cloud credentials have the following permissions to Google Cloud Pub/Sub API:

  • projects.subscriptions.get
  • projects.subscriptions.create
  • projects.subscriptions.pull
  • projects.subscriptions.acknowledge

For more information about Google Cloud Pub/Sub API permissions, see Google Cloud Pub/Sub API permissions: Access control.

Work with Function Worker

You can create a configuration file (JSON or YAML) to set the properties if you use Pulsar Function Worker to run connectors in a cluster.

Example

  • JSON

    {
        "tenant": "public",
        "namespace": "default",
        "name": "google-pubsub-source",
        "topicName": "test-google-pubsub-pulsar",
        "archive": "connectors/pulsar-io-google-pubsub-2.8.4.2.nar",
        "parallelism": 1,
        "configs":
        {
          "pubsubProjectId": "pulsar-io-google-pubsub",
          "pubsubTopicId": "test-pubsub-source"
        }
    }
    
  • YAML

    tenant: public
    namespace: default
    name: google-pubsub-source
    topicName: test-google-pubsub-pulsar
    archive: connectors/pulsar-io-google-pubsub-2.8.4.2.nar
    parallelism: 1
    configs:
      pubsubProjectId: pulsar-io-google-pubsub
      pubsubTopicId: test-pubsub-source
    

Work with Function Mesh

You can create a CustomResourceDefinitions (CRD) to create a Google Cloud Pub/Sub source connector. Using CRD makes Function Mesh naturally integrate with the Kubernetes ecosystem. For more information about Pulsar source CRD configurations, see source CRD configurations.

You can define a CRD file (YAML) to set the properties as below.

apiVersion: compute.functionmesh.io/v1alpha1
kind: Source
metadata:
  name: google-pubsub-source-sample
spec:
  image: streamnative/pulsar-io-google-pubsub:2.8.4.2
  className: org.apache.pulsar.ecosystem.io.pubsub.PubsubSource
  replicas: 1
  maxReplicas: 1
  output:
    topics:
      - persistent://public/default/test-google-pubsub-pulsar
  sourceConfig:
    pubsubCredential: 'SECRETS'
    pubsubProjectId: pulsar-io-google-pubsub
    pubsubTopicId: test-google-pubsub-source
  pulsar:
    pulsarConfig: "test-pulsar-source-config"
  resources:
    limits:
      cpu: "0.2"
      memory: 1.1G
    requests:
      cpu: "0.1"
      memory: 1G
  java:
    jar: connectors/pulsar-io-google-pubsub-2.8.4.2.nar
  clusterName: test-pulsar

How to use

You can use the Google Cloud Pub/Sub source connector with Function Worker or Function Mesh.

Work with Function Worker

You can use the Google Cloud Pub/Sub source connector as a non built-in connector or a built-in connector as below.

If you already have a Pulsar cluster, you can use the Google Cloud Pub/Sub source connector as a non built-in connector directly.

This example shows how to create a Google Cloud Pub/Sub source connector on a Pulsar cluster using the pulsar-admin sources create command.

PULSAR_HOME/bin/pulsar-admin sources create \
--source-config-file <google-pubsub-source-config.yaml >