Skip to main content
StreamNative Private Cloud uses Kubernetes Storage Classes to provision persistent storage volumes for ZooKeeper and BookKeeper.

Use default Kubernetes StorageClass

By default, StreamNative Private Cloud uses the default Kubernetes StorageClass to provision persistent volumes on Custom Resources (CRs). Use the command below to get the name of the current default storage class.
kubectl get sc
To change the default Storage Class that is used to provision volumes, see Change the default StorageClass .

Use specific Kubernetes StorageClass

You can provide a storage class to use for ZooKeeper and BookKeeper.To use a specific Kubernetes StorageClass, follow these steps.
  1. Create or use a pre-defined StorageClass you want to use in your Kubernetes cluster. You need to have sufficient permissions to create and modify StorageClasses in your Kubernetes cluster if you intend to create a new StorageClass to use rather than using a pre-existing one.
  2. In your ZooKeeper and BookKeeper CRs, specify the name of the StorageClass to use:
  • ZooKeeperCluster
spec:
  persistence:
    data:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 40Gi
      # Set a pre-defined Kubernetes Storage Class
      storageClassName: <Your Storage Class name>
    dataLog:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 20Gi
      # Set a pre-defined Kubernetes Storage Class
      storageClassName: <Your Storage Class name>
  • BookKeeperCluster
spec:
  storage:
    journal:
      numDirsPerVolume: 1
      numVolumes: 1
      volumeClaimTemplate:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
        # Set a pre-defined Kubernetes Storage Class
        storageClassName: <Your Storage Class name>
    ledger:
      numDirsPerVolume: 1
      numVolumes: 1
      volumeClaimTemplate:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 80Gi
        # Set a pre-defined Kubernetes Storage Class
        storageClassName: <Your Storage Class name>

PVC metadata

You can add custom annotations and labels to BookKeeper PVC resources. Currently, only BookKeeper PVCs support this feature. To configure PVC metadata for BookKeeper, add the metadata field under journal and/or ledger in the BookKeeperCluster CR:
spec:
  storage:
    journal:
      metadata:
        annotations:
          example.com/annotation-key: "annotation-value"
        labels:
          example.com/label-key: "label-value"
      volumeClaimTemplate:
        # ... other settings
    ledger:
      metadata:
        annotations:
          example.com/annotation-key: "annotation-value"
        labels:
          example.com/label-key: "label-value"
      volumeClaimTemplate:
        # ... other settings
The configured annotations and labels will be added to the PVC resources created by the BookKeeperCluster CR.

Tiered Storage

Tiered Storage makes storing huge volumes of data in Pulsar manageable by reducing operational burden and cost. The fundamental idea is to separate data storage from data processing, allowing each to scale independently. With Tiered Storage, you can send data to cost-effective object storage, and scale brokers only when you need more compute resources. StreamNative Private Cloud supports the following object storage solutions for Tiered Storage:
  • AWS S3
  • Google Cloud Storage
  • Azure Blob Storage

Enable Tiered Storage

To enable Tiered Storage, you need to configure the type of blob storage to use and its related properties, such as the bucket / container, the region, and the credentials in the PulsarBroker CR. When a Pulsar cluster is deleted, StreamNative Private Cloud does not perform a garbage collection of the Tiered Storage bucket contents. You can either wait for the set deletion interval or manually delete the objects in the Tiered Storage bucket.

Configure Tiered Storage for AWS S3

Before enabling Tiered Storage on Amazon Web Services (AWS) with Amazon Simple Storage Service (S3 buckets), you need to configure the following:
  • Create an AWS S3 bucket.
  • Create an IAM role in your AWS account and attach the following IAM policy to grant the necessary permissions for accessing the S3 bucket:
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Effect": "Allow",
                  "Action": [
                      "s3:ListBucket"
                  ],
                  "Resource": [
                      "arn:aws:s3:::<bucket-name>"
                  ]
              },
              {
                  "Effect": "Allow",
                  "Action": [
                      "s3:PutObject",
                      "s3:GetObject",
                      "s3:DeleteObject"
                  ],
                  "Resource": [
                      "arn:aws:s3:::<bucket-name>/*"
                  ]
              }
          ]
      }
    
  • Create a Kubernetes ServiceAccount with the IAM role annotation:
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      annotations:
        eks.amazonaws.com/role-arn: <your-custom-role-arn>
      name: <service-account-name>
      namespace: <namespace>
    
To enable Tiered Storage for AWS S3, configure the PulsarBroker CR as follows:
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarBroker
metadata:
  name: <PulsarBroker name>
  namespace: <namespace>
spec:
  image: <Pulsar image version>
  replicas: 1
  zkServers: <ZooKeeper address>
  serviceAccountName: <service-account-name>
  config:
    custom:
      managedLedgerOffloadDriver: "aws-s3"
      managedLedgerMinLedgerRolloverTimeMinutes: "10"
      managedLedgerMaxEntriesPerLedger: "50000"
      offloadersDirectory: /pulsar/offloaders
      s3ManagedLedgerOffloadRegion: '<YOUR REGION OF S3>'
      s3ManagedLedgerOffloadBucket: '<YOUR BUCKET OF S3>'
      s3ManagedLedgerOffloadServiceEndpoint: "http://s3.amazonaws.com"
      s3ManagedLedgerOffloadMaxBlockSizeInBytes: '67108864'
      s3ManagedLedgerOffloadReadBufferSizeInBytes: '1048576'
This table outlines fields available for configuring Tiered Storage for AWS S3.
FieldDescriptionDefaultRequired
config.custom.managedLedgerOffloadDriverThe offloader driver name. Set to aws-s3 for AWS S3.N/ARequired
config.custom.managedLedgerMinLedgerRolloverTimeMinutesThe minimum time in minutes to wait before rolling over a ledger.”10”Optional
config.custom.managedLedgerMaxEntriesPerLedgerThe maximum number of entries to append to a ledger before triggering a rollover.”50000”Optional
config.custom.offloadersDirectoryThe directory where offloader implementations are stored./pulsar/offloadersOptional
config.custom.s3ManagedLedgerOffloadBucketThe AWS S3 bucket.N/ARequired
config.custom.s3ManagedLedgerOffloadRegionThe AWS S3 region.N/ARequired
config.custom.s3ManagedLedgerOffloadMaxBlockSizeInBytesThe maximum size of a block that is sent when a multi-block is uploaded to AWS S3. It cannot be smaller than 5 MB.64 MBOptional
config.custom.s3ManagedLedgerOffloadReadBufferSizeInBytesThe block size for each individual read when reading data from AWS S3.1 MBOptional
config.custom.s3ManagedLedgerOffloadServiceEndpointAn alternative AWS S3 endpoint to connect to (for test purpose).N/AOptional
spec.serviceAccountNameThe name of the Kubernetes ServiceAccount that is associated with the IAM role for assume role authentication.N/ARequired

Configure Tiered Storage for Google Cloud Storage

Before enabling Tiered Storage with Google Cloud Storage (GCS), you need to configure the following:
  • Create a GCS service account.
  • Create a GCS bucket.
  • Create a Kubernetes secret to save your Google credentials with the following command. When you configure Tiered Storage, you can specify the Kubernetes secret. Pulsar brokers use the credentials stored in the Kubernetes secret to access the storage container. When your storage credentials change, you need to restart the Pulsar cluster.
    kubectl -n <k8s_namespace> create secret generic <secret-name> \
      --from-file=<gcs_service_account_path>
    
To enable Tiered Storage for Google Cloud Storage, configure the PulsarBroker CR as follows:
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarBroker
metadata:
  name: <PulsarBroker name>
  namespace: <namespace>
spec:
  image: <Pulsar image version>
  replicas: 1
  zkServers: <ZooKeeper address>
  config:
    custom:
      managedLedgerOffloadDriver: 'google-cloud-storage'
      managedLedgerMinLedgerRolloverTimeMinutes: "10"
      managedLedgerMaxEntriesPerLedger: "50000"
      offloadersDirectory: /pulsar/offloaders
      gcsManagedLedgerOffloadRegion: '<YOUR REGION OF GCS>'
      gcsManagedLedgerOffloadBucket: '<YOUR BUCKET OF GCS>'
      gcsManagedLedgerOffloadServiceAccountKeyFile: "/pulsar/srvaccts/gcs.json"
      gcsManagedLedgerOffloadMaxBlockSizeInBytes: '67108864'
      gcsManagedLedgerOffloadReadBufferSizeInBytes: '1048576'
  pod:
    secretRefs:
    - mountPath: /pulsar/srvaccts/gcs.json
      secretName: <secret-name>
This table outlines fields available for configuring Tiered Storage for Google Cloud Storage.
FieldDescriptionDefaultRequired
config.custom.managedLedgerOffloadDriverThe offloader driver name. Set to google-cloud-storage for GCS.N/ARequired
config.custom.managedLedgerMinLedgerRolloverTimeMinutesThe minimum time in minutes to wait before rolling over a ledger.”10”Optional
config.custom.managedLedgerMaxEntriesPerLedgerThe maximum number of entries to append to a ledger before triggering a rollover.”50000”Optional
config.custom.offloadersDirectoryThe directory where offloader implementations are stored./pulsar/offloadersOptional
config.custom.gcsManagedLedgerOffloadBucketThe Google Cloud Storage bucket.N/ARequired
config.custom.gcsManagedLedgerOffloadRegionThe Google Cloud Storage bucket region.N/ARequired
config.custom.gcsManagedLedgerOffloadServiceAccountKeyFileThe path to the GCS service account key file./pulsar/srvaccts/gcs.jsonOptional
config.custom.gcsManagedLedgerOffloadMaxBlockSizeInBytesThe maximum size of a block that is sent when a multi-block is uploaded to Google Cloud Storage. It cannot be smaller than 5 MB.64 MBOptional
config.custom.gcsManagedLedgerOffloadReadBufferSizeInBytesThe block size for each individual read when reading data from Google Cloud Storage.1 MBOptional
pod.secretRefsMount the GCS service account JSON file to /pulsar/srvaccts/gcs.json.N/ARequired

Configure Tiered Storage for Azure Blob Storage

Before enabling Tiered Storage with Azure Blob Storage, you need to configure the following:
  • Create an Azure storage account and a storage account access key.
  • Create an Azure Blob container.
  • Create a Kubernetes secret to save your Azure credentials with the command below. When you configure Tiered Storage, you can specify the Kubernetes secret. Pulsar brokers use the credentials stored in the Kubernetes secret to access the storage container. When your storage credentials change, you need to restart the Pulsar cluster.
    kubectl -n <k8s_namespace> create secret generic <secret-name> \
      --from-literal=AZURE_STORAGE_ACCOUNT=<azure_storage_account> \
      --from-literal=AZURE_STORAGE_ACCESS_KEY=<azure_storage_access_key>
    
To enable Tiered Storage for Azure Blob Storage, configure the PulsarBroker CR as follows:
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarBroker
metadata:
  name: <PulsarBroker name>
  namespace: <namespace>
spec:
  image: <Pulsar image version>
  replicas: 1
  zkServers: <ZooKeeper address>
  config:
    custom:
      managedLedgerOffloadDriver: 'azureblob'
      managedLedgerMinLedgerRolloverTimeMinutes: "10"
      managedLedgerMaxEntriesPerLedger: "50000"
      offloadersDirectory: /pulsar/offloaders
      managedLedgerOffloadBucket: '<YOUR BLOB CONTAINER>'
      managedLedgerOffloadServiceEndpoint: "https://your-container.blob.core.windows.net"
      managedLedgerOffloadMaxBlockSizeInBytes: '67108864'
      managedLedgerOffloadReadBufferSizeInBytes: '1048576'
  pod:
    vars:
    - name: AZURE_STORAGE_ACCOUNT
      valueFrom:
        secretKeyRef:
          name: <secret-name>
          key: AZURE_STORAGE_ACCOUNT
    - name: AZURE_STORAGE_ACCESS_KEY
      valueFrom:
        secretKeyRef:
          name: <secret-name>
          key: AZURE_STORAGE_ACCESS_KEY
This table outlines fields available for configuring Tiered Storage for Azure Blob Storage.
FieldDescriptionDefaultRequired
config.custom.managedLedgerOffloadDriverThe offloader driver name. Set to azureblob for Azure Blob Storage.N/ARequired
config.custom.managedLedgerMinLedgerRolloverTimeMinutesThe minimum time in minutes to wait before rolling over a ledger.”10”Optional
config.custom.managedLedgerMaxEntriesPerLedgerThe maximum number of entries to append to a ledger before triggering a rollover.”50000”Optional
config.custom.offloadersDirectoryThe directory where offloader implementations are stored./pulsar/offloadersOptional
config.custom.managedLedgerOffloadBucketThe Azure Blob container.N/ARequired
config.custom.managedLedgerOffloadMaxBlockSizeInBytesThe maximum size of a block that is sent when a multi-block is uploaded to Azure Blob Storage. It cannot be smaller than 5 MB.64 MBOptional
config.custom.managedLedgerOffloadReadBufferSizeInBytesThe block size for each individual read when reading data from Azure Blob Storage.1 MBOptional
config.custom.managedLedgerOffloadServiceEndpointAn alternative Azure Blob Storage endpoint to connect to (for test purpose).N/AOptional
pod.varsEnvironment variables to reference Azure credentials from the Kubernetes secret.N/ARequired