Metrics is a valuable tool for getting visibility into your Cloud deployment. StreamNative Cloud provides a broad range of metrics that you can use to help fine-tune performance and troubleshoot issues.

Metrics endpoint

StreamNative Cloud provides an endpoint that exposes real-time metrics in Prometheus metrics format. The following table displays the currently available metrics endpoints.

Currently, the Cloud Metrics API only exposes resource-related metrics for Pulsar, including Tenants, Namespaces, Topics, Functions, Connectors, and others. System-level metrics are not exposed through this API. These system-level metrics are actively monitored and managed by the StreamNative Cloud team. However, for advanced observability use cases, you might need access to these system-level metrics. To meet this requirement, you can use the Local Metrics Endpoint. Please note that the Local Metrics Endpoint is only available for BYOC Pro clusters.

Endpoint	Description
`https://metrics.streamnative.cloud/v1/cloud/metrics/export`	Export Pulsar resource metrics
`https://metrics.streamnative.cloud/v1/cloud/metrics/source/export`	Export Source connector metrics
`https://metrics.streamnative.cloud/v1/cloud/metrics/sink/export`	Export Sink connector metrics
`https://metrics.streamnative.cloud/v1/cloud/metrics/function/export`	Export Function metrics
`https://metrics.streamnative.cloud/v1/cloud/metrics/kafkaconnect/export`	Export Kafka Connect metrics
`https://metrics.streamnative.cloud/v1/cloud/metrics/health/export`	Export Cluster health metrics

Metrics authorization

To access and scrape metrics from the Cloud endpoints, you must use a Super Admin service account or a normal service account with metrics-viewer role.

Super Admin service account

To create a super admin service account, please check the create a service account.

metrics-viewer role

To bind a service account with metrics-viewer, your can configure it through snctl or terraform.

create a normal service account

snctl create serviceaccount metrics-account

create role binding with metrics-viewer

snctl create rolebinding metrics-viewer --serviceaccount metrics-account --clusterrole metrics-viewer

In case you want to remove the permission to list metrics you can delete the rolebinding

snctl delete rolebinding metrics-viewer

Pulsar resource metrics

Name	Type	Description
pulsar_topics_count	Gauge	The number of Pulsar topics of the namespace owned by this broker.
pulsar_subscriptions_count	Gauge	The number of Pulsar subscriptions of the topic served by this broker.
pulsar_producers_count	Gauge	The number of active producers of the topic connected to this broker.
pulsar_consumers_count	Gauge	The number of active consumers of the topic connected to this broker.
pulsar_rate_in	Gauge	The total message rate of the namespace coming into this broker (message/second).
pulsar_rate_out	Gauge	The total message rate of the namespace going out from this broker (message/second).
pulsar_throughput_in	Gauge	The total throughput of the topic coming into this broker (byte per second).
pulsar_throughput_out	Gauge	The total throughput of the topic going out from this broker (byte per second).
pulsar_storage_size	Gauge	The total storage size of the topics in this topic owned by this broker (bytes).
pulsar_storage_backlog_size	Gauge	The total backlog size of the topics of this topic owned by this broker (in bytes).
pulsar_storage_offloaded_size	Gauge	The total amount of the data in this topic offloaded to the tiered storage (bytes).
pulsar_storage_write_rate	Gauge	The total message batches (entries) written to the storage for this topic (message batch per second).
pulsar_storage_read_rate	Gauge	The total message batches (entries) read from the storage for this topic (message batch per second).
pulsar_subscription_delayed	Gauge	The total message batches (entries) are delayed for dispatching.
pulsar_broker_publish_latency	Summary	The total latency of pulsar broker publish.

Source connector metrics

Name	Type	Description
pulsar_source_written_total	Counter	The total number of records written to a Pulsar topic
pulsar_source_written_1min_total	Counter	The total number of records written to a Pulsar topic in the last 1 minute
pulsar_source_received_total	Counter	The total number of records received from source
pulsar_source_received_1min_total	Counter	The total number of records received from source in the last 1 minute
pulsar_source_last_invocation	Gauge	The timestamp of the last invocation of the source
pulsar_source_source_exception	Gauge	The exception from a source
pulsar_source_source_exceptions_total	Counter	The total number of source exceptions
pulsar_source_source_exceptions_1min_total	Counter	The total number of source exceptions in the last 1 minute
pulsar_source_system_exception	Gauge	The exception from system code
pulsar_source_system_exceptions_total	Counter	The total number of system exceptions
pulsar_source_system_exceptions_1min_total	Counter	The total number of system exceptions in the last 1 minute
pulsar_source_user_metric_*	Summary	The user-defined metrics
process_cpu_seconds_total	Counter	Total user and system CPU time spent in seconds.
jvm_memory_bytes_committed	Gauge	Committed (bytes) of a given JVM memory area.
jvm_memory_bytes_max	Gauge	Max (bytes) of a given JVM memory area.
jvm_memory_direct_bytes_used	Gauge	Used bytes of a given JVM memory area.
jvm_memory_bytes_init	Gauge	Initial bytes of a given JVM memory area.
jvm_gc_collection_seconds_sum	Summary	Time spent in a given JVM garbage collector in seconds.

Sink connector metrics

Name	Type	Description
pulsar_sink_written_total	Counter	The total number of records written to a Pulsar topic
pulsar_sink_written_1min_total	Counter	The total number of records written to a Pulsar topic in the last 1 minute
pulsar_sink_received_total	Counter	The total number of records received from sink
pulsar_sink_received_1min_total	Counter	The total number of records received from sink in the last 1 minute
pulsar_sink_last_invocation	Gauge	The timestamp of the last invocation of the sink
pulsar_sink_sink_exception	Gauge	The exception from a sink
pulsar_sink_sink_exceptions_total	Counter	The total number of sink exceptions
pulsar_sink_sink_exceptions_1min_total	Counter	The total number of sink exceptions in the last 1 minute
pulsar_sink_system_exception	Gauge	The exception from system code
pulsar_sink_system_exceptions_total	Counter	The total number of system exceptions
pulsar_sink_system_exceptions_1min_total	Counter	The total number of system exceptions in the last 1 minute
pulsar_sink_user_metric_*	Summary	The user-defined metrics
process_cpu_seconds_total	Counter	Total user and system CPU time spent in seconds.
jvm_memory_bytes_committed	Gauge	Committed (bytes) of a given JVM memory area.
jvm_memory_bytes_max	Gauge	Max (bytes) of a given JVM memory area.
jvm_memory_direct_bytes_used	Gauge	Used bytes of a given JVM memory area.
jvm_memory_bytes_init	Gauge	Initial bytes of a given JVM memory area.
jvm_gc_collection_seconds_sum	Summary	Time spent in a given JVM garbage collector in seconds.

Function metrics

Name	Type	Description
pulsar_function_processed_successfully_total	Counter	The total number of messages processed successfully
pulsar_function_processed_successfully_1min_total	Counter	The total number of messages processed successfully in the last 1 minute
pulsar_function_system_exceptions_total	Counter	The total number of system exceptions
pulsar_function_system_exceptions_1min_total	Counter	The total number of system exceptions in the last 1 minute
pulsar_function_user_exceptions_total	Counter	The total number of user exceptions
pulsar_function_user_exceptions_1min_total	Counter	The total number of user exceptions in the last 1 minute
pulsar_function_process_latency_ms	Summary	The process latency in milliseconds
pulsar_function_process_latency_ms_1min	Summary	The process latency in milliseconds in the last 1 minute
pulsar_function_last_invocation	Gauge	The timestamp of the last invocation of the function
pulsar_function_received_total	Counter	The total number of messages received from source
pulsar_function_received_1min_total	Counter	The total number of messages received from source in the last 1 minute
pulsar_function_user_metric_*	Summary	The user-defined metrics
process_cpu_seconds_total	Counter	Total user and system CPU time spent in seconds.
jvm_memory_bytes_committed	Gauge	Committed (bytes) of a given JVM memory area. (Java Functions only)
jvm_memory_bytes_max	Gauge	Max (bytes) of a given JVM memory area. (Java Functions only)
jvm_memory_direct_bytes_used	Gauge	Used bytes of a given JVM memory area. (Java Functions only)
jvm_memory_bytes_init	Gauge	Initial bytes of a given JVM memory area. (Java Functions only)
jvm_gc_collection_seconds_sum	Summary	Time spent in a given JVM garbage collector in seconds. (Java Functions only)

Kafka Connect metrics

Name	Type	Description
kafka_connect_connector_task_batch_size_avg	Gauge	The average size of the batches processed by the connector
kafka_connect_connector_task_batch_size_max	Gauge	The maximum size of the batches processed by the connector
kafka_connect_connector_task_offset_commit_avg_time_ms	Gauge	The average time in milliseconds taken by this task to commit offsets
kafka_connect_connector_task_offset_commit_failure_percentage	Gauge	The average percentage of this task’s offset commit attempts that failed
kafka_connect_connector_task_offset_commit_max_time_ms	Gauge	The maximum time in milliseconds taken by this task to commit offsets
kafka_connect_connector_task_offset_commit_success_percentage	Gauge	The average percentage of this task’s offset commit attempts that succeeded
kafka_connect_connector_task_pause_ratio	Gauge	The fraction of time this task has spent in the pause state
kafka_connect_connector_task_running_ratio	Gauge	The fraction of time this task has spent in the running state
kafka_connect_source_task_source_record_poll	Gauge	The total number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker
kafka_connect_source_task_source_record_poll_rate	Gauge	The average per-second number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker
kafka_connect_source_task_source_record_write	Gauge	The number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker, since the task was last restarted
kafka_connect_source_task_source_record_write_rate	Gauge	The average per-second number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker
kafka_connect_source_task_poll_batch_avg_time_ms	Gauge	The average time in milliseconds taken by this task to poll for a batch of source records
kafka_connect_source_task_poll_batch_max_time_ms	Gauge	The maximum time in milliseconds taken by this task to poll for a batch of source records
kafka_connect_source_task_source_record_active_count	Gauge	The number of records that have been produced by this task but not yet completely written to Kafka
kafka_connect_source_task_source_record_active_count_avg	Gauge	The average number of records that have been produced by this task but not yet completely written to Kafka
kafka_connect_source_task_source_record_active_count_max	Gauge	The maximum number of records that have been produced by this task but not yet completely written to Kafka
kafka_connect_sink_task_offset_commit_completion	Gauge	The total number of offset commit completions that were completed successfully
kafka_connect_sink_task_offset_commit_completion_rate	Gauge	The average per-second number of offset commit completions that were completed successfully
kafka_connect_sink_task_offset_commit_seq_no	Gauge	The current sequence number for offset commits
kafka_connect_sink_task_offset_commit_skip	Gauge	The total number of offset commit completions that were received too late and skipped/ignored
kafka_connect_sink_task_offset_commit_skip_rate	Gauge	The average per-second number of offset commit completions that were received too late and skipped/ignored
kafka_connect_sink_task_partition_count	Gauge	The number of topic partitions assigned to this task belonging to the named sink connector in this worker
kafka_connect_sink_task_put_batch_avg_time_ms	Gauge	The average time taken by this task to put a batch of sinks records
kafka_connect_sink_task_put_batch_max_time_ms	Gauge	The maximum time taken by this task to put a batch of sinks records
kafka_connect_sink_task_sink_record_active_count	Gauge	The number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task
kafka_connect_sink_task_sink_record_active_count_avg	Gauge	The average number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task
kafka_connect_sink_task_sink_record_active_count_max	Gauge	The maximum number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task
kafka_connect_sink_task_sink_record_read	Gauge	The total number of records read from Kafka by this task belonging to the named sink connector in this worker, since the task was last restarted
kafka_connect_sink_task_sink_record_read_rate	Gauge	The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied
kafka_connect_sink_task_sink_record_send	Gauge	The total number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker, since the task was last restarted
kafka_connect_sink_task_sink_record_send_rate	Gauge	The average per-second number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker
kafka_connect_task_error_deadletterqueue_produce_failures	Gauge	The number of failed writes to the dead letter queue
kafka_connect_task_error_deadletterqueue_produce_requests	Gauge	The number of attempted writes to the dead letter queue
kafka_connect_task_error_last_error_timestamp	Gauge	The epoch timestamp when this task last encountered an error
kafka_connect_task_error_total_errors_logged	Gauge	The total number of errors that were logged
kafka_connect_task_error_total_record_errors	Gauge	The total number of record processing errors in this task
kafka_connect_task_error_total_record_failures	Gauge	The total number of record processing failures in this task
kafka_connect_task_error_total_records_skipped	Gauge	The total number of records skipped due to errors
kafka_connect_task_error_total_retries	Gauge	The total number of operations retried
kafka_connect_worker_connector_destroyed_task_count	Gauge	The number of destroyed tasks of the connector on the worker
kafka_connect_worker_connector_failed_task_count	Gauge	The number of failed tasks of the connector on the worker
kafka_connect_worker_connector_paused_task_count	Gauge	The number of paused tasks of the connector on the worker
kafka_connect_worker_connector_restarting_task_count	Gauge	The number of restarting tasks of the connector on the worker
kafka_connect_worker_connector_running_task_count	Gauge	The number of running tasks of the connector on the worker
kafka_connect_worker_connector_total_task_count	Gauge	The number of tasks of the connector on the worker
kafka_connect_worker_connector_unassigned_task_count	Gauge	The number of unassigned tasks of the connector on the worker
process_cpu_seconds_total	Counter	Total user and system CPU time spent in seconds
jvm_memory_committed_bytes	Gauge	Committed (bytes) of a given JVM memory area
jvm_memory_max_bytes	Gauge	Max (bytes) of a given JVM memory area
jvm_memory_init_bytes	Gauge	Initial bytes of a given JVM memory area
jvm_memory_used_bytes	Gauge	Used bytes of a given JVM memory area
jvm_gc_collection_seconds_sum	Summary	Time spent in a given JVM garbage collector in seconds

Health metrics

Name	Type	Description
pulsar_detector_e2e_latency_ms	Summary	The latency distribution from message sending to message consumption
pulsar_detector_publish_latency_ms	Summary	The latency distribution of message sending
pulsar_detector_pulsar_sla_messaging_up	Gauge	The gauge for indicating the messaging service up or down
pulsar_detector_pulsar_sla_webservice_up	gauge	The gauge for indicating the webservice up or down
pulsar_detector_geo_latency_ms	Summary	The latency distribution Latency distribution from message sending to message consumption across clusters

Metrics API integration

The examples below demonstrate how to configure your observability tool to scrape the metrics endpoint. While StreamNative Cloud provides the metrics endpoint, it is your responsibility to set up and manage your own observability stack.

Prometheus integration

To collect Pulsar metrics into Prometheus, add the following to your Prometheus configuration file. The bearer tokens have a limited life cycle, therefore it is recommended to use the OAuth2 authentication method.

global:
  scrape_interval: 120s
  scrape_timeout: 60s
scrape_configs:
  - job_name: streamnative
    metrics_path: /v1/cloud/metrics/export
    scheme: https
    oauth2:
      client_id: '${client_id}'
      client_secret: '${client_secret}'
      token_url: https://auth.streamnative.cloud/oauth/token
      endpoint_params:
        grant_type: 'client_credentials'
        audience: '${audience}'
    static_configs:
      - targets: [metrics.streamnative.cloud]

You can find the values of client_id and client_secret in the Key file of a Super Admin Service Account. For more information, see work with service accounts. The audience parameter is the Uniform Resource Name (URN), which is a combination of the urn:sn:pulsar, the organization name, and the Pulsar instance name at StreamNative:

"urn:sn:pulsar:${org_name}:${instance_name}"

The Prometheus response can be large, if your cluster has a lot of topics. Make sure to set the scrape_timeout parameter large enough to cover the duration of the curl request above. Your scrape_interval parameter should also be larger than your scrape_timeout parameter.

OpenTelemetry collector integration

The OpenTelemetry collector, as described on its official page, is a vendor-agnostic agent process designed for gathering and sending telemetry data from various sources. StreamNative Cloud, which outputs its metrics in the Prometheus format, is compatible with the OpenTelemetry collector. To collect metrics from StreamNative Cloud, configure your OpenTelemetry collector to utilize the Prometheus Receiver, which is fully compatible with Prometheus’s scape_config settings. To configure your collector, refer to the guidance provided in the Prometheus Integration section. There, you will find instructions to create a scape_config for collecting metrics from StreamNative Cloud. This config should be placed in your collector’s configuration file under the following section:

receivers:
  prometheus:
    config:

An example of such configuration is as follows:

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: streamnative
          metrics_path: /v1/cloud/metrics/export
          scheme: https
          oauth2:
            client_id: '${client_id}'
            client_secret: '${client_secret}'
            token_url: https://auth.streamnative.cloud/oauth/token
            endpoint_params:
              grant_type: 'client_credentials'
              audience: '${audience}'
          static_configs:
            - targets: [metrics.streamnative.cloud]

The OpenTelemetry collector’s versatility allows it to support a range of exporters, facilitating the routing of metrics from StreamNative Cloud to various observability platforms. A comprehensive list of supported exporters by the OpenTelemetry collector is available here.

NewRelic integration

You can use a Prometheus instance to forward metrics to NewRelic. To do this, add a remote_write entry to the prometheus.yml configuration file as described in the Prometheus Integration section:

remote_write:
  - url: https://metric-api.newrelic.com/prometheus/v1/write?prometheus_server=streamnative
    authorization:
      credentials: '${newrelic_ingest_key}'

The NewRelic ingestion point could also be metric-api.eu.newrelic.com depending on your account configuration.

Then by running a Prometheus instance, the Pulsar metrics are scraped from the StreamNative endpoint and forwarded to NewRelic:

prometheus --config.file=prometheus.yml

If you want to keep data from going into this Prometheus instance, you can setup a short retention time with the storage.tsdb.retention.time parameter:

prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=15m

Grafana Cloud integration

You can use a Prometheus instance to forward metrics to Grafana Cloud. To do this, add a remote_write entry to the prometheus.yml configuration file as described in the Prometheus Integration section:

remote_write:
  - url: ${grafana_cloud_endpoint}/api/prom/push
    basic_auth:
      username: '${grafana_cloud_username}'
      password: '${grafana_cloud_api_key}'

You can find the grafana_cloud_endpoint and grafana_cloud_username values by selecting Prometheus at https://grafana.com/orgs/${grafana_org}. You can find grafana_cloud_api_key at https://grafana.com/orgs/${grafana_org}/api-keys. Then by running a Prometheus instance, the Pulsar metrics are scraped from the StreamNative endpoint and forwarded to Grafana Cloud:

prometheus --config.file=prometheus.yml

If you want to keep data from going into this Prometheus instance, you can setup a short retention time with the storage.tsdb.retention.time parameter:

prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=15m

Datadog integration

Integrate with Datadog Agent

The integration with StreamNative Cloud requires the PR 16812 which released in the Datadog Agent 7.52.0.

Using Datadog Agent, you can connect Datadog to the StreamNative Cloud Metrics endpoint to start collecting metrics. Datadog Agent supports most platform to host and this documentation will mainly to demonstrate with Docker and Kubernetes.

Create a file conf.yaml, with the spec of your Datadog Agent deployment configuration.

init_config:
  service: docker

instances:
  - openmetrics_endpoint: https://metrics.streamnative.cloud/v1/cloud/metrics/export
    request_size: 900
    min_collection_interval: 180
    metrics:
      - pulsar_topics_count:
          type: gauge
          name: pulsar_topics_count
      - pulsar_subscriptions_count:
          type: gauge
          name: pulsar_subscriptions_count
      - pulsar_producers_count:
          type: gauge
          name: pulsar_producers_count
      - pulsar_consumers_count:
          type: gauge
          name: pulsar_consumers_count
      - pulsar_rate_in:
          type: gauge
          name: pulsar_rate_in
      - pulsar_rate_out:
          type: gauge
          name: pulsar_rate_out
      - pulsar_throughput_in:
          type: gauge
          name: pulsar_throughput_in
      - pulsar_throughput_out:
          type: gauge
          name: pulsar_throughput_out
      - pulsar_storage_size:
          type: gauge
          name: pulsar_storage_size
      - pulsar_storage_backlog_size:
          type: gauge
          name: pulsar_storage_backlog_size
      - pulsar_storage_offloaded_size:
          type: gauge
          name: pulsar_storage_offloaded_size
      - pulsar_storage_read_rate:
          type: gauge
          name: pulsar_storage_read_rate
      - pulsar_subscription_delayed:
          type: gauge
          name: pulsar_subscription_delayed
      - pulsar_storage_write_latency_le_0_5:
          type: histogram
          name: pulsar_storage_write_latency_le_0_5
      - pulsar_storage_write_latency_le_1:
          type: histogram
          name: pulsar_storage_write_latency_le_1
      - pulsar_storage_write_latency_le_5:
          type: histogram
          name: pulsar_storage_write_latency_le_5
      - pulsar_storage_write_latency_le_10:
          type: histogram
          name: pulsar_storage_write_latency_le_10
      - pulsar_storage_write_latency_le_20:
          type: histogram
          name: pulsar_storage_write_latency_le_20
      - pulsar_storage_write_latency_le_50:
          type: histogram
          name: pulsar_storage_write_latency_le_50
      - pulsar_storage_write_latency_le_100:
          type: histogram
          name: pulsar_storage_write_latency_le_100
      - pulsar_storage_write_latency_le_200:
          type: histogram
          name: pulsar_storage_write_latency_le_200
      - pulsar_storage_write_latency_le_1000:
          type: histogram
          name: pulsar_storage_write_latency_le_1000
      - pulsar_storage_write_latency_le_overflow:
          type: histogram
          name: pulsar_storage_write_latency_le_overflow
      - pulsar_entry_size_le_128:
          type: histogram
          name: pulsar_entry_size_le_128
      - pulsar_entry_size_le_512:
          type: histogram
          name: pulsar_entry_size_le_512
      - pulsar_entry_size_le_1_kb:
          type: histogram
          name: pulsar_entry_size_le_1_kb
      - pulsar_entry_size_le_4_kb:
          type: histogram
          name: pulsar_entry_size_le_4_kb
      - pulsar_entry_size_le_16_kb:
          type: histogram
          name: pulsar_entry_size_le_16_kb
    auth_token:
      reader:
        type: oauth
        url: https://auth.streamnative.cloud/oauth/token
        client_id: { your-admin-service-account-client-id }
        client_secret: { your-admin-service-account-client-secret }
        options:
          audience: urn:sn:pulsar:{your-organization}:{your-instance}
      writer:
        type: header
        name: Authorization
        value: Bearer <TOKEN>
        placeholder: <TOKEN>

[1] client_id: Required. You need to prepare a service account with Super Admin pemision and the client_id can be obtained from an OAuth2 credential file.
[2] client_secret: Required. You need to prepare a service account with Super Admin pemision andt the client_id can be obtained from an OAuth2 credential file.
[3] audience: Required. Audience is the Uniform Resource Name (URN), which is a combination of the urn:sn:pulsar, your organization name, and your Pulsar instance name. {organization} is the name of your organization and the {instance} is the name of your instance.

Run the docker commands to create a Datadog Agent container:

docker run -d --name dd-agent \
-e DD_API_KEY={ your-Datadog-API-Key } \
-e DD_SITE={ your-Datadog-Site-region } \
-e DD_APM_NON_LOCAL_TRAFFIC=true \
-v {your-config-yaml-file-path}:/etc/datadog-agent/conf.d/openmetrics.d/conf.yaml:ro \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-v /var/lib/docker/containers:/var/lib/docker/containers:ro \
datadog/agent:7.52.0

[1] DD_API_KEY: Your Datadog API key.
[2] DD_SITE: Destination site for your metrics, traces, and logs. Set your Datadog site to: datadoghq.com. Defaults to datadoghq.com.
[3] your-config-yaml-file-path: The conf.yaml configuration file created in the first step.

More detailed usage please refer the Docker Agent for Docker.

Bridge with OpenTelemetry

You can use OpenTelemetry Collector to collect the metrics from StreamNative Cloud and export them to Datadog. To export metrics to Datadog, you can use the Datadog Exporter and add it to your OpenTelemetry Collector configuration. Use the example file which provides a basic configuration that is ready to use after you set your Datadog API key as the ${DD_API_KEY} variable:

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: streamnative
          metrics_path: /v1/cloud/metrics/export
          scheme: https
          oauth2:
            client_id: '${client_id}'
            client_secret: '${client_secret}'
            token_url: https://auth.streamnative.cloud/oauth/token
            endpoint_params:
              grant_type: 'client_credentials'
              audience: '${audience}'
          static_configs:
            - targets: [metrics.streamnative.cloud]

processors:
  batch:
    send_batch_max_size: '10MiB'
    send_batch_size: 4096
    timeout: 120s

exporters:
  datadog:
    api:
      site: ${DD_SITE}
      key: ${DD_API_KEY}

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch]
      exporters: [datadog]

Where ${DD_SITE} is your site, . The above configuration enables the receiving of metrics from StreamNative Cloud, sets up a batch processor, which is mandatory for any non-development environment, and exports to Datadog. You can refer to this full documented example configuration file for all possible configuration options for Datadog Exporter.

Introduction

Get Started

Clusters

Data Streams

Security

Governance

Connect

Lakehouse

Process

Networking

Log And Monitor

Universal Linking

Billing

References

Cluster Metrics

Metrics endpoint

Metrics authorization

Super Admin service account

metrics-viewer role

Pulsar resource metrics

Source connector metrics

Sink connector metrics

Function metrics

Kafka Connect metrics

Health metrics

Metrics API integration

Prometheus integration

OpenTelemetry collector integration

NewRelic integration

Grafana Cloud integration

Datadog integration

Integrate with Datadog Agent

Bridge with OpenTelemetry

Introduction

Get Started

Clusters

Data Streams

Security

Governance

Connect

Lakehouse

Process

Networking

Log And Monitor

Universal Linking

Billing

References

​Metrics endpoint

​Metrics authorization

​Super Admin service account

​metrics-viewer role

​Pulsar resource metrics

​Source connector metrics

​Sink connector metrics

​Function metrics

​Kafka Connect metrics

​Health metrics

​Metrics API integration

​Prometheus integration

​OpenTelemetry collector integration

​NewRelic integration

​Grafana Cloud integration

​Datadog integration

​Integrate with Datadog Agent

​Bridge with OpenTelemetry

Metrics endpoint

Metrics authorization

Super Admin service account

metrics-viewer role

Pulsar resource metrics

Source connector metrics

Sink connector metrics

Function metrics

Kafka Connect metrics

Health metrics

Metrics API integration

Prometheus integration

OpenTelemetry collector integration

NewRelic integration

Grafana Cloud integration

Datadog integration

Integrate with Datadog Agent

Bridge with OpenTelemetry