Cluster Metrics
Metrics is a valuable tool for getting visibility into your Cloud deployment. StreamNative Cloud provides a broad range of metrics that you can use to help fine-tune performance and troubleshoot issues.
Exposed metrics
StreamNative Cloud proposes an endpoint exposing real-time metrics in Prometheus metrics format. The following table displays the metrics currently exposed.
Note
Some metrics may be missing in empty clusters.
Name | Type | Description |
---|---|---|
pulsar_topics_count | Gauge | The number of Pulsar topics of the namespace owned by this broker. |
pulsar_subscriptions_count | Gauge | The number of Pulsar subscriptions of the namespace served by this broker. |
pulsar_producers_count | Gauge | The number of active producers of the namespace connected to this broker. |
pulsar_consumers_count | Gauge | The number of active consumers of the namespace connected to this broker. |
pulsar_rate_in | Gauge | The total message rate of the namespace coming into this broker (message/second). |
pulsar_rate_out | Gauge | The total message rate of the namespace going out from this broker (message/second). |
pulsar_throughput_in | Gauge | The total throughput of the namespace coming into this broker (byte/second). |
pulsar_throughput_out | Gauge | The total throughput of the namespace going out from this broker (byte/second). |
pulsar_storage_size | Gauge | The total storage size of the topics in this namespace owned by this broker (bytes). |
pulsar_storage_backlog_size | Gauge | The total backlog size of the topics of this namespace owned by this broker (messages). |
pulsar_storage_offloaded_size | Gauge | The total amount of the data in this namespace offloaded to the tiered storage (bytes). |
pulsar_storage_write_rate | Gauge | The total message batches (entries) written to the storage for this namespace (message batches / second). |
pulsar_storage_read_rate | Gauge | The total message batches (entries) read from the storage for this namespace (message batches / second). |
pulsar_subscription_delayed | Gauge | The total message batches (entries) delayed for dispatching. |
pulsarstorage_write_latency_le* | Histogram | The entry rate of a namespace that the storage write latency is smaller than a given threshold. Available thresholds:
|
pulsarentry_size_le* | Histogram | The entry rate of a namespace that the entry size is smaller than a given threshold. Available thresholds:
|
Prometheus endpoint
The export
endpoint can be used to collect real-time metrics in Prometheus metrics format. First, you need a token from a Super Admin service account.
Note
- Before getting the token of a service account, verify that the service account is authorized as a superuser or an admin of the tenants and namespaces.
- A token has a system-defined Time-To-Live (TTL) of 7 days. Before a token expires, ensure that you generate a new token for your service account.
To get a token using the StreamNative Console, follow these steps.
On the left navigation pane, click Service Accounts.
In the row of the service account you want to use, in the Token column, click Generate new token, then click the Copy icon to copy the token to your clipboard.
curl https://metrics.streamnative.cloud/cloud/metrics/export \
-H "Authorization: Bearer ${TOKEN}"
Prometheus integration
To collect Pulsar metrics into Prometheus, add the following to your Prometheus configuration file. The bearer tokens have a limited life cycle, therefore it is recommended to use the OAuth2 authentication method.
global:
scrape_interval: 120s
scrape_timeout: 60s
scrape_configs:
- job_name: streamnative
metrics_path: /cloud/metrics/export
scheme: https
oauth2:
client_id: '${client_id}'
client_secret: '${client_secret}'
token_url: https://auth.streamnative.cloud/oauth/token
endpoint_params:
grant_type: 'client_credentials'
audience: '${audience}'
static_configs:
- targets: [metrics.streamnative.cloud]
You can find the values of client_id
and client_secret
in the Key
file of a Super Admin Service Account. For more information, see work with service accounts.
The audience
parameter is the Uniform Resource Name (URN), which is a combination of the urn:sn:pulsar
, the organization name, and the Pulsar instance name at StreamNative:
"urn:sn:pulsar:${org_name}:${instance_name}"
The Prometheus response can be large, if your cluster has a lot of topics. Make sure to set the scrape_timeout
parameter large enough to cover the duration of the curl request above. Your scrape_interval
parameter should also be larger than your scrape_timeout
parameter.
NewRelic integration
Currently, remote writing of metrics directly into NewRelic is not supported. You can use a Prometheus instance to forward metrics to NewRelic. To do this, add a remote_write
entry to the prometheus.yml
configuration file as described in the Prometheus Integration section:
remote_write:
- url: https://metric-api.newrelic.com/prometheus/v1/write?prometheus_server=streamnative
authorization:
credentials: '${newrelic_ingest_key}'
Note
The NewRelic ingestion point could also be metric-api.eu.newrelic.com
depending on your account configuration.
Then by running a Prometheus instance, the Pulsar metrics are scraped from the StreamNative endpoint and forwarded to NewRelic:
prometheus --config.file=prometheus.yml
If you want to keep data from going into this Prometheus instance, you can setup a short retention time with the storage.tsdb.retention.time
parameter:
prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=15m
Grafana Cloud integration
Currently, remote writing of metrics directly into Grafana Cloud is not supported. You can use a Prometheus instance to forward metrics to Grafana Cloud. To do this, add a remote_write
entry to the prometheus.yml
configuration file as described in the Prometheus Integration section:
remote_write:
- url: ${grafana_cloud_endpoint}/api/prom/push
basic_auth:
username: '${grafana_cloud_username}'
password: '${grafana_cloud_api_key}'
You can find the grafana_cloud_endpoint
and grafana_cloud_username
values by selecting Prometheus at https://grafana.com/orgs/${grafana_org}
. You can find grafana_cloud_api_key
at https://grafana.com/orgs/${grafana_org}/api-keys
.
Then by running a Prometheus instance, the Pulsar metrics are scraped from the StreamNative endpoint and forwarded to Grafana Cloud:
prometheus --config.file=prometheus.yml
If you want to keep data from going into this Prometheus instance, you can setup a short retention time with the storage.tsdb.retention.time
parameter:
prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=15m