1. Configure Private Cloud
  2. Private Preview
  3. Lakehouse Tiered Storage

Lakehouse tiered storage metrics

Note

This feature is currently in private preview. If you want to try it out or have any questions, submit a ticket to the support team.

Introduction

Monitoring tiered storage metrics in a private cloud setup is crucial for ensuring the efficiency and performance of your data offloading processes. By enabling the embedded stat service embeddedStatServiceEnabled and configuring the metric port statServicePort in the offload.conf file, you can access detailed insights into the offload framework's operations.

Components of Tiered Storage Metrics

  1. Service Stats:

    • Provides resource usage statistics at the offload service level.
  2. Offload Stats:

    • Focuses on the offload framework's performance for data offloading, including namespace, topic, and partition labels.
  3. Container Stats:

    • Offers insights into message container activity, aiding in optimizing data flushing to the Lakehouse storage.
  4. Write Stats:

    • Monitors the encoding and writing processes to the Lakehouse storage, measuring latency and throughput.
  5. Read Stats:

    • Tracks the reading process from the Lakehouse storage, including cache hits, misses, and entry retrieval metrics.

Lakehouse tiered storage architecture

Service Stats Metrics

Metric NameTypeDescription
offload_service_IN_USAGE_CONTAINERS_COUNTGaugeNumber of allocated message containers
offload_service_IDLE_CONTAINERS_COUNTGaugeNumber of idle message containers
offload_service_LEDGER_DELETION_LATENCYSummaryLatency for ledger deletions
offload_service_PARALLEL_READ_CONTAINERSGaugeNumber of parallel read containers
offload_service_OFFLOADING_TOPICS_COUNTGaugeNumber of parallel offloading topics
offload_service_READ_CACHE_SIZEGaugeSize of the read cache in bytes
offload_service_READ_CACHE_COUNTGaugeEntry count in the read cache

Offload Stats Metrics

Metric NameTypeDescription
offload_framework_WAITING_PROMISEGaugeNumber of waiting offload promises
offload_framework_MESSAGE_OUT_OF_ORDER_COUNTCounterCount of out-of-order messages
offload_framework_MESSAGE_PUT_INTO_CONTAINER_FAILED_COUNTCounterCount of failed message placements into containers
offload_framework_MESSAGE_PUT_INTO_CONTAINER_RETRY_COUNTCounterCount of retried message placements into containers
offload_framework_FETCH_LEDGER_METADATA_LATENCYSummaryLatency for fetching ledger metadata
offload_framework_FETCH_TOPIC_SCHEMA_LATENCYSummaryLatency for fetching topic schemas
offload_framework_ACKNOWLEDGE_FAILED_COUNTCounterCount of failed message acknowledgments

Container Stats Metrics

Metric NameTypeDescription
offload_container_SWITCH_COUNTCounterContainer switch count
offload_container_CURRENT_LEDGERIDGaugeCurrent container's ledger ID
offload_container_BYTES_INCounterBytes put into message containers
offload_container_MESSAGE_INCounterNumber of messages put into containers
offload_container_MESSAGE_COUNT_IN_LAST_CONTAINERGaugeMessage count in the last container
offload_container_BYTES_IN_LAST_CONTAINERGaugeBytes in the last container
offload_container_ALLOCATED_COUNTGaugeNumber of allocated containers per topic
offload_container_LIFECYCLE_TIMEGaugeLifecycle time of a container
offload_container_PROCESS_FAILED_COUNTCounterCount of container processing failures

Write Stats Metrics

Metric NameTypeDescription
storage_write_MESSAGE_ENCODE_LATENCYSummaryLatency for message encoding
storage_write_CONTAINER_WRITE_LATENCYSummaryLatency for writing container data to Lakehouse storage
storage_write_CONTAINER_WRITE_MESSAGE_LATENCYSummaryLatency for writing messages in a container
storage_write_CONTAINER_WRITE_METADATA_LATENCYSummaryLatency for writing metadata in a container
storage_write_CONTAINER_WRITE_FLUSH_LATENCYSummaryLatency for flushing a container
storage_write_CONTAINER_WRITE_COMMIT_RETRY_COUNTCounterRetry count for container write commits
storage_write_DATA_DELETION_LATENCYSummaryLatency for data deletion in the Lakehouse storage
storage_write_BYTES_OUTCounterBytes written to the Lakehouse storage
storage_write_MESSAGE_OUTCounterMessage count written to the Lakehouse storage

Read Stats Metrics

Metric NameTypeDescription
storage_read_READ_ADDITIONAL_DATA_LATENCYSummaryLatency for reading ledger metadata
storage_read_READ_ENTRIES_LATENCYSummaryLatency for reading entries from Lakehouse storage
storage_read_READ_CACHE_HITSCounterCount of read cache hits
storage_read_READ_CACHE_MISSESCounterCount of read cache misses
storage_read_READ_CACHE_HIT_ENTRIESCounterNumber of entries hit in the read cache
storage_read_READ_CACHE_HIT_BYTESCounterNumber of bytes hit in the read cache
storage_read_READ_CACHE_PREFETCHED_ENTRIESCounterNumber of prefetched entries in the read cache
storage_read_READ_CACHE_PREFETCHED_BYTESCounterNumber of prefetched bytes in the read cache
storage_read_READ_ENTRIES_FROM_STORAGE_COUNTCounterNumber of entries read from Lakehouse storage
storage_read_READ_BYTES_FROM_STORAGE_COUNTCounterNumber of bytes read from Lakehouse storage
storage_read_READ_FROM_STORAGE_LATENCYSummaryLatency for reading entries from Lakehouse storage
storage_read_MESSAGE_DECODE_LATENCYSummaryLatency for decoding fetched messages into Pulsar entries

Additional Metrics for Delta Offloader

For the Delta offloader, specific metrics are available for the Delta writer, reader, and index design, enhancing monitoring capabilities for Delta-based operations.

By leveraging these comprehensive metrics, you can effectively monitor and optimize the performance of your Lakehouse tiered storage in a private cloud environment.

Previous
User Guide