- Configure Private Cloud
- Advanced
Configure Auto-Scaling
Auto-Scaling automatically adjusts the available resources of your deployments, and eliminates the need for scripts or consulting services to make scaling decisions. It currently supports scaling for Pulsar nodes only and works on a rolling basis, meaning the process doesn't incur any downtime.
You can specify a range of minimum and maximum Pulsar nodes that your Pulsar cluster can automatically scale to, our Cloud scaler will monitor the CPU workload of Broker nodes and adjust their nodes based on the scaling rule.
Overview of Auto-Scaling
One of the significant challenges faced by organizations dealing with real-time data is ensuring that the underlying infrastructure can handle varying workloads. Traditional scaling methods often involve manual intervention, leading to inefficiency, and increased operational costs. Private Cloud Auto-Scaling addresses these challenges by providing an intelligent, automated solution.
The Power of Private Cloud Auto-Scaling:
Dynamic Resource Allocation: Private Cloud Auto-Scaling dynamically adjusts resources based on the incoming workload. Whether it's handling a sudden spike in traffic or scaling down during periods of low activity, Pulsar ensures optimal resource utilization, leading to cost savings and improved performance.
Efficient Load Balancing: Auto-Scaling in Pulsar ensures that the message processing load is evenly distributed across brokers. This prevents any single broker from becoming a bottleneck, allowing the system to maintain high throughput and low latency even under heavy loads.
Cost-Effective Scaling: Traditional scaling methods often result in over-provisioning to handle peak loads, leading to unnecessary costs. Private Cloud Auto-Scaling optimizes resource allocation, ensuring that organizations pay only for the resources they need, making it a cost-effective solution for real-time data processing.
After enabling the Auto-Scaling, the scaling controller will track the average metrics usage of the Pulsar nodes and adjusts the nodes to keep at the target metrics usage level. If the average metrics usage for nodes is over the target, scaling controller will scale out to bring more nodes and if the average metrics for the nodes is less than the target, scaling controller will downscale nodes to save resources.
Auto-Scaling with resource metrics
Prerequisites
Private Cloud Auto-Scaling by default uses resource metrics for Pod scaling which requires the Kubernetes Metrics Server to provide metrics.
- Deploy the Metrics Server with the following command:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
- Verify that the metrics-server deployment is running
kubectl get deployment metrics-server -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 7m14s
Enable Auto-Scaling with resource metrics
Private Cloud Auto-Scaling has default scaling policy basd on cpu Utilization as
type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
After installing themetrics-server, you can easily enable the Auto-Scaling with spec.autoScalingPolicy
field on PulsarBroker
CR and PulsarProxy
CR:
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarBroker
metadata:
name: brokers
namespace: pulsar
labels:
k8s.streamnative.io/coordinator-name: private-cloud
spec:
image: streamnative/private-cloud:3.0.1.4
replicas: 3
zkServers: zookeepers-zk:2181
pod:
resources:
requests:
cpu: 200m
memory: 512Mi
securityContext:
runAsNonRoot: true
autoScalingPolicy:
minReplicas: 1
maxReplicas: 5
---
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarProxy
metadata:
name: proxys
namespace: pulsar
labels:
k8s.streamnative.io/coordinator-name: private-cloud
spec:
image: streamnative/private-cloud:3.0.1.4
replicas: 3
brokerAddress: brokers-broker
pod:
resources:
requests:
cpu: 200m
memory: 512Mi
securityContext:
runAsNonRoot: true
autoScalingPolicy:
minReplicas: 1
maxReplicas: 5
Auto-Scaling with custom metrics
Private Cloud Auto-Scaling supports configuring scaling policy based on custom metrics and multiple metrics to cover more complex scenario.
Prerequisites
To use the custom metrics Auto-Scaling, we need to install the Prometheus and Prometheus Adapter to provide metrics.
- Create Kubernetes namespace for Promtheus and Prometheus Adapter:
kubectl create ns monitor
- Add
prometheus-community
Helm repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
- Install the Prometheus chart
helm install prometheus prometheus-community/prometheus -n monitor --set alertmanager.enabled=false --set kube-state-metrics.enabled=false --set prometheus-node-exporter.enabled=false --set prometheus-pushgateway.enabled=false
- Preapre a values config
sample-config.yaml
for Prometheus Adapter chart
prometheus:
url: http://prometheus-server.monitor.svc
port: 80
listenPort: 8443
rules:
default: false
custom:
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
resources:
template: "<<.Resource>>"
name:
matches: "^(.*)_total"
as: ""
metricsQuery: |
sum by (<<.GroupBy>>) (
irate (
<<.Series>>{<<.LabelMatchers>>}[1m]
)
)
- seriesQuery: 'container_cpu_usage_seconds_total{namespace!~"(sn-system|kube-system|olm|cert-manager)"}'
seriesFilters: []
resources:
overrides:
pod:
resource: pod
namespace:
resource: namespace
name:
matches: "container_cpu_usage_seconds_total"
as: "cpu_usage"
metricsQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[5m])) by (<<.GroupBy>>) / (sum(container_spec_cpu_shares{<<.LabelMatchers>>}/1000) by (<<.GroupBy>>)) * 100
- seriesQuery: 'container_network_receive_bytes_total{namespace!~"(sn-system|kube-system|olm|cert-manager)"}'
seriesFilters: []
resources:
overrides:
pod:
resource: pod
namespace:
resource: namespace
name:
matches: "container_network_receive_bytes_total"
as: "network_in_rate_kb"
metricsQuery: rate(container_network_receive_bytes_total{<<.LabelMatchers>>}[5m]) / 1024
- seriesQuery: 'container_network_transmit_bytes_total{namespace!~"(sn-system|kube-system|olm|cert-manager)"}'
seriesFilters: []
resources:
overrides:
pod:
resource: pod
namespace:
resource: namespace
name:
matches: "container_network_transmit_bytes_total"
as: "network_out_rate_kb"
metricsQuery: rate(container_network_transmit_bytes_total{<<.LabelMatchers>>}[5m]) / 1024
- seriesQuery: 'container_fs_reads_bytes_total{namespace!~"(sn-system|kube-system|olm|cert-manager)"}'
seriesFilters: []
resources:
overrides:
pod:
resource: pod
namespace:
resource: namespace
name:
matches: "container_fs_reads_bytes_total"
as: "disk_read_rate_kb"
metricsQuery: sum(rate(container_fs_reads_bytes_total{<<.LabelMatchers>>}[5m])) by (<<.GroupBy>>) / 1024
- seriesQuery: 'container_fs_writes_bytes_total{namespace!~"(sn-system|kube-system|olm|cert-manager)"}'
seriesFilters: []
resources:
overrides:
pod:
resource: pod
namespace:
resource: namespace
name:
matches: "container_fs_writes_bytes_total"
as: "disk_write_rate_kb"
metricsQuery: sum(rate(container_fs_writes_bytes_total{<<.LabelMatchers>>}[5m])) by (<<.GroupBy>>) / 1024
- Install the Prometheus Adapter chart
helm install prometheus-adapter -f sample-config.yaml prometheus-community/prometheus-adapter -n monitor
Enable Auto-Scaling with custom metrics
After installing the Promtheus and Prometheus Adapter, you can configure to use custom metrics at the spec.autoScalingPolicy.metrics
:
spec:
autoScalingPolicy:
minReplicas: 1
maxReplicas: 5
metrics:
- pods:
metric:
name: cpu_usage
target:
averageValue: "75"
type: AverageValue
type: Pods
- pods:
metric:
name: network_in_rate_kb
target:
averageValue: "204800"
type: AverageValue
type: Pods
- pods:
metric:
name: network_out_rate_kb
target:
averageValue: "204800"
type: AverageValue
type: Pods
This configuration will leverage cpu_usage
, network_in_rate_kb
and network_out_rate_kb
for Auto-Scaling.
To customize the scaling policy for Auto-Scaling, you can configure at the spec.autoScalingPolicy.behavior:
:
spec:
Auto-ScalingPolicy:
minReplicas: 1
maxReplicas: 5
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
- [1]
scaleDown
: Define the behavior while scaling down. - [2]
scaleUp
: Define the behavior while scaling up. - [3]
stabilizationWindowSeconds
: Stabilization window is used to restrict the flapping of replica count when the metrics used for scaling keep fluctuating. - [4]
policies
: Define the scaling policy, you can limit the scaling number or percent in theperiodSeconds
.