[Daily morning study] Kubernetes HPA (Horizontal Pod Autoscaler)

#daily morning study

Image


HPA๋ž€?

HPA(Horizontal Pod Autoscaler)๋Š” Kubernetes์—์„œ Deployment๋‚˜ StatefulSet์˜ ํŒŒ๋“œ ์ˆ˜๋ฅผ ์ž๋™์œผ๋กœ ์กฐ์ •ํ•˜๋Š” ์ปจํŠธ๋กค๋Ÿฌ๋‹ค. CPU ์‚ฌ์šฉ๋ฅ , ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋ฅ , ๋˜๋Š” ์ปค์Šคํ…€ ๋ฉ”ํŠธ๋ฆญ์— ๋”ฐ๋ผ ํŒŒ๋“œ๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ธ๋‹ค.

์ˆ˜์ง ํ™•์žฅ(Vertical Scaling)์ด ํŒŒ๋“œ์— ๋” ๋งŽ์€ ๋ฆฌ์†Œ์Šค(CPUยท๋ฉ”๋ชจ๋ฆฌ)๋ฅผ ์ฃผ๋Š” ๋ฐฉ์‹์ด๋ผ๋ฉด, ์ˆ˜ํ‰ ํ™•์žฅ(Horizontal Scaling)์€ ํŒŒ๋“œ ๊ฐœ์ˆ˜ ์ž์ฒด๋ฅผ ๋Š˜๋ฆฌ๋Š” ๋ฐฉ์‹์ด๋‹ค. HPA๋Š” ํ›„์ž๋ฅผ ์ž๋™ํ™”ํ•œ๋‹ค.

๋™์ž‘ ์›๋ฆฌ

HPA๋Š” ์ปจํŠธ๋กค ๋ฃจํ”„(Control Loop)๋กœ ๋™์ž‘ํ•œ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ 15์ดˆ๋งˆ๋‹ค Metrics Server๋กœ๋ถ€ํ„ฐ ํ˜„์žฌ ๋ฉ”ํŠธ๋ฆญ์„ ๊ฐ€์ ธ์™€ ๋ชฉํ‘œ๊ฐ’๊ณผ ๋น„๊ตํ•˜๊ณ , ํ•„์š”ํ•œ ํŒŒ๋“œ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.

ํŒŒ๋“œ ์ˆ˜ ๊ณ„์‚ฐ ๊ณต์‹:

desiredReplicas = ceil[currentReplicas ร— (currentMetricValue / desiredMetricValue)]

์˜ˆ: ํ˜„์žฌ ํŒŒ๋“œ 2๊ฐœ, ๋ชฉํ‘œ CPU 50%, ํ˜„์žฌ ํ‰๊ท  CPU 80%

desiredReplicas = ceil[2 ร— (80 / 50)] = ceil[3.2] = 4

ํŒŒ๋“œ๋ฅผ 4๊ฐœ๋กœ ๋Š˜๋ฆฐ๋‹ค.

๊ตฌ์„ฑ ์š”์†Œ

๊ตฌ์„ฑ ์š”์†Œ์—ญํ• 
Metrics ServerํŒŒ๋“œ์˜ CPU/๋ฉ”๋ชจ๋ฆฌ ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘
HPA Controller๋ฉ”ํŠธ๋ฆญ ๊ธฐ๋ฐ˜์œผ๋กœ ํŒŒ๋“œ ์ˆ˜ ๊ฒฐ์ •
Deployment์‹ค์ œ ํŒŒ๋“œ ์ˆ˜๋ฅผ ์กฐ์ •ํ•˜๋Š” ๋Œ€์ƒ

HPA ์„ค์ • ๋ฐฉ๋ฒ•

๊ธฐ๋ณธ YAML (CPU ๊ธฐ๋ฐ˜)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  • minReplicas: ์ตœ์†Œ ํŒŒ๋“œ ์ˆ˜
  • maxReplicas: ์ตœ๋Œ€ ํŒŒ๋“œ ์ˆ˜
  • averageUtilization: ๋ชฉํ‘œ CPU ์‚ฌ์šฉ๋ฅ (%)

kubectl ๋ช…๋ น์–ด๋กœ ๋น ๋ฅด๊ฒŒ ์ƒ์„ฑ

kubectl autoscale deployment my-app --cpu-percent=50 --min=2 --max=10

์ƒํƒœ ํ™•์ธ

kubectl get hpa
kubectl describe hpa my-app-hpa

๋ฉ”ํŠธ๋ฆญ ์ข…๋ฅ˜ (v2 API)

Resource ๋ฉ”ํŠธ๋ฆญ โ€” CPU, ๋ฉ”๋ชจ๋ฆฌ ๊ฐ™์€ ๋ฆฌ์†Œ์Šค ๊ธฐ๋ฐ˜

metrics:
- type: Resource
  resource:
    name: memory
    target:
      type: AverageValue
      averageValue: 500Mi

External ๋ฉ”ํŠธ๋ฆญ โ€” ํด๋Ÿฌ์Šคํ„ฐ ์™ธ๋ถ€ ๋ฉ”ํŠธ๋ฆญ (SQS ํ ๊ธธ์ด ๋“ฑ)

metrics:
- type: External
  external:
    metric:
      name: sqs_queue_length
    target:
      type: Value
      value: "100"

Pods ๋ฉ”ํŠธ๋ฆญ โ€” ํŒŒ๋“œ ์ž์ฒด์˜ ์ปค์Šคํ…€ ๋ฉ”ํŠธ๋ฆญ

metrics:
- type: Pods
  pods:
    metric:
      name: http_requests_per_second
    target:
      type: AverageValue
      averageValue: "100"

์Šค์ผ€์ผ๋ง ๋™์ž‘ ์ œ์–ด (behavior)

๊ธ‰๊ฒฉํ•œ ์Šค์ผ€์ผ ์—…/๋‹ค์šด์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด behavior ํ•„๋“œ๋กœ ์†๋„๋ฅผ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ๋‹ค.

spec:
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300   # 5๋ถ„ ์•ˆ์ •ํ™” ๊ตฌ๊ฐ„
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60               # 1๋ถ„์— ์ตœ๋Œ€ 10%์”ฉ ์ค„์ž„
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Pods
        value: 4
        periodSeconds: 15               # 15์ดˆ๋งˆ๋‹ค ์ตœ๋Œ€ 4๊ฐœ์”ฉ ๋Š˜๋ฆผ

์Šค์ผ€์ผ ๋‹ค์šด์€ ๊ฐ‘์ž๊ธฐ ์ค„์ด๋ฉด ํŠธ๋ž˜ํ”ฝ ์ฒ˜๋ฆฌ์— ๋ฌธ์ œ๊ฐ€ ์ƒ๊ธธ ์ˆ˜ ์žˆ์–ด์„œ ์•ˆ์ •ํ™” ๊ตฌ๊ฐ„์„ ๊ธธ๊ฒŒ ์žก๋Š”๋‹ค. ์Šค์ผ€์ผ ์—…์€ ๋น ๋ฅด๊ฒŒ ๋ฐ˜์‘ํ•ด์•ผ ํ•˜๋ฏ€๋กœ ์งง๊ฒŒ ์„ค์ •ํ•˜๋Š” ๊ฒŒ ์ผ๋ฐ˜์ ์ด๋‹ค.

์ „์ œ ์กฐ๊ฑด

1. Metrics Server ์„ค์น˜

HPA๊ฐ€ ๋™์ž‘ํ•˜๋ ค๋ฉด Metrics Server๊ฐ€ ํด๋Ÿฌ์Šคํ„ฐ์— ์„ค์น˜๋˜์–ด ์žˆ์–ด์•ผ ํ•œ๋‹ค. ์„ค์น˜๊ฐ€ ์•ˆ ๋˜์–ด ์žˆ์œผ๋ฉด kubectl get hpa์—์„œ <unknown>/50%์ฒ˜๋Ÿผ ๋ฉ”ํŠธ๋ฆญ์„ ๋ชป ์ฝ๋Š” ์ƒํƒœ๊ฐ€ ๋œ๋‹ค.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

kubectl top pods
kubectl top nodes

2. resources.requests ์„ค์ •

Deployment์˜ ํŒŒ๋“œ ์ŠคํŽ™์— resources.requests๊ฐ€ ๋ช…์‹œ๋˜์–ด์•ผ CPU ๊ธฐ๋ฐ˜ HPA๊ฐ€ ์ •์ƒ ๋™์ž‘ํ•œ๋‹ค. requests๊ฐ€ ์—†์œผ๋ฉด CPU ์‚ฌ์šฉ๋ฅ  ๊ณ„์‚ฐ ๊ธฐ์ค€์ด ์—†์–ด์„œ HPA๊ฐ€ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค.

resources:
  requests:
    cpu: "200m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

HPA vs VPA

ํ•ญ๋ชฉHPAVPA (Vertical Pod Autoscaler)
ํ™•์žฅ ๋ฐฉ์‹ํŒŒ๋“œ ์ˆ˜ ์ฆ๊ฐ€ํŒŒ๋“œ ๋ฆฌ์†Œ์Šค(CPU/๋ฉ”๋ชจ๋ฆฌ) ์ฆ๊ฐ€
์žฌ์‹œ์ž‘ ์—ฌ๋ถ€๋ถˆํ•„์š”๋ณดํ†ต ํŒŒ๋“œ ์žฌ์‹œ์ž‘ ํ•„์š”
์ ํ•ฉํ•œ ์ƒํ™ฉ๋ฌด์ƒํƒœ(stateless) ์•ฑ๋‹จ์ผ ์ธ์Šคํ„ด์Šค ๋˜๋Š” ์ƒํƒœ ์žˆ๋Š” ์•ฑ
์‚ฌ์šฉ ๋นˆ๋„๋†’์Œ๋‚ฎ์Œ

HPA์™€ VPA๋ฅผ ๊ฐ™์ด ์“ธ ์ˆ˜ ์žˆ์ง€๋งŒ, CPU/๋ฉ”๋ชจ๋ฆฌ ๋ฉ”ํŠธ๋ฆญ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‘˜ ๋‹ค ํ™œ์„ฑํ™”ํ•˜๋ฉด ์ถฉ๋Œ์ด ์ƒ๊ธด๋‹ค. ๋ณดํ†ต HPA๋Š” ์ปค์Šคํ…€ ๋ฉ”ํŠธ๋ฆญ, VPA๋Š” ๋ฆฌ์†Œ์Šค ์กฐ์ • ์šฉ๋„๋กœ ๋ถ„๋ฆฌํ•ด์„œ ์‚ฌ์šฉํ•œ๋‹ค.

์‹ค๋ฌด์—์„œ ์ฃผ์˜ํ•  ์ 

  • ์Šค์ผ€์ผ ๋‹ค์šด ์ง€์—ฐ: ๊ธฐ๋ณธ์ ์œผ๋กœ ์Šค์ผ€์ผ ๋‹ค์šด์€ 5๋ถ„ ์•ˆ์ •ํ™” ๊ตฌ๊ฐ„์„ ๊ฐ€์ง„๋‹ค. ํŠธ๋ž˜ํ”ฝ์ด ๋น ์ ธ๋„ ๋ฐ”๋กœ ์ค„์ด์ง€ ์•Š๋Š”๋‹ค.
  • Cold Start ๋ฌธ์ œ: ํŒŒ๋“œ๊ฐ€ ์ƒˆ๋กœ ๋œจ๋Š” ๋ฐ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฌ๋Š” ์•ฑ์ด๋ผ๋ฉด minReplicas๋ฅผ ๋„‰๋„‰ํžˆ ์žก๊ฑฐ๋‚˜ Readiness Probe๋ฅผ ์ฒ ์ €ํžˆ ์„ค์ •ํ•ด์•ผ ํ•œ๋‹ค.
  • Metrics Server ์‹ ๋ขฐ๋„: ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘์ด ์ผ์‹œ์ ์œผ๋กœ ์‹คํŒจํ•˜๋ฉด HPA๋Š” ์Šค์ผ€์ผ๋ง์„ ๋ฉˆ์ถ˜๋‹ค. ์•Œ๋žŒ ์„ค์ •์„ ๊ถŒ์žฅํ•œ๋‹ค.
  • minReplicas: 0 ๋ถˆ๊ฐ€: ๊ธฐ๋ณธ HPA๋กœ๋Š” 0์œผ๋กœ ์ค„์ผ ์ˆ˜ ์—†๋‹ค. ์™„์ „ ์ข…๋ฃŒ๊ฐ€ ํ•„์š”ํ•˜๋ฉด KEDA(Kubernetes Event-Driven Autoscaling)๋ฅผ ์จ์•ผ ํ•œ๋‹ค.