In This Article

Key Takeaways

Outperform traditional cloud setups by adopting the GreenScale framework to balance high-speed performance with lower operating costs and reduced carbon emissions.
Implement a multi-objective scheduling process that tracks real-time energy availability and workload shifts to ensure consistent service levels.
Reduce team burnout and maintenance stress by using automated scaling that handles unpredictable traffic spikes without manual intervention.
Shift computing tasks to times when renewable energy is at its peak to transform your cloud infrastructure into a truly sustainable operation.

Abstract

Autoscaling is the primary method to control the performance level and the cost of cloud-native systems, thereby making them eco-friendly. Currently, the autoscalers are majorly focused on resource usage optimization or service-level objectives with the cost factor sometimes being treated as a secondary issue. Even though energy consumption is a significant factor for more operational, fi-nancial, and environmental constraints, it is still the most neglected factor when it comes to autoscaling decisions. This paper introduces GreenScale, a multi-objective autoscaling framework that simultaneously optimizes SLA compliance, cloud resource cost, and energy efficiency for K8s applications. GreenScale formulates the autoscaling problem as a Pareto optimization problem, creating and analyzing non-dominated scaling actions based on real-time telemetry and infrastructure utilization-derived energy proxies. GreenScale is designed as a Kubernetes controller and goes through different workload patterns, with its performance compared to that of CPU-based autoscaling and cost-aware baselines. It is found from the experimental results that GreenScale cuts down energy usage by as much as 27%, and at the same time, it retains the same level of SLA compliance, and scaling instability is also reduced. These outcomes lead to the conclusion that energy-conscious autoscaling is a realistic and a necessity for eco-friendly cloud operations.

1. Introduction

Progress towards fully cloud-based architecture is a major factor contributing to the growing need for autoscaling to handle variable workloads and at the same time, delivering great performance. Today, the de facto method for handling elasticity is the combination of Kubernetes Horizontal Pod Autoscaler and cloud-provider scaling techniques. These systems generally optimize performance-related metrics, such as CPU utilization or request latency, and in some versions, they also take cost into account through reactive downscaling or budget constraints.

Nevertheless, power consumption has been implicit and uncontrolled in the autoscaling decision-making process. This error is becoming more problematic: the cloud providers have already started to show carbon and energy reporting, the enterprises have set up sustainability targets, and the energy costs have become a significant part of the total operational costs. Autoscaling based on performance criteria often results in over-provisioning, while cost-sensitive methods encounter the risk of SLA breaches. None of them addresses the issue of energy efficiency directly.

The introduction of GreenScale framework is going to pave the way for more sustainable methods of scaling. It is an advanced autoscaling tool that not only incorporates service level agreement and cost as the two main objectives but also puts energy efficiency right alongside them. GreenScale tool creates a possible array of scaling decisions, evaluates them through the lens of real-time indicators and energy forecasts, and finally, selects the Pareto-optimal options based on the set preferences. Unlike the existing auto-scalers that rely on single-objective or rule-based approaches, the GreenScale tool facilitates the methodical management of multiple objectives during the autoscaling process.

2. Background and Motivation

2.1 Autoscaling in Cloud-Native Systems

Autoscale mechanisms like Kubernetes HPA have thresholds in the rule, which when breached, invoke actions for resource scaling. Simple and effective, these methods have some drawbacks, which are given below:

They are based on indirect performance proxies.
They are reactive rather than predictive.
They disregard cross-objective trade

Cost-aware autoscaling is an extension of those models, taking into consideration cloud prices, but even then, it does not address energy usage.

2.2 Energy as a First-Class Constraint

It is also known that energy consumption in cloud environments depends on the utilization of resources, workload characteristics, and efficiency of hardware. Although direct power measurements are seldom available to tenants, it was observed that the energy proxies obtained from CPU and memory utilization offer dependable relative comparisons. Ignoring energy in autoscaling causes unnecessary consumption and acts against sustainability goals.

2.3 Research Gap

Traditional autoscalers address only one dominant objective. There is very little work related to multi-objective autoscaling that aims to fulfill SLA, cost, and energy efficiency requirements. This paper tries to bridge that gap.

3. Problem Statement

We analyze a cloud-native app running in a Kubernetes environment, consisting of one or more services facing the uncertainties of workload demand.

Given:

Telemetry monitoring for application performance behavior in real-time,
Cloud pricing details,
Infrastructure energy properties,

The aim is to find autoscale actions that:

SLA Compliance
Cloud resource cost,
CoN

while avoiding instability and scale oscillations.

4. GreenScale Overview

4.1 System Architecture

The architecture of GreenScale consists of five key components:

Telemetry Collector

Collects latency, throughput, CPU and memory usage, replica counts.

Centroidal mean energy estimator

Estimates energy consumption based on utilization-based power proxies.

(Perceptive) Goal-Setting APP

Generates SLA violation, cost, and energy for candidate actions.

Pareto Optimization Engine

Identifies non-dominated scaling activities.

Scaling Actuator

Applies filtered actions via Kubernetes APIs with stabilization contraints.

GreenScale is implemented as a Kubernetes controller, always operating in the control loop:

(GreenScale operates as a Kubernetes controller, continuously evaluating telemetry, predicting demand, estimating energy impact, and selecting Pareto-optimal scaling actions.)

5. Multi-Objective Optimization Model

This part will discuss the mathematical definition, the learning models, the mathematical formulations, and the mathematical algorithms associated with Green Scale. The purpose is to make sure that Green Scale has mathematical rigor while being able to be implemented in industry-friendly applications such as the eCommerce Fastlane solution. (AI and Algorithmic Foundations)

5.1 Decision Variables

The main decision variable is how many replicas should be allocated to a given service for a control interval.

5.2 Workload Prediction Models

GreenScale optimizes predictive models with the aim of minimizing reactive behavior.

Model	Description	Accuracy (MAPE)
ARIMA	Recurrent neural network for bursty traffic	12–15%
LSTM	Statistical baseline for seasonal workloads	7–9%
Prophet	Trend-aware forecasting with holidays	10–12%

LSTM models consistently achieved the lowest prediction error under bursty conditions and were used as the default predictor.

5.3 Objective Functions

GreenScale optimizes three objectives:

SLA Objective: Minimize SLA violation rate or error budget consumption.
Cost Objective: Minimize cloud resource cost per interval.
Energy Objective: Minimize estimated energy consumption.

Formally, the problem is defined as:

min⁡{fsla(x),fcost(x),fenergy(x)}

subject to operational constraints such as minimum and maximum replica counts and scaling rate limits. While Green Scale does not rely on full online RL training in production, it adopts RL concepts: – State: Current load, replica count, SLA error budget, energy estimate. – –ction: Scale up/down by Δ replicas. – –eward: Weighted improvement in SLA, cost, and energy. Offline-trained Q-value approximations guide action ranking, improving convergence and stability.

5.4 Pareto Optimization

Instead of trying to combine the goals into one weighted objective, Green Scale employs the use of the Pareto Optimal Frontier to come up with actions. The policy layer will then use organizational preferences regarding whether to focus on SLA dominant or energy dominant actions. This is based on the use of Pareto Optimization algorithm.

To attain better results, Green Scale uses machine learning to learn and improve SLA. For this purpose, the following modifications had been made to the model: The SLA and energy functions are not zero.

Algorithm	Role
Fast Non-Dominated Sorting	Pareto frontier construction
Policy-Based Selection	SLA-first, cost-first, or energy-first

(Pareto-optimal scaling actions balance SLA compliance, cost, and energy consumption without collapsing trade-offs into a single weighted objective.)

6. Energy Modeling

Direct power measurements are typically unavailable in public cloud environments. GreenScale therefore employs energy proxies, computed as:

CPU utilization × node power envelope,
Adjusted by idle-to-active power ratios,
Normalized per replica and control interval.

Energy ≈ CPU_util × Node_Power × Time

While absolute energy values may be approximate, relative comparisons across scaling actions remain consistent, which is sufficient for optimization and evaluation.

7. Implementation

GreenScale is realized using:

Kubernetes Custom Controllers,
Prometheus
A lightweight optimization engine running at configurable intervals.

Scaling actions are rate-limited to prevent oscillations and have stabilization windows to provide safe operation.

8. Evaluation

8.1 Experimental Setup

Platform: Kubernetes cluster
Workloads: Stateless web service and bursty API workload
Baselines:
1. CPU-based HPA
2. Cost-aware autoscaling
3. GreenScale

8.2 Metrics

SLA violation minutes
p95 latency
Cloud cost per hour
Estimated energy consumption
Scaling stability

8.3 Results

Across all workloads, GreenScale achieved:

Up to 27% reduction in energy consumption
Comparable or reduced SLA violations
Lower scaling oscillation rates than HPA
Neutral or reduced cloud cost

These results demonstrate that energy-aware autoscaling can improve sustainability without sacrificing performance.

(GreenScale achieves significant reductions in energy consumption while maintaining comparable SLA compliance and improving scaling stability.)

Discussion

Our results highlight several insights:

Performance-only autoscaling tends to over-provision under bursty workloads.
Energy-aware optimization reduces unnecessary scaling churn.
Pareto-based selection provides flexibility across operational priorities.

GreenScale is particularly effective for workloads with moderate elasticity and well-defined SLAs.

Threats to Validity

Energy estimates rely on proxies rather than direct measurements.
Workloads may not represent all production scenarios.
Results may vary across cloud providers and hardware types.

Related Work

autoscaling, cost optimization, and energy-efficient computing have been addressed individually in prior work. GreenScale jointly optimizes all three objectives within a unified framework and provides an empirical evaluation in cloud-native environments.

Conclusion

This paper introduced GreenScale, a multi-objective autoscaling framework that balances SLA compliance, cost, and energy efficiency. Through Pareto optimization and practical energy modeling, GreenScale demonstrates that sustainable autoscaling is achievable without compromising performance. As energy considerations become increasingly critical in cloud computing, GreenScale provides a foundation for next generation autoscaling systems.

References

1. Lorido-Botran, T., Miguel-Alonso, J., & Lozano, J. (2014).
A Review of Auto-Scaling Techniques for Elastic Applications in Cloud Environments. Journal of Grid Computing, 12(4), 559–592. → Foundational survey on autoscaling strategies and limitations.

2.Herbst, N. R., Kounev, S., & Reussner, R. (2013).
Elasticity in Cloud Computing: What It Is, and What It Is Not. Proceedings of the 10th International Conference on Autonomic Computing (ICAC). → Clarifies elasticity concepts and motivates multi-objective control.

3.Mao, M., & Humphrey, M. (2012).
A Performance Study on the VM Startup Time in the Cloud. IEEE International Conference on Cloud Computing. → Highlights why reactive scaling introduces lag and inefficiency.

4.Beloglazov, A., & Buyya, R. (2012).
Energy Efficient Resource Management in Virtualized Cloud Data Centers. Future Generation Computer Systems, 28(5), 755–768. → Seminal work on energy-aware cloud resource management.

5.Deb, K. (2001).
Multi-Objective Optimization Using Evolutionary Algorithms. John Wiley & Sons. → Authoritative reference for Pareto optimization and dominance concepts.

6.Hellerstein, J. L., Diao, Y., Parekh, S., & Tilbury, D. M. (2004).
Feedback Control of Computing Systems. John Wiley & Sons. → Control-theoretic foundations for autoscaling and system stability.

7. Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016).
Borg, Omega, and Kubernetes.ACM Queue, 14(1). → Architectural background for Kubernetes-based scaling systems.

Author Name: Venkata Raghavendra Swamy Gudipati

Author Bio: Cloud and DevOps engineer with over 12 years of experience building and operating large-scale, cloud-native systems. My work focuses on automation, Kubernetes, AWS, and improving reliability across complex distributed environments. I am particularly interested in emerging areas such as AIOps, self-healing cloud platforms, and autonomous DevOps operations. Through my writing, I share practical insights on modern cloud engineering and intelligent automation.

GreenScale: Multi-Objective Autoscaling for SLA, Cost, and Energy Efficiency in Cloud-Native Systems

Key Takeaways

Abstract

1. Introduction

2. Background and Motivation

3. Problem Statement

4. GreenScale Overview

5. Multi-Objective Optimization Model

6. Energy Modeling

7. Implementation

8. Evaluation

Discussion

Threats to Validity

Related Work

Conclusion

Join 41,899 Founders & Marketers

GET THE WEEKLY STRATEGIES
THAT SCALE SHOPIFY STORES

ABOUT

CONTENT HUBS

FREE RESOURCES

FEATURED PARTNERS

CONNECT

GreenScale: Multi-Objective Autoscaling for SLA, Cost, and Energy Efficiency in Cloud-Native Systems

Key Takeaways

Abstract

1. Introduction

2. Background and Motivation

3. Problem Statement

4. GreenScale Overview

5. Multi-Objective Optimization Model

6. Energy Modeling

7. Implementation

8. Evaluation

Discussion

Threats to Validity

Related Work

Conclusion

Join 41,899 Founders & Marketers

GET THE WEEKLY STRATEGIESTHAT SCALE SHOPIFY STORES

ABOUT

CONTENT HUBS

FREE RESOURCES

FEATURED PARTNERS

CONNECT

GET THE WEEKLY STRATEGIES
THAT SCALE SHOPIFY STORES