Skip to main content

Cost Optimization Pillar

TL;DRโ€‹

The Cost Optimization pillar focuses on maximizing the value delivered by your cloud spend. Key concepts:

  • Right-sizing: Match resources to actual workload needs
  • Commitment discounts: Use reservations and savings plans for predictable workloads
  • Spot instances: Leverage unused capacity for fault-tolerant workloads
  • FinOps practices: Implement financial accountability and governance
  • Continuous optimization: Regularly review and adjust spending

Design Principlesโ€‹

Core Cost Optimization Principlesโ€‹

PrincipleDescriptionImplementation
Choose the right resourcesMatch SKU to requirementsRight-sizing analysis
Set up budgets and alertsProactive cost monitoringAzure Cost Management
Dynamically allocate resourcesScale with demandAuto-scaling, serverless
Optimize workloadsImprove efficiencyCode optimization, caching
Continuously monitorTrack and adjustCost reviews, anomaly detection

Cost Optimization Lifecycleโ€‹


Azure Pricing Modelsโ€‹

Pricing Model Comparisonโ€‹

ModelDiscountCommitmentBest For
Pay-as-you-go0%NoneVariable workloads, testing
Reserved InstancesUp to 72%1 or 3 yearsSteady-state production
Savings PlansUp to 65%1 or 3 yearsFlexible compute usage
Spot VMsUp to 90%None (can be evicted)Fault-tolerant batch jobs
Dev/Test PricingUp to 55%VS subscriptionDevelopment environments
Hybrid BenefitUp to 85%Existing licensesWindows/SQL migrations

When to Use Each Modelโ€‹


Right-Sizing Strategiesโ€‹

Identifying Right-Sizing Opportunitiesโ€‹

Right-Sizing Guidelinesโ€‹

MetricThresholdAction
CPU < 20% averageUnderutilizedDownsize or consolidate
CPU > 80% averageConstrainedUpsize or scale out
Memory < 30% averageOver-provisionedChoose smaller SKU
Memory > 90% averageMemory pressureAdd memory or optimize
Disk IOPS < 50%Over-provisionedUse standard storage

Azure Advisor Recommendationsโ€‹

# Get cost recommendations from Azure Advisor
az advisor recommendation list --category Cost --output table

# Get specific VM right-sizing recommendations
az advisor recommendation list \
--category Cost \
--query "[?shortDescription.problem=='Right-size or shutdown underutilized virtual machines']"

Reserved Instances and Savings Plansโ€‹

Reserved Instancesโ€‹

RI vs Savings Plansโ€‹

FeatureReserved InstancesSavings Plans
FlexibilitySpecific SKU and regionAny SKU, any region
DiscountUp to 72%Up to 65%
ScopeSubscription or sharedSubscription or shared
ExchangeYes, with restrictionsNo
Best forKnown, stable workloadsDynamic compute needs

Calculating RI Savingsโ€‹

Monthly Pay-as-you-go cost: $1,000
3-year RI discount: 60%
Monthly RI cost: $400

Monthly savings: $600
Annual savings: $7,200
3-year savings: $21,600
Break-even: ~5 months

Spot VMsโ€‹

Spot VM Use Casesโ€‹

Use CaseSuitabilityReason
Batch processingExcellentCan restart interrupted jobs
CI/CD agentsGoodShort-lived, replaceable
Dev/TestGoodNon-critical workloads
RenderingExcellentParallelizable, fault-tolerant
Machine LearningGoodCheckpoint-based training
Production webPoorEviction causes downtime

Spot VM Configurationโ€‹

// Bicep - Create Spot VM
resource spotVM 'Microsoft.Compute/virtualMachines@2023-07-01' = {
name: 'spot-vm'
location: location
properties: {
hardwareProfile: {
vmSize: 'Standard_D4s_v3'
}
priority: 'Spot'
evictionPolicy: 'Deallocate' // or 'Delete'
billingProfile: {
maxPrice: 0.05 // Max price per hour, -1 for market price
}
// ... other configuration
}
}

Handling Spot Evictionsโ€‹

// C# - Check for Spot VM eviction using Scheduled Events
public async Task<bool> CheckForEviction()
{
var client = new HttpClient();
var response = await client.GetAsync(
"http://169.254.169.254/metadata/scheduledevents?api-version=2020-07-01",
new Dictionary<string, string> { { "Metadata", "true" } });

var events = JsonSerializer.Deserialize<ScheduledEvents>(
await response.Content.ReadAsStringAsync());

return events.Events.Any(e => e.EventType == "Preempt");
}

Azure Cost Managementโ€‹

Cost Management Featuresโ€‹

Setting Up Budgetsโ€‹

# Create a budget with Azure CLI
az consumption budget create \
--budget-name "Monthly-Budget" \
--amount 10000 \
--category Cost \
--time-grain Monthly \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--resource-group myRG

# Create budget alert
az consumption budget create \
--budget-name "Alert-Budget" \
--amount 5000 \
--category Cost \
--time-grain Monthly \
--notifications '{
"Actual_GreaterThan_80_Percent": {
"enabled": true,
"operator": "GreaterThan",
"threshold": 80,
"contactEmails": ["admin@example.com"],
"thresholdType": "Actual"
}
}'

Cost Analysis Queriesโ€‹

// KQL - Analyze costs by resource type
AzureCostManagement
| where TimeGenerated > ago(30d)
| summarize TotalCost = sum(Cost) by ResourceType
| order by TotalCost desc
| take 10

// Costs by tag
AzureCostManagement
| where TimeGenerated > ago(30d)
| extend CostCenter = tostring(Tags["CostCenter"])
| summarize TotalCost = sum(Cost) by CostCenter
| order by TotalCost desc

FinOps Practicesโ€‹

FinOps Frameworkโ€‹

FinOps Maturity Modelโ€‹

StageCharacteristicsFocus
CrawlBasic visibility, reactiveUnderstand spending
WalkProactive optimization, some automationReduce waste
RunFull automation, predictiveMaximize value

Cost Allocation Strategyโ€‹

DimensionPurposeExample
EnvironmentSeparate prod/dev costsenv:production
Cost CenterBusiness unit allocationcostcenter:engineering
ProjectProject-level trackingproject:customer-portal
OwnerAccountabilityowner:team-alpha
ApplicationApplication costsapp:order-service

Tagging Strategyโ€‹

Required Tagsโ€‹

{
"Environment": "production",
"CostCenter": "CC-12345",
"Owner": "platform-team@company.com",
"Project": "customer-portal",
"Application": "order-service",
"CreatedBy": "terraform",
"CreatedDate": "2024-01-15"
}

Enforcing Tags with Azure Policyโ€‹

// Azure Policy - Require CostCenter tag
{
"mode": "Indexed",
"policyRule": {
"if": {
"allOf": [
{
"field": "tags['CostCenter']",
"exists": false
},
{
"field": "type",
"notEquals": "Microsoft.Resources/subscriptions/resourceGroups"
}
]
},
"then": {
"effect": "deny"
}
}
}
# Apply tag to all resources in resource group
az tag create --resource-id /subscriptions/{sub}/resourceGroups/{rg} \
--tags CostCenter=CC-12345 Environment=production

# List untagged resources
az resource list --query "[?tags.CostCenter==null].{Name:name, Type:type}" -o table

Storage Cost Optimizationโ€‹

Storage Tier Comparisonโ€‹

TierAccess FrequencyCost (per GB)Access Cost
HotFrequent$$$$
CoolInfrequent (30+ days)$$$$
ColdRare (90+ days)$$$$
ArchiveRarely (180+ days)ยข$$$$

Lifecycle Managementโ€‹

// Storage Lifecycle Policy
{
"rules": [
{
"name": "MoveToCool",
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["logs/"]
},
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 30 },
"tierToArchive": { "daysAfterModificationGreaterThan": 90 },
"delete": { "daysAfterModificationGreaterThan": 365 }
}
}
}
}
]
}

Compute Cost Optimizationโ€‹

Auto-Scaling Configurationโ€‹

// Bicep - Autoscale settings for App Service
resource autoscale 'Microsoft.Insights/autoscalesettings@2022-10-01' = {
name: 'autoscale-webapp'
location: location
properties: {
targetResourceUri: appServicePlan.id
enabled: true
profiles: [
{
name: 'Default'
capacity: {
minimum: '1'
maximum: '10'
default: '2'
}
rules: [
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 70
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '1'
cooldown: 'PT5M'
}
}
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'LessThan'
threshold: 30
}
scaleAction: {
direction: 'Decrease'
type: 'ChangeCount'
value: '1'
cooldown: 'PT5M'
}
}
]
}
]
}
}

Serverless Cost Benefitsโ€‹

ScenarioTraditionalServerlessSavings
Low traffic API24/7 VM ($100/mo)Pay per request ($5/mo)95%
Batch processingDedicated VM ($200/mo)Functions ($20/mo)90%
Event processingAlways-on serviceEvent-driven ($10/mo)85%

Cost Optimization Checklistโ€‹

Planning Phaseโ€‹

  • Estimate costs using Azure Pricing Calculator
  • Set budgets and alerts
  • Define tagging strategy
  • Plan for reserved capacity

Implementation Phaseโ€‹

  • Apply required tags to all resources
  • Configure auto-scaling
  • Use appropriate storage tiers
  • Implement lifecycle policies

Operations Phaseโ€‹

  • Review Azure Advisor recommendations weekly
  • Analyze cost trends monthly
  • Right-size underutilized resources
  • Purchase/renew reservations as needed

Governance Phaseโ€‹

  • Enforce tagging with Azure Policy
  • Implement cost allocation
  • Conduct regular cost reviews
  • Train teams on cost awareness

Assessment Questionsโ€‹

AreaQuestion
VisibilityCan you see costs by team/project/application?
BudgetsDo you have budgets and alerts configured?
Right-sizingWhen did you last review resource utilization?
ReservationsAre predictable workloads covered by RIs?
TaggingAre all resources properly tagged?
AutomationDo you auto-scale based on demand?
StorageAre you using appropriate storage tiers?
GovernanceWho is accountable for cloud costs?

Key Takeawaysโ€‹

  1. Right-size first: Don't pay for resources you don't use
  2. Commit to save: Use reservations for predictable workloads
  3. Tag everything: Enable cost allocation and accountability
  4. Automate scaling: Match capacity to demand
  5. Review regularly: Cost optimization is continuous

Resourcesโ€‹