Performance Efficiency Pillar
TL;DR
The Performance Efficiency pillar focuses on meeting workload demands efficiently. Key concepts:
- Scaling: Match capacity to demand (vertical and horizontal)
- Caching: Reduce latency and backend load
- Optimization: Improve code and query efficiency
- Capacity planning: Anticipate future needs
- Performance testing: Validate under realistic conditions
Design Principles
Core Performance Efficiency Principles
| Principle | Description | Implementation |
|---|---|---|
| Design for scaling | Handle varying loads | Auto-scaling, stateless design |
| Optimize hot paths | Focus on critical code paths | Profiling, optimization |
| Use caching | Reduce repeated work | Redis, CDN, local cache |
| Partition workloads | Distribute load effectively | Sharding, partitioning |
| Continuously monitor | Understand performance | APM, load testing |
Performance Optimization Hierarchy
Scaling Strategies
Vertical vs Horizontal Scaling
| Aspect | Vertical Scaling | Horizontal Scaling |
|---|---|---|
| Approach | Bigger machines | More machines |
| Limit | Hardware limits | Virtually unlimited |
| Downtime | Usually required | Zero downtime |
| Complexity | Simple | Requires stateless design |
| Cost | Diminishing returns | Linear cost increase |
| Use Case | Databases, legacy apps | Web apps, microservices |
Auto-Scaling Patterns
Auto-Scale Configuration
// Bicep - Configure auto-scaling for App Service
resource autoscale 'Microsoft.Insights/autoscalesettings@2022-10-01' = {
name: 'autoscale-webapp'
location: location
properties: {
targetResourceUri: appServicePlan.id
enabled: true
profiles: [
{
name: 'DefaultProfile'
capacity: {
minimum: '2'
maximum: '10'
default: '2'
}
rules: [
// Scale out on high CPU
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 70
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '2'
cooldown: 'PT5M'
}
}
// Scale in on low CPU
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT10M'
timeAggregation: 'Average'
operator: 'LessThan'
threshold: 30
}
scaleAction: {
direction: 'Decrease'
type: 'ChangeCount'
value: '1'
cooldown: 'PT10M'
}
}
]
}
// Scheduled scaling for known traffic patterns
{
name: 'BusinessHours'
capacity: {
minimum: '4'
maximum: '10'
default: '4'
}
recurrence: {
frequency: 'Week'
schedule: {
timeZone: 'Eastern Standard Time'
days: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
hours: [9]
minutes: [0]
}
}
rules: []
}
]
}
}
Caching Strategies
Cache Hierarchy
Cache Pattern Comparison
| Pattern | Use Case | Pros | Cons |
|---|---|---|---|
| Cache-Aside | General purpose | Simple, flexible | Cache misses hit DB |
| Read-Through | Read-heavy workloads | Transparent to app | Cache dependency |
| Write-Through | Data consistency | Always consistent | Write latency |
| Write-Behind | Write-heavy workloads | Fast writes | Eventual consistency |
| Refresh-Ahead | Predictable access | No cache misses | Wasted refreshes |
Cache-Aside Pattern
// C# - Cache-aside pattern with Redis
public class ProductService
{
private readonly IDatabase _cache;
private readonly IProductRepository _repository;
private readonly TimeSpan _cacheExpiry = TimeSpan.FromMinutes(10);
public async Task<Product> GetProductAsync(string productId)
{
// Try cache first
var cacheKey = $"product:{productId}";
var cached = await _cache.StringGetAsync(cacheKey);
if (cached.HasValue)
{
return JsonSerializer.Deserialize<Product>(cached);
}
// Cache miss - get from database
var product = await _repository.GetByIdAsync(productId);
if (product != null)
{
// Store in cache
await _cache.StringSetAsync(
cacheKey,
JsonSerializer.Serialize(product),
_cacheExpiry);
}
return product;
}
public async Task UpdateProductAsync(Product product)
{
// Update database
await _repository.UpdateAsync(product);
// Invalidate cache
var cacheKey = $"product:{product.Id}";
await _cache.KeyDeleteAsync(cacheKey);
}
}
Azure Cache for Redis
# Create Redis Cache
az redis create \
--name myRedisCache \
--resource-group myRG \
--location eastus \
--sku Standard \
--vm-size c1
# Get connection string
az redis list-keys --name myRedisCache --resource-group myRG
// C# - Configure Redis in ASP.NET Core
builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = builder.Configuration["Redis:ConnectionString"];
options.InstanceName = "myapp:";
});
// Use IDistributedCache
public class CachedService
{
private readonly IDistributedCache _cache;
public async Task<string> GetCachedDataAsync(string key)
{
var cached = await _cache.GetStringAsync(key);
if (cached != null) return cached;
var data = await FetchDataAsync();
await _cache.SetStringAsync(key, data, new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10),
SlidingExpiration = TimeSpan.FromMinutes(2)
});
return data;
}
}
CDN Configuration
// Bicep - Configure Azure CDN
resource cdnProfile 'Microsoft.Cdn/profiles@2023-05-01' = {
name: 'cdn-profile'
location: 'global'
sku: {
name: 'Standard_Microsoft'
}
}
resource cdnEndpoint 'Microsoft.Cdn/profiles/endpoints@2023-05-01' = {
parent: cdnProfile
name: 'cdn-endpoint'
location: 'global'
properties: {
originHostHeader: storageAccount.properties.primaryEndpoints.blob
origins: [
{
name: 'storage-origin'
properties: {
hostName: '${storageAccount.name}.blob.core.windows.net'
httpPort: 80
httpsPort: 443
}
}
]
isCompressionEnabled: true
contentTypesToCompress: [
'text/plain'
'text/html'
'text/css'
'application/javascript'
'application/json'
]
queryStringCachingBehavior: 'IgnoreQueryString'
}
}
Database Optimization
Query Optimization Techniques
| Technique | Impact | Implementation |
|---|---|---|
| Indexing | High | Create indexes on frequently queried columns |
| Query rewriting | High | Optimize slow queries |
| Partitioning | Medium | Split large tables |
| Read replicas | High | Offload read traffic |
| Connection pooling | Medium | Reuse database connections |
| Caching | High | Cache frequent queries |
SQL Server Index Strategy
-- Identify missing indexes
SELECT
CONVERT(decimal(18,2), migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)) AS improvement_measure,
'CREATE INDEX [IX_' + OBJECT_NAME(mid.object_id) + '_' + REPLACE(REPLACE(REPLACE(ISNULL(mid.equality_columns,''), ', ', '_'), '[', ''), ']', '') + ']'
+ ' ON ' + mid.statement
+ ' (' + ISNULL(mid.equality_columns, '') + CASE WHEN mid.inequality_columns IS NOT NULL THEN ',' + mid.inequality_columns ELSE '' END + ')'
+ ISNULL(' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle
ORDER BY improvement_measure DESC;
-- Find unused indexes
SELECT
OBJECT_NAME(i.object_id) AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
ius.user_seeks,
ius.user_scans,
ius.user_lookups,
ius.user_updates
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius ON i.object_id = ius.object_id AND i.index_id = ius.index_id
WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 1
AND i.type_desc = 'NONCLUSTERED'
AND ISNULL(ius.user_seeks, 0) + ISNULL(ius.user_scans, 0) + ISNULL(ius.user_lookups, 0) = 0
ORDER BY ius.user_updates DESC;
Cosmos DB Optimization
// C# - Optimized Cosmos DB queries
public class CosmosRepository
{
private readonly Container _container;
// Use partition key for efficient queries
public async Task<Product> GetProductAsync(string productId, string category)
{
// Point read with partition key - most efficient
var response = await _container.ReadItemAsync<Product>(
productId,
new PartitionKey(category));
return response.Resource;
}
// Use query with partition key
public async Task<List<Product>> GetProductsByCategoryAsync(string category)
{
var query = new QueryDefinition(
"SELECT * FROM c WHERE c.category = @category")
.WithParameter("@category", category);
var options = new QueryRequestOptions
{
PartitionKey = new PartitionKey(category),
MaxItemCount = 100
};
var results = new List<Product>();
using var iterator = _container.GetItemQueryIterator<Product>(query, requestOptions: options);
while (iterator.HasMoreResults)
{
var response = await iterator.ReadNextAsync();
results.AddRange(response);
}
return results;
}
}
Read Replica Configuration
// C# - Read/write splitting with EF Core
public class ApplicationDbContext : DbContext
{
private readonly string _writeConnectionString;
private readonly string _readConnectionString;
private bool _useReadReplica;
public void UseReadReplica() => _useReadReplica = true;
public void UsePrimary() => _useReadReplica = false;
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
var connectionString = _useReadReplica ? _readConnectionString : _writeConnectionString;
optionsBuilder.UseSqlServer(connectionString);
}
}
// Usage
public async Task<List<Product>> GetProductsAsync()
{
_context.UseReadReplica();
return await _context.Products.ToListAsync();
}
public async Task CreateProductAsync(Product product)
{
_context.UsePrimary();
_context.Products.Add(product);
await _context.SaveChangesAsync();
}
Asynchronous Processing
Async Patterns
Queue-Based Load Leveling
// C# - Azure Service Bus producer
public class OrderService
{
private readonly ServiceBusSender _sender;
public async Task<string> SubmitOrderAsync(Order order)
{
var orderId = Guid.NewGuid().ToString();
order.Id = orderId;
var message = new ServiceBusMessage(JsonSerializer.Serialize(order))
{
MessageId = orderId,
ContentType = "application/json",
Subject = "NewOrder"
};
await _sender.SendMessageAsync(message);
return orderId; // Return immediately
}
}
// C# - Azure Service Bus consumer
public class OrderProcessor : BackgroundService
{
private readonly ServiceBusProcessor _processor;
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
_processor.ProcessMessageAsync += async args =>
{
var order = JsonSerializer.Deserialize<Order>(args.Message.Body.ToString());
await ProcessOrderAsync(order);
await args.CompleteMessageAsync(args.Message);
};
_processor.ProcessErrorAsync += args =>
{
_logger.LogError(args.Exception, "Error processing message");
return Task.CompletedTask;
};
await _processor.StartProcessingAsync(stoppingToken);
}
}
Performance Testing
Testing Types
| Type | Purpose | Tools |
|---|---|---|
| Load Testing | Validate under expected load | k6, JMeter, Azure Load Testing |
| Stress Testing | Find breaking points | k6, Locust |
| Soak Testing | Find memory leaks, degradation | Long-running load tests |
| Spike Testing | Handle sudden traffic bursts | k6, Gatling |
k6 Load Test Example
// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up to 100 users
{ duration: '5m', target: 100 }, // Stay at 100 users
{ duration: '2m', target: 200 }, // Ramp up to 200 users
{ duration: '5m', target: 200 }, // Stay at 200 users
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% of requests under 500ms
http_req_failed: ['rate<0.01'], // Less than 1% failure rate
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
Azure Load Testing
# load-test-config.yaml
version: v0.1
testId: api-load-test
displayName: API Load Test
testPlan: load-test.jmx
engineInstances: 5
failureCriteria:
- avg(response_time_ms) > 500
- percentage(error) > 5
env:
- name: API_URL
value: https://api.example.com
Capacity Planning
Capacity Planning Process
Capacity Estimation
| Metric | Current | Growth Rate | 12-Month Forecast |
|---|---|---|---|
| Daily Active Users | 10,000 | 20%/month | 89,000 |
| Requests/second | 100 | 20%/month | 890 |
| Storage (GB) | 500 | 10%/month | 1,570 |
| Database DTUs | 100 | 15%/month | 535 |
Right-Sizing Recommendations
// KQL - Analyze resource utilization for capacity planning
Perf
| where TimeGenerated > ago(30d)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| summarize
AvgCPU = avg(CounterValue),
MaxCPU = max(CounterValue),
P95CPU = percentile(CounterValue, 95)
by Computer
| extend Recommendation = case(
P95CPU < 20, "Consider downsizing",
P95CPU > 80, "Consider upsizing",
"Right-sized")
Azure Services for Performance
Compute Performance Tiers
| Service | Standard | Premium | Ultra |
|---|---|---|---|
| App Service | Shared compute | Dedicated, auto-scale | Isolated |
| SQL Database | DTU-based | vCore, Hyperscale | Business Critical |
| Storage | Standard HDD | Standard SSD | Premium SSD, Ultra Disk |
| Redis | Basic | Standard | Premium (clustering) |
Performance-Focused Services
| Service | Use Case | Benefit |
|---|---|---|
| Azure Front Door | Global load balancing | Edge caching, acceleration |
| Azure CDN | Static content delivery | Global edge network |
| Azure Cache for Redis | Distributed caching | Sub-millisecond latency |
| Cosmos DB | Global database | Single-digit ms latency |
| Premium Storage | High IOPS workloads | Up to 160,000 IOPS |
Performance Efficiency Checklist
Design Phase
- Identify performance requirements and SLAs
- Design for horizontal scaling
- Plan caching strategy
- Choose appropriate service tiers
- Design for async processing where possible
Implementation Phase
- Implement caching at multiple levels
- Configure auto-scaling
- Optimize database queries and indexes
- Use connection pooling
- Implement async patterns
Testing Phase
- Conduct load testing
- Perform stress testing
- Validate auto-scaling behavior
- Test failover scenarios
- Benchmark against requirements
Operations Phase
- Monitor performance metrics
- Set up performance alerts
- Review and optimize regularly
- Plan for capacity growth
- Conduct periodic load tests
Assessment Questions
| Area | Question |
|---|---|
| Scaling | Can your application scale horizontally? |
| Caching | Do you have a caching strategy? |
| Database | Are your queries optimized? |
| Async | Do you use async processing for long operations? |
| Testing | Do you load test before releases? |
| Monitoring | Can you identify performance bottlenecks? |
| Capacity | Do you have a capacity planning process? |
| SLAs | Are you meeting your performance SLAs? |
Key Takeaways
- Scale horizontally: Design stateless applications that can scale out
- Cache aggressively: Use caching at every layer to reduce latency
- Optimize hot paths: Focus optimization efforts on critical code paths
- Test under load: Validate performance before production deployment
- Monitor continuously: Use APM to identify and fix bottlenecks