Architectural Thinking: The Staff+ Mindset
Architectural thinking is the ability to see systems holistically, understand trade-offs, and make decisions that balance competing concerns. It's what separates Staff+ engineers from senior engineers.
TL;DR
| Concept | Definition |
|---|---|
| Architecture | Significant decisions that are hard to change |
| Trade-offs | Everything has pros and cons - there are no silver bullets |
| Fitness Functions | Measurable criteria for architectural success |
| -ilities | Non-functional requirements (scalability, maintainability, etc.) |
What is Software Architecture?
"Architecture is about the important stuff. Whatever that is." - Ralph Johnson
Architecture consists of:
- Structure - Components and their relationships
- Characteristics - The "-ilities" (scalability, reliability, etc.)
- Decisions - Significant choices and their rationale
- Design Principles - Guidelines that inform decisions
The First Law of Software Architecture
"Everything in software architecture is a trade-off."
Corollary: "If an architect thinks they have discovered something that isn't a trade-off, more likely they just haven't identified the trade-off yet."
— Mark Richards & Neal Ford
Trade-off Examples
| Decision | Benefit | Trade-off |
|---|---|---|
| Microservices | Independent deployment, scaling | Distributed complexity, network latency |
| Monolith | Simplicity, easier debugging | Scaling limitations, deployment coupling |
| Event Sourcing | Complete audit trail, temporal queries | Complexity, eventual consistency |
| CQRS | Optimized reads and writes | Two models to maintain |
| Caching | Performance | Consistency challenges |
Architectural Characteristics (-ilities)
The Big List
┌─────────────────────────────────────────────────────────┐
│ ARCHITECTURAL CHARACTERISTICS │
├─────────────────────────────────────────────────────────┤
│ │
│ OPERATIONAL │
│ • Availability - System uptime │
│ • Scalability - Handle growth │
│ • Performance - Response time, throughput │
│ • Reliability - Consistent behavior │
│ • Recoverability - Recover from failure │
│ │
│ STRUCTURAL │
│ • Maintainability - Ease of changes │
│ • Testability - Ease of testing │
│ • Deployability - Ease of deployment │
│ • Modularity - Well-defined components │
│ • Extensibility - Adding new features │
│ │
│ CROSS-CUTTING │
│ • Security - Protection from threats │
│ • Observability - Understanding system state │
│ • Auditability - Tracking changes │
│ • Compliance - Meeting regulations │
│ │
└─────────────────────────────────────────────────────────┘
You Can't Have Everything
The key insight: characteristics conflict. Optimizing for one often hurts another.
Prioritization Exercise
For any system, identify the top 3 characteristics:
| System Type | Top Characteristics |
|---|---|
| Banking | Security, Reliability, Auditability |
| E-commerce | Scalability, Performance, Availability |
| Healthcare | Security, Compliance, Reliability |
| Startup MVP | Time-to-market, Simplicity, Extensibility |
| IoT Platform | Scalability, Performance, Reliability |
Fitness Functions
Fitness functions are objective measures of how well architecture meets its goals.
Definition
A fitness function is an objective integrity assessment of some architectural characteristic.
Examples
// Fitness Function: API Response Time
function apiResponseTimeFitness(): boolean {
const p99Latency = metrics.getP99Latency('/api/orders');
return p99Latency < 200; // Must be under 200ms
}
// Fitness Function: Deployment Frequency
function deploymentFrequencyFitness(): boolean {
const deploysPerWeek = metrics.getDeployCount(lastWeek);
return deploysPerWeek >= 5; // At least 5 deploys per week
}
// Fitness Function: Test Coverage
function testCoverageFitness(): boolean {
const coverage = metrics.getCodeCoverage();
return coverage >= 80; // At least 80% coverage
}
// Fitness Function: Cyclic Dependencies
function noCyclicDependenciesFitness(): boolean {
const cycles = architectureAnalyzer.findCycles();
return cycles.length === 0; // No cycles allowed
}
Fitness Function Categories
| Category | Examples |
|---|---|
| Atomic | Single metric (response time < 200ms) |
| Holistic | Multiple metrics combined (overall system health) |
| Triggered | Run on specific events (deployment, PR) |
| Continuous | Always running (monitoring) |
| Static | Code analysis (no cycles, coverage) |
| Dynamic | Runtime metrics (latency, throughput) |
Implementing Fitness Functions
# Example: Architecture fitness tests in CI/CD
fitness_tests:
- name: "No circular dependencies"
type: static
tool: archunit
rule: "no cycles in package dependencies"
- name: "API latency P99 < 200ms"
type: dynamic
tool: datadog
metric: "api.latency.p99"
threshold: 200
- name: "Domain layer has no infrastructure imports"
type: static
tool: archunit
rule: "domain packages should not depend on infrastructure"
- name: "All public APIs have OpenAPI docs"
type: static
tool: custom
rule: "every controller method has swagger annotation"
Decision Frameworks
The Architecture Decision Framework
┌─────────────────────────────────────────────────────────┐
│ ARCHITECTURE DECISION FRAMEWORK │
├─────────────────────────────────────────────────────────┤
│ │
│ 1. CONTEXT │
│ • What problem are we solving? │
│ • What are the constraints? │
│ • What are the requirements? │
│ │
│ 2. OPTIONS │
│ • What are the alternatives? │
│ • What are the trade-offs of each? │
│ │
│ 3. DECISION │
│ • Which option do we choose? │
│ • Why this option over others? │
│ │
│ 4. CONSEQUENCES │
│ • What are the positive outcomes? │
│ • What are the negative outcomes? │
│ • What risks are we accepting? │
│ │
│ 5. VALIDATION │
│ • How will we know if it's working? │
│ • What fitness functions apply? │
│ │
└─────────────────────────────────────────────────────────┘
The "Last Responsible Moment"
Don't make decisions too early:
┌─────────────────────────────────────────────────────────┐
│ │
│ Too Early Last Responsible Too Late │
│ │ Moment │ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ────●─────────────────────●──────────────────────●──── │
│ │ │ │ │
│ Insufficient Optimal Forced │
│ Information Decision Decision │
│ │
│ Risk: Wrong decision Benefit: Informed Risk: No │
│ based on assumptions decision with good │
│ options options │
│ │
└─────────────────────────────────────────────────────────┘
Reversibility Spectrum
Consider how hard decisions are to reverse:
| Reversibility | Examples | Approach |
|---|---|---|
| Easy | Library choice, coding style | Decide quickly, change if needed |
| Medium | Database schema, API design | Think carefully, plan for change |
| Hard | Programming language, cloud provider | Extensive analysis, pilot first |
| Very Hard | Monolith vs microservices, data model | Maximum analysis, accept trade-offs |
Thinking in Trade-offs
The Trade-off Matrix
For any decision, map options against characteristics:
| Option | Performance | Simplicity | Scalability | Maintainability |
|---|---|---|---|---|
| Monolith | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐ |
| Microservices | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Modular Monolith | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
Questions to Ask
For every architectural decision:
- What problem does this solve?
- What problems does this create?
- What are we optimizing for?
- What are we sacrificing?
- How will we know if it's working?
- How hard is this to change later?
Common Thinking Traps
Trap 1: Resume-Driven Development
❌ "Let's use Kubernetes because it looks good on my resume"
✅ "Let's evaluate if Kubernetes solves our actual problems"
Trap 2: Cargo Culting
❌ "Netflix uses microservices, so we should too"
✅ "Let's understand why Netflix chose microservices and if our context is similar"
Trap 3: Analysis Paralysis
❌ "We can't decide until we have perfect information"
✅ "Let's make the best decision we can with current information and plan to adapt"
Trap 4: Silver Bullet Thinking
❌ "Event sourcing will solve all our problems"
✅ "Event sourcing solves X but introduces Y - is that trade-off worth it?"
Staff+ Interview Questions
Q: How do you approach a new architectural decision?
A: I follow a structured process:
- Understand context - What problem? What constraints?
- Identify characteristics - What -ilities matter most?
- Generate options - At least 3 alternatives
- Analyze trade-offs - Pros/cons of each
- Make decision - Document in ADR
- Define fitness functions - How we'll measure success
Q: How do you handle disagreements about architecture?
A: I focus on:
- Shared understanding - Do we agree on the problem?
- Explicit trade-offs - Make pros/cons visible
- Data over opinions - Prototypes, benchmarks, research
- Reversibility - Can we try and change if wrong?
- Document decision - ADR captures context for future
Q: What's the most important architectural characteristic?
A: It depends on context - that's the key insight. For a banking system, security and reliability. For a startup, time-to-market and simplicity. For a high-traffic site, scalability and performance. The skill is identifying which characteristics matter for your specific situation.
Quick Reference Card
┌─────────────────────────────────────────────────────────┐
│ ARCHITECTURAL THINKING QUICK REFERENCE │
├─────────────────────────────────────────────────────────┤
│ │
│ FIRST LAW │
│ Everything is a trade-off │
│ │
│ KEY QUESTIONS │
│ • What problem does this solve? │
│ • What problems does this create? │
│ • What are we optimizing for? │
│ • How hard is this to reverse? │
│ │
│ DECISION PROCESS │
│ 1. Context → 2. Options → 3. Decision → │
│ 4. Consequences → 5. Validation │
│ │
│ FITNESS FUNCTIONS │
│ Objective measures of architectural success │
│ │
│ AVOID │
│ • Resume-driven development │
│ • Cargo culting │
│ • Analysis paralysis │
│ • Silver bullet thinking │
│ │
└─────────────────────────────────────────────────────────┘
Next Steps
- Layered Architecture - The classic pattern
- ADRs - Documenting architectural decisions