← back
CST / Part IA / Easter / Software and Security Engineering

Software and Security Engineering

Part IA Easter notes from the provided Software Engineering L1-L6 slides, with the course-outline security and safety topics integrated · focus: large systems, critical failures, engineering process, assurance, delivery, evolution, and responsible use of AI

Course Map

This course is about the gap between writing a program and engineering a system. Small programming exercises are judged by whether they work now. Real systems are judged by whether teams can understand, change, test, deploy, secure, and operate them for years while users and environments change.

Main Software Engineering Arc
  1. Why software fails: complexity, integration, socio-technical context, and poor process.
  2. Requirements: turning vague user/business needs into testable specifications.
  3. Process: waterfall, spiral, agile, XP, Scrum, Kanban, and scaling.
  4. Delivery: version control, review, testing, CI/CD, cloud, containers, release strategies.
  5. Evolution: maintenance, APIs, refactoring, observability, SRE, sustainability.
  6. AI era: using AI tools while retaining engineering responsibility.
Security / Critical Systems Arc
  • Safety cases and security policies specify what must never happen.
  • Assurance depends on architecture, human procedures, verification, testing, auditability, and management.
  • Threats include capable attackers, accidental harm, fraud, unsafe usability, protocol attacks, implementation bugs, and supply-chain risk.
  • Critical failures are often system failures, not just isolated bugs.

Exam answers should keep connecting technical details back to economics and organisations: failures happen at boundaries between teams, components, assumptions, users, procedures, and incentives.

The Software Crisis

The 1968 NATO conference popularised the term software engineering because software projects were routinely late, over budget, unreliable, and difficult to maintain. The central problem is that software complexity scales faster than human ability to understand it.

ProgrammingSoftware Engineering
Single developer or very small team.Many developers working over long periods.
Small codebase and short lifespan.Large codebase with years or decades of maintenance.
Main question: does it work today?Main questions: is it maintainable, secure, extensible, testable, and operable?
Local reasoning is often enough.Failure often emerges from component interactions and organisational interfaces.

Why Software Is Hard To Predict

  • State explosion: discrete software states are enormous; exhaustive testing is normally impossible.
  • Interdependence: adding people to a late project can make it later because communication and training overheads grow.
  • Emergence: a bug may live in the interaction between correct-looking components.
  • Change: APIs, libraries, operating systems, protocols, hardware, user needs, and regulations move underneath old code.
  • Maintenance economics: initial development is a minority of lifecycle cost; most cost is reading, modifying, debugging, and operating the system.

Failure Case Studies

CaseFailure ModeEngineering Lesson
Therac-25Safety depended on software after hardware interlocks were removed; a race condition and dismissible cryptic errors contributed to radiation overdoses.Use defence in depth, actionable errors, concurrency discipline, and independent safety mechanisms.
Mars Climate OrbiterOne team produced thruster data in imperial units while another expected metric units.Interface specifications, type safety, and end-to-end integration tests matter most at team boundaries.
London Ambulance Service CADBig-bang deployment, underbidding, poor training, hostile users, radio/data errors, memory leak, and overload collapsed dispatch.Critical systems are socio-technical; use phased rollout, load testing, user involvement, and fallback plans.
Post Office HorizonAccounting bugs and institutional belief in software infallibility led to wrongful accusations and prosecutions.Financial/legal systems need auditability, humility about bugs, and escalation paths for technical truth.
Boeing 737 MAX MCASSoftware compensated for hardware/aerodynamic change, relied on a single sensor, and was hidden from pilots.Avoid single points of failure in safety-critical control; operators need correct mental models.

Technical Debt

Technical debt is the cost of choosing a quick or weak design now and paying extra effort later. The principal is the eventual cleanup; the interest is every future change slowed by the poor structure.

  • Intentional debt can be a rational business choice if recorded and repaid.
  • Unintentional debt arises from poor understanding or skill and is harder to manage.
  • Technical bankruptcy happens when interest consumes nearly all engineering time and a rewrite becomes tempting.
  • Refactoring pays debt down by restructuring without changing externally visible behaviour, which requires automated tests.

Professional responsibility means pushing back when deadlines compromise safety, security, or core quality. Engineers cannot treat management pressure as a complete excuse for predictable failure.

Requirements And Specifications

Requirements engineering bridges the translation gap between stakeholders, who usually know the business problem, and engineers, who know implementation possibilities. The goal is to separate the problem space from the solution space.

Problem Space

The why: business value, user needs, market reality, safety constraints, legal duties. Example: warehouse packing is too slow and causes late shipments.

Solution Space

The what/how: product features, technology choices, architecture, interfaces, and engineering constraints. Example: tablet barcode app with route optimisation.

Good Requirements Are Testable

Natural language is ambiguous, so vague requirements are dangerous. "Search should be fast" is not testable. "Return results for 1 million records in under 500 ms for 95% of queries under normal load" is testable and provides a clear boundary for done.

TypeMeaningExamples
Functional requirementA specific action or behaviour the system must perform.Add item to basket, calculate VAT, send confirmation email.
Non-functional requirementA quality, constraint, or performance bound on system operation.Latency, throughput, reliability, maintainability, portability, usability, security, accessibility.
Fit criterionA measurement proving an NFR has been met.99.99% monthly uptime; WCAG 2.1 AA; 10,000 concurrent connections.

Security may be expressed as an NFR but often generates FRs. For example, GDPR compliance may require account deletion, export, audit logging, consent capture, and data retention behaviour.

Models And UML

Models communicate system structure without showing every line of code. Modern UML is usually used as a sketch, not a complete blueprint.

DiagramShowsUse When
Class diagramStatic object structure: classes, attributes, methods, association, inheritance, aggregation, composition.You need to explain entities, ownership, or domain structure.
Sequence diagramMessage flow over time between actors/services/components.You need to explain protocols, login flows, API calls, async behaviour, or distributed interactions.
State machine diagramAllowed states and transitions.You need to explain lifecycle rules such as cart states, workflow stages, or safety modes.

A useful model hides the right amount of detail. If a diagram fits on a whiteboard and the team understands the idea, it is often good enough.

Roles And Prioritisation

Product Manager
Owns the why and what: user research, market trends, business value, requirements, and prioritisation.
Engineering Manager
Owns execution and team health: staffing, architecture oversight, delivery, and developer growth.
Product Requirements Document
Living source of truth with background, business goals, personas, FRs, NFRs, and constraints.
Traceability
Connects requirement origin to code and tests, e.g. requirement -> pull request -> integration test. Essential in regulated domains.

MoSCoW prioritisation separates Must, Should, Could, and Won't requirements so the team avoids building based only on who shouts loudest.

Process Models

A software process is a structured set of activities for specification, design and implementation, validation, and evolution. Ad-hoc code-and-fix works for tiny scripts but breaks when many people build a large system.

ModelIdeaStrengthFailure Mode
WaterfallStrict sequential phases: requirements, design, implementation, integration/testing, operation.Clear milestones and accountability; plausible for stable, regulated, or hardware-coupled projects.Assumes requirements can be frozen; discovers integration and requirement mistakes late.
SpiralRisk-driven loops: objectives, risk assessment/prototypes, development, planning.Finds high-risk unknowns before full commitment.Can be management-heavy; needs skill to identify real risks.
Iterative / incrementalBuild pieces and refine them repeatedly.Delivers value earlier and incorporates feedback.Needs modular architecture and disciplined integration.
AgileAdapt through short cycles, working software, customer collaboration, and team interaction.Handles changing requirements and fast feedback.Fake agile skips discipline while keeping the vocabulary.

Agile, XP, And Scrum

The Agile Manifesto values individuals and interactions, working software, customer collaboration, and responding to change. This does not mean no documentation, no planning, or chaos; agile teams plan frequently and need strong engineering discipline.

Extreme Programming

  • Pair programming: driver writes, navigator reviews and thinks ahead.
  • Test-driven development: write failing test, write minimal code, refactor.
  • Continuous integration: merge and test frequently to avoid integration surprises.

Scrum

ConceptMeaning
PillarsTransparency, inspection, adaptation.
Product backlogOrdered evolving list of everything known to be needed.
User story"As a user type, I want an action, so that value/reason."
Acceptance criteriaSpecific conditions a story must satisfy.
Definition of DoneTeam-wide checklist: review, tests, docs, deployability, etc.
SprintFixed timebox, often two weeks, producing a usable increment.
Daily ScrumShort engineering synchronisation, not a management status report.
Review / retrospectiveInspect product with stakeholders; inspect and improve the process.

Story points measure relative complexity, effort, and risk, not literal hours. Planning poker exposes hidden assumptions when estimates differ widely.

Kanban And Scaling

Kanban uses continuous flow rather than timeboxed sprints. Work moves across a board such as To Do -> In Progress -> Review -> Done, with work-in-progress limits to reveal bottlenecks.

Scaling agile beyond one team introduces coordination roles and frameworks. Technical Program Managers coordinate cross-team dependencies. SAFe, LeSS, and squad/guild/tribe structures are attempts to preserve flow while many teams work on related systems.

Tooling And Coding Standards

Fast teams need technical coordination. Version control, issue tracking, review, and standards are not bureaucracy by default; they are tools for preserving history, sharing context, and preventing local changes from damaging the whole system.

Tool / PracticePurpose
Git commitsSnapshots with hashes and messages explaining why a change happened.
BranchesIsolate feature work; merge back into main when reviewed and tested.
Pull requestsCode review, discussion, knowledge sharing, style consistency, and bug detection.
Issue trackersBug lifecycle and traceability from problem to code change.
Coding standardsReduce cognitive load; automated linters remove style arguments from review.

Trunk-based development keeps changes small and integrates quickly, but requires strong automated tests. Long-lived branch models can support traditional releases but risk painful late merges.

Testing And Quality Assurance

Tests provide confidence, executable documentation, safe refactoring, and design feedback. Code that is hard to test is often too tightly coupled or too complex.

LayerRoleTradeoff
Unit testsTest small functions/classes in isolation, often with mocks/stubs.Fast and precise but may miss integration failures.
Integration testsTest real interactions between components such as service + database.More realistic but slower and harder to isolate.
End-to-end testsExercise user flows through the whole stack.High confidence but slow, expensive, and brittle.
Manual exploratory testsHuman intuition catches UX and visual problems.Slow, hard to repeat, and does not scale.
  • Coverage shows what code ran during tests; it is not the same as quality.
  • Mutation testing changes code slightly and checks whether tests fail; surviving mutants reveal weak tests.
  • TDD uses red-green-refactor to force a testable design and preserve behaviour during cleanup.
  • Fuzzing feeds malformed or random inputs to discover crashes and edge cases.

CI/CD And Release Engineering

DevOps removes the wall between development and operations: if a team builds a service, it should understand and support it in production.

TermMeaning
Continuous IntegrationMerge to main frequently; every push triggers build, tests, linting, and scanning.
Continuous DeliveryCode is always deployable; production release is a business decision.
Continuous DeploymentEvery passing change goes live automatically.
Quality gatePipeline step that must pass before promotion to the next environment.

Release strategies reduce risk. Blue-green deployment switches traffic between two equivalent environments. Canary deployment exposes a small percentage of users first. Feature flags decouple deployment from release and provide an emergency shutoff. A/B testing measures product value, while canaries measure technical health.

Cloud Systems

ModelWhat You GetTradeoff
IaaSVirtual machines, storage, and networks.Maximum control but high management overhead.
PaaSPlatform handles OS/runtime/scaling; you upload code.Fast delivery but less control and possible vendor lock-in.
SaaSComplete application delivered over the web.Instant value but little control over features, data location, or internals.

The shared responsibility model matters: providers secure the cloud infrastructure; customers secure their application, data, identities, network configuration, and operational practices.

Containers package code and dependencies into reproducible images. Kubernetes orchestrates many containers with scheduling, load balancing, self-healing, and autoscaling. Observability uses metrics, logs, traces, and dashboards to judge whether deployments are healthy.

Software Evolution

Software is never finished. Lehman's laws say evolutionary systems must keep changing or become less useful, and that complexity increases unless work is done to reduce it.

Maintenance TypeMeaning
CorrectiveFix bugs found in use.
AdaptiveUpdate for new environments, platforms, regulations, or dependencies.
PerfectiveImprove performance, maintainability, readability, or usability.
PreventiveFix latent problems before they become failures.

Legacy code is not just old code. A practical definition is code without tests or code the team is afraid to change. Software archaeology uses source code, `git blame`, `git log`, issue trackers, and commit links to recover why the system is the way it is.

API Design And Versioning

APIs are contracts. Once others depend on an API, changing it becomes expensive because users may depend on documented and undocumented observable behaviour.

SemVer PartChange Meaning
MAJORBreaking incompatible API changes.
MINORBackwards-compatible new functionality.
PATCHBackwards-compatible bug fix.
  • Backward compatibility: new code handles old data/requests.
  • Forward compatibility: old code gracefully handles newer data, often by ignoring unknown fields.
  • Lockfiles: record exact dependency versions for reproducible builds.
  • Deprecation cycle: announce, mark deprecated, wait, then remove in a major version.
  • Hyrum's Law: with enough users, someone depends on every observable behaviour.

Refactoring And Remediation

Refactoring is disciplined restructuring without behaviour change. It is safest when supported by tests.

Code SmellSignal
Long methodToo much behaviour in one place; extract smaller named operations.
God objectOne class knows or does too much; split responsibilities.
Feature envyA method is more interested in another class's data than its own.
Data clumpsGroups of values always travel together and may deserve their own object.

Characterisation tests record current behaviour before refactoring unknown legacy code. They test consistency, not correctness. The strangler pattern replaces a legacy monolith gradually by routing some calls to new code while leaving the rest on the old system until migration is complete.

Observability And SRE

Monitoring says something is wrong; observability helps explain why. The telemetry triad is logs, metrics, and traces.

Golden SignalMeaning
LatencyTime to service a request.
TrafficDemand placed on the system.
ErrorsRate or count of failed requests.
SaturationHow full a resource is, such as CPU, memory, disk, queue, or connection pool.

SRE defines service level indicators and objectives. The error budget is the permitted unreliability: if budget remains, teams can take delivery risk; if it is exhausted, stability work takes priority. Blameless post-mortems focus on why the system permitted a failure and what automated safeguards should be added.

Security And Safety Engineering

The course outline extends software engineering into systems that must withstand attack or avoid harm. The common pattern is assurance: define a policy or safety claim, identify how it can fail, design controls, and gather evidence that controls work.

ConceptHow To Think About It
Security policyRules about allowed information flow, access, authority, and operations.
Safety caseStructured argument, backed by evidence, that a system is acceptably safe in a defined context.
One-way flowConfidentiality prevents information flowing from high secrecy to low secrecy; safety may require control signals or hazards not to propagate the wrong way.
Separation of dutiesNo single actor can complete a sensitive action alone; reduces fraud and error.
Policy vs. mechanismPolicy states what must be true; mechanism is how the system enforces it. Decoupling improves clarity and changeability.
Top-down analysisStart from unacceptable losses or policy goals and derive hazards/threats and controls.
Bottom-up analysisStart from component failures or vulnerabilities and analyse consequences.

Security And Safety Usability

Users are part of the system. If a secure or safe workflow is unusable, people create workarounds: shared passwords, skipped checks, mis-entered statuses, ignored alarms, or unsafe defaults. Good engineering predicts likely errors and designs them out where possible.

  • Make the safe action the easy/default action.
  • Use affordances, warnings, confirmations, and constraints where users are likely to err.
  • Make error messages actionable rather than cryptic.
  • In accounting, preserve audit trails and require independent checks for high-risk operations.
  • In medical devices, treat user interface errors as safety hazards, not merely UX defects.

Security Protocols And Bugs

Security protocols enforce policy through structured human interaction, cryptography, or both. Middleperson attacks exploit the gap between intended parties and the actual communication path, so authentication, key binding, certificate validation, and usable warnings are central.

Implementation bugs can defeat correct designs. Important classes include syntactic mistakes, timing bugs, concurrency bugs, injection bugs, buffer overflows, side channels, and memory safety errors. Defensive programming uses secure coding standards, contracts, input validation, least privilege, fuzzing, code review, and dependency patching.

Verification Limits

Verification can prove properties of a model or implementation under assumptions, but it does not prove the assumptions are right. It can miss wrong requirements, misunderstood users, bad deployment, compromised dependencies, hardware faults, misleading interfaces, and organisational pressure.

Software Engineering In The AI Era

AI changes tools, not the goals. Reliability, maintainability, security, and user value still matter. The engineer remains responsible for every line committed.

AI UseValueRisk
Code generationBoilerplate, prototypes, language translation, routine APIs.Happy-path code, hallucinated libraries, insecure old patterns, unreadable debt.
Test generationEdge cases, mocks, fuzz inputs, fast coverage growth.Tests may encode what code does, not what it should do.
Debugging/log analysisCorrelates stack traces, logs, and symptoms quickly.Can infer wrong root causes if context is incomplete.
Code review/refactoringFinds architectural smells and performs mechanical cleanup.May violate project architecture unless reviewed by humans.
Workflow agentsCan navigate files, run tests, and implement multi-step changes.Autonomy increases blast radius; needs guardrails, review, and reproducible checks.
Building AI systemsNew product capabilities.Probabilistic outputs, latency, token cost, provider lock-in, eval difficulty, guardrail design.

The durable skillset moves toward domain knowledge, system design, communication, verification, and judgement. Knowing syntax is less valuable than knowing what should be built, how parts fit together, and how to prove the result is good enough.

Exam Use

When answering, avoid listing tools in isolation. Explain which failure mode the tool addresses, what assumption it relies on, and where it can still fail.

If Asked About...Say...
Large project failureDiscuss requirements ambiguity, team boundaries, integration testing, deployment strategy, user acceptance, management incentives, and fallback.
Waterfall vs AgileWaterfall gives control and documentation when requirements are stable; agile gives feedback and adaptability but needs discipline and technical practices.
TestingUse the pyramid; coverage is not quality; mutation/fuzzing find weak tests and edge cases; automated tests enable refactoring.
Security/safetyStart with policy or safety claim, identify threats/hazards, choose mechanisms, consider users, and collect evidence.
DevOps/SREUse CI/CD, observability, progressive delivery, error budgets, and blameless post-mortems to reduce delivery and operational risk.
Legacy evolutionUse archaeology, tests, semantic versioning, deprecation, refactoring, and gradual replacement rather than big-bang rewrites.
AI in engineeringAI accelerates coding and analysis but does not remove ownership, testing, security review, or architectural judgement.

Key Phrases Worth Remembering

  • Software engineering is management of complexity over time.
  • Most failures are socio-technical, not just technical.
  • Late requirement fixes are disproportionately expensive.
  • Tests are executable specifications and refactoring insurance.
  • Deployment and release are different when feature flags are used.
  • Security and safety are emergent system properties.
  • Auditability and traceability matter when software affects money, law, medicine, or life.