SE 101: Introduction to Methods of Software Engineering

Paul Ward

Estimated study time: 23 minutes

Table of contents

Sources and References

Sommerville, I. Software Engineering, 10th ed. Pearson, 2016.
Pressman, R. S. & Maxim, B. R. Software Engineering: A Practitioner’s Approach, 8th ed. McGraw-Hill, 2015.
Pfleeger, S. L. & Atlee, J. M. Software Engineering: Theory and Practice, 4th ed. Prentice Hall, 2010.
MIT 6.031: Elements of Software Construction. MIT OpenCourseWare.
CMU 17-214: Principles of Software Construction. Carnegie Mellon University.

Chapter 1: The Software Engineering Process

1.1 What Is Software Engineering?

Software engineering is the application of a systematic, disciplined, and quantifiable approach to the development, operation, and maintenance of software. The discipline emerged in the late 1960s in response to the so-called “software crisis”—a period characterised by cost overruns, schedule delays, and unreliable software systems—and formalised the idea that building software requires engineering rigour comparable to that applied in bridge or aircraft design.

Sommerville distinguishes software engineering from computer science by noting that computer science focuses on theory and foundational principles, while software engineering is concerned with practical techniques for producing reliable software on time and within budget. Software products are either generic (developed for an open market, such as a word processor) or custom (developed to a specific client’s specification).

A central insight of the field is that software quality cannot be achieved by effort at the end of a project alone. Quality must be built into every phase of development, from the initial statement of requirements through to deployment and ongoing maintenance.

1.2 Requirements Engineering

Requirements engineering is the process of establishing what services a system should provide and the constraints under which it must operate. Poorly defined requirements are consistently among the leading causes of project failure. Sommerville identifies two categories of requirement:

Functional requirements specify what the system shall do—the services, functions, and behaviours the system must exhibit in response to particular inputs or in particular situations.
Non-functional requirements specify constraints on the system or its development process, such as performance bounds, security properties, reliability targets, or platform compatibility.

The requirements engineering process itself encompasses elicitation (gathering requirements from stakeholders through interviews, workshops, and observation), analysis (resolving conflicts and feasibility concerns), specification (documenting requirements in a form usable by developers and verifiable by testers), and validation (confirming that requirements reflect what stakeholders actually need).

A requirements specification document serves as a contract between the development team and the client. IEEE Std 830 describes recommended practices for software requirements specifications. Requirements are often expressed as use cases or user stories; the choice depends on the development methodology in use.

1.3 Software Architecture and Design

Architecture describes a system’s gross structure: its major components, the relationships among them, and the principles governing their design and evolution. Architectural decisions are costly to reverse because they shape every subsequent design and implementation decision.

Common architectural styles discussed in Pressman & Maxim include:

Layered architecture: the system is decomposed into layers each of which provides services to the layer above and uses services from the layer below. Common in operating systems and enterprise applications.
Client–server architecture: a server component provides services to multiple client components that initiate requests.
Pipe-and-filter architecture: data flows through a sequence of processing components (filters) connected by channels (pipes). Common in compilers and signal-processing systems.
Repository (shared data) architecture: components interact primarily through a central data store. Useful when large amounts of data must be shared across components.

Detailed design refines the architecture into module specifications, data structure choices, and algorithm selection. Design should exhibit high cohesion (each module does one thing well) and low coupling (modules depend on each other as little as possible). These two structural properties, identified by Yourdon and Constantine in the 1970s, remain central heuristics for maintainable design. Object-oriented design principles are often summarised as SOLID: Single Responsibility, Open–Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion.

1.4 Implementation

Implementation translates design artefacts into executable code in a chosen programming language. Coding standards and style guides promote consistency and readability. The choice of programming language affects productivity, performance, safety, and the availability of libraries.

Version control is a non-negotiable practice in professional software development. Git is the dominant distributed version control system. Under Git, a repository holds a complete history of changes; developers commit snapshots of their working tree, and branches allow parallel lines of development. GitFlow is a branching model that prescribes long-lived branches (main and develop) together with short-lived feature, release, and hotfix branches, providing a predictable release workflow.

Code review—the systematic inspection of source code by developers other than the author—is one of the most cost-effective defect-removal techniques known. Both formal inspections (Fagan inspections) and lightweight pull-request reviews catch defects earlier than testing alone.

1.5 Testing and Quality Assurance

Testing is the process of executing a program with the intention of finding errors. It cannot prove the absence of defects (Dijkstra’s dictum), but systematic testing significantly raises confidence in software correctness.

The test vocabulary distinguishes:

Unit testing: testing individual components (functions, classes, modules) in isolation from the rest of the system. Frameworks such as PyTest (for Python) automate test execution and assertion checking.
Integration testing: testing the interaction between components as they are assembled.
System testing: testing the entire integrated system against its requirements.
Acceptance testing: testing conducted by or on behalf of the client to determine whether requirements are satisfied.

Test-driven development (TDD) inverts the conventional order: tests are written before the code they exercise. The red–green–refactor cycle drives design toward testable, well-structured implementations.

Quality assurance (QA) is the broader activity of ensuring that development processes consistently produce high-quality outputs. QA encompasses process standards, audits, metrics collection, and the monitoring of defect rates over time. Verification asks “are we building the product right?” while validation asks “are we building the right product?"—a distinction articulated by Boehm.

1.6 Maintenance

Maintenance begins at deployment and accounts for a substantial fraction of total software lifecycle cost—estimates range from 40 % to 80 % of total effort over the lifetime of a system. Sommerville classifies maintenance activities as:

Corrective maintenance: fixing defects found after delivery.
Adaptive maintenance: modifying software to accommodate changes in the operating environment (new operating systems, hardware platforms, or regulations).
Perfective maintenance: improving performance or maintainability without changing external behaviour.
Preventive maintenance: restructuring or rewriting code to forestall future problems.

Lehman’s Laws of Software Evolution describe empirical regularities observed in long-lived systems: a program that is used must be continually adapted or it becomes progressively less satisfying; the complexity of a system increases unless active work is done to reduce it.

Chapter 2: Project Management

2.1 Scope, Planning, and Estimation

Project management in software engineering is the activity of planning, organising, monitoring, and controlling software projects to deliver the agreed product within agreed constraints of time, cost, and quality. A project that lacks management discipline frequently experiences scope creep, missed deadlines, and cost overruns.

The project scope defines what the project will and will not deliver. A Work Breakdown Structure (WBS) decomposes the project into manageable tasks. Effort estimation techniques include:

Expert judgment: experienced practitioners estimate based on comparable past projects.
Algorithmic models: models such as COCOMO II derive estimates from lines of code or function points together with cost drivers.
Analogy-based estimation: a new project is compared to a historical project of known cost.

Schedules are often visualised as Gantt charts (bar charts showing task durations and dependencies) or as network diagrams (CPM/PERT), which identify the critical path—the longest chain of dependent tasks that determines the minimum project duration. Brooks’ Law states that adding people to a late software project makes it later, because ramp-up time and communication overhead outweigh the added capacity in the short term.

2.2 Traditional Project Management

Traditional (plan-driven) project management, exemplified by the waterfall model, proceeds through a sequence of phases—requirements, design, implementation, verification, maintenance—with formal reviews and signed-off artefacts between phases. This model is appropriate when requirements are well understood and unlikely to change, and when the development team is large and geographically distributed.

The principal advantage of a plan-driven approach is predictability: scope, schedule, and cost are defined early and can be baselined. The principal weakness is inflexibility: late-phase discovery of requirement errors is extremely expensive to correct, because earlier artefacts must be reworked.

2.3 Agile Project Management

Agile methods emerged in the 1990s as a reaction to the rigidity and overhead of plan-driven approaches for projects with volatile requirements and small co-located teams. The Agile Manifesto (2001) articulates four values:

Individuals and interactions over processes and tools.
Working software over comprehensive documentation.
Customer collaboration over contract negotiation.
Responding to change over following a plan.

Agile does not advocate ignoring process, documentation, contract, or planning; rather, it reorders the priorities among these concerns.

Scrum

Scrum is the most widely adopted agile framework. Work is divided into fixed-length iterations called sprints (typically one to four weeks). Key roles include:

Product Owner: represents stakeholder interests, owns and prioritises the Product Backlog—the ordered list of all desired work.
Scrum Master: serves the team by removing impediments and coaching adherence to the Scrum process.
Development Team: self-organising, cross-functional group responsible for delivering a potentially shippable product increment at the end of each sprint.

Key ceremonies include Sprint Planning (selecting and committing to backlog items), the Daily Scrum (a 15-minute synchronisation meeting), the Sprint Review (demonstrating the increment to stakeholders), and the Sprint Retrospective (inspecting team process and identifying improvements).

Kanban

Kanban is a flow-based method originally derived from lean manufacturing. Work items are visualised on a Kanban board with columns representing workflow states (e.g., To Do, In Progress, Done). The key practice is limiting work in progress (WIP) to reduce multitasking and identify bottlenecks. Unlike Scrum, Kanban imposes no fixed iteration length; work is pulled through the system continuously. Lead time and cycle time are the principal performance metrics.

Chapter 3: Professionalism, Ethics, and Legal Responsibilities

3.1 The Software Engineering Profession

Software engineers occupy positions of trust in society. Software defects have caused loss of life (Therac-25 radiation overdoses), large financial losses (Knight Capital’s 2012 trading loss), and breaches of privacy (various data security incidents). This societal impact demands professional responsibility.

The ACM/IEEE-CS Software Engineering Code of Ethics identifies eight principles, the most fundamental being that software engineers shall act consistently with the public interest. The code identifies obligations to clients and employers, the profession itself, colleagues, and society at large.

Professional engineering licensing (P.Eng. in Canada, PE in the United States) imposes legal accountability on individuals who sign off on engineering work. In Ontario, the practice of professional engineering is regulated by Professional Engineers Ontario (PEO). Software engineering shares this accountability framework when software is embedded in safety-critical systems.

3.2 Ethical Frameworks in Software Engineering

Several ethical frameworks are applied to software engineering decisions:

Consequentialism judges actions by their outcomes. A consequentialist analysis of a privacy trade-off weighs the aggregate benefit of a feature against the aggregate harm of the resulting data exposure.
Deontology judges actions by conformity to rules or duties, independent of consequences. A deontological framing asserts that certain actions—such as unauthorised data collection—are intrinsically wrong.
Virtue ethics asks what a person of good character would do in a given situation, emphasising the development of professional virtues such as honesty, diligence, and care.

In practice, ethical analysis requires reasoning carefully about multiple frameworks rather than mechanically applying any single one.

Chapter 4: Intellectual Property

4.1 Copyright

Copyright is an automatic legal right that arises at the moment an original work is created in a fixed form. In software, copyright protects the source code and object code as literary works; it does not protect ideas, algorithms, or functionality themselves. The copyright holder has exclusive rights to reproduce, distribute, modify, and licence the work.

Open-source licences are a structured exercise of copyright. Permissive licences (MIT, BSD, Apache 2.0) allow unrestricted use, modification, and redistribution with minimal conditions. Copyleft licences (GPL, LGPL, AGPL) require that derivative works be distributed under the same terms, a property known as share-alike. Understanding licence compatibility is essential when combining open-source components.

4.2 Patents

A patent grants the holder a time-limited monopoly (20 years from filing date in most jurisdictions) to practise an invention in exchange for public disclosure of how the invention works. To be patentable, an invention must be novel, non-obvious, and useful. Software patents are controversial because they can cover algorithms and business methods, and their breadth has sometimes hindered innovation. In Canada, software per se is not patentable, but software that produces a physical effect or is tied to a specific hardware implementation may qualify.

Design patents (in the United States) or industrial design registrations (in Canada) protect the ornamental appearance of a product—relevant when the visual design of a user interface constitutes a competitive differentiator.

4.3 Trademarks and Trade Secrets

A trademark is a sign capable of distinguishing goods or services of one enterprise from those of another. It protects brand identifiers such as names, logos, and slogans, and is renewable indefinitely provided it remains in use and is renewed.

A trade secret is confidential business information that provides competitive advantage. Unlike patents, trade secrets require no registration and have no fixed term; protection lasts as long as secrecy is maintained. Software source code is often protected as a trade secret, in which case the developer relies on confidentiality agreements (NDAs) and access controls rather than patent protection.

Chapter 5: Safety and Risk Management

5.1 Hazards, Risks, and Safety

A hazard is a condition that can lead to an accident; a risk is the combination of the probability that a hazard will lead to an accident and the severity of the resulting harm. Safety engineering is the discipline concerned with identifying hazards, assessing risk, and designing systems to reduce risk to an acceptable level.

The standard decomposition of risk is:

\[ \text{Risk} = \text{Probability of Occurrence} \times \text{Severity of Consequence} \]

Risk matrices plot probability against severity to prioritise mitigation effort.

Safety-critical systems—medical devices, avionics, automotive control systems, nuclear plant instrumentation—must meet stringent safety standards (IEC 61508, DO-178C, ISO 26262). For such systems, the development process itself must be certified, not merely the final product.

5.2 Risk Management Process

The risk management process consists of identification, assessment, planning, and monitoring:

Identification: eliciting potential risks through techniques such as hazard and operability studies (HAZOP), fault tree analysis (FTA), and failure mode and effects analysis (FMEA).
Assessment: estimating probability and impact for each identified risk.
Planning: selecting mitigation strategies—avoidance (removing the source of risk), reduction (lowering probability or impact), transfer (insurance, contractual allocation), or acceptance (choosing to accept residual risk).
Monitoring: tracking risk indicators throughout the project and updating the risk register as new risks emerge or existing risks change.

Workplace Hazardous Materials Information System (WHMIS) training addresses chemical hazard identification and safe handling in engineering environments—an element of general safety orientation for engineering students.

Chapter 6: Technical Communications

6.1 Communication in Engineering Practice

Technical communication is the discipline of conveying technical information to specific audiences for specific purposes. Engineers routinely produce requirements documents, design specifications, test plans, progress reports, technical proposals, and oral presentations. Clarity, precision, and appropriate level of detail are the governing principles.

Audience analysis is the foundational step: a document written for end users differs fundamentally from one written for a development team or a regulatory body. Technical writing must be structured so that readers can locate relevant sections efficiently. Standards such as ISO/IEC/IEEE 29148 govern requirements documents; IEEE Std 1016 governs software design descriptions.

6.2 Structured Documents and Reports

Effective technical documents share structural conventions: an executive summary or abstract that allows a busy reader to understand the conclusion and significance without reading the full document; a body organised to support the argument or specification logically; and supporting material (appendices, references) that enriches without cluttering.

For project reports, a standard structure includes: Introduction (problem statement, scope, objectives), Background (relevant literature and prior work), Method (what was done and why), Results (what was found), Discussion (interpretation of results, limitations), and Conclusion (summary of contributions and recommendations).

6.3 Presentations and Professional Communication

Oral presentations compress the key messages of written documents into a time-bounded format. Effective slides serve as visual scaffolding for the speaker’s words, not as a script to be read. The 10/20/30 rule (attributed to Guy Kawasaki) suggests no more than 10 slides, 20 minutes, and no font size smaller than 30 points—heuristics that serve the goals of focus and legibility.

Professional communication also encompasses résumé writing and interview preparation. A software engineering résumé should highlight technical skills, project experience, and measurable outcomes. Co-op work terms represent the first systematic encounter with professional communication expectations in industry contexts.

Chapter 7: Software Development Tools and Methodologies

7.1 Version Control with Git and GitFlow

Git implements a distributed version control model in which every clone of a repository is a full backup with complete history. Key concepts include the working tree, the index (staging area), and the commit graph. Essential operations are commit, branch, merge, and rebase.

GitFlow prescribes the following branch structure:

main (or master): contains only production-ready code; each commit on this branch corresponds to a release.
develop: integration branch where completed features are merged before release.
Feature branches (feature/...): branch from and merge back into develop.
Release branches (release/...): branch from develop, allow release preparation, and merge into both main and develop.
Hotfix branches (hotfix/...): branch from main to patch a production defect and merge into both main and develop.

Trunk-based development is an alternative model in which all developers commit directly to main (or merge short-lived branches into it multiple times per day), relying on feature flags and continuous integration to manage incomplete features.

7.2 Python Programming

Python is a dynamically typed, interpreted, general-purpose language designed for readability. Its syntax uses indentation to delimit blocks, eliminating the need for explicit begin/end keywords. Python’s extensive standard library and ecosystem of third-party packages (NumPy, pandas, Flask, SQLAlchemy) make it suitable for scripting, data analysis, web development, and rapid prototyping.

Important language features include list comprehensions, generators, decorators, context managers, and the distinction between mutable and immutable types. Python’s built-in data structures—lists, tuples, sets, and dictionaries—cover the majority of common programming patterns.

7.3 Testing with PyTest

PyTest is the de facto standard testing framework for Python. Tests are plain functions (or methods) whose names begin with test_. PyTest discovers and runs tests automatically, reporting failures with detailed context. Fixtures provide reusable setup and teardown logic via dependency injection. Parametrize decorators allow a single test function to run against multiple input cases, reducing code duplication.

A well-structured test suite has three properties: tests are isolated (no test depends on the side effects of another), tests are repeatable (same result regardless of execution order or environment), and tests are fast (slow tests are run less frequently, reducing the feedback cycle).

7.4 Regular Expressions

A regular expression (regexp or regex) is a sequence of characters that defines a search pattern. Regular expressions are implemented in virtually every programming language and are used for input validation, text search and replace, lexical analysis, and log parsing.

Key metacharacters include . (any character), * (zero or more), + (one or more), ? (zero or one), ^ and \$ (line anchors), [] (character class), () (capturing group), and | (alternation). Quantifiers are greedy by default; appending ? makes them lazy. Named groups ((?P<name>...) in Python) improve readability of complex patterns.

7.5 Logging and Debugging

Logging is the systematic recording of runtime events for diagnostic and audit purposes. Python’s logging module provides a hierarchy of loggers, handlers, and formatters. Log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL) allow selective filtering; in production, DEBUG output is suppressed while ERROR and above are retained.

Effective debugging proceeds systematically: reproduce the defect reliably; isolate the scope of the defect by bisecting the code or data; form a hypothesis about the cause; test the hypothesis; and verify the fix does not introduce regressions. Debuggers (such as pdb in Python) allow inspection of program state at breakpoints and single-step execution, accelerating hypothesis testing.

7.6 Relational Databases and SQL

A relational database management system (RDBMS) organises data into tables (relations) composed of rows (tuples) and columns (attributes). The relational model, proposed by E. F. Codd in 1970, provides a mathematically rigorous foundation for data organisation. MySQL is a widely deployed open-source RDBMS.

Structured Query Language (SQL) is the standard language for defining, manipulating, and querying relational data. Core SQL operations include:

CREATE TABLE / DROP TABLE (data definition)
INSERT, UPDATE, DELETE (data manipulation)
SELECT ... FROM ... WHERE ... (data query)
JOIN operations (combining data from multiple tables)

Normalisation is the process of decomposing a table schema to reduce redundancy and update anomalies. The first three normal forms (1NF, 2NF, 3NF) address the most common structural problems. A single-table schema suitable for a first course covers primary keys, data types, basic constraints (NOT NULL, UNIQUE, CHECK), and simple queries with filtering and ordering.

7.7 Basic Android Development

Android is the dominant mobile operating system globally, built on a Linux kernel. Android applications are written in Java or Kotlin and structured around Activities (screen-oriented components), Fragments (reusable sub-screens), Services (background processes), and BroadcastReceivers (inter-component messaging). The Android manifest file declares components and permissions.

User interfaces are defined in XML layout files and inflated into view hierarchies at runtime. The Model–View–ViewModel (MVVM) architectural pattern is the currently recommended approach for Android UI, separating data management (ViewModel, LiveData/StateFlow) from UI logic (Fragment/Activity). Android Studio, built on IntelliJ IDEA, is the standard development environment.

Chapter 8: The Software Engineering Lifecycle in Context

8.1 Putting It Together: Process Models

Software development processes orchestrate the activities of requirements, design, implementation, testing, and maintenance into a coherent sequence with defined roles, artefacts, and checkpoints. No single process model suits all projects; the choice depends on requirements volatility, team size, regulatory constraints, and organisational culture.

The spiral model (Boehm, 1988) treats the process as a series of cycles, each addressing risk through prototyping and evaluation before committing to the next level of elaboration. Each spiral cycle passes through planning, risk analysis, engineering, and customer evaluation quadrants. The spiral model was influential in establishing risk as a primary driver of process decisions.

Continuous integration and continuous delivery (CI/CD) practices, now standard in industry, automate the build, test, and deployment pipeline so that every commit can potentially be released to production. These practices depend on a comprehensive automated test suite, version-controlled infrastructure configuration (infrastructure as code), and monitoring of production systems.

8.2 Deployment and Decommissioning

Deployment is the process of making software available for use. It includes installation, configuration, data migration, user training, and the transition from old to new systems. Deployment strategies such as blue-green deployment (maintaining two production environments and switching traffic) and canary release (gradually shifting traffic to the new version) reduce deployment risk.

Decommissioning marks the end of a system’s operational life. It requires migrating or archiving data, notifying users, and ensuring that dependent systems are not left without their dependencies. Proper decommissioning is a legal and organisational responsibility often overlooked during initial development.

8.3 Measurement and Analysis

Measurement in software engineering collects quantitative data about products and processes to support management decisions and quality improvement. Metrics fall into two broad categories: product metrics (size, complexity, defect density) and process metrics (defect removal efficiency, build duration, deployment frequency).

The Goal–Question–Metric (GQM) paradigm (Basili, 1988) provides a disciplined approach: begin with a goal (e.g., improve reliability), derive questions that characterise achievement of the goal, then identify metrics that answer those questions. This top-down structure prevents the common anti-pattern of collecting metrics without a clear purpose.

Cyclomatic complexity (McCabe, 1976) measures the number of linearly independent paths through a program module:

\[ V(G) = E - N + 2P \]

where $E$ is the number of edges, $N$ is the number of nodes, and $P$ is the number of connected components in the control flow graph. High cyclomatic complexity correlates with increased defect density and reduced testability, motivating refactoring to simplify control flow.