CS 246 - Object-Oriented Software Development

Victoria Sakhnini

Estimated reading time: 35 minutes

Table of contents

Linux
Software Testing
Basic C++
Object-Oriented Programming
Design patterns
- Iterator
- UML
- Inheritance
- Observer
- Decorator
- Factory
- Strategy
- Template
- Visitor
- Bridge
- MVC
STL
Exceptions
Further topics

The notes below are generated from text-based lecture.

Linux

Shell, File System, I/O…

A shell is a program that runs in your computer and provides a user interface, which allows you (the user) to communicate with the operating system.

In most Linux distributions, a program called bash acts as the shell. This name stands for Bourne Again Shell, which is an enhanced version of Unix’s original shell program, sh.

Standard Input (or stdin for short): is a device from where the shell is reading text input.
Standard Output (or stdout for short): is a device to where the shell is writing text output.
Standard Error (or stderr for short): similar to the stdout, it is a device where the shell is writing text output. The difference is that the shell and the programs we run usually write normal output to stdout and error messages to stderr.

As you probably know, the files on our computers are organized into directories (also known as folders). This organization is implemented by the file system. A path is the general name of a file or directory in a textual format.

The path denoted by only / is the root, i.e., the starting point of the file system. Therefore, all paths are somehow relative to the root directory. Therefore, any path that begins with / is an absolute path because it clearly identifies where the file or directory is in the file system.

Any path that does not begin with a / is a relative path. A relative path is always relative to the current directory (or working directory).

A dot (.) identifies the current directory. Two dots (..) identifies the parent of the current directory.

There is a global system variable named $PATH that specifies where the shell should find executable files. The contents of the $PATH variable are a list of pathnames separated by the colon character (:). For example:

$ echo $PATH
/home/gustavo/bin:/usr/lobal/bin:/usr/bin

The command ls can be used to list the files (and directories) in the current directory (or any directory if you pass a path as a parameter).

$ ls -l
-rw-r----- 1 j2smith cs246 25 May 4 15:27 abc.txt
type/perms   owner   group size  modified name

Each file or directory has a set of three types of permissions: user, group, and other:

user permissions: apply to the file’s owner user
group permissions: apply to members of the file’s group (other than the owner)
other permissions: apply to everyone else

Each one of these types may contain three specific permissions: Read, Write, and eXecute.

Scripts are text files that contain commands in a specific programming language.

Programs are files that contain commands in binary format.

Using cat to simply read from the keyboard and output to the monitor does not seem very useful. However, you can use the shell to redirect cat’s standard input and output devices to read from or write to files.

Use the character < to redirect the standard input.
Redirect the standard output using the character >.
Redirect the standard error device with 2>.

You can also pass arguments to a program or script by writing each argument separated by a space after the command name. These are called command-line arguments.

If you want to pass a single argument that contains spaces within it, you need to quote the argument with double (") or single (') quotes. The difference is that single quotes will not interpolate anything (e.g., variables), whereas double quotes will.

$ echo "My shell is $0"
My shell is -bash
$ echo 'My shell is $0'
My shell is $0

Pipes allow us to use the output of one program as the input of another by connecting the second program’s stdin to the first program’s stdout.

Globbing Patterns

The shell can automatically expand a few wildcard patterns to match all the files that satisfy the pattern. This is known as globbing.

Operator	Meaning
*	matches zero or more characters
?	matches one character
[abc]	matches exactly one of the characters in the brackets
[!abc]	matches any character except the ones in the brackets
[a-z]	matches any character in the given range
{pat1,pat2}	matches either pat1 or pat2 (note no spaces)

Regular Expressions

Now we have the ability to pipe the output of one program into another program. But, what should we do with it? One of the most common things to do with pipes is process data, and one of the most common ways of processing data is pattern matching.

The patterns that egrep accepts are called regular expressions, and they’re more powerful than the globbing patterns we’ve seen before. A regular expression is a pattern formed by combining smaller patterns.

Now it’s your time to check all patterns supported by egrep. Do a search. I’ll not include them here…

Script

#!/bin/bash
date
whoami
pwd

The very first line of the file isn’t required, but it’s a good idea! It’s called a shebang line (less commonly known as “shabang” or “sh-bang”) because it starts with the musical sharp symbol (‘#’) followed by an exclamation point (‘!’), which some people call “bang”. This line tells the executing shell which shell language the script is written in, since it could be executed in different environments with a different default shell.

bash also has a simple programming language syntax available. This lets you write more complicated scripts.

#!/bin/bash -x

The -x option tells bash to print every command and its arguments as it executes.

A bash variable:

x=1
echo $x

Special variables:

$0: the shell-script name
$1, $2: each command-line argument, based upon its position
$#: the total number of command-line arguments
$?: the exit/return value of the most-recently executed command; used to tell us whether it succeeded or failed. 0 means success, while not-0 means failure

Now it’s your time to do a search: if in bash.

A condition is generally specified within a pair of square brackets, “[ ]”. Note that bash is very picky about the spacing in that there must be a blank space between the opening square bracket and the condition being tested, as well as a blank space between the condition and the closing square bracket.

Now it’s your time to do a search: function in bash, loop in bash.

Software Testing

Software testing is an investigation conducted to verify if the software works as intended, identify existing errors, and possibly assess the quality of the software.

Approaches

In human testing, a person manually verifies if the software is working as intended and tries to identify existing errors (bugs).

In automated testing, test suites are implemented that automatically test the software and compare the results with the expected ones. In general, test suites should contain a list of input sets and matching expected outputs, covering different situations according to the requirement specifications.

Testing is the process that identifies the errors but does not solve them. Debugging, on the other hand, is the process that identifies the causes of known errors to solve them.

Types of Software Tests

Unit tests are conducted at the lowest level, testing only one specific module/unit of the software.

Integration tests are conducted a level above the unit tests and aim to verify if the different modules/units of the software work correctly together.

Functional tests focus on the business requirements of the application.

Acceptance tests are usually part of a formal process in which the client must verify that the produced software meets all the requirements, so it can be accepted.

Sometimes, acceptance tests may be detailed into phases such as alpha testing (done at the end of the development, by a subset of users, but in the controlled environment of the developer) and beta testing (done at the end of the development after the alpha tests, by a subset of real users, in the user’s environment).

Regression tests are conducted after any modification in the software, to ensure that it continues working as intended, i.e., that the modification did not introduce new errors.

Performance tests are designed to verify if the run-time performance of the system will be adequate.

White/Black Box Tests

In white-box tests, tests are created with knowledge of the internal structure of a program.

In black-box tests, tests are created based only on the requirement specifications, without any knowledge of the internal structure of the program (i.e., without access to the source code).

Grey-box testing is just a mix of the two approaches above.

Basic C++

Basics

The compiler takes a high-level programming language file (or files), and produces machine-readable instructions by first preprocessing (we will explain what preprocessing is later), then compiling (transforming C++ code into Assembler code), assembling (transforming Assembler code into machine code), and linking (combining several pieces of machine code—called object files—into a single executable file).

Options passed into the compiler’s cmd-line:

Option	Meaning
-std=c++14	standard...
-g	Produce debugging information in the operating system's native format. The GNU debugger, gdb, can work with this debugging information.
-c	Produce an object file (ends in .o) that consists of assembler output.
-Wall	This enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning).
-Wextra	This enables some extra warnings that aren't enabled by -Wall.
-Wpedantic	This ensures that your code follows the strict ISO C++ standard, with no forbidden extensions.
-D	Lets us define a macro name as a command-line argument to the compiler.

There are three global variables in the standard (std) namespace that define the stream objects used for basic input and output. cin, cout, cerr.

Now do some search: cin.eof(), cin.ignore(), cin.clear().

File I/O

#include <fstream>

int main() {
    std::ifstream infile{ "input.txt" };
    std::ofstream outfile{ "output.txt" };
    int i;
    while ( true ) {
        infile >> i;
        if ( infile.fail() ) break;
        outfile << i << std::endl;
    }
}

Further documentation: c++.com

Now do some search: format the output using setw, setprecision. cppref

Strings

#include <string>
...
std::string s = "hey";
std::string name{ "Sibelius" };

cpp.com on strings

There is an unusual type in C++ called stringstream, available through the sstream library. It is a hybrid of both the std::string class, and the I/O stream classes. It lets you read/write to/from strings using stream operators. (While you can use the stringstream for either input or output, we recommend that you use the type explicitly defined for input, istringstream, or output, ostringstream, as appropriate.)

Command-line arguments

#include <iostream>

int main(int argc, char **argv) {
  for (int argi = 0; argi < argc; argi++) {
    std::cout << argv[argi] << std::endl;
  }
}

Functions

You should already be familiar with the concept of separating a function into its declaration (signature), and its definition (implementation).

Where C++ is different from C is that it allows a function to be overloaded. In other words, I can have more than one function with the exact same function name so long as the number of arguments and/or their types are different. Example from cpp.com:

// overloading functions
#include <iostream>
using namespace std;

int operate (int a, int b) {
  return (a*b);
}

double operate (double a, double b) {
  return (a/b);
}

int main () {
  int x=5,y=2;
  double n=5.0,m=2.0;
  cout << operate (x,y) << '\n';
  cout << operate (n,m) << '\n';
  return 0;
}

Note that the decision as to which function is to be called must be made at compile-time.

C++ provides a mechanism called default parameters. cppref

Structures

cpp.com

C++ no longer uses NULL; rather you should be using nullptr instead. There’s a nice discussion of the reason why at here.

A value is const means that it must be initialized when it is defined since it cannot be changed later!

References

We first need to introduce two terms that are going to help explain the concept, that of rvalues and lvalues. Scott Meyers, on page 2 of “Effective Modern C++”, gives a simple explanation that works well. If you can take the address of an expression, it is an lvalue; otherwise, it’s an rvalue.

So, what is a reference? A reference is an lvalue that acts like a constant pointer but the compiler automatically dereferences it. Since it’s a constant, it must be initialized when it is defined.

isocpp

Dynamic memory allocation

The C++ standard uses the terms automatic versus dynamic storage.

C++:

Uses keywords new and delete.
Type safe i.e. allocates space of the appropriate size and returns a pointer of the appropriate type.

Returning information

There are three ways in which we can return information from a function in C++:

return by value
return by pointer
return by reference

Operator overloading

Vec operator+( const Vec & v1, const Vec & v2 ) {
   Vec v{ v1.x + v2.x, v1.y + v2.y };
   return v;
}

The Preprocessor

Both C and C++ use a tool called the C preprocessor, which is tasked with handling preprocessor directives like #include. See more explanations in Section 4.2 of cs 146. See some discussions in Section 2.1 of cs 246e.

Including isn’t all that the C preprocessor can do, however. It can also define its own constants:

#include <iostream>
using namespace std;

#define GEESE 15

int main() {
    cout << "I have " << GEESE << " geese!" << endl;
    return 0;
}

Moreover, #if (end with #endif) can be used to debug.

Separate Compilation

Recall the definitions of declaration and definition.

We will split our code into two components:

interface: Declarations including function prototypes, with no actual code, as well as type definitions. Put in a separate file from the actual code, typically named with .h (which stands for “header”).
implementation: The full definition for every function, as well as space for any global variables. In this course, typically named .cc. C files are always named .c.

Separate compilation: we can compile only one part of our whole program at any time.

$ g++ -c main.cc
$ g++ -c vec.cc
$ g++ main.o vec.o -o vecs
$

The -c option to g++ requests that a .cc file be compiled into a .o file, or an object file. Combining multiple .o files to create a program is called linking.

The extern keyword may be applied to a global variable, function, or template declaration. It specifies that the symbol has external linkage. From docs.microsoft

An example from class:

// vec.h
struct Vec {
  int x, y;
};

extern Vec origin;

Vec operator+(const Vec &v1, const Vec &v2);
Vec addToOrigin(const Vec &v);

// vec.cc
#include "vec.h"

Vec origin {0, 0};

Vec operator+(const Vec &v1, const Vec &v2) {
  return {v1.x + v2.x, v1.y + v2.y};
}

Vec addToOrigin(const Vec &v) {
  return v + origin;
}

// main.cc
#include "vec.h"

int main() {
  Vec v {1,2};
  origin.x = 12;
  v = addToOrigin(v);
  return 0;
}

This separation of declarations from definitions creates a system of dependencies.

Make and Makefiles

In the previous example, the relationship between our files created a system of dependencies, and those dependencies were making compiling our program a bit complicated.

Luckily, a tool exists to automate this tedium, by combining the list of commands with the dependency relationship: make. Most often, the makefile directs Make on how to compile and link a program. A makefile works upon the principle that files only need recreating if their dependencies are newer than the file being created/recreated. (from wiki)

In Makefile, each file is a target, that is, something this Makefile describes how to create.

The commands to create a target must be indented with tabs, even if you normally indent with spaces, so be careful!

The command to create any given target is called a recipe.

Phony target: a target that exists only for its recipe, and doesn’t actually build anything. The clean phony target is commonly used for cleanup commands:

.PHONY: clean

clean:
	rm *.o vecs

We can make our Makefile more flexible, by using make variables.

As a bonus, make includes some default recipes, including a recipe for compiling .cc files into .o files, which uses the pre-defined make variables CXX and CXXFLAGS, so we didn’t have to include those recipes here.

The -MMD flag to g++ causes it to create .d files, which are make dependencies, whenever it compiles a .cc file.

Final version of makefile:

CXX=g++
CXXFLAGS=-std=c++14 -MMD
OBJECTS=main.o vec.o
DEPENDS=${OBJECTS:.o=.d}
EXEC=vecs

${EXEC}: ${OBJECTS}
	${CXX} ${OBJECTS} -o ${EXEC}

-include ${DEPENDS}

.PHONY: clean

clean:
	rm ${OBJECTS} ${DEPENDS} ${EXEC}

Preprocessor guards

Put preprocessor guards, also called include guards, into every header file. The general format of an include guard is:

#ifndef SOME_UNIQUE_MACRO_NAME_H
#define SOME_UNIQUE_MACRO_NAME_H
...
#endif

Never write using namespace std; in a header (.h) file. Always prefix each type with std:: in these files.

Object-Oriented Programming

Coupling and Cohesion

Cohesion measures the amount of “relatedness” that a module or unit of code contains.

Coupling measures the amount of dependency between units/modules.

Maximize cohesion and minimize coupling!

Procedural vs Object-Oriented Programming, Classes

The style of programming you have been doing so far in C/C++ is called procedural programming. It has this name because the code is organized in procedures that implement the logic of the program (which in C/C++ are the functions) and variables that hold the data.

In object-oriented programming (OOP), we implement our programs using objects. Objects are units of code that contain data and the procedures that implement the logic that operates over the data. In OOP, the data contained within an object are called attributes or member fields. The procedures of an object are called methods, operations, or member functions.

To create objects, we must first define classes. Classes are blueprints or type specifications that describe the contents of an object.

A class is a type. An instance of a class is called an object.

#include "student.h"

float Student::grade() {
  return assns * 0.4 + mt * 0.2 + final * 0.4;
}

Note that in the implementation file for the Student, we have to use the scope resolution operator ::, prefixed by the class name, to specify that the grade function is a method of the Student class.

It initializes s by passing in three integer values to the initialization list delimited by the curly braces ({}).

All class methods have a hidden, extra first parameter called this that is a pointer of the class type.

It turns out that C++ classes have a special type of method, called a constructor.

Name-of-Class-Type( parameter-list ) {
   // necessary code
}

The rules in C++ as to when we can use = and when we can use () are rather arbitrary. C++ now has uniform initialization syntax using {}, which is meant to be used in nearly all initialization situations. From geeksforgeeks,

Uniform initialization is a feature in C++ 11 that allows the usage of a consistent syntax to initialize variables and objects ranging from primitive type to aggregates. In other words, it introduces brace-initialization that uses braces ({}) to enclose initializer values. The syntax is as follows:

type var_name{arg1, arg2, ..., arg n}

Big 5

By definition, a default constructor is a constructor that has 0 parameters.

3 steps when creating an object:

allocate space
construct the data fields
run the constructor body

Special syntax: member initialization list (MIL). The member initialization list code must be placed as part of the constructor implementation. It occurs between the closing parenthesis of the parameter list, and the opening curly brace of the constructor body. It consists of a colon (‘:’) followed by a comma-separated list of data field names to be initialized, where each data field is initialized using uniform initialization syntax.

default constructor: calls the default constructor on all fields that are objects
copy constructor: copies all fields from the object passed in
copy assignment operator: copies all fields from the object passed in
destructor: does nothing by default
move constructor: takes data from the object passed in
move assignment operator: takes data from the object passed in

There’s one type of constructor with which you need to be careful—single-argument constructors. By definition, they’re used to implicitly convert an argument of the specified type into an object of the constructor type.

Disable implicit conversion by labelling all single-argument constructors with the explicit keyword.

Copy-and-swap idiom. article

#include <utility>
struct Node {
   ...
   void swap( Node & other ) {
      std::swap( data, other.data );
      std::swap( next, other.next );
   }

   Node & operator=( const Node & other) {
      Node tmp{ other };
      swap( tmp );
      return *this;
   }
   ...
};

The compiler can carry out an optimization called elision. (Copy/Move Elision).

If we compile the program with the g++ option -fno-elide-constructors, this will turn off the compiler optimizations, and we can see how many copy constructor calls are made if we don’t make our program efficient by turning the elision on.

Static members are associated with the class itself, and not with any particular instance (object) of the class. More discussion see Section 6.1 of cs 146.

Static member functions don’t depend on a specific instance for their computation (they don’t have an implicit this parameter).

Advanced object uses: object arrays, constant object (an object whose fields cannot be modified).

physical constness: whether the actual bit pattern that makes up the object changes; logical constness: whether any change to the object is apparent to outsiders.

We want to be able to update numMethodCalls, even if the object is const. We declare the field mutable:

struct Student {
    . . .
    mutable int numMethodCalls = 0;
    float grade() const {  // can be const now
        ++numMethodCalls;  // OK now
        return . . . ;
    }
};

Another keyword, which is not typically covered/mentioned in cs 246, but mentioned in cs241/cs246e: inline.

struct Vector {
    ...
    const int &itemAt(size_t i) const;
    // Will be called if the object is const
    int &itemAt(size_t i);
    // Will be called if object is non-const
};
inline const int &Vector::itemAt(size_t i) const { return theVector[i]; }
inline int &Vector::itemAt(size_t) { return theVector[i]; }

inline suggests replacing the function call with the function body, saves the cost of function call.

This is an example of const overloading. Check Chapter 6 of cs 246e.

Encapsulation

To enforce invariants, we introduce encapsulation, which means that we make clients treat our objects as black boxes (capsules) that create abstractions in which implementation details are sealed away.

We implement encapsulation by setting the access modifier (or visibility) for each one of the members (fields or methods) of a class: private and public.

Discussion on struct vs. class.

We can have accessor (getX()) and mutator (setX()).

friend in class:

class List {
  public:
    class Node {   // Public nested class
        ...
        friend class List;    // List has access to all members of Node
    }
  private:
    ...
  public:
    ...
};

friend function:

// vec.h
class Vec {
    int x, y;
    friend std::ostream &operator<<(std::ostream &out, const Vec &v);
};

//vec.cc
std::ostream &operator<<(std::ostream &out, const Vec &v) {
    return out << v.x << ' ' << v.y;
}

However, in general, you should give your classes as few friends as possible. Friendships weaken encapsulation and thus should be used only if really necessary.

Design patterns

A design pattern is a codified solution to a common software problem.

Iterator

From wiki, iterator pattern is a design pattern in which an iterator is used to traverse a container and access the container’s elements.

automatic type deduction:

for (auto it = lst.begin(); it != lst.end(); ++it) {
    cout << *it << endl; // prints 3 2 1
}

range-based for loop

for (auto n : lst) {
    cout << n << endl;
}

UML

The Unified Modeling Language, known as UML for short, is a visual modeling language that lets people design and document a software system.

A class model is a diagram that visually represents a group of classes, and the relationships between them. A box contains class name, attributes, operations, comment.

Selected notations:

An abstract operation is italicized.
dependency: dashed line
static attr/ops: underline

Relationships:

association (from sourcemaking): An association indicates that objects of one class have a relationship with objects of another class, in which this connection has a specifically defined meaning (for example, “is flown with”).
aggregation: Has-A. Shallow copy
composition: Owns-A. Deep copy.
generalization (or specialization): superclass and subclass. Is-A.

Inheritance

Three possible ways to keep different types in a single array: C unions, C void pointers, C++ inheritance.

class Book {
    ...
    virtual bool isHeavy() const;
}
class Text : public Book {
    ...
    bool isHeavy() const override;
}
class Comic : public Book {
    ...
    bool isHeavy() const override;
}

This is public inheritance. And private & protected.

The ability to accommodate multiple types under one abstraction is called polymorphism.

int main() {
    Book* books [] { 
        new Book{ ... },
        new Book{ ... },
        new Text{ ... },
        new Text{ ... },
        new Comic{ ...},
        new Comic{ ...}
    };
}

Note that the use of array of objects polymorphically might misalign the data. Instead, one should use array of pointers.

From wiki,

In programming languages, an abstract type is a type in a nominative type system that cannot be instantiated directly; a type that is not abstract – which can be instantiated – is called a concrete type. Every instance of an abstract type is an instance of some concrete subtype. Abstract types are also known as existential types.

Needs a pure virtual method:

class Foo {
    ...
    public:
        virtual void bar() const = 0;
}

To avoid circular includes, one can use forward declaration. Check page 112 of CS 246E for a full discussion.

Observer

The Observer design pattern is also known as Dependents or Publish-Subscribe.

See chapter 2 of the “Head First Design Patterns” book by Freeman and Freeman.

Decorator

The Decorator design pattern is intended to let you add functionality or features to an object at run-time rather than to the class as a whole.

See chapter 3 of the “Head First Design Patterns” book by Freeman and Freeman.

Factory

The Factory Method design pattern provides an interface for object creation, but lets the subclasses decide which object to create. It is also known as the Virtual Constructor.

The Abstract Factory design pattern provides an interface for creating families of related or dependent objects. It is also known as Kit.

See chapter 4 of the “Head First Design Patterns” book by Freeman and Freeman.

Strategy

The Strategy design pattern is intended to define a family of algorithms, encapsulating each one, and making them interchangeable. Allows the algorithm to vary independently of the client. Also known as Policy.

See chapter 1 of the “Head First Design Patterns” book by Freeman and Freeman.

Template

The Template Method design pattern defines the steps of an algorithm in an operation, but lets the subclasses redefine certain steps though not the overall algorithm’s structure.

See chapter 8 of the “Head First Design Patterns” book by Freeman and Freeman.

Visitor

The Visitor design pattern allows the programmer to apply an operation to be performed upon the elements contained in an object structure. New operations can be added without changing the elements on which it operates.

See 26.3 of CS 246E for an example of overloading and overriding.

See chapter 14, the appendix, of the “Head First Design Patterns” book by Freeman and Freeman.

Bridge

The Bridge design pattern decouples an abstraction from its implementation, allowing the two to vary independently. It is also known as Handle/Body.

See chapter 14, the appendix, of the “Head First Design Patterns” book by Freeman and Freeman.

MVC

The Model-View-Controller (MVC) architecture presents a solution to this common issue. In MVC, the program state, presentation logic, and control logic are all separated. Therefore, it is in theory possible to modify any one of them without modifying the other two.

STL

Template programming allows us to create parameterized classes (templates) that are specialized to actual code when we need to use them. See isocpp or problem 10 & problem 24 of CS 246E.

The Standard Template Library (STL) is a large collection of useful templates that already exist in C++.

Two important and useful class template: vector and map. More: set, stack, queue, list…

Also template functions: for_each, find, count, copy, transform.

Exceptions

See problem 9 of CS 246E for an introduction.

throw by value catch by reference.

try {
    throw 1;
} catch (...) {
    ...
}

Three levels of exception safety:

Basic guarantee
Strong guarantee
No-throw guarantee.

Further topics

smart pointers

unique_ptr and shared_ptr in <memory>. See problem 14 & 23 of CS 246E

Idioms

Resource Acquisition Is Initialization

In the Pimpl Idiom, the term “Pimpl” is short for “Pointer to Implementation”.

Casts

static_cast, reinterpret_cast, const_cast, dynamic_cast, dynamic_pointer_cast.

Refactoring

https://refactoring.com/

Function Objects and Lambdas

This is typically covered in the last lecture of cs 246 (in person version).

Function Objects:

class Plus1 {
  public:
    int operator() (int n) { return n + 1; }
};
. . .
Plus1 p;
p(4); // produces 5

Lambda:

vector <int> v { . . . };
int x = count_if(v.begin(), v.end(), [](int n) { return n % 2 == 0; });