CS 246 - Object-Oriented Software Development
Victoria Sakhnini
Estimated reading time: 35 minutes
Table of contents
The notes below are generated from text-based lecture.
Linux
Shell, File System, I/O…
A shell is a program that runs in your computer and provides a user interface, which allows you (the user) to communicate with the operating system.
In most Linux distributions, a program called bash acts as the shell. This name stands for Bourne Again Shell, which is an enhanced version of Unix’s original shell program, sh
.
- Standard Input (or
stdin
for short): is a device from where the shell is reading text input. - Standard Output (or
stdout
for short): is a device to where the shell is writing text output. - Standard Error (or
stderr
for short): similar to thestdout
, it is a device where the shell is writing text output. The difference is that the shell and the programs we run usually write normal output tostdout
and error messages tostderr
.
As you probably know, the files on our computers are organized into directories (also known as folders). This organization is implemented by the file system. A path is the general name of a file or directory in a textual format.
The path denoted by only /
is the root, i.e., the starting point of the file system. Therefore, all paths are somehow relative to the root directory. Therefore, any path that begins with /
is an absolute path because it clearly identifies where the file or directory is in the file system.
Any path that does not begin with a /
is a relative path. A relative path is always relative to the current directory (or working directory).
A dot (.
) identifies the current directory. Two dots (..
) identifies the parent of the current directory.
There is a global system variable named $PATH
that specifies where the shell should find executable files. The contents of the $PATH
variable are a list of pathnames separated by the colon character (:
). For example:
$ echo $PATH
/home/gustavo/bin:/usr/lobal/bin:/usr/bin
The command ls
can be used to list the files (and directories) in the current directory (or any directory if you pass a path as a parameter).
$ ls -l
-rw-r----- 1 j2smith cs246 25 May 4 15:27 abc.txt
type/perms owner group size modified name
Each file or directory has a set of three types of permissions: user, group, and other:
- user permissions: apply to the file’s owner user
- group permissions: apply to members of the file’s group (other than the owner)
- other permissions: apply to everyone else
Each one of these types may contain three specific permissions: Read, Write, and eXecute.
Scripts
are text files that contain commands in a specific programming language.
Programs
are files that contain commands in binary format.
Using cat
to simply read from the keyboard and output to the monitor does not seem very useful. However, you can use the shell to redirect cat
’s standard input and output devices to read from or write to files.
- Use the character
<
to redirect the standard input. - Redirect the standard output using the character
>
. - Redirect the standard error device with
2>
.
You can also pass arguments to a program or script by writing each argument separated by a space after the command name. These are called command-line arguments.
If you want to pass a single argument that contains spaces within it, you need to quote the argument with double ("
) or single ('
) quotes. The difference is that single quotes will not interpolate anything (e.g., variables), whereas double quotes will.
$ echo "My shell is $0"
My shell is -bash
$ echo 'My shell is $0'
My shell is $0
Pipes allow us to use the output of one program as the input of another by connecting the second program’s stdin
to the first program’s stdout
.
Globbing Patterns
The shell can automatically expand a few wildcard patterns to match all the files that satisfy the pattern. This is known as globbing.
Operator | Meaning |
---|---|
* | matches zero or more characters |
? | matches one character |
[abc] | matches exactly one of the characters in the brackets |
[!abc] | matches any character except the ones in the brackets |
[a-z] | matches any character in the given range |
{pat1,pat2} | matches either pat1 or pat2 (note no spaces) |
Regular Expressions
Now we have the ability to pipe the output of one program into another program. But, what should we do with it? One of the most common things to do with pipes is process data, and one of the most common ways of processing data is pattern matching.
The patterns that egrep
accepts are called regular expressions, and they’re more powerful than the globbing patterns we’ve seen before. A regular expression is a pattern formed by combining smaller patterns.
Now it’s your time to check all patterns supported by egrep
. Do a search. I’ll not include them here…
Script
#!/bin/bash
date
whoami
pwd
The very first line of the file isn’t required, but it’s a good idea! It’s called a shebang line (less commonly known as “shabang” or “sh-bang”) because it starts with the musical sharp symbol (‘#’) followed by an exclamation point (‘!’), which some people call “bang”. This line tells the executing shell which shell language the script is written in, since it could be executed in different environments with a different default shell.
bash also has a simple programming language syntax available. This lets you write more complicated scripts.
#!/bin/bash -x
The -x
option tells bash to print every command and its arguments as it executes.
A bash variable:
x=1
echo $x
Special variables:
$0
: the shell-script name$1
,$2
: each command-line argument, based upon its position$#
: the total number of command-line arguments$?
: the exit/return value of the most-recently executed command; used to tell us whether it succeeded or failed. 0 means success, while not-0 means failure
Now it’s your time to do a search: if
in bash.
A condition is generally specified within a pair of square brackets, “[ ]”. Note that bash is very picky about the spacing in that there must be a blank space between the opening square bracket and the condition being tested, as well as a blank space between the condition and the closing square bracket.
Now it’s your time to do a search: function in bash, loop in bash.
Software Testing
Software testing is an investigation conducted to verify if the software works as intended, identify existing errors, and possibly assess the quality of the software.
Approaches
In human testing, a person manually verifies if the software is working as intended and tries to identify existing errors (bugs).
In automated testing, test suites are implemented that automatically test the software and compare the results with the expected ones. In general, test suites should contain a list of input sets and matching expected outputs, covering different situations according to the requirement specifications.
Testing is the process that identifies the errors but does not solve them. Debugging, on the other hand, is the process that identifies the causes of known errors to solve them.
Types of Software Tests
Unit tests are conducted at the lowest level, testing only one specific module/unit of the software.
Integration tests are conducted a level above the unit tests and aim to verify if the different modules/units of the software work correctly together.
Functional tests focus on the business requirements of the application.
Acceptance tests are usually part of a formal process in which the client must verify that the produced software meets all the requirements, so it can be accepted.
Sometimes, acceptance tests may be detailed into phases such as alpha testing (done at the end of the development, by a subset of users, but in the controlled environment of the developer) and beta testing (done at the end of the development after the alpha tests, by a subset of real users, in the user’s environment).
Regression tests are conducted after any modification in the software, to ensure that it continues working as intended, i.e., that the modification did not introduce new errors.
Performance tests are designed to verify if the run-time performance of the system will be adequate.
White/Black Box Tests
In white-box tests, tests are created with knowledge of the internal structure of a program.
In black-box tests, tests are created based only on the requirement specifications, without any knowledge of the internal structure of the program (i.e., without access to the source code).
Grey-box testing is just a mix of the two approaches above.
Basic C++
Basics
The compiler takes a high-level programming language file (or files), and produces machine-readable instructions by first preprocessing (we will explain what preprocessing is later), then compiling (transforming C++ code into Assembler code), assembling (transforming Assembler code into machine code), and linking (combining several pieces of machine code—called object files—into a single executable file).
Options passed into the compiler’s cmd-line:
Option | Meaning |
---|---|
-std=c++14 | standard... |
-g | Produce debugging information in the operating system's native format. The GNU debugger, gdb, can work with this debugging information. |
-c | Produce an object file (ends in .o) that consists of assembler output. |
-Wall | This enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning). |
-Wextra | This enables some extra warnings that aren't enabled by -Wall. |
-Wpedantic | This ensures that your code follows the strict ISO C++ standard, with no forbidden extensions. |
-D | Lets us define a macro name as a command-line argument to the compiler. |
There are three global variables in the standard (std
) namespace that define the stream objects used for basic input and output. cin, cout, cerr
.
Now do some search: cin.eof()
, cin.ignore()
, cin.clear()
.
File I/O
#include <fstream>
int main() {
std::ifstream infile{ "input.txt" };
std::ofstream outfile{ "output.txt" };
int i;
while ( true ) {
infile >> i;
if ( infile.fail() ) break;
outfile << i << std::endl;
}
}
Further documentation: c++.com
Now do some search: format the output using setw
, setprecision
. cppref
Strings
#include <string>
...
std::string s = "hey";
std::string name{ "Sibelius" };
There is an unusual type in C++ called stringstream
, available through the sstream library. It is a hybrid of both the std::string class
, and the I/O stream classes. It lets you read/write to/from strings using stream operators. (While you can use the stringstream
for either input or output, we recommend that you use the type explicitly defined for input, istringstream
, or output, ostringstream
, as appropriate.)
Command-line arguments
#include <iostream>
int main(int argc, char **argv) {
for (int argi = 0; argi < argc; argi++) {
std::cout << argv[argi] << std::endl;
}
}
Functions
You should already be familiar with the concept of separating a function into its declaration
(signature), and its definition
(implementation).
Where C++ is different from C is that it allows a function to be overloaded. In other words, I can have more than one function with the exact same function name so long as the number of arguments and/or their types are different. Example from cpp.com:
// overloading functions
#include <iostream>
using namespace std;
int operate (int a, int b) {
return (a*b);
}
double operate (double a, double b) {
return (a/b);
}
int main () {
int x=5,y=2;
double n=5.0,m=2.0;
cout << operate (x,y) << '\n';
cout << operate (n,m) << '\n';
return 0;
}
Note that the decision as to which function is to be called must be made at compile-time.
C++ provides a mechanism called default parameters. cppref
Structures
C++ no longer uses NULL
; rather you should be using nullptr
instead. There’s a nice discussion of the reason why at here.
A value is const
means that it must be initialized when it is defined since it cannot be changed later!
References
We first need to introduce two terms that are going to help explain the concept, that of rvalues and lvalues. Scott Meyers, on page 2 of “Effective Modern C++”, gives a simple explanation that works well. If you can take the address of an expression, it is an lvalue; otherwise, it’s an rvalue.
So, what is a reference? A reference is an lvalue that acts like a constant pointer but the compiler automatically dereferences it. Since it’s a constant, it must be initialized when it is defined.
Dynamic memory allocation
The C++ standard uses the terms automatic versus dynamic storage.
C++:
- Uses keywords new and
delete
. - Type safe i.e. allocates space of the appropriate size and returns a pointer of the appropriate type.
Returning information
There are three ways in which we can return information from a function in C++:
- return by value
- return by pointer
- return by reference
Operator overloading
Vec operator+( const Vec & v1, const Vec & v2 ) {
Vec v{ v1.x + v2.x, v1.y + v2.y };
return v;
}
The Preprocessor
Both C and C++ use a tool called the C preprocessor, which is tasked with handling preprocessor directives like #include
. See more explanations in Section 4.2 of cs 146. See some discussions in Section 2.1 of cs 246e.
Including isn’t all that the C preprocessor can do, however. It can also define its own constants:
#include <iostream>
using namespace std;
#define GEESE 15
int main() {
cout << "I have " << GEESE << " geese!" << endl;
return 0;
}
Moreover, #if
(end with #endif
) can be used to debug.
Separate Compilation
Recall the definitions of declaration and definition.
We will split our code into two components:
- interface: Declarations including function prototypes, with no actual code, as well as type definitions. Put in a separate file from the actual code, typically named with
.h
(which stands for “header”). - implementation: The full definition for every function, as well as space for any global variables. In this course, typically named
.cc
. C files are always named.c
.
Separate compilation: we can compile only one part of our whole program at any time.
$ g++ -c main.cc
$ g++ -c vec.cc
$ g++ main.o vec.o -o vecs
$
The -c
option to g++ requests that a .cc
file be compiled into a .o
file, or an object file. Combining multiple .o
files to create a program is called linking.
The extern
keyword may be applied to a global variable, function, or template declaration. It specifies that the symbol has external linkage. From docs.microsoft
An example from class:
// vec.h
struct Vec {
int x, y;
};
extern Vec origin;
Vec operator+(const Vec &v1, const Vec &v2);
Vec addToOrigin(const Vec &v);
// vec.cc
#include "vec.h"
Vec origin {0, 0};
Vec operator+(const Vec &v1, const Vec &v2) {
return {v1.x + v2.x, v1.y + v2.y};
}
Vec addToOrigin(const Vec &v) {
return v + origin;
}
// main.cc
#include "vec.h"
int main() {
Vec v {1,2};
origin.x = 12;
v = addToOrigin(v);
return 0;
}
This separation of declarations from definitions creates a system of dependencies.
Make and Makefiles
In the previous example, the relationship between our files created a system of dependencies, and those dependencies were making compiling our program a bit complicated.
Luckily, a tool exists to automate this tedium, by combining the list of commands with the dependency relationship: make
. Most often, the makefile directs Make on how to compile and link a program. A makefile works upon the principle that files only need recreating if their dependencies are newer than the file being created/recreated. (from wiki)
In Makefile
, each file is a target, that is, something this Makefile
describes how to create.
The commands to create a target must be indented with tabs, even if you normally indent with spaces, so be careful!
The command to create any given target is called a recipe
.
Phony target: a target that exists only for its recipe, and doesn’t actually build anything. The clean phony target is commonly used for cleanup commands:
.PHONY: clean
clean:
rm *.o vecs
We can make our Makefile
more flexible, by using make variables.
As a bonus, make includes some default recipes, including a recipe for compiling .cc
files into .o
files, which uses the pre-defined make variables CXX
and CXXFLAGS
, so we didn’t have to include those recipes here.
The -MMD
flag to g++ causes it to create .d
files, which are make dependencies, whenever it compiles a .cc
file.
Final version of makefile:
CXX=g++
CXXFLAGS=-std=c++14 -MMD
OBJECTS=main.o vec.o
DEPENDS=${OBJECTS:.o=.d}
EXEC=vecs
${EXEC}: ${OBJECTS}
${CXX} ${OBJECTS} -o ${EXEC}
-include ${DEPENDS}
.PHONY: clean
clean:
rm ${OBJECTS} ${DEPENDS} ${EXEC}
Preprocessor guards
Put preprocessor guards, also called include guards, into every header file. The general format of an include guard is:
#ifndef SOME_UNIQUE_MACRO_NAME_H
#define SOME_UNIQUE_MACRO_NAME_H
...
#endif
Never write using namespace std;
in a header (.h
) file. Always prefix each type with std::
in these files.
Object-Oriented Programming
Coupling and Cohesion
Cohesion measures the amount of “relatedness” that a module or unit of code contains.
Coupling measures the amount of dependency between units/modules.
Maximize cohesion and minimize coupling!
Procedural vs Object-Oriented Programming, Classes
The style of programming you have been doing so far in C/C++ is called procedural programming. It has this name because the code is organized in procedures that implement the logic of the program (which in C/C++ are the functions) and variables that hold the data.
In object-oriented programming (OOP), we implement our programs using objects. Objects are units of code that contain data and the procedures that implement the logic that operates over the data. In OOP, the data contained within an object are called attributes or member fields. The procedures of an object are called methods, operations, or member functions.
To create objects, we must first define classes. Classes are blueprints or type specifications that describe the contents of an object.
A class is a type. An instance of a class is called an object.
#include "student.h"
float Student::grade() {
return assns * 0.4 + mt * 0.2 + final * 0.4;
}
Note that in the implementation file for the Student, we have to use the scope resolution operator ::
, prefixed by the class name, to specify that the grade function is a method of the Student
class.
It initializes s by passing in three integer values to the initialization list
delimited by the curly braces ({}
).
All class methods have a hidden, extra first parameter called this
that is a pointer of the class type.
It turns out that C++ classes have a special type of method, called a constructor.
Name-of-Class-Type( parameter-list ) {
// necessary code
}
The rules in C++ as to when we can use =
and when we can use ()
are rather arbitrary. C++ now has uniform initialization syntax using {}
, which is meant to be used in nearly all initialization situations. From geeksforgeeks,
Uniform initialization is a feature in C++ 11 that allows the usage of a consistent syntax to initialize variables and objects ranging from primitive type to aggregates. In other words, it introduces brace-initialization that uses braces ({}) to enclose initializer values. The syntax is as follows:
type var_name{arg1, arg2, ..., arg n}
Big 5
By definition, a default constructor is a constructor that has 0 parameters.
3 steps when creating an object:
- allocate space
- construct the data fields
- run the constructor body
Special syntax: member initialization list (MIL). The member initialization list code must be placed as part of the constructor implementation. It occurs between the closing parenthesis of the parameter list, and the opening curly brace of the constructor body. It consists of a colon (‘:
’) followed by a comma-separated list of data field names to be initialized, where each data field is initialized using uniform initialization syntax.
- default constructor: calls the default constructor on all fields that are objects
- copy constructor: copies all fields from the object passed in
- copy assignment operator: copies all fields from the object passed in
- destructor: does nothing by default
- move constructor: takes data from the object passed in
- move assignment operator: takes data from the object passed in
There’s one type of constructor with which you need to be careful—single-argument constructors. By definition, they’re used to implicitly convert an argument of the specified type into an object of the constructor type.
Disable implicit conversion by labelling all single-argument constructors with the explicit
keyword.
Copy-and-swap idiom. article
#include <utility>
struct Node {
...
void swap( Node & other ) {
std::swap( data, other.data );
std::swap( next, other.next );
}
Node & operator=( const Node & other) {
Node tmp{ other };
swap( tmp );
return *this;
}
...
};
The compiler can carry out an optimization called elision. (Copy/Move Elision).
If we compile the program with the g++ option -fno-elide-constructors
, this will turn off the compiler optimizations, and we can see how many copy constructor calls are made if we don’t make our program efficient by turning the elision on.
Static members are associated with the class itself, and not with any particular instance (object) of the class. More discussion see Section 6.1 of cs 146.
Static member functions don’t depend on a specific instance for their computation (they don’t have an implicit this parameter).
Advanced object uses: object arrays, constant object (an object whose fields cannot be modified).
physical constness: whether the actual bit pattern that makes up the object changes; logical constness: whether any change to the object is apparent to outsiders.
We want to be able to update numMethodCalls
, even if the object is const
. We declare the field mutable:
struct Student {
. . .
mutable int numMethodCalls = 0;
float grade() const { // can be const now
++numMethodCalls; // OK now
return . . . ;
}
};
Another keyword, which is not typically covered/mentioned in cs 246, but mentioned in cs241/cs246e: inline
.
struct Vector {
...
const int &itemAt(size_t i) const;
// Will be called if the object is const
int &itemAt(size_t i);
// Will be called if object is non-const
};
inline const int &Vector::itemAt(size_t i) const { return theVector[i]; }
inline int &Vector::itemAt(size_t) { return theVector[i]; }
inline
suggests replacing the function call with the function body, saves the cost of function call.
This is an example of const
overloading. Check Chapter 6 of cs 246e.
Encapsulation
To enforce invariants, we introduce encapsulation, which means that we make clients treat our objects as black boxes (capsules) that create abstractions in which implementation details are sealed away.
We implement encapsulation by setting the access modifier (or visibility) for each one of the members (fields or methods) of a class: private and public.
Discussion on struct vs. class.
We can have accessor (getX()
) and mutator (setX()
).
friend
in class:
class List {
public:
class Node { // Public nested class
...
friend class List; // List has access to all members of Node
}
private:
...
public:
...
};
friend
function:
// vec.h
class Vec {
int x, y;
friend std::ostream &operator<<(std::ostream &out, const Vec &v);
};
//vec.cc
std::ostream &operator<<(std::ostream &out, const Vec &v) {
return out << v.x << ' ' << v.y;
}
However, in general, you should give your classes as few friends as possible. Friendships weaken encapsulation and thus should be used only if really necessary.
Design patterns
A design pattern is a codified solution to a common software problem.
Iterator
From wiki, iterator pattern is a design pattern in which an iterator is used to traverse a container and access the container’s elements.
for (auto it = lst.begin(); it != lst.end(); ++it) {
cout << *it << endl; // prints 3 2 1
}
for (auto n : lst) {
cout << n << endl;
}
UML
The Unified Modeling Language, known as UML for short, is a visual modeling language that lets people design and document a software system.
A class model is a diagram that visually represents a group of classes, and the relationships between them. A box contains class name, attributes, operations, comment.
Selected notations:
- An abstract operation is italicized.
- dependency: dashed line
- static attr/ops: underline
Relationships:
- association (from sourcemaking): An association indicates that objects of one class have a relationship with objects of another class, in which this connection has a specifically defined meaning (for example, “is flown with”).
- aggregation: Has-A. Shallow copy
- composition: Owns-A. Deep copy.
- generalization (or specialization): superclass and subclass. Is-A.
Inheritance
Three possible ways to keep different types in a single array: C unions, C void pointers, C++ inheritance.
class Book {
...
virtual bool isHeavy() const;
}
class Text : public Book {
...
bool isHeavy() const override;
}
class Comic : public Book {
...
bool isHeavy() const override;
}
This is public inheritance. And private & protected.
The ability to accommodate multiple types under one abstraction is called polymorphism.
int main() {
Book* books [] {
new Book{ ... },
new Book{ ... },
new Text{ ... },
new Text{ ... },
new Comic{ ...},
new Comic{ ...}
};
}
Note that the use of array of objects polymorphically might misalign the data. Instead, one should use array of pointers.
From wiki,
In programming languages, an abstract type is a type in a nominative type system that cannot be instantiated directly; a type that is not abstract – which can be instantiated – is called a concrete type. Every instance of an abstract type is an instance of some concrete subtype. Abstract types are also known as existential types.
Needs a pure virtual method:
class Foo {
...
public:
virtual void bar() const = 0;
}
To avoid circular includes, one can use forward declaration. Check page 112 of CS 246E for a full discussion.
Observer
The Observer design pattern is also known as Dependents or Publish-Subscribe.
See chapter 2 of the “Head First Design Patterns” book by Freeman and Freeman.
Decorator
The Decorator design pattern is intended to let you add functionality or features to an object at run-time rather than to the class as a whole.
See chapter 3 of the “Head First Design Patterns” book by Freeman and Freeman.
Factory
The Factory Method design pattern provides an interface for object creation, but lets the subclasses decide which object to create. It is also known as the Virtual Constructor.
The Abstract Factory design pattern provides an interface for creating families of related or dependent objects. It is also known as Kit.
See chapter 4 of the “Head First Design Patterns” book by Freeman and Freeman.
Strategy
The Strategy design pattern is intended to define a family of algorithms, encapsulating each one, and making them interchangeable. Allows the algorithm to vary independently of the client. Also known as Policy.
See chapter 1 of the “Head First Design Patterns” book by Freeman and Freeman.
Template
The Template Method design pattern defines the steps of an algorithm in an operation, but lets the subclasses redefine certain steps though not the overall algorithm’s structure.
See chapter 8 of the “Head First Design Patterns” book by Freeman and Freeman.
Visitor
The Visitor design pattern allows the programmer to apply an operation to be performed upon the elements contained in an object structure. New operations can be added without changing the elements on which it operates.
See 26.3 of CS 246E for an example of overloading and overriding.
See chapter 14, the appendix, of the “Head First Design Patterns” book by Freeman and Freeman.
Bridge
The Bridge design pattern decouples an abstraction from its implementation, allowing the two to vary independently. It is also known as Handle/Body.
See chapter 14, the appendix, of the “Head First Design Patterns” book by Freeman and Freeman.
MVC
The Model-View-Controller (MVC) architecture presents a solution to this common issue. In MVC, the program state, presentation logic, and control logic are all separated. Therefore, it is in theory possible to modify any one of them without modifying the other two.
STL
Template programming allows us to create parameterized classes (templates) that are specialized to actual code when we need to use them. See isocpp or problem 10 & problem 24 of CS 246E.
The Standard Template Library (STL) is a large collection of useful templates that already exist in C++.
Two important and useful class template: vector and map. More: set, stack, queue, list…
Also template functions: for_each
, find
, count
, copy
, transform
.
Exceptions
See problem 9 of CS 246E for an introduction.
throw by value catch by reference.
try {
throw 1;
} catch (...) {
...
}
Three levels of exception safety:
- Basic guarantee
- Strong guarantee
- No-throw guarantee.
Further topics
smart pointers
unique_ptr
and shared_ptr
in <memory>
. See problem 14 & 23 of CS 246E
Idioms
Resource Acquisition Is Initialization
In the Pimpl Idiom, the term “Pimpl” is short for “Pointer to Implementation”.
Casts
static_cast
, reinterpret_cast
, const_cast
, dynamic_cast
, dynamic_pointer_cast
.
Refactoring
Function Objects and Lambdas
This is typically covered in the last lecture of cs 246 (in person version).
Function Objects:
class Plus1 {
public:
int operator() (int n) { return n + 1; }
};
. . .
Plus1 p;
p(4); // produces 5
vector <int> v { . . . };
int x = count_if(v.begin(), v.end(), [](int n) { return n % 2 == 0; });