CS 330: Management Information Systems

Kevin Lanctot

Estimated study time: 1 hr 37 min

Table of contents

Topic 1: IT Infrastructure

Readings: Management Information Systems: Managing the Digital Firm, 7^th Canadian Edition, Chapter 5 (Laudon, Laudon and Brabston, 2013).

Computer Hardware Basics

What is a Computer?

At its most fundamental level, a computer is a machine that takes data as input, processes it, and produces output. A computer consists of four main components working together: an input device (such as a keyboard, mouse, touchscreen, or microphone), an output device (such as a monitor, speaker, or printer), a processor (the central processing unit, or CPU, which does the actual computation), and memory (which stores both the data and the instructions the processor needs).

Everything a computer handles — numbers, text, images, video, sound — is ultimately represented as binary data: sequences of 0s and 1s. A single binary digit is called a bit. The reason computers use binary is electrical: a wire can either carry a voltage (representing 1) or not (representing 0). By combining many such wires, computers can represent arbitrarily complex information.

Representing Data in Binary

A single bit can represent two values: 0 or 1. With two bits, there are four possible combinations (00, 01, 10, 11), and in general, n bits can represent 2ⁿ different values. A group of 8 bits is called a byte, which can represent 2⁸ = 256 different values — enough to encode, for example, all the letters of the English alphabet (both uppercase and lowercase), digits, punctuation, and various control characters.

The standard encoding for English text is ASCII (American Standard Code for Information Interchange), which uses 7 bits to represent 128 characters. For example, the uppercase letter “A” is represented as 1000001 in binary (65 in decimal), and the lowercase “a” is 1100001 (97 in decimal). To handle the world’s many writing systems — Chinese, Arabic, Hindi, emoji, and more — the Unicode standard was developed, which uses up to 32 bits per character and can represent over a million distinct symbols.

When measuring larger quantities of data, we use standard prefixes. A kilobyte (KB) is roughly one thousand bytes (technically 1,024 = 2¹⁰), a megabyte (MB) is roughly one million bytes, a gigabyte (GB) is roughly one billion bytes, and a terabyte (TB) is roughly one trillion bytes. For context, a typical digital photo might be 2–5 MB, while a feature-length HD movie might be 4–8 GB.

Representing Numbers

Computers represent integers using a fixed number of bits. An unsigned integer uses all its bits for magnitude: with 32 bits, values range from 0 to 2³² - 1 (about 4.3 billion). A signed integer reserves one bit for the sign (positive or negative), so a 32-bit signed integer ranges from roughly -2.1 billion to +2.1 billion.

For numbers with decimal points, computers use floating point representation. The number is broken into three parts: a sign bit, an exponent, and a mantissa (also called the significand). This is analogous to scientific notation — for example, 6.02 x 10²³ has a mantissa of 6.02 and an exponent of 23. In computing, the standard IEEE 754 format specifies how these parts are stored: a 32-bit (“single precision”) float provides about 7 significant digits of precision, while a 64-bit (“double precision”) float provides about 15–16 significant digits.

An important consequence of floating point representation is that not all decimal numbers can be represented exactly. For instance, the simple fraction 1/3 has no exact binary representation, and even 0.1 cannot be stored precisely. This is why financial calculations often use integer arithmetic (counting cents) rather than floating point.

Representing Images and Sound

A digital image is composed of tiny dots called pixels (picture elements). Each pixel stores colour information — commonly as three values for red, green, and blue (RGB), each ranging from 0 to 255 (one byte per colour channel). A single pixel thus requires 3 bytes (24 bits) of storage. A 12-megapixel image — typical of a modern smartphone camera — contains 12 million pixels, requiring approximately 36 MB of raw storage. Compression algorithms like JPEG reduce this significantly by discarding information the human eye is less sensitive to.

Sound is represented by sampling the audio waveform at regular intervals. The standard for CD-quality audio is 44,100 samples per second (44.1 kHz), with each sample stored as a 16-bit value, in stereo (two channels). This works out to about 10 MB per minute of uncompressed audio. Compression formats like MP3 and AAC can reduce this by a factor of 10 or more with minimal perceptible quality loss.

The Processor (CPU)

The central processing unit (CPU) is the “brain” of the computer. It executes the instructions that make up a software program. A CPU has several key components: the arithmetic logic unit (ALU), which performs mathematical operations (addition, subtraction, multiplication, comparison); registers, which are tiny, extremely fast storage locations that hold the data currently being processed; and the control unit, which coordinates the sequence of operations.

The CPU operates in a cycle: fetch an instruction from memory, decode it to determine what operation to perform, and then execute it. This cycle repeats billions of times per second in a modern processor.

Clock Speed

The speed at which a CPU’s clock “ticks” — really a square wave alternating between high and low voltage — is measured in hertz (Hz). One megahertz (MHz) equals one million ticks per second, and one gigahertz (GHz) equals one billion ticks per second, meaning one clock tick every billionth of a second. Modern desktop processors typically run at 3–5 GHz.

When discussing computer performance, it helps to appreciate how small the relevant time intervals are:

Unit	Symbol	Fraction of a second
millisecond	ms	1/1,000 s
microsecond	μs	1/1,000,000 s
nanosecond	ns	1/1,000,000,000 s
picosecond	ps	1/1,000,000,000,000 s

However, clock speed alone does not determine performance. A processor with a lower clock speed but a more efficient architecture — one that can do more work per tick — may outperform a higher-clocked processor. Modern CPUs also contain multiple cores (essentially multiple processors on a single chip), allowing them to execute multiple instructions simultaneously.

Multicore Processors

A multicore processor places two or more independent processing cores on the same chip. A dual-core processor has two cores, a quad-core has four, and some high-end chips have 16, 32, or even more cores. Each core can execute its own stream of instructions, which allows the computer to run multiple programs simultaneously (or to split a single program across cores, if the software is designed to take advantage of parallelism).

Most modern processors also support hyper-threading (Intel’s term) or simultaneous multithreading (SMT), which allows each physical core to behave as if it were two logical cores, improving utilization when one thread is waiting for data from memory.

Memory and Storage

Where to Store Data?

A computer has many different storage devices, including static random access memory (SRAM) used for registers and caches, dynamic random access memory (DRAM, commonly just called RAM), solid state drives (SSD), hard drives (HD), USB flash drives, secure digital (SD) cards, digital versatile disks (DVD), and Blu-ray disks (BD). Why so many kinds? Because no single storage technology optimizes all the properties we care about — speed, cost, capacity, and durability — simultaneously.

The Processor-Memory Performance Gap

Historically, processor performance has been increasing much faster than memory performance. Data from Computer Architecture: A Quantitative Approach by Hennessy, Patterson, and Arpaci-Dusseau shows this gap widening dramatically from 1980 to 2010: processor speeds grew by orders of magnitude while memory speeds improved only modestly. The result is that accessing memory — reading from it and writing to it — is often the bottleneck that limits overall system performance.

The Memory Hierarchy

The solution to this gap is the memory hierarchy, a layered arrangement of storage from fastest and most expensive at the top to slowest and cheapest at the bottom:

Type	Speed	Capacity
Registers	0.2 ns	100–300 B
Caches	1–10 ns	10 KB – 10 MB
RAM	100 ns	4 GB – 400 GB
Solid state drive	10 μs	256 GB – 1 TB
Hard drive	10 ms	8 TB – 12 TB
Network storage	varies	varies
Off-site storage (cloud)	varies	varies

The guiding principle is to make the system appear to have large amounts of fast memory by storing commonly used data and instructions in fast memory and storing rarely used data and instructions in slow memory. This works because of the principle of locality: programs tend to access the same data and instructions repeatedly over short periods of time.

To put these speeds in perspective with a physical analogy: if memory were like sheets of paper and each clock tick were one inch, then CPU registers would be like an index card right beside you (1 inch), the L1 cache would be a single page at 1 inch away, the L2 cache would be 31 pages at 3 inches, the L3 cache would be 125 pages at 16 inches, main memory would be 200 books at 5 feet, disk/virtual memory would be 60,000 books at 950 miles (roughly a science library in St. Louis), and tertiary storage would be 100 million books at 1 million miles (roughly the Library of Congress on the Moon).

Types of Random Access Memory

There are two main types of RAM. Static RAM (SRAM) is expensive but very fast, and it is used for registers and cache memory. Dynamic RAM (DRAM) is less expensive but slower, and it is what we typically mean when we say a computer has “16 GB of RAM.” The key difference is that SRAM retains its contents as long as power is applied without any special action, while DRAM must be periodically refreshed — its contents re-read and re-written — because the electrical charges that represent the stored data gradually leak away.

Main Memory

Main memory includes registers, caches, and RAM. It is often simply called RAM because the defining characteristic of random access memory is the ability to directly access any randomly chosen address in roughly the same amount of time. Main memory is volatile — its contents disappear when power is lost. It stores all or part of the software program being executed, the data that program is using, and the operating system programs that manage the computer’s operation.

The processor can directly access data stored in main memory. Registers are memory directly connected to the ALU and are the fastest. Caches (L1, L2, L3, and possibly L4) store the most commonly used data and instructions, acting as a buffer between the extremely fast registers and the relatively slower RAM. When you click on a program or file, the operating system loads it from secondary storage into RAM so the processor can access it.

Secondary Storage

Secondary storage — also called external memory or external storage — is where programs and files reside when they are not actively being used. Unlike main memory, it is not directly accessible by the processor; data and programs must be copied into main memory before the processor can work with them. Secondary storage is slower and cheaper than main memory, but crucially it is non-volatile: its contents persist even when the power is turned off.

Secondary storage includes hard drives (HD), flash drives (SSD, SD cards, USB flash drives), and optical drives (CD, DVD, Blu-ray).

Hard Drives

A traditional hard disk drive (HDD) stores data magnetically on spinning platters — a set of disks stacked on top of each other, each with a smooth magnetic coating on both sides. The platters spin at thousands of revolutions per minute (RPM); common speeds are 5,400 RPM and 7,200 RPM, with higher RPMs allowing faster data access. An actuator arm moves across the disk surface to position the read/write heads, which change the orientation of the magnetic field at particular locations to represent 0s and 1s.

Parts of a hard drive (0:00–2:15)

A hard drive in action: booting up, deleting a folder, etc. (0:00–0:55)

Hard drive reliability is characterized by the Mean Time Between Failures (MTBF), also called the mean time to fail (MTTF), which is approximately 100,000 hours. The probability of failure follows a bathtub curve: drives are more likely to fail initially (due to manufacturing defects) and later in life (due to wear), with a relatively stable period in between.

The Annualized Failure Rate (AFR) for enterprise-grade drives (the kind a university like UW would purchase) is typically 0.7–0.8%, while consumer-grade drives have an AFR of about 1.25–1.89% when replaced every four years (as of 2019). An AFR of 1.89% means that roughly (1.0 - 0.0189)⁴ × 100% ≈ 92% of drives would still be working after four years. The cloud storage company Backblaze, which operates over 100,000 hard drives, publicly tracks and publishes its failure data.

Solid State Drives (SSD)

A solid state drive is an alternative to a traditional hard disk. SSDs have several advantages: they are typically around 10 times faster for data access, they tend to last longer under normal use, and they have no moving parts that can wear out. On the other hand, SSDs are more expensive per gigabyte, they can wear out sooner than hard drives when writing large amounts of data repeatedly, and the stored data can fade over very long periods without power.

Hybrid Drives

A hybrid drive combines a smaller SSD (for speed) with a larger hard drive (for capacity at a lower price). Software on the hybrid drive tracks which files are accessed most frequently and places them on the SSD portion to achieve faster access for commonly used files. This is an instance of a common strategy in computer science: optimize for the common case. The price and performance of a hybrid drive falls between that of a pure HDD and a pure SSD.

Optical Drives

Optical drives — CDs, DVDs, and Blu-ray disks — are similar to hard drives in that they use a spinning disk, but they typically use only one surface (the bottom of the disc). Instead of a magnetic read/write head, optical drives use a laser and a mirror to read and write data. The smooth aluminum surface reflects light well to represent a 0, while the laser creates tiny pits on the surface that scatter light to represent a 1. Optical media is slower and has less capacity than a hard drive, but it is inexpensive and durable. These formats are becoming less common as cloud storage and streaming take over.

Assessing Storage Performance

When evaluating storage options, several key measures matter. Price per gigabyte favors hard drives — as of 2021, a 4 TB drive might cost around $100, working out to about 2.5 cents per GB, and prices continue to fall. Capacity also favors hard drives, which continue to get larger. Bandwidth (data transfer speed, measured in MB/s or GB/s) favors solid state drives. Durability is debatable: some argue DVDs are most durable, others favor SSDs. The right choice depends on the specific use case.

Will Adding RAM Improve Performance?

The answer is: it depends. If there is not enough RAM to hold all the programs and data currently in use, the operating system will resort to using secondary storage (the HD or SSD) as an extension of memory — a technique called virtual memory or paging. Since secondary storage is much slower to access, this significantly degrades performance. In this case, adding more RAM will help noticeably. However, if the operating system never needs to use this strategy (because there is already sufficient RAM), then adding more will provide no benefit. The solution is to monitor how much RAM is being used — in Windows 10, check the Task Manager; in macOS, check the Activity Monitor.

Specialty Computers

Mainframes

A mainframe computer is a large, powerful machine designed for three key characteristics: reliability (often achieved through redundant parts), the ability to hot swap components (replace a failing hard drive, for example, while the system continues running and processing data), and the ability to support many users simultaneously — for example, processing 100,000 bank transactions, credit card transactions, or airline reservations very quickly. Examples of mainframes include IBM zSystems, Unisys ClearPath Libra, Hewlett-Packard NonStop, and Fujitsu BS2000.

Supercomputers

A supercomputer is optimized for fast floating point computations. Its main use is for complex calculations such as simulations, weather forecasting, and scientific computations. Speed is measured in FLOPS (floating point operations per second). As of 2021, the fastest supercomputer could perform about 400 PFLOPS — that is, 400,000,000,000,000,000 floating point operations per second, more than a million times faster than a typical desktop computer. Supercomputers use on the order of 100,000 cores (processors), and the primary challenge is managing the data flow across all these cores. The University of Waterloo has access to supercomputing resources through SHARCNET.

Microcontrollers

A microcontroller is a simple processor with built-in RAM and I/O capabilities that can cost as little as $0.25. Microcontrollers are used in embedded systems — devices where the computer is part of a larger product, such as home appliances, digital watches, traffic lights, robots, and cars. By 2014, a typical car had the computing power of 20 personal computers, contained about 100 million lines of programming code, and processed up to 25 gigabytes of data per hour.

Evolution of IT Infrastructure

What is IT Infrastructure?

IT infrastructure refers to the shared technology resources that provide the platform for a firm’s information system applications. It includes investment in hardware, software, and services such as consulting, education, and training. IT infrastructure has evolved in five stages since the 1950s, and each configuration is still around today in some form.

Stage 1: Mainframe / Minicomputer (1959–present)

The earliest IT infrastructure was built around mainframes. These systems were very expensive and centralized — one system controlled by operators, owned by large corporations such as banks and insurance companies. Users later interacted with the mainframe directly via terminals (screens and keyboards with no processing power of their own). Minicomputers came along later as cheaper alternatives; a large university might have several minicomputers.

Stage 2: Personal Computers (1981–present)

The personal computer era brought a computer designed to be used by one person. Early PCs initially cost roughly $4,000 (adjusted for inflation) and could perform simple word processing, accounting, and game playing. Users were technically sophisticated in those early days. PCs started with text-based interfaces but eventually evolved to graphical user interfaces with a mouse. The software market was eventually dominated by Microsoft.

Stage 3: Client/Server (1983–present)

The client/server model divides computing between two types of machines. Clients — typically inexpensive devices — request and use services. Servers — typically more powerful and expensive machines — run applications and provide them to others over a network. Examples include Google searches, streaming music on Spotify, streaming video on YouTube, accessing lecture slides on Learn, and course selection on Quest. This remains the most common form of distributed computing architecture.

Peer-to-Peer (P2P)

Client/server is not the only distributed architecture. In a peer-to-peer (P2P) network, every machine in the network both consumes services (acting as a client) and provides services (acting as a server) at the same time. For example, on torrent sites, you can download files from other people’s computers while they can download files from yours. P2P networks are hard to control because there is no central computer “in charge.” P2P technology started out associated with software, music, and video piracy (such as Napster), but today it is also used legitimately for distributing software updates.

Stage 4: Enterprise Computing (1992–present)

Enterprise computing links together different networks and applications throughout the firm — a process sometimes called integration. This involves linking different types of hardware, different data formats, using internet protocols for the network, creating standards for data formats, and using software to translate between various formats.

Stage 5: Cloud and Mobile Computing (2000–present)

Cloud computing is an extension of the client/server model, but rather than having a dedicated server, resources come from a shared pool. These resources include clusters of computers, software (such as Gmail and Google Docs), and storage. Software can be sold as a service delivered over the internet — for example, Microsoft’s Office 365. For mobile computing (smartphones and tablets), the devices are smaller and can move around over time.

Drivers of Technology

The evolution of IT infrastructure has been driven by five key forces. Understanding these trends matters for business decision-making: when designing a product that will be available in 18 months, it is important to consider what hardware performance will be like at that time.

1. Moore’s Law

Moore’s Law is really a trend rather than a law. A transistor is a fundamental building block used to make computer chips. Moore’s Law states that the number of transistors that can fit on a chip doubles every 18 months. This trend has been interpreted in several equivalent ways: the power of microprocessors doubles every 18 months, computing power doubles every 18 months, and the price of computing falls by half every 18 months. Processing power can be expressed as how many millions of instructions a processor can execute in one second (MIPS). This trend held true from 1959, but as of 2010–2013, the rate began to slow down. As more transistors fit on a chip, the cost of a single transistor on that chip decreases — this has driven dramatic cost reductions since 1965.

2. The Law of Mass Digital Storage

The amount of digital information is roughly doubling every year, and this growth is exponential. Since 1990, the storage capacity of hard drives has increased at a rate of 65% per year. The cost of storing a gigabyte is falling at an exponential rate, being cut in half approximately every 15 months. (The course textbook claims this rate is “100% per year,” which does not make sense — it would be free after one year.)

3. Metcalfe’s Law

Metcalfe’s Law observes that the value of a network grows exponentially as a function of the number of network members. With 2 members, there is 1 possible connection; with 5, there are 10; with 12, there are 66. More precisely, a network of n members has n(n-1)/2 possible connections. This explains why social networks, communication platforms, and marketplaces become enormously more valuable as they grow — and why it is so hard for competitors to dislodge an established network.

4. Declining Communication Costs

Communication costs have been declining steadily. The cost per kilobit dropped from nearly $2.00 in 1995 to near zero by the early 2000s and has remained extremely low since. The lower the cost of communication, the more businesses rely on it to conduct operations.

5. The Creation of Technology Standards

The creation of technology standards allows competition, increases interoperability, and reduces costs. Examples include ASCII and Unicode for representing text, the Portable Operating System Interface (POSIX) for Unix and Linux, TCP/IP for interconnecting networks (the internet), Ethernet and Wi-Fi for connecting devices, and HTML and the World Wide Web for formatting and displaying content.

Infrastructure Components

There are seven major components of IT infrastructure, and the choices among them must be coordinated — a choice in one component affects the options available in the others.

1. Computer Hardware Platforms

Hardware platforms divide into two varieties of machines. Client machines include desktops, laptops, tablets, and smartphones. Server machines are specialized high-end computers — these could be a single mainframe or a large number of rack servers or blade servers (thin, modular computers without dedicated keyboards or monitors, stacked like books in a bookcase). Companies like Google, Facebook, and Amazon have server farms — collections of hundreds of thousands of rack or blade servers stored in large, windowless, air-conditioned rooms. This design minimizes the physical space required.

2. Operating System (OS) Platforms

The operating system manages a computer’s hardware and software resources: the processor, memory, peripherals, files, and applications. For laptops and desktops (as of Q1 2020), 88.6% of PCs ran Microsoft Windows, 9.4% ran macOS, and 1.5% ran Linux. For smartphones and tablets, 71.3% ran Android (acquired by Google, based on Linux) and 28.3% ran iOS. For servers (as of April 2020), 69.7% of servers in the United States ran Unix or Linux, while 30.3% ran Windows.

3. Enterprise Applications (EA)

Enterprise applications are computer programs used by organizations that integrate business applications and services across many different departments. For example, a central database and set of programs used by Sales and Marketing, Finance and Accounting, Human Resources, and Manufacturing and Production — such as Quest at the University of Waterloo. Previously, departments maintained their own separate databases, making it hard to combine data across the organization. The largest suppliers of enterprise software are SAP, Oracle, IBM, and Microsoft.

4. Data Management and Storage

A database management system (DBMS) organizes and stores the company’s data. Open-source options like MySQL are available free of charge. A database server may be needed to run the DBMS, particularly if it must be accessible by several machines or through the internet. The leading database software providers are Oracle, IBM (DB2), Microsoft (SQL Server), and Sybase. For data storage, the major types are hard drives, solid state drives, and cloud-based storage (historically, tape drives were also popular). Organizations can use Redundant Array of Independent Disks (RAID) to improve hard disk performance, reliability, or both. The hard drive market is dominated by three companies: Western Digital, Seagate, and Toshiba. Tape drives, once popular for remote offsite backup (archiving) due to their portability, have been mostly replaced by cloud-based storage.

RAID Storage Architecture

RAID uses many drives together to achieve improvements in reliability, availability, performance, and capacity. There are currently seven officially recognized levels (RAID 0 through RAID 6), plus many non-standard levels. Each achieves a different balance of these four properties.

RAID 0 (disk striping) breaks a file into fixed-size blocks and distributes them across multiple disks. For example, with two disks and a 24 KB file divided into six 4 KB blocks, odd-numbered blocks go to disk 1 and even-numbered blocks go to disk 2. When reading the file, both disks work simultaneously — disk 1 retrieves block 1 at the same time disk 2 retrieves block 2 — yielding roughly twice the bandwidth of a single disk. The downside is decreased reliability: if either disk fails, all files are corrupted. There is a fundamental tradeoff between performance and reliability. With three disks, you could read the file three times as fast.

RAID 1 (disk mirroring) stores a complete copy of the data on another disk. This improves reliability (if one disk fails, use the copy) and can improve read performance (if one disk is busy, read from the other). The tradeoff is decreased capacity: RAID 1 uses twice as much space to store a file. Some schemes combine two RAID levels — for example, RAID 1+0 and RAID 0+1 both use striping and mirroring.

RAID levels 2 through 6 use parity to detect and possibly correct errors. Even parity adds an extra bit (0 or 1) at the end of a sequence of bits to ensure the total number of 1s is even. For example, EvenParity(1001000) = 10010000 (already has an even number of 1s, so add 0), while EvenParity(1001001) = 10010011 (odd number of 1s, so add 1). Parity can detect a single error (or any odd number of errors) in the stored data.

Data Backup

Two approaches protect against data loss. Online backup (also called hot backup) provides instant, real-time backup that protects against the failure of a single drive — RAID 1 and RAID 5 are examples. Offline backup (archiving) involves copying data at the end of the day and transferring it to a different location — for example, backing up to the cloud or to tape. Offline backup protects against complete failure but can only recover data from one day ago or more. Organizations must choose between full backups (copying everything) and incremental backups (copying only what has changed since the last backup).

5. Network and Telecom Platforms

A network is a group of computers linked together to share resources such as printers and network drives. Common network hardware components include a hub (sends data to all connected devices), a bridge (has one input and one output, examines the destination and decides whether to forward), a switch (has many ports, examines the destination and decides which port to send on), and a router (like a switch but connects different networks together). A firewall is hardware or software (or both) placed between the internal network and the internet to prevent unauthorized access.

Computers connect to networks using a Network Interface Card (NIC), which typically supports Ethernet, Wi-Fi, Bluetooth, or near field communication (NFC). Leading network hardware providers include Cisco, HPE/Aruba, Juniper, Huawei, and Arista.

A Network Operating System (NOS) historically referred to an OS with networking capabilities — managing users, groups, file sharing, printer access, and security. These capabilities are now standard in every desktop OS. Today, the term NOS typically refers to an operating system made specifically for network hardware such as firewalls or routers.

The telecom platform also includes telephone and cell phone services, telephones, cell phones, telephone systems (PBXs), automated attendants, call centre software, and fax machines. Major Canadian telecom service vendors include Rogers, Bell, Telus, and Shaw, plus regional carriers.

6. Internet Platforms

An Internet Service Provider (ISP) provides the link from your home or company network to the rest of the internet. The larger ISPs are typically telephone or cable companies that own the telephone line and cable that runs to your home or office (the “last mile”). In Canada, about 40 smaller regional ISPs lease network access from the bigger ISPs and provide their own customer service and tech support. Major Canadian ISPs include Rogers, Bell, and Shaw.

For website development, simple websites use languages like HTML (hypertext markup language) and JavaScript. Simple websites are typically static — the site does not change unless a person edits the web page files. More sophisticated websites are dynamic — when a client makes a query, a web page is created using a combination of scripts and database queries to provide the most recent and relevant information. For example, when you click on a YouTube video, the page displays the current view count, likes, and latest comments, all of which change over time. Programming languages for dynamic web pages include PHP, ASP.NET (by Microsoft), and JSP/Java (by Oracle). Many large companies also use artificial intelligence to learn what content to present to each user.

To host a website, you need a server (a fairly powerful computer), a domain name and an IP address, and web server software that accepts requests from web browsers. As of September 2020, the most common web servers were Apache (36.3% market share), nginx (32.5%), and Cloudflare Server (15.9%). Alternatively, you can use template-based services like Wix, content management systems like WordPress, or build your own from scratch if you have a programming background.

7. Service Platform

A service platform is a collection of services that enable the information system to function — consulting and system integration services. Most firms cannot develop their systems without significant outside help, including identifying which parts of the business can be improved by using IT, ensuring new systems integrate with legacy systems, maintenance, training, and security.

A Note on Bundling

The term “server” can refer to hardware (the server machine), software (the server application), or both. Some IT components may be bundled: machines might come with preinstalled operating systems, the enterprise application platform may dictate the server machines needed, some enterprise applications are bundled with their own DBMS, some require dedicated server machines, and some come with a service package (integration, maintenance, and training).

Contemporary Hardware Trends

Eight contemporary hardware trends are shaping the IT landscape.

Trend 1: The Mobile Digital Platform

Increasingly, internet access happens via highly portable devices — smartphones and tablets — requiring businesses to provide apps rather than just websites. Smartphones are taking over the functions of many other electronic devices, such as GPS devices, cameras, and music players. The integration of voice (the telephone network) and data (computer networks) is bringing together two historically distinct global networks.

Trend 2: Consumerization of IT and BYOD

BYOD — Bring Your Own Device — is a policy that allows employees to bring their own personal devices to work and use consumer software services such as Gmail, Google, Facebook, and Twitter for business purposes. The key trend is the consumerization of IT: technology originally meant for consumers moves into the business world. This presents companies with challenges around boundaries (what can and cannot be used), security, software availability, ownership, and privacy.

Trend 3: Grid Computing

Processors are idle most of the time — you can verify this by looking at the Task Manager in Windows or the Activity Monitor in macOS. Grid computing harnesses this idle capacity by organizing the computational power of a network of computers to simulate a supercomputer. The machines may be geographically remote and run different operating systems. The benefit is the ability to tackle problems that require short-term access to large computational capacity. SETI@home was a famous example of grid computing.

However, grid computing has a fundamental limitation: only tasks that can be broken up into smaller independent tasks (parallelized) can take advantage of it. For example, finding all occurrences of a keyword in a large collection of documents is easily parallelized — each computer searches a subset. But computing the Fibonacci number F(x) for some large x is inherently sequential, since each value depends on the previous two.

Trend 4: Virtualization

Virtualization is the creation of a virtual (simulated) version of something — a hardware platform, an operating system, a storage device, or network resources — rather than the actual physical version. It takes several forms: “one looks like something else” (a machine running macOS can act as if it is running Windows), “many looks like one” (many smaller hard drives configured to look like one large drive, as in RAID 0), and “one looks like many” (a single powerful server configured to look like many smaller computers, each potentially running a different OS).

In a typical computer, the hardware is managed by the operating system, and application software interacts with the hardware through the OS. Virtualization adds a layer: software that simulates hardware — creating a virtual machine — which can then run its own operating system and applications. For example, to run a Linux application on a Windows computer, you could create virtual hardware using software like VMware or VirtualBox, install Linux on it, and run the application in that virtual environment.

Benefits of virtualization include better resource management (using more of the processor’s capacity, reducing space, expense, and energy), support for legacy applications (by running older versions of an OS), and testing (software can be tested on a variety of virtual configurations without needing physical hardware for each).

Trend 5: Cloud Computing

Cloud computing is the leasing of hardware, programming tools, or software from another company, accessed over the internet, as a service. There are three main categories: infrastructure as a service (IaaS) — processing or storage hardware, such as MS OneDrive, Apple iCloud, or Google Drive; platform as a service (PaaS) — development tools and libraries, such as machine learning frameworks; and software as a service (SaaS) — complete applications, such as Microsoft 365, Google Docs, Hotmail, or Gmail.

Cloud computing offers several advantages: lower cost for covering peak demand, convenience (use as needed), and flexibility (not tied to a fixed number of machines or types of OS). But there are also concerns: privacy (less control over where data is stored), liability (Google Cloud experienced six outages in one year), legal compliance (must comply with Canadian privacy laws), and general loss of control. A hybrid cloud approach uses your own infrastructure for mission-critical systems while using the cloud for peak demand and less critical workloads.

On-Demand (Utility) Computing

On-demand computing is a form of cloud computing where firms off-load peak demand to remote, large-scale data processing centers. Firms pay only for the computing power they use, much like an electrical utility. This is excellent for businesses with spiked demand curves — for example, an online shopping website on Black Friday, or a concert ticket sales platform — and saves firms from purchasing excessive infrastructure to handle occasional peaks.

Meeting Peak Demand: Three Options

Consider a scenario where an accounting system handles an average of 10,000 transactions per day with peak demand of 20,000 during tax season. Three technical options are available. Load balancing distributes the workload evenly across many servers — for example, four servers each handling 6,000 transactions per day, each operating at 41.6% (normal) to 83.3% (peak) capacity. This handles crashes, upgrades, and seasonal peaks gracefully, but requires purchasing and maintaining hardware that is rarely fully utilized. Cloud computing provides additional capacity as needed without owning extra hardware. On-demand computing specifically off-loads peak demand to external data centers, paying only for what is used.

Trend 6: Green Computing

Green computing is the design and use of computer systems in a way that minimizes their impact on the environment. This involves four R’s: reduce power consumption (servers alone account for about 1% of global power), reduce standby power (also known as vampire power), reuse parts to repair older computers, and recycle e-waste (old cell phones, old laptops, and other electronics).

An important consideration when recycling is to sanitize the device — erase all data — before disposing of it. Simply deleting files is not enough, as the data may still be recoverable. Different methods are required for SSDs versus HDs, and many devices have built-in commands to securely erase themselves. You can verify your method works by running a file recovery program (such as Recuva) afterward to check if any files remain.

Trend 7: High-Performance and Power-Saving Processors

Modern processors feature multicore designs where individual cores can disconnect from power when not in use. Energy-efficient designs (using fewer transistors) are common in cell phones and tablets, where battery life is a critical concern.

Trend 8: Autonomic Computing

Computer systems have become so complex that the cost of managing them has risen significantly — a substantial portion of a company’s IT budget is spent preventing or recovering from system crashes, and the most common cause of crashes is operator or user error. Autonomic computing is an industry-wide effort to develop systems that are capable of self-management: self-configure, self-protect, self-optimize, and self-heal. Examples include P2P systems like Skype and the internet (where the network continues functioning if some nodes go down) and S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) hard drives that can predict their own failure.

Future Hardware Technology

Nanotechnology

Nanotechnology is the science of using nanostructures to build devices. A nanometer is a billionth of a meter, roughly the size of a few atoms or a small molecule. Nanotechnology uses individual atoms and molecules to create computer chips and other devices. As of 2021, a transistor is about 14 nanometers wide (roughly 70 silicon atoms), and the practical limit for silicon-based approaches appears to be around 5 nanometers. Researchers are exploring new materials and approaches to create even smaller transistors or their replacements.

Quantum Computing

Classical computers use bits — two different voltage levels — to represent data. Quantum computers use a quantum property called entanglement of a group of electrons to represent data. The properties of n entangled electrons (called n qubits) can be described by a probability distribution over 2ⁿ different states. For a single electron, a state could be the direction of its spin in a magnetic field (spin up or spin down). Quantum computers can store a large amount of information in a small number of qubits and can minimize the number of steps needed to arrive at a result for certain problems — such as factoring large numbers, which is the basis of much modern cryptography. However, quantum computers are not a universal replacement for classical computers; they excel at specific types of problems.

Contemporary Software Trends

Trend 1: Open-Source Software

Open-source software is source code that is publicly available and can be modified and redistributed by anyone. Different standards exist: the Free Software Foundation (FSF), started in 1985, emphasizes that software should be free to use, share, and modify. The Open Source Initiative (OSI), started in 1998, may add some restrictions to make open-source commercially viable — for example, requiring that any modifications be made publicly available.

Open-source software is often developed and maintained by a worldwide network of programmers under the management of user communities, though it could be maintained by a single person. Sometimes a company funds an open-source challenger to a competitor’s product (as Google funded Firefox to compete with Microsoft’s Internet Explorer), or a company may open-source a product it no longer supports (as Sun Microsystems did with StarOffice, now called OpenOffice). Notable examples of open-source software include the operating system Linux, the web browser Firefox, the web server Apache, the database MySQL, the programming language Python, the media player VLC, the audio editor Audacity, and the office suites OpenOffice and LibreOffice.

The benefits of open-source software include lower cost, potentially more security and fewer bugs (because many people inspect the code), flexibility (you can modify it), transparency (you know exactly what the code does), and freedom from vendor lock-in. The drawbacks include that open-source software is less likely to be easy to use, to meet specific customer needs, to be compatible with particular hardware, or to come with professional support.

Trend 2: HTML and HTML5

HTML stands for hypertext markup language. “Hypertext” refers to text containing links to other texts that you can access quickly, and “markup language” refers to a way of annotating and presenting text — bold, italics, titles, subtitles, and so on. HTML originally did not support audio and video, requiring third-party plugins. The latest version, HTML5 (finalized in 2014), includes native support for audio and video.

Trend 3: Web Services and SOA

Web services are software components that exchange information with each other using web communication standards and languages. The web provides well-known and well-supported standards for presenting information. Web browsers use HTML, which specifies how text and graphics are displayed. A generalization of HTML is eXtensible Markup Language (XML), which can also specify what the data means — for example, indicating that “$16,800” represents a price in Canadian dollars by wrapping it in a tag like <PRICE CURRENCY="CAD">$16 800</PRICE>.

XML provides a format for different programs — potentially on different platforms, at different companies — to exchange information. The use of web services to achieve integration among different applications and platforms is referred to as service-oriented architecture (SOA). SOA is a cost-effective way to adopt new technology and integrate different applications. For example, a car rental company can interact with an airline’s website, a tour company’s website, and a travel reservation system by converting its information to web standards, allowing customers to book a flight, rent a car, and book a tour all at the same website.

Trend 4: Software Outsourcing

Firms have several options for acquiring software. They can purchase a customizable generic software package (such as SAP or Oracle-PeopleSoft), contract custom software development or maintenance to a third party (which could be located in another country — this started with maintenance and data entry but now also includes developing new software), or use software available from the cloud, called software as a service (SaaS) — for example, Salesforce.com for customer relations management.

Management Issues

1. Dealing with Change

Firms need to be able to grow or shrink their IT infrastructure as needed. Scalability is the ability to expand (or contract) to serve a larger or smaller number of users without the system breaking down.

2. Management and Governance

Who is responsible for the IT infrastructure? There are several models: each department has its own IT group (decentralized), one overall IT group for the whole company (centralized), or a mixture of both — for example, a central IT group decides on the email server while each department decides on its own course websites.

3. Infrastructure Investments

Total Cost of Ownership (TCO)

There are different ways to estimate the total cost of ownership (TCO) of IT infrastructure. A common rule of thumb: the acquisition costs for hardware and software represent only about 20% of the TCO (ranging from 20–35% depending on what is purchased). TCO is like an iceberg — you only see the tip.

TCO can be broken down into capital expenditure (fixed, one-time costs to acquire the system) and operational expenditure (ongoing expenses for running it). The components that contribute to TCO include:

Component	Cost
Hardware	Computers, cables, terminals, storage, printers
Software	Operating systems, applications
Installation	Staff to install computers and software
Training	Time and people for both developers and end users
Support	Ongoing technical support and help desks
Maintenance	Upgrades for hardware and software
Infrastructure	Networks and backup units
Downtime	Lost productivity during system failures
Space and Energy	Real estate, furniture, and utility costs

Costs can also be categorized as direct IT costs — costs the company pays for explicitly (hardware, software, printer paper, internet, facilities, support, training) — and indirect IT costs — costs due to lost productivity (downtime, poorly trained users, user mistakes, using the computer for non-business purposes, users installing unauthorized software).

A Nash Networks report estimated costs for a PC over a 3–4 year lifetime: purchase costs at $3,090, deployment at $500, operations at $1,040, support at $1,680, and retirement at $630, for a total of $6,940 and approximate annual costs of $2,000. For large companies (2,500+ employees) in 2007, the average IT operating budget was 5.5% of revenue, IT capital budget was 2.5% of revenue, and IT operating budget per employee was $9,100. IT spending broke down to hardware (26%), software (20%), support including staff and contractors (41%), and telecommunications (13%).

One way to reduce costs is through computer management policies. At the “unmanaged” extreme, users can install any application and change any setting, resulting in the highest indirect costs ($5,867 total per PC). At the “locked and well-managed” extreme, users cannot install software or change critical settings, reducing total costs to $3,413 per PC. The more a computer is managed, the less the indirect costs.

Competitive Forces Model

When deciding how much to spend on IT, firms should consider six factors: the demand for services (what services do customers, suppliers, and employees need?), business strategy (what new capabilities are required?), IT strategy (how will IT help achieve these goals?), IT assessment (is the current infrastructure too old or too new?), competitor’s services (what do competitor firms offer?), and competitor’s IT investments (how much have they spent?).

Topic 2: Databases

Readings: Management Information Systems, Chapter 6 (Laudon, Laudon and Brabston, 2013).

What is a Database?

A database is an organized collection of related data. Databases are everywhere — they store student records (Quest), library catalogues, online store inventories, banking information, health records, and much more. At the University of Waterloo, Quest is backed by a database that tracks students, courses, grades, and enrolments.

The term data refers to raw facts — for example, a list of items scanned at a supermarket checkout. Information is data shaped into a form that is meaningful to human beings — for example, a report showing which items are selling well and which are not. A database stores data so that it can be efficiently transformed into useful information.

The Traditional File Approach

Before databases existed, organizations used a traditional file approach: each department maintained its own files using its own application programs. The Sales department had its own customer files, the Accounting department had its own, and so on. This led to several serious problems.

Data redundancy occurs when the same data is stored in multiple files across the organization. For example, both the Sales department and the Billing department might store a customer’s address. Data inconsistency is a consequence of redundancy: when data is updated in one file but not in others, the copies become inconsistent. If a customer moves and updates their address with Sales but not with Billing, the organization now has two different addresses for the same customer.

The traditional approach also suffers from a lack of data sharing (it is difficult to combine data from different departments), program-data dependence (the programs are tightly coupled to the specific file format, so any change to the file structure requires changing all programs that use it), and inflexibility (ad hoc queries are difficult because a programmer must write a new program for each new question).

The Database Approach

The database approach solves these problems by storing all of the organization’s data in a single, centralized database managed by a database management system (DBMS). The DBMS sits between application programs and the physical data files. It provides a level of data abstraction — application programs interact with a logical view of the data rather than directly with the files. This eliminates program-data dependence, reduces redundancy and inconsistency, and enables data sharing across the organization.

A DBMS provides several key capabilities: it defines and creates the structure of the database, it allows users and programs to store, retrieve, and update data, it provides a query language for ad hoc questions, and it manages concurrent access by multiple users.

The Relational Database Model

The most common type of database is the relational database, which organizes data into two-dimensional tables called relations (or simply tables). Each table represents one type of entity — for example, a Suppliers table, a Parts table, or an Orders table. Each row of a table is called a record (or tuple), and each column is called a field (or attribute).

Every table must have a primary key — a field (or combination of fields) whose value uniquely identifies each record. For example, in a Students table, the student ID number is the primary key because no two students share the same ID. A foreign key is a field in one table that refers to the primary key of another table, creating a link between the two tables. For example, in an Orders table, a Supplier_ID field would be a foreign key referencing the Suppliers table.

Example: A Supplier Database

Consider a simple database for tracking suppliers and the parts they provide:

Supplier Table:

Supplier_Number	Supplier_Name	Supplier_Street	Supplier_City	Supplier_Province	Supplier_PostalCode
8259	CBM Inc.	74 Brook St.	Waterloo	ON	N2L 1N5
8261	B.R. Molds	1277 Gandalf Dr.	Toronto	ON	M6H 1V7
8263	Jackson Parts	8 Jefferson St.	Rochester	NY	14618
8444	Bryant Corp.	Route 60	Kitchener	ON	N2G 1V6

The primary key is Supplier_Number — it uniquely identifies each supplier.

Parts Table:

Part_Number	Part_Name	Unit_Price
137	Door latch	22.00
145	Side mirror	12.00
150	Door molding	6.00
152	Door lock	31.00
178	Seal strip	10.00

The primary key is Part_Number.

Orders Table:

Order_Number	Order_Date	Part_Number	Supplier_Number	Quantity
…	…	137	8259	10
…	…	145	8444	250
…	…	152	8263	7

The Orders table uses Part_Number and Supplier_Number as foreign keys to link orders to the Parts and Suppliers tables.

Normalization and Redundancy

Why not combine everything into one big table? Because that would introduce enormous redundancy. Every time supplier 8259 appears in an order, the supplier’s name, address, city, province, and postal code would be repeated. If the supplier moves, every row containing that supplier would need to be updated — and if even one row is missed, the data becomes inconsistent.

Normalization is the process of organizing data to minimize redundancy. The goal is that each piece of information is stored in exactly one place. When data needs to be combined from multiple tables, the DBMS performs a join operation using the relationships defined by primary and foreign keys.

Structured Query Language (SQL)

SQL (Structured Query Language) is the standard language for interacting with relational databases. It allows users to create tables, insert data, update data, delete data, and — most importantly — query data.

Basic SQL Queries

The fundamental SQL query uses SELECT, FROM, and WHERE:

SELECT specifies which columns to include in the result
FROM specifies which table(s) to query
WHERE specifies conditions that filter the rows

For example, to find all suppliers located in Ontario:

SELECT Supplier_Name, Supplier_City
FROM Supplier
WHERE Supplier_Province = 'ON';

To find all parts with a unit price greater than $15:

SELECT Part_Name, Unit_Price
FROM Parts
WHERE Unit_Price > 15;

Joining Tables

The real power of SQL comes from combining data across multiple tables. To find out which supplier provides part number 137:

SELECT Supplier.Supplier_Name, Parts.Part_Name, Orders.Quantity
FROM Orders, Supplier, Parts
WHERE Orders.Supplier_Number = Supplier.Supplier_Number
  AND Orders.Part_Number = Parts.Part_Number
  AND Parts.Part_Number = 137;

This join operation links rows from three tables based on matching primary and foreign key values.

Aggregate Functions

SQL provides functions for computing summary statistics: COUNT (number of rows), SUM (total of values), AVG (average), MIN (smallest value), and MAX (largest value). The GROUP BY clause groups rows before applying these functions. For example, to find the total quantity ordered for each part:

SELECT Part_Number, SUM(Quantity) AS Total_Ordered
FROM Orders
GROUP BY Part_Number;

Other SQL Operations

INSERT INTO adds new records, UPDATE modifies existing records, and DELETE removes records. The ORDER BY clause sorts results, and LIKE enables pattern matching (using % as a wildcard). SQL also supports subqueries — queries nested inside other queries.

Database Design

Designing a database involves several key decisions: what entities to represent (what tables to create), what attributes each entity has (what columns each table needs), and how entities relate to each other (what foreign keys to define). Three types of relationships exist between entities:

One-to-one (1:1): Each record in table A corresponds to exactly one record in table B. Example: each student has exactly one transcript.
One-to-many (1:N): Each record in table A can correspond to many records in table B, but each record in table B corresponds to exactly one record in table A. Example: each supplier can have many orders, but each order comes from one supplier. This is the most common relationship type.
Many-to-many (M:N): Records in table A can correspond to many records in table B, and vice versa. Example: students and courses — each student takes many courses, and each course has many students. Many-to-many relationships require a third “junction” table (like an Enrolment table with Student_ID and Course_ID as foreign keys).

An entity-relationship diagram (ER diagram) is a visual tool for depicting the entities, their attributes, and the relationships between them during the database design process.

Ensuring Data Quality

Data quality is a persistent challenge. Common problems include errors during data entry (misspellings, wrong numbers), missing data (blanks or “unknown” values), inconsistent data (the same information recorded differently in different places), and outdated data (information that was once correct but is no longer). Organizations can improve data quality through data cleansing (also called data scrubbing) — using software to detect and correct errors — and by establishing data governance policies that define who is responsible for data quality, what standards data must meet, and how data quality is measured and maintained.

Big Data and Business Intelligence

Organizations are collecting data at unprecedented rates. Big data refers to datasets so large, fast-moving, or complex that traditional data processing methods cannot handle them. Big data is often characterized by three V’s: volume (the sheer amount of data), velocity (the speed at which new data arrives), and variety (the different formats — structured tables, unstructured text, images, video, sensor readings).

Business intelligence (BI) refers to tools and techniques for turning raw data into useful information for decision-making. BI encompasses data warehousing (consolidating data from multiple sources into a single repository), online analytical processing (OLAP) for exploring data from multiple dimensions, and data mining (using algorithms to discover patterns and trends).

Data warehouses pull data from various operational databases, clean and standardize it, and store it in a format optimized for analysis rather than day-to-day transactions. Data mining uses statistical and machine learning techniques to find patterns in large datasets — for example, discovering that customers who buy product A also tend to buy product B (market basket analysis), or predicting which customers are likely to cancel their subscriptions (churn prediction).

Topic 3: Networking

Readings: Management Information Systems, Chapter 7 (Laudon, Laudon and Brabston, 2013).

What is a Network?

A computer network is a collection of two or more computers connected together to share resources. Networks enable communication (email, messaging), resource sharing (printers, files, databases), and collaboration (shared documents, video conferencing). Without networks, the modern digital world — from the internet to cloud computing — would not exist.

Key Networking Concepts

Protocols

A protocol is a set of rules for communication between two parties. In everyday life, there are protocols for introductions, phone calls, and letters. In networking, protocols define the format of messages, the order in which messages are sent, and the actions taken when a message is received. Multiple protocols work together in layers — each layer handles a specific aspect of communication.

The Internet Protocol Suite (TCP/IP)

The internet runs on a family of protocols called TCP/IP (Transmission Control Protocol / Internet Protocol). IP is responsible for addressing — giving each device on the internet a unique IP address (like a mailing address) so data can be routed to the correct destination. An IPv4 address consists of four numbers separated by dots (e.g., 192.168.1.1), each ranging from 0 to 255. Since the world has run out of IPv4 addresses, the newer IPv6 standard uses 128-bit addresses (written as eight groups of four hexadecimal digits), providing enough addresses for every grain of sand on Earth.

TCP is responsible for reliability — breaking messages into packets, ensuring all packets arrive at the destination, reassembling them in the correct order, and requesting retransmission of any lost packets.

Packet Switching

Networks transmit data using packet switching: messages are broken into small packets, each of which independently travels through the network to the destination, possibly taking different routes. At the destination, the packets are reassembled into the original message. This is fundamentally different from the old telephone system’s circuit switching, where a dedicated connection was established for the entire duration of a call. Packet switching is more efficient because network capacity is shared among many users.

The Domain Name System (DNS)

Humans find it difficult to remember IP addresses, so the Domain Name System (DNS) provides human-readable names. When you type “www.google.com” into your browser, a DNS server translates that name into the corresponding IP address (e.g., 142.250.80.46). DNS is essentially the internet’s phone book.

URLs

A URL (Uniform Resource Locator) specifies the location of a resource on the internet. It has several parts: the protocol (e.g., https), the domain name (e.g., www.uwaterloo.ca), and optionally a path to a specific page or resource.

Types of Networks

Networks are classified by their geographic scope:

LAN (Local Area Network): covers a small area like a room, floor, or building. Most LANs use Ethernet (wired) or Wi-Fi (wireless) technology.
MAN (Metropolitan Area Network): covers a city or campus — for example, the University of Waterloo’s campus network.
WAN (Wide Area Network): covers a large geographic area — a country or the entire world. The internet is the largest WAN.

The Internet

The internet is a global network of networks — millions of private, public, academic, and government networks linked together using the TCP/IP protocol suite. No single organization owns or controls the internet; it is a decentralized system. The internet’s physical infrastructure includes fibre optic cables (including undersea cables connecting continents), satellites, cellular towers, and routing equipment.

The World Wide Web

The World Wide Web (WWW) is not the same as the internet. The internet is the underlying network infrastructure; the web is a service that runs on top of it. The web uses HTTP (Hypertext Transfer Protocol) to transfer web pages, which are written in HTML and can contain text, images, video, and links to other pages. The web was invented by Tim Berners-Lee at CERN in 1989.

Other Internet Services

Besides the web, the internet supports many other services: email (using SMTP, POP3, and IMAP protocols), file transfer (FTP), remote access (SSH), instant messaging, voice and video calls (VoIP), streaming media, and more.

Network Security Basics

As covered more extensively in Topic 8, networks face various security threats including unauthorized access, eavesdropping (sniffing), and denial-of-service attacks. Firewalls filter traffic between an internal network and the internet. Virtual Private Networks (VPNs) create encrypted tunnels through the public internet, allowing secure remote access to a private network. Encryption protects data in transit from eavesdroppers.

Topic 4: Management Information Systems

Readings: Management Information Systems, Chapter 1 (Laudon, Laudon and Brabston, 2013).

What is a Management Information System?

A management information system (MIS) deals with the planning, development, management, and use of information technology tools to help people perform tasks related to information processing and management. More specifically, an information system (IS) is a set of interrelated components that collect, process, store, and distribute information to support decision-making and control in an organization.

An information system contains information about significant people, places, and things within the organization or in the environment surrounding it. Three basic activities produce the information organizations need: input captures raw data from the organization or its environment, processing converts raw data into a meaningful form, and output transfers the processed information to the people who will use it. A fourth activity, feedback, returns output to appropriate members of the organization to help evaluate or correct the input stage.

Data, Information, and Knowledge

Three related but distinct concepts form a hierarchy. Data consists of raw facts — for example, a list of items scanned at a supermarket checkout. Information is data shaped into a form that is meaningful to human beings — for example, a report showing total sales per product category this month. Knowledge goes further: it is the discovery of patterns, rules, and contexts where information is useful — for example, understanding that customers are more likely to buy an item that is placed at eye level on a grocery store shelf.

Why Information Systems Matter

Information systems are not just a technical tool — they are a strategic asset. Organizations that use information systems effectively gain a competitive advantage. IS can improve decision-making (by providing timely and accurate information), increase operational efficiency (by automating routine tasks), enable new products and services (such as online banking or e-commerce), and improve customer relationships (through better understanding of customer needs and behaviour).

The Dimensions of Information Systems

Information systems have three dimensions: technology (hardware, software, data management, and networking), organizations (the structure, culture, and business processes of the firm), and people (the managers, knowledge workers, and end users who interact with the system). A common mistake is to think of IS as purely a technology issue — in reality, the organizational and human dimensions are equally important.

Information Systems and Business Strategy

Information systems can help firms achieve several strategic objectives: operational excellence (streamlining processes to reduce costs and improve efficiency), new products and services (creating offerings that were not previously possible), customer and supplier intimacy (building closer relationships through better information sharing), improved decision-making (providing managers with accurate and timely data), competitive advantage (doing things better or differently than competitors), and survival (meeting regulatory or industry requirements that cannot be met without IS).

Porter’s Competitive Forces Model

Michael Porter’s competitive forces model identifies five forces that shape competition in any industry: the threat of new entrants, the bargaining power of suppliers, the bargaining power of buyers, the threat of substitute products or services, and rivalry among existing competitors. Information systems can be used strategically to address each of these forces — for example, raising barriers to entry, reducing supplier power through better information, or differentiating products through IT-enabled features.

Porter’s Value Chain Model

Porter’s value chain model views the firm as a series of activities that add value to its products or services. Primary activities directly relate to producing and delivering the product: inbound logistics, operations, outbound logistics, marketing and sales, and service. Support activities enable the primary activities: firm infrastructure, human resource management, technology development, and procurement. Information systems can improve performance at any point in the value chain.

Topic 5: Business Processes and Types of Information Systems

Readings: Management Information Systems, Chapters 2 and 12 (Laudon, Laudon and Brabston, 2013).

Business Processes

A business process is a collection of activities required to produce a product or a service. Examples include hiring an employee, fulfilling an order, creating a marketing plan, and manufacturing a product. Business processes can span multiple functional areas — for example, fulfilling an order may involve Sales (taking the order), Manufacturing (producing the product), and Shipping (delivering it).

Information systems can improve business processes by automating manual steps, enforcing consistency, providing real-time visibility, and enabling entirely new processes that were not possible before.

Types of Information Systems

Different levels of an organization have different information needs, and different types of information systems serve those needs.

Transaction Processing Systems (TPS)

Transaction processing systems serve the operational level of the organization. They record the routine, day-to-day transactions that are necessary for conducting business — sales, purchases, payroll, inventory changes, and so on. TPS are typically highly structured and automated. Examples include point-of-sale systems in a retail store, airline reservation systems, and banking systems that process deposits and withdrawals. TPS provide the raw data that feeds into all other types of information systems.

Management Information Systems (MIS)

Management information systems (in the narrow sense) serve middle managers. They provide routine reports and summaries of transaction-level data — for example, a weekly sales report broken down by region, or a monthly inventory status report. MIS typically answer structured, recurring questions: “How did sales compare to last month?” rather than “Why did sales decline in the Western region?”

Decision Support Systems (DSS)

Decision support systems serve middle and senior managers who need to make semi-structured or unstructured decisions. Unlike MIS, which provide fixed reports, DSS are interactive and flexible — they allow managers to ask “what-if” questions and explore different scenarios. For example, a DSS might allow a logistics manager to model the impact of opening a new warehouse in a different city, changing shipping routes, or adjusting inventory levels. DSS often use mathematical models and data analysis tools.

Executive Support Systems (ESS)

Executive support systems (also called executive information systems) serve senior executives. They provide a high-level, consolidated view of the organization’s performance, often through dashboards and visualizations that highlight key performance indicators. ESS draw data from both internal sources (MIS, DSS, TPS) and external sources (market data, industry reports, news) to help executives make strategic decisions.

Enterprise Systems

An enterprise system (also called an Enterprise Resource Planning or ERP system) integrates all of the organization’s business processes and data into a single, unified system. Before enterprise systems, each department might have had its own separate information system — Sales used one system, Manufacturing used another, HR used a third, and so on. This made it difficult to share data and coordinate activities across departments.

An ERP system uses a single, central database that is shared by all departments. When a salesperson enters an order, the system automatically updates inventory, schedules production, calculates the financial impact, and plans shipping — all in real time. Major ERP vendors include SAP, Oracle, and Microsoft.

The benefits of enterprise systems include eliminating data redundancy and inconsistency, providing a unified view of the organization, improving coordination across departments, and enabling more informed decision-making. The challenges include high cost (both software and implementation), complexity (ERP implementations can take years), and rigidity (the organization may need to change its business processes to fit the software).

Supply Chain Management (SCM)

A supply chain is the network of organizations and business processes involved in producing and delivering a product — from raw materials to the final customer. Supply chain management (SCM) systems help firms manage relationships with suppliers, optimize inventory levels, and coordinate production and distribution.

Effective SCM requires sharing information across organizational boundaries. The bullwhip effect is a classic supply chain problem: small fluctuations in consumer demand are amplified as they move up the supply chain, causing suppliers and manufacturers to experience much larger swings in orders than the actual change in demand warrants.

Customer Relationship Management (CRM)

Customer relationship management (CRM) systems help firms manage their interactions with current and potential customers. CRM systems consolidate customer data from multiple sources (sales, service, marketing, social media) to provide a unified view of each customer. This enables personalized marketing, better customer service, and more effective sales processes. Major CRM vendors include Salesforce, Oracle, SAP, and Microsoft.

Topic 6: Organizations and Information Systems

Readings: Management Information Systems, Chapter 3 (Laudon, Laudon and Brabston, 2013).

How Organizations Use Information Systems

Organizations use information systems to achieve greater efficiency, to gain competitive advantage, and to transform the way they do business. But the relationship between organizations and IT is bidirectional — organizations shape how IT is used, and IT reshapes organizations.

Organizational Structure

Organizations have different structures that affect how IS are used. A hierarchical (or tall) organization has many levels of management between the top executive and frontline employees. A flat organization has few levels. Information systems can enable flatter organizations by giving managers access to more information, reducing the need for middle management layers that primarily pass information up and down.

Organizational Culture

Organizational culture — the shared values, beliefs, and practices of an organization — significantly affects how IS are adopted and used. An organization with a culture of openness and collaboration will more readily adopt knowledge-sharing systems. An organization with a culture of departmental silos will resist systems that require cross-departmental data sharing.

Organizational Politics

Different groups within an organization may have competing interests. IS implementations often shift power and resources, which can lead to resistance. A new information system that makes certain jobs obsolete, changes established workflows, or redistributes access to information will face political opposition regardless of its technical merits.

The Impact of IS on Organizations

Information systems have transformed organizations in several ways. Flattening — reducing the number of management levels by giving senior managers direct access to operational data. Separation of work from location — enabling telecommuting and distributed teams. Increased flexibility — allowing organizations to respond more quickly to market changes. Reorganization of workflows — automating manual processes and eliminating unnecessary steps. Changing management processes — providing better data for decision-making, enabling more evidence-based management.

Ethical Implications

Information systems raise important organizational questions: Who owns the data? Who has access to it? How should employee monitoring be balanced against privacy? What happens to employees whose jobs are automated? These questions have no purely technical answers — they require organizational policies and ethical frameworks.

Readings: Management Information Systems, Chapter 4 (Laudon, Laudon and Brabston, 2013).

Ethics and Information Systems

Information systems raise new ethical questions because they create opportunities for intense social change, threatening existing distributions of power, money, rights, and obligations. Technology is a “double-edged sword” — it can be used to achieve social progress, but it can also be used in ways that harm individuals and communities.

A Framework for Ethical Analysis

The course presents a model for thinking about ethical, social, and political issues raised by information systems. Five moral dimensions are particularly relevant: information rights and obligations (privacy and freedom in the internet age), property rights and obligations (intellectual property protection), accountability and liability (who is responsible when things go wrong), system quality (what standards of data and system quality should be expected), and quality of life (what values should be preserved in an information- and knowledge-based society).

Ethical Principles

Several ethical principles can guide decision-making in technology contexts. The Golden Rule: do unto others as you would have them do unto you. Immanuel Kant’s Categorical Imperative: if an action is not right for everyone to take, it is not right for anyone. The utilitarian principle: take the action that achieves the greatest good for the greatest number of people. The risk aversion principle: take the action that produces the least harm or the least potential cost. The “no free lunch” rule: if something someone else has created is useful to you, it has value and you should assume the creator wants compensation.

Concern 1: Personal Information and Privacy

What Information is Collected?

Internet companies collect vast amounts of personal information. When you search the web, browse websites, make online purchases, or use social media, your activities are tracked and recorded. Websites use cookies — small text files stored on your computer — to identify and track users across visits. Even when you delete cookies, websites can re-identify you through browser fingerprinting (analyzing the unique characteristics of your browser and system configuration) or by linking old and new cookies when you log in to the same account.

How is that Information Used?

Personal information collected online is used for advertising products and services you may be interested in, tailoring the content you see (suggesting articles and videos, which can create an echo-chamber effect where you are only exposed to viewpoints similar to your own), making hiring decisions, setting insurance coverage and premiums, offering preferential pricing, identifying security risks, and solving crimes. When combined with other information, these data points create a detailed profile of each person.

Nonobvious relationship awareness (NORA) technology combines information from multiple sources — telephone listings, customer lists, mailing lists, credit card purchases, and data brokers — to create detailed profiles and discover hidden connections between people.

NORA: Nonobvious Relationship Awareness

Privacy Laws in Canada

Canada’s Personal Information Protection and Electronic Documents Act (PIPEDA), enacted in 2000, establishes rules for how private-sector organizations collect, use, and disclose personal information. PIPEDA is based on ten fair information principles, including accountability, identifying purposes, consent, limiting collection, limiting use/disclosure/retention, accuracy, safeguards, openness, individual access, and challenging compliance. Provincial legislation — such as Alberta’s PIPA, British Columbia’s PIPA, and Quebec’s private-sector privacy law — provides substantially similar protections.

Strategies for Protecting Privacy

Simply deleting cookies once a week has limited effectiveness — if you have previously logged into a website with old cookies and then log in again with new cookies, the website can match both sets to the same person and continue accumulating information. Browser fingerprinting and IP addresses can also be used to tentatively associate old and new cookies.

A more practical strategy is to use two browsers: one for day-to-day access (accounts you must sign into), and a second for more private browsing (where you delete cookies frequently, block third-party cookies, and never log into any accounts). You could also keep an old computer dedicated to private access, using software like BleachBit to periodically sanitize it, and cycling your IP address by restarting your modem. Using a computer in a lab or library for some searches provides another layer of separation — your identity can be tracked to the organization but not to you personally (though if you did something illegal, the institution’s login records could still identify you).

To view cookies in Chrome, click the lock icon on the left side of the address bar, or go to chrome://settings/siteData. In Firefox, click the ‘i’ icon on the left side of the address bar, or go to about:preferences#privacy and select Manage Data in the Cookies and Site Data section.

Concern 2: Digital Property Rights

What is Intellectual Property?

Intellectual property (IP) is intangible property — a recipe, a song, an invention, software — created by individuals or corporations. It can be protected by one of three legal traditions: trade secrets, copyrights, or patents.

A trade secret is intellectual work or product belonging to a business that is not in the public domain, that confers economic advantage, and for which reasonable attempts have been made to keep it secret. Examples include the recipe for Coca-Cola and the layout of a chemical plant. The risk is a breach of confidentiality. Most End User License Agreements (EULAs) prohibit the reverse engineering of software to protect trade secrets.

A copyright protects original literary, musical, artistic, dramatic works, and computer software. It prohibits copying the entire work or parts of it for at least 50 years. Copyrighting the “look and feel” of a device remains a legally murky area, as demonstrated by the Apple v. Microsoft case (1994, Mac OS vs. Windows 2.0) and Apple v. Samsung (2011, smartphones and tablets). Obtaining a copyright is relatively simple — you can place “Copyright 2021” on a work and mail a copy to yourself as a registered letter (which you do not open).

A patent grants the owner an exclusive monopoly on the ideas behind an invention for between 17 and 20 years. It is intended to promote innovation by protecting investments made to commercialize inventions. Key concepts include originality (you created it), novelty (it is a new idea), and invention (it is useful). Patents can offer protection in all 160 countries that are members of the World Trade Organization (WTO). Notably, you cannot patent software in Canada, though you can in the United States.

Challenges to IP Rights

The internet has made it easy to copy and distribute intellectual property. Perfect digital copies cost almost nothing to create, sharing digital content costs almost nothing, a web page may present data from many sources, and websites and software for file sharing are hard to regulate.

Canada’s Response

The Copyright Modernization Act (2011) addresses these challenges. It prohibits circumventing digital locks. Time shifting, format shifting, and backup copies are permitted as long as there are no digital locks. The act allows fair use provisions for education, satire, and parodies. Damages for non-commercial infringement (such as illegally downloading music and videos) are limited to between $100 and $5,000.

The act includes a notice-and-notice provision (in effect as of January 1, 2015): copyright holders notify the ISP about infringement, the ISP notifies the customer (without revealing the customer’s identity to the copyright holder), and the copyright holder must obtain a court order for the ISP to reveal a customer’s identity.

Concern 3: Data and System Quality

No large program is error-free. Errors exist with a low probability — Windows 10 has about 55 million lines of code and macOS has about 87 million lines of code. It is impossible to test every combination of hardware, settings, features, and inputs for a large program within a few years. Software producers knowingly ship products with bugs. The number of bugs can reach a steady state: in the process of fixing existing bugs and adding new features, new bugs are created. The largest source of error is poor data quality rather than faulty hardware or software.

A notable example: in 2011, Intel temporarily halted shipments of a new chip platform due to a design flaw that could cause 5% of chips to fail over the next three to five years. The estimated cost to Intel was $1 billion, including the cost of fixing nearly half a million desktops and laptops already sold.

Concern 4: Accountability and Liability

Software is typically licensed, not sold. Most EULAs limit the producer’s liability. In law, publishers of books and magazines are not legally liable for their content, to allow for freedom of expression. When software acts more like an information provider (e.g., a book publisher), the producer is not liable. When software acts more like a service provider (e.g., a machine controller), the producer can be held liable.

Concern 5: Quality of Life

Information systems have negative social costs. They blur work-home boundaries — employees are expected to do more work at home with company laptops and cell phones. They create a centralized control structure — companies such as Google, Facebook, Amazon, and Microsoft dominate the collection of personal information. The rapidity of change driven by globalization means companies must respond very quickly to environmental changes. Organizations become dependent on IS — many companies are vulnerable to any failure in their information systems, yet these systems are not regulated. Cybercrime has opened whole new areas of crime, and institutions have been slow to respond. Other concerns include job loss and repetitive stress injury / carpal tunnel syndrome.

Topic 8: Security

Readings: Management Information Systems, Chapter 8 (Laudon, Laudon and Brabston, 2013).

Secure Communication

Basic Idea

Encryption is the process of rendering a message unreadable so that anyone intercepting it will not be able to determine the original message. Decryption is the reverse — retrieving the original message from the encrypted version. The strength of an encryption method depends on the number of possible keys — more possible keys means it takes longer for an attacker to try them all.

Consider a simple example: pick a key of length one — say, the number 3 — and add 3 to each letter. The plaintext “meet me after the toga party” becomes the ciphertext “phhw ph diwhu wkh wrjd sduwb.” Because the key has length one, letters in the plaintext map to the same letters in the ciphertext (’e’ always becomes ‘h’, ’t’ always becomes ‘w’), making the cipher easy to break using frequency analysis.

Using a longer key makes the cipher harder to break. With a key of length five — say (3, 6, 5, 2, 4) — each group of five letters is shifted by different amounts, so the same plaintext letter can map to different ciphertext letters depending on its position. This is called symmetric key encryption: the same key is used to encrypt and decrypt the message.

Brute Force Search

Brute force search means trying every possible key to find the actual key. The difficulty grows exponentially with key size:

Key Size (bits)	Number of Possible Keys	Time at 10¹² attempts/sec
32	4.3 × 10⁹	2.15 milliseconds
56	7.2 × 10¹⁶	10 hours
128	3.4 × 10³⁸	5.4 × 10¹⁸ years
168	3.7 × 10⁵⁰	5.9 × 10³⁰ years

Computational Security

An encryption method is computationally secure if it will take the attacker a very long time to crack the message using the best existing technology. What is secure today may not be secure years from now — both because of Moore’s Law (faster computers) and because of novel methods such as quantum computation.

Secure Hashing

A hash function is a computer function that maps input of any size onto an output of a fixed size. Secure Hash Algorithms (SHA) are a family of hashing functions. SHA256 maps any message to a 32-byte (256-bit) number — there are 2²⁵⁶ ≈ 1.16 × 10⁷⁷ different possible output values. Even a tiny change in the input produces a drastically different hash value. Given a hash value x, it is computationally hard to find a message m such that SHA256(m) = x — typically you would need brute force.

The Key Distribution Problem

In symmetric key encryption, both parties must know the key. But how do you securely share a symmetric key with a website you are visiting for the first time? You cannot simply send the key over the internet — an eavesdropper could intercept it. This is the key distribution problem.

Public Key Encryption

The solution is public key encryption, which uses a pair of keys: a public key and a private key. The two keys are mathematically related so that when you encrypt with either one, the only way to decrypt (other than brute force) is using the other one. Public key encryption is generally used to exchange a shared symmetric key or a digital signature, rather than to encrypt an entire message (because it is slower than symmetric encryption).

With public key encryption, if you encrypt a message with Amazon’s public key, only Amazon can decrypt it (using Amazon’s private key). Conversely, if Amazon encrypts something with its private key, anyone can decrypt it with Amazon’s public key — proving the message came from Amazon.

Digital Signatures

A digital signature proves that a message came from the sender (authenticity) and has not been tampered with (data integrity). The process works as follows: the sender and receiver agree on a hash function (e.g., SHA256). The sender calculates the hash of the message h(m), encrypts h(m) with the sender’s private key to produce h(m)_c, and sends both the message m and h(m)_c. The receiver independently calculates h(m), decrypts h(m)_c using the sender’s public key, and checks that the two hash values match. If they match, the message is authentic and unaltered.

Note that a digital signature does not make the message secret — it only guarantees authenticity and integrity. Only the sender could have encrypted h(m) with the sender’s private key. And because the hash function is secure, it is computationally hard for an attacker to find a different message n such that h(n) = h(m).

Certificates and Certificate Authorities

How do you find out a sender’s public key in a reliable way? Through a Certificate Authority (CA). A CA is a trusted organization (such as DigiCert) that verifies the identity of entities and issues digital certificates. A certificate contains information about the entity (e.g., Amazon) and its public key, digitally signed by the CA. A list of trusted CAs and their public keys is included with every web browser.

Secure Browsing (HTTPS)

When you visit a secure website for the first time, the following process occurs: (1) The website presents its certificate to your browser. (2) Your browser verifies that the certificate has been signed by a recognized CA (using the CA’s public key, which is built into the browser). (3) If valid, the browser extracts the website’s public key from the certificate. (4) The browser randomly generates a symmetric key and encrypts it using the website’s public key. (5) The browser sends the encrypted symmetric key to the website. (6) Only the website can decrypt it (using its private key). (7) Now both parties share a symmetric key, and all subsequent communication is encrypted using this key.

You can tell a connection is secure by looking at the address bar — secure connections use https (not http) and show a lock icon. Unsecure connections show a warning such as “Not secure.”

Security: The Challenges

People

One major source of security problems is people. People are careless and make mistakes. Many people are lazy — they prefer simple, easy-to-remember passwords. People can be tricked through social engineering into divulging confidential information. For example, IT professionals are discouraged from having LinkedIn accounts because hackers can use their profile information to craft convincing phishing emails to other employees of the same company.

Bugs and Vulnerabilities

Any complex piece of hardware or software contains bugs and many points of vulnerability. A computer processor (billions of transistors) or an operating system (hundreds of millions of lines of code) are enormously complex — vulnerabilities like Spectre and Meltdown have demonstrated this. For even a moderately complex enterprise system, there are many points of vulnerability: at the client (unauthorized access, errors), on communication lines (tapping, sniffing, message alteration, radiation), at corporate servers (hacking, viruses, worms, denial of service), and at corporate systems (data theft, copying, alteration, hardware failure, software failure).

Sources of Data Breaches

A 2011 study found that data breaches came from many sources: hacked computer or server (16%), scraped from the web (12%), fraud or scam (10%), stolen laptop (7%), document found in trash or unattended (7%), stolen computer (6%), snail mail exposed or intercepted (5%), email exposed or intercepted (4%), stolen document (3%), lost media found (3%), lost document found (3%), virus (2%), lost computer drive found (2%), and stolen computer drive (2%). Web scraping — using automated programs to collect information from websites — is a significant source. Companies are sometimes pressured to create backdoors (secret ways for governments to access private data).

The Scale of the Problem

In 2013, Edward Snowden revealed that the NSA could breach many security protocols, including encrypted chat, encrypted VoIP (Voice over IP), VPN (Virtual Private Network), SSH (Secure Shell), and HTTPS (developed by the predecessor of Firefox to implement secure browsing). In June 2014, McAfee estimated the global cost of cybercrime at between $375 billion and $500 billion per year — approximately 0.8% of GDP, comparable to the costs of car crashes (1.0%) and narcotics (0.9%).

Common Malware

Malware (malicious software) is software designed to cause damage to or loss of control of a computer or network. Nine common types of malware include:

A computer virus is software that attaches to other programs or data in order to be executed. It can copy itself from file to file, harm data, programs, machines, or the network, and open a backdoor to hackers. Antivirus software helps minimize the risk.

A worm is similar to a virus but runs on its own — it does not need to attach to other programs. Worms can cause the same damage as viruses and use computer networks to spread. For example, a worm might try to remotely log onto other computers using default administrator names and passwords for various operating systems.

A Trojan horse is a software program that appears to be benign but does something unexpected behind the scenes. Unlike viruses and worms, Trojan horses must be launched by the user and cannot replicate on their own. An example would be an Android weather app that also allows a hacker to download files from the phone.

Phishing is an email or text message that pretends to come from a trusted authority and asks for confidential information — for example, asking you to log into your account to “verify some information.” Phishing is becoming increasingly common and sophisticated.

A denial of service attack involves many computers overwhelming a website with requests, blocking legitimate users from accessing the site. No data is lost — only potential business. A botnet (robot network) is a collection of compromised computers used together for a common purpose, such as launching denial of service attacks. The largest botnet found and removed controlled over 12 million computers; it has been estimated that as much as 10% of computers worldwide may be part of one botnet or another.

Sniffing is eavesdropping on network communication to obtain proprietary information such as passwords, emails, confidential reports, and company files. Spam is junk email, usually sent in bulk — it has become less of a problem because Gmail, Hotmail, Outlook, and other providers now have excellent spam filters, and there are laws against spam.

Ransomware is software that threatens to publish the victim’s files or prevents the victim from accessing their files unless a ransom is paid (usually in Bitcoin so the victim cannot trace the payment). Ransomware is becoming increasingly common.

Computer Security

Computer security encompasses the policies, procedures, and technical measures used to prevent unauthorized access, alteration, theft, interruption, or physical damage to information systems. Security involves more than just passwords and encryption.

Six Security Services

When a customer wants to order an item online, several security concerns arise. These are addressed by six security services:

Authentication: assurance that the other party is who they claim to be (not an impostor).
Access control: prevention of unauthorized use of a resource.
Data confidentiality: protection of data from unauthorized disclosure.
Data integrity: assurance that data received is exactly what the authorized entity sent (not tampered with).
Availability: assurance that services are available when needed.
Non-repudiation: protection against denial by one of the parties in a communication — either the sender denying they sent the data, or the receiver denying they received it.

Tools for Protecting Information Systems

Authentication

Three approaches to authentication exist: something you know (password, security question, PIN), something you have (smart card, wireless key card, cell phone app, USB security token), and something you are (biometrics — fingerprint, retinal image, face). Combining two or more approaches is called two-factor (or multi-factor) authentication.

The 25 most commonly hacked passwords include “password,” “123456,” “qwerty,” “abc123,” “monkey,” “1234567,” “letmein,” “dragon,” “baseball,” “iloveyou,” “master,” “sunshine,” and so on. Password crackers work by trying common passwords first, then common passwords with suffixes of 2-3 characters, then dictionary words with variations in capitalization and spelling (’$’ for ’s’, ‘1’ for ’l’, ‘@’ for ‘a’), then combinations of dictionary words. To target a specific person, they gather information from the web — names of partners, children, pets, favourite sports, food, musicians, and actors — and use these in their strategies.

The best defence is a password manager, which generates a different random sequence of characters for each website. An alternative method is to convert a meaningful phrase into a password — rather than picking a pet’s name like “Bailey” or “b@i1ey,” think of a phrase like “I was 14 when Bailey arrived. She was so tiny!” and convert it to the password “Iw14wBa.Swst!” by taking the first letter of each word while keeping capitalization, punctuation, and numbers.

Firewalls

Both macOS and Windows have had software firewalls for over a decade. Many cable and DSL modems have hardware firewalls built in, or a separate device can be used.

Intrusion Detection Systems

Intrusion detection systems look for unusual activities. For example, if an employee normally works weekdays 8:30 AM to 4:30 PM and typically only logs into their desktop computer, email, and Learn, but is suddenly trying to remotely log into every other computer on the network at 3 AM on a Saturday, that behaviour should be flagged.

Antivirus Software

Antivirus (anti-malware) software — such as Avast, AVG, Windows Defender (built into Windows 10), and File Quarantine (built into macOS) — works by looking for bit patterns in programs called a signature to recognize known viruses, worms, Trojan horses, spyware, and ransomware. It can also look for “unusual behaviour” to detect new malware — for example, a program accessing the internet excessively. The downside is that antivirus software can slow the launch or running of programs and the opening of files or mounting of USB drives.

Wireless Security

Setting up a secure Wi-Fi connection involves understanding three parameters:

Bandwidth (speed): Wi-Fi comes in several versions with increasing maximum speeds — 802.11b (11 Mb/s), 802.11a/g (54 Mb/s), 802.11n (300 Mb/s), 802.11ac / Wi-Fi 5 (3.5 Gb/s), and 802.11ax / Wi-Fi 6 (9 Gb/s). These are maximum speeds, not typical speeds. Newer versions are generally backward-compatible with older ones.

Security protocol: Newer protocols are more secure — WEP (Wired Equivalent Privacy) can be easily cracked, WPA (Wireless Protected Access) was a temporary replacement, WPA2/AES is newer and stronger, and WPA3 (2018) is the newest and most secure.

Authentication: For personal/home use, WPS (Wi-Fi Protected Setup) is least secure, while Personal or PSK (Pre-Shared Key) is most secure. For enterprise use (e.g., Eduroam at UW), you typically just follow the organization’s configuration.

The fundamental problem is that ISPs want to make the default setup compatible with the oldest equipment (WEP or WPA), but you want the most secure option available. The solution is to check each device to see if it supports the most recent protocol (WPA3), and configure your router to try WPA3 first, then fall back to WPA2/AES, and then WPA.

Securing Your System

At a bare minimum: use strong passwords, use anti-malware protection, and activate automatic updates for your operating system, browser, and anything else that uses the internet. Bugs create security vulnerabilities, and while eliminating all bugs is impossible, vendors release patches and updates to fix known flaws. The discovery of bugs outpaces the ability of even big companies to fix them all.

Best practices include isolating and encrypting sensitive data (Android, iOS, Linux, macOS, and Windows 10 Pro all support encrypting secondary storage — though Windows 10 Home does not). Use AES-256 based encryption software like VeraCrypt. Consider using an AES-256 encrypted flash drive (e.g., Kingston Data Traveler Vault Privacy). Have a separate user account on your computer for banking and financial activities.

To minimize your attack surface — the different places in your system where a hacker can try to add or extract data — use WPA3 or WPA2/AES for Wi-Fi, configure the firewall in your OS and your modem/router, disconnect from the internet when not in use (turn off Wi-Fi or your modem when sleeping, at school, or at work), and remove unnecessary browser plugins, software, and apps.

Security and Control Framework

Business Value

Inadequate security and control can result in lost business and create serious legal liabilities. Businesses must protect the information assets of their company, employees, customers, and business partners. A sound security and control framework can produce a high return on investment.

Legal and Regulatory Requirements

In Canada, PIPEDA establishes principles for the collection, use, disclosure, and safeguarding of personal information. Companies must be able to respond to legal requests for electronic documents relevant to a civil case (a discovery request). Ontario’s Canadian version of the Sarbanes-Oxley Act (C-SOX) requires internal controls to govern the accuracy of information in financial statements.

Six Management Tools

Risk assessment determines the level of risk to the firm for various classes of risks — for example, identifying that power failures have a 30% probability of occurrence per year with an average loss of $100,000, yielding an expected annual loss of $30,000, which justifies spending $20,000 on a backup system.

A security policy identifies the main security risks, defines acceptable security goals (e.g., downtime of no more than 3 minutes per year), and specifies mechanisms to achieve those goals (e.g., uninterruptible power supplies and diesel generator backup).

An acceptable use policy (AUP) states the acceptable uses of information and computers — covering privacy, user responsibility, personal use of devices, access rules for different employees, and the technical measures used to enforce the policies. It identifies the information employees have access to and the type of access (read-only vs. update) based on their role.

Disaster recovery planning focuses on getting IT systems (computing and communication) up and running after a disruption — for example, backing up files and maintaining backup systems.

Business continuity planning focuses on getting the business up and running after a disaster — safeguarding people as well as machines, identifying and documenting critical business processes, creating action plans, and lining up offsite resources such as cloud computing.

A security audit investigates whether the current security and control framework is adequate. It involves a comprehensive assessment of the company’s security policies, procedures, technical measures, personnel, training, and documentation — and may even simulate an attack. Risk assessment is done before security implementation, while auditing is done after implementation and repeated periodically.

The Bottom Line

Many companies assume that a disaster is too improbable and that security and control is not worth the investment in time and money. Lack of knowledge or lack of motivation are the greatest causes of computer security breaches.

Topic 9: Managing Knowledge

Readings: Management Information Systems, Chapter 11.1 (Laudon, Laudon and Brabston, 2013).

Why Knowledge Management?

The Knowledge Economy

For several decades the world’s best-known forecasters of societal change have predicted the emergence of a new economy in which brainpower, not machine power, is the critical resource. But the future has already turned into the present, and the era of knowledge has arrived.
— The Learning Organization, Economist Intelligence Unit and IBM (1996)

That quote is from 1996. Most students in this course have lived their entire lives in the era of the knowledge economy.

The Increasing Demand for Knowledge Workers

The composition of the workforce has shifted dramatically over the past century. In 1900, the largest category was farmworkers, with labourers, operators, and craftspeople also comprising large shares. By 2000, the largest categories were professional and technical workers, managerial and administrative workers, and service workers — the “knowledge workers.” This trend shows no sign of reversing.

Cost of Mismanagement

Each year, poor documentation and communications cost the Canadian economy more than $50 billion (Peter Richardson, Coping with the Crisis in the office: Canada’s $50 Billion Challenge). When people leave an organization, they often take years of knowledge with them — knowledge that was never documented.

The Exponential Growth of Digital Media

The amount of data stored is roughly doubling every year, most of it digitally. Printed documents account for only 0.003% of information growth. Organizations need better ways to manage the data and information floating inside the organization, extract and share the knowledge stored in the minds of employees, and harness external data and information.

Knowledge Management Concepts

Data, Information, Knowledge, and Wisdom

Building on the data-information distinction from earlier topics: data consists of raw facts (e.g., a list of items scanned at a checkout). Information is data shaped into a meaningful form (e.g., which items are selling well). Knowledge is the discovery of patterns, rules, and contexts where information is useful (e.g., customers are more likely to buy items at eye level). Wisdom is knowing when, where, and how to apply knowledge to solve a problem (e.g., how to maximize revenue per square foot in a grocery store).

Two Types of Knowledge

Roughly 20% of organizational knowledge is explicit knowledge — knowledge that has been documented somewhere: reports, policies, manuals, emails, databases, books, magazines, and journals. It is formal and codified.

The remaining 80% is tacit knowledge — what employees know that has not been documented. Tacit knowledge is held in the minds of employees, is informal and uncodified, and encompasses values, perspectives, culture, and the memories of staff, suppliers, and vendors.

What is Knowledge Management?

Knowledge management (KM) is the task of acquiring, storing, disseminating, and applying an organization’s explicit and tacit knowledge to meet mission objectives. The objective of KM is to connect those who know to those who need to know, to leverage knowledge transfer from one to many, and to capture know-how, know-why, and know-who.

KM’s Role

KM is one of the fastest-growing areas of software investment in companies. Knowledge is a source of wealth for an organization, just like land, labour, factories, or financial capital. The key challenge of the knowledge-based economy is to foster innovation. A substantial part of a company’s stock value is related to its intangible assets, and these assets must be properly managed.

Implementing a KM System

Stage 1: Create a Knowledge Network

The first step is to develop a sharing environment through mentor programs, virtual teams, expert panels, seminars, conferences, and communities of practice. Use collaboration tools to encourage information sharing — shared drives, wikis, blogs, groupware like SharePoint, and FAQs. Ideally, everything you do, say, and know should be properly documented and stored in digital form.

A typical concern is resistance to sharing: “the more I share, the less valuable I am to the company.” If you are a student, would you share your studying strategies? If you are a professor, would you share your lectures? If you are a machine operator, would you share your knowledge of operations? If you are a stock broker, would you share trade information? However, having a reputation in the company for being knowledgeable and helpful actually makes your job safer.

Stage 2: Implement a Search Engine

Provide relevant information to decision-making using a text-based search engine that covers internal sources (emails, internal forums, meeting minutes, reports, memos, database systems) and external sources (everything publicly available on the internet). A key concern is relevancy — from a human standpoint, relevancy is user-dependent (depends on the specific user’s judgment and current needs), time-dependent (changes over time), and geographically dependent (an approach that works in one part of the country may not work in another; municipal and provincial laws may differ).

Stage 3: Build an Intelligent System

The ultimate goal of knowledge management is to build on the search engine with the addition of an inference engine or machine learning. Such a system is capable of making suggestions or computing solutions — for example, automated medical diagnosis, detecting suspicious (possibly fraudulent) credit card transactions, or flagging suspicious tax returns.