PLAN 281: Introduction to Geographic Information Systems

Su-Yin Tan

Estimated study time: 28 minutes

Table of contents

Sources and References

Primary textbook — Paul A. Longley, Michael F. Goodchild, David J. Maguire, David W. Rhind, Geographic Information Science and Systems, 4th ed. (Wiley). Supplementary texts — Ian Heywood, Sarah Cornelius, Steve Carver An Introduction to Geographical Information Systems; Kang-tsung Chang Introduction to Geographic Information Systems; Robert Burrough, Rachael McDonnell, Christopher Lloyd Principles of Geographical Information Systems. Online resources — ESRI ArcGIS Pro documentation; QGIS official documentation; MIT OpenCourseWare 11.205 “Introduction to Spatial Analysis and GIS”.

Chapter 1 — What Is GIS? Data, Models, and Questions

A Geographic Information System is a coordinated assembly of hardware, software, data, people, and procedures that captures, stores, manipulates, analyzes, and displays information tied to locations on the Earth’s surface. Longley and colleagues distinguish the system (the software and tooling) from the science (GIScience, the study of the principles underlying spatial representation) and from the studies that apply these tools to decisions. For planners, the system is the everyday instrument; the science is what keeps the instrument honest.

Almost every question a planner asks has a “where” embedded in it. Where should we site a new transit station? Which parcels sit in a flood risk zone? How are tree canopy and surface temperature correlated across neighbourhoods? A GIS answers such questions by binding attributes — populations, land uses, elevations, ownership — to geometric primitives anchored in a coordinate system. The binding is what makes spatial reasoning computable.

Geographic data share several properties that set them apart from ordinary tabular data. They are spatially autocorrelated: Waldo Tobler’s “first law of geography” says nearby things are more related than distant things, which both enables interpolation and complicates statistical inference. They are multi-scale: a parcel, a census tract, and a watershed all describe the same neighbourhood in different grains. They are uncertain: every boundary, measurement, and classification carries error. And they are expensive to collect, so GIS workflows usually reuse, reconcile, and repair inherited datasets rather than surveying from scratch.

Conceptually, all GIS work rests on a representation choice. The world is continuous and infinitely detailed; a database cannot be. So we impose either a discrete-object view (the world is full of things with sharp edges — buildings, parcels, roads) or a continuous-field view (every location has a value — elevation, temperature, soil moisture). These two views lead to the two dominant data models, vector and raster, introduced next. The rest of the course is a disciplined tour of how to get data into one of these models, how to clean and project it, how to analyze it, and how to reason about the errors you inevitably introduce along the way.

Chapter 2 — Spatial Data Models: Vector vs Raster

The vector model represents the world as an assembly of discrete geometric primitives: points, lines (polylines), and polygons. A point is an ordered pair (x, y), or a triple if elevation matters. A line is a sequence of points connected by straight segments. A polygon is a closed line enclosing an interior. Each primitive carries an identifier that links it to rows in an attribute table — land use, population, speed limit, and so on. Vector data compresses well when features are sparse and is ideal for things with crisp legal or functional boundaries: parcels, municipal borders, road centrelines, fire hydrants.

The raster model divides space into a regular grid of cells (pixels), each storing a single value. A raster therefore has an origin, a cell size, a number of rows and columns, and an array of values. Rasters are the natural form for continuous phenomena — elevation, rainfall, reflectance from a satellite sensor, pollution concentration — because every location in the extent has a value. They are also the only practical form for remotely sensed imagery. Their cost is resolution: halving the cell size quadruples storage and processing time.

Choice between the models is rarely pure preference. Planners routinely mix the two: parcels as vectors, terrain as raster, and they intersect them when answering questions like “which parcels drain into this creek.” Both models can represent the same reality, but with different fidelity. A coastline drawn as a vector polyline has whatever precision the digitizer chose; the same coast rasterized at 30 m loses subpixel detail but becomes straightforward to combine with Landsat imagery.

Within each model there are variants. Vector data may be stored as simple features (geometry alone) or as a topological structure (explicit shared boundaries). Rasters may be stored uncompressed, run-length encoded, tiled, or as pyramids (multi-resolution overviews). File formats proliferate — shapefile, GeoPackage, GeoJSON, File Geodatabase on the vector side; GeoTIFF, NetCDF, IMG, Cloud-Optimized GeoTIFF on the raster side — but the conceptual split between discrete objects and continuous fields is the division that actually matters. Everything downstream in GIS theory is some refinement of this choice.

Chapter 3 — Topology and Feature Relationships

Topology is the branch of mathematics that cares about connectivity and adjacency but not about exact shape or distance. In GIS, a topological data model records the relationships among features explicitly: which polygons share which edges, which lines meet at which nodes, which segments form the boundary of which area. A non-topological (spaghetti) model simply stores each feature as an independent geometry and lets the viewer infer relationships visually.

Topology matters for three reasons. First, data integrity: if a municipal boundary is shared between two polygons, a topological model stores it once, so editing the line updates both polygons and they cannot drift apart. Second, analysis: many questions — routing across a road network, tracing a river, flood-filling contiguous land parcels — require knowing which features connect to which. Third, error detection: slivers, dangling nodes, overshoots, undershoots, and unclosed polygons are topological defects that a rules engine can flag automatically.

A common formalism is the planar graph: nodes (endpoints), edges (lines between nodes), and faces (areas bounded by edges). The arc–node structure popularized by early Arc/INFO encodes each arc’s start node, end node, left face, and right face, so one pass through the database answers adjacency questions without geometric computation. Modern geodatabases use topology rules — “must not overlap,” “must not have gaps,” “must be covered by” — to enforce the same discipline without demanding the user think in graph theory.

The nine-intersection model (Egenhofer and Franzosa) formalizes the relationships between two regions by asking whether the interior, boundary, and exterior of each intersect the corresponding parts of the other. From it come the familiar spatial predicates of OGC Simple Features: equals, disjoint, touches, contains, covers, overlaps, within, intersects. These predicates are the vocabulary every spatial SQL engine speaks, and they are what turn questions like “which schools lie within 500 m of a bus stop” into executable queries. Network topology adds direction and cost to edges, a foundation we return to under network analysis.

Chapter 4 — Coordinate Systems, Datums, and Projections

Every point in a GIS must ultimately be anchored to a position on the Earth. The Earth, however, is neither flat nor a perfect sphere. It is approximately an oblate ellipsoid bulging slightly at the equator, and its true surface — the geoid — is a gravity equipotential surface that undulates by tens of metres relative to any clean mathematical model. Georeferencing is the chain of choices needed to express a location as coordinates.

A geographic coordinate system uses latitude and longitude on an ellipsoid. An ellipsoid is defined by its semi-major axis and flattening; common choices include WGS84, GRS80, and Clarke 1866. A datum fastens an ellipsoid to the Earth by specifying its centre, orientation, and scale. NAD27 used the Clarke 1866 ellipsoid fixed at Meades Ranch, Kansas; NAD83 and WGS84 are Earth-centred datums consistent with GPS. A point’s latitude and longitude depend on which datum you assume, and the shift between NAD27 and NAD83 in parts of Canada exceeds 100 m — large enough to move a building off its parcel.

Datums can be horizontal (for x, y) or vertical (for elevation). Vertical datums reference mean sea level or the geoid; CGVD28 and CGVD2013 are Canadian examples. Heights above the ellipsoid (from GPS) and heights above the geoid (orthometric, what surveyors call “elevation”) differ by the geoid–ellipsoid separation, which GIS software corrects using a geoid model such as HT2.0.

A projected coordinate system converts latitude and longitude into planar x, y coordinates using a map projection. Common projected systems include Universal Transverse Mercator (UTM), State Plane, and national grids like MTM in Ontario. Each is built on a specific datum plus a specific projection plus chosen parameters (central meridian, false easting, standard parallels). The full georeferencing stack — ellipsoid, datum, projection, parameters, units — is what an EPSG code (like 26917 for NAD83/UTM zone 17N) captures in a single integer. Any responsible GIS workflow records that integer for every layer.

Chapter 5 — Map Projections and Distortions

Projecting a curved surface onto a flat one cannot preserve all geometric properties simultaneously. Gauss’s theorema egregium guarantees this: the Gaussian curvature of the sphere is nonzero, so no isometric (shape- and distance-preserving) map to a plane exists. Every projection therefore trades off among four properties — area, shape (angles), distance, and direction — and the planner’s job is to pick the trade that hurts least for the task at hand.

Projections are classified by the preserved property: conformal projections preserve local angles and shapes of small features (Mercator, Lambert Conformal Conic, Transverse Mercator), equal-area projections preserve area (Albers Equal-Area Conic, Mollweide, Lambert Azimuthal Equal-Area), equidistant projections preserve distance along certain lines, and azimuthal projections preserve direction from a central point. A conformal projection distorts area; an equal-area projection distorts shape; you cannot have both.

They are also classified by developable surface: cylindrical (wrap a cylinder around the globe), conic (seat a cone on it), and azimuthal or planar (touch a plane at a point). The Mercator projection, cylindrical and conformal, famously inflates high-latitude landmasses — Greenland rivalling Africa on classroom walls — which is harmless for navigation (rhumb lines are straight) but misleading for thematic maps of population.

Practical rules of thumb: for small, mid-latitude regions use Transverse Mercator or Lambert Conformal Conic; for countries wider east–west than north–south use a conic with two standard parallels; for statistical maps use an equal-area projection so that choropleth colours are proportional to real ground area; for web maps accept the Web Mercator convention despite its area distortion because the whole ecosystem is wired for it. Tissot’s indicatrix — an imagined tiny circle drawn everywhere on the globe and then projected — gives a visual audit: conformal projections keep it circular everywhere but varying in size; equal-area projections keep its area constant but let it squash into ellipses. When combining layers from different projections, everything must be reprojected into a common system before any measurement or overlay can be trusted.

Chapter 6 — Attribute Data and Database Fundamentals

Geometry without attributes is a picture; attributes without geometry are a spreadsheet. GIS power comes from the join. In a vector database each feature has a unique identifier, and a parallel table stores its attributes — one row per feature, one column per variable. The attribute table obeys ordinary relational rules: columns have data types (integer, float, text, date), rows are uniquely keyed, and integrity constraints guard against nonsense entries.

Attributes are classified on the four Stevens scales: nominal (categories, like land use codes), ordinal (ranked, like road classes), interval (numeric without a true zero, like temperature in Celsius), and ratio (numeric with a true zero, like population). The measurement level dictates which operations are legal and which cartographic conventions work — you can average ratio values but not nominal ones, and choropleth shading requires ordinal or interval data.

Relational databases shine when multiple tables describe the same features. A parcels table might join to an ownership table through a tax roll number, to an assessment history through a date, and to a zoning table through a bylaw code. Joins can be one-to-one (each parcel owned by one owner) or one-to-many (each owner holding many parcels). Spatial joins, introduced later, add a fourth kind of join key: geometric relationship rather than shared identifier.

Metadata is the layer that explains what each dataset is. Good metadata records the source, date, lineage, coordinate reference, attribute definitions, completeness, positional accuracy, and contact information. ISO 19115 and the FGDC Content Standard for Digital Geospatial Metadata codify these fields. Without metadata a dataset is an orphan: one cannot tell whether a “population” column reports residents or daytime workers, whether a date is the survey date or the publication date, whether an elevation is orthometric or ellipsoidal. Modern geodatabases store metadata inside the container; lightweight formats rely on separate XML sidecars. A GIS office that treats metadata as optional is one audit away from discovering that half its layers are unusable. Database design, attribute hygiene, and metadata discipline are the quiet work that distinguishes a GIS shop from a map-making hobby.

Chapter 7 — Vector Analysis: Buffers, Overlays, Spatial Joins

Vector analysis operations compute new geometry or new attributes from existing layers. Four operations do most of the work in planning practice.

A buffer generates a new polygon at a specified distance around input features. Buffering a road network by 50 m defines a noise zone; buffering wells by 100 m defines a wellhead protection area; buffering schools by 400 m identifies walkshed catchments. Buffers can be constant-distance, attribute-driven (wider around arterial roads, narrower around locals), or multi-ring (concentric 100/200/500 m bands). On geographic (lat/lon) coordinates, a buffer in metres requires first projecting to an equal-distance system or using a geodesic algorithm that operates on the ellipsoid directly.

Overlay combines two polygon layers, splitting geometries wherever they intersect and carrying attributes from both parents into the result. The Boolean flavours are union (keep everything), intersect (keep only overlapping portions), identity (keep all of one layer, overlapping portions of the other), and symmetric difference (keep the non-overlapping parts). Intersecting a parcel layer with a flood-hazard polygon yields the at-risk portion of each parcel and whatever attributes — owner, land use, assessed value — the parcel carried. Overlay is the workhorse of suitability analysis, site selection, and impact assessment.

A spatial join transfers attributes from one layer to another based on a geometric predicate — intersects, contains, is within, is nearest. Joining points (complaints) to polygons (neighbourhoods) by containment produces a complaint count per neighbourhood. Joining point-to-point by nearest neighbour assigns each bus stop its closest shelter.

Other vector operations fill out the toolbox: clip (cut one layer by the extent of another without transferring attributes), erase (the complement), dissolve (merge adjacent polygons sharing an attribute value, so census blocks become tracts), and generalize or simplify (reduce vertex count at a controlled tolerance using algorithms like Douglas–Peucker). Aggregation can also run in the attribute direction, summing, averaging, or counting feature values by group. Every vector workflow is some ordered chain of these primitives, and the skill is knowing which predicate and which geometry transformation matches the question being asked.

Chapter 8 — Raster Fundamentals and Map Algebra

A raster is an array of cells with a geographic anchor. Each cell stores a single value, interpreted according to the raster’s type: discrete (land cover codes), continuous (elevation, temperature), or binary (presence/absence). Multi-band rasters stack several values per cell, most commonly the spectral bands of a satellite image.

Four numbers place a raster in space: the coordinates of one corner, the cell size in x, the cell size in y, and the coordinate reference system. Together these form an affine transform that maps each (row, column) to a (x, y) location. Resampling methods — nearest neighbour, bilinear, cubic convolution — come into play whenever a raster is reprojected, rescaled, or aligned to another grid; nearest neighbour preserves original values and is mandatory for categorical rasters, while bilinear and cubic are smoother choices for continuous data.

Map algebra, formalized by Dana Tomlin, treats rasters as variables and combines them with arithmetic, Boolean, and comparison operators. Adding two rasters cell-by-cell sums the inputs; multiplying a binary mask by an elevation raster zeroes out cells outside the mask; a weighted sum of reclassified suitability factors produces a composite suitability surface. Tomlin grouped these operations into four families by the spatial scope of the computation: local (per cell), focal (per cell plus its neighbours), zonal (per cell summarized over a region), and global (per cell using all cells).

Local operations take one or more aligned rasters and compute each output cell from the corresponding input cells alone. Reclassification, thresholding, arithmetic combinations, and statistical summaries (mean, majority, variety across a stack of rasters) are all local. Boolean tests — is slope less than 15 percent and elevation less than 500 m and land cover equal to forest? — reduce multi-criteria questions to a single mask raster. Map algebra’s power is that it turns complicated spatial reasoning into terse expressions on entire grids, and because computation proceeds cell by cell it parallelizes almost trivially, which is why raster workflows scale to continental-sized problems.

Chapter 9 — Raster Analysis: Focal, Zonal, and Surface Operations

Focal (neighbourhood) operations compute each output cell from a moving window — typically 3x3, 5x5, or a circular kernel — of input cells. A mean focal filter smooths a noisy elevation raster; a majority filter cleans speckle in a classified land cover map; a focal range flags sharp edges. Convolution with directional kernels produces slope and aspect from a digital elevation model, and a Laplacian kernel highlights curvature. Hillshade combines slope, aspect, and a modelled sun position to produce the shaded relief images that give topographic maps their three-dimensional look.

Zonal operations summarize one raster — the value raster — within regions defined by another raster or a polygon layer — the zone layer. Zonal mean elevation per watershed, zonal sum of rainfall per county, zonal majority land cover per neighbourhood: each reduces a continuous surface to one number per zone. Zonal histograms cross-tabulate two classified rasters and are the raster analogue of a pivot table. The zone layer need not be contiguous; disjoint polygons sharing an identifier count as one zone.

Global operations use information from the entire raster to compute each output cell. Euclidean distance from the nearest source cell, cost-distance across a friction surface, viewshed from an observer, and watershed delineation are all global. Cost-distance generalizes straight-line distance by letting each cell specify a cost per unit of traversal — steep slopes cost more than gentle ones, water bodies cost infinitely — and propagates least-cost paths outward with a Dijkstra-like algorithm. This is the machinery behind route-of-least-impact studies for pipelines, power lines, and wildlife corridors.

Surface analysis specializes in digital elevation models. From a DEM one derives slope (maximum rate of change), aspect (direction of maximum slope), curvature (second derivative), profile and plan curvatures, stream networks (by following flow directions downhill), drainage basins (by accumulating upstream cells), and relative landforms. The combination of map algebra for the calculation and focal/zonal/global operators for the context is expressive enough to implement hydrological, habitat, visibility, and accessibility models with only a handful of steps.

Chapter 10 — Network Analysis

Roads, transit lines, rivers, pipes, and cables are linear features whose value lies in connectivity, and analyzing them demands a network model: a graph of nodes and edges with attributes — length, speed, direction, capacity, turn restrictions, barriers — attached to both. Classical network analysis answers four families of questions. Shortest path: given origin, destination, and an impedance (time, distance, cost), find the path minimizing the sum. Dijkstra’s algorithm or its heuristic cousin A* solves this efficiently on non-negative weights. Service area: given one or more facilities and a budget (10 minutes, 500 m), find the extent of the network reachable within the budget, and optionally the polygon enclosing it. Closest facility: for each demand point, find the nearest facility on the network and return the route. Origin–destination matrix: compute impedances between all pairs, the input to gravity models and accessibility indices.

Two more advanced problems appear frequently in planning. Location–allocation picks a subset of candidate sites to minimize a weighted impedance — where to place p facilities so that total travel is minimized (p-median), or so that maximum travel is minimized (minimax), or so that a target demand is covered within a threshold (maximum covering). These solve classical questions of fire station siting, school redistricting, and charging-station placement. The vehicle routing problem extends shortest path to fleets: assign stops to vehicles and sequence them to minimize cost subject to capacity, time windows, and driver constraints, central to sanitation pickup, home care, and delivery logistics.

Network models live or die by their attributes. Turn tables encode intersection rules; one-way flags forbid illegal moves; hierarchy levels let the solver prefer highways over alleys; restrictions exclude trucks from parkways. Impedances may be constant (length) or dynamic (time-of-day speeds, historical traffic). Any output is only as trustworthy as these attributes. A well-curated network dataset with accurate turn restrictions will route a courier around a median; a naive one will send the courier the wrong way down a one-way. The lesson, typical of GIS in general, is that analytical sophistication cannot redeem poor data.

Chapter 11 — Data Acquisition: GPS, Remote Sensing, Digitizing

GIS data comes from three broad sources. Primary field collection uses the Global Navigation Satellite Systems — the U.S. GPS, Russian GLONASS, European Galileo, Chinese BeiDou — whose receivers trilaterate position from four or more satellites. Autonomous civilian GPS yields roughly 5–10 m horizontal accuracy; differential corrections from a nearby reference station (DGPS) bring it under a metre; real-time kinematic (RTK) and post-processed kinematic (PPK) methods using carrier phase push accuracy to centimetres. Multipath, canopy obstruction, ionospheric delay, and poor satellite geometry (dilution of precision) all degrade fixes, so field protocols matter as much as hardware.

Remote sensing acquires data without physical contact, most often through sensors on satellites, aircraft, or drones. Passive sensors record reflected or emitted energy — optical (Landsat, Sentinel-2, PlanetScope), thermal infrared, or microwave radiometers — while active sensors emit their own energy and measure the return, including radar (Sentinel-1, RADARSAT) and lidar. Each sensor is defined by its spatial, spectral, temporal, and radiometric resolutions, and the trade among these is physical: finer spatial resolution usually means narrower swath and longer revisit. Lidar has become the dominant source of high-resolution topography, producing point clouds that are gridded into bare-earth DEMs after classification removes vegetation and buildings.

Photogrammetry, older than remote sensing, derives three-dimensional coordinates from overlapping photographs using stereo parallax; structure-from-motion pipelines now run this on drone imagery to produce dense point clouds and orthomosaics at centimetre resolution.

Secondary acquisition uses existing maps and documents. Heads-up digitizing traces features on screen over a georeferenced raster backdrop — scanned paper maps, orthophotos, satellite imagery — producing vectors of a quality limited by the backdrop’s own accuracy. Vectorization software can automate tracing for clean inputs. Scanning and georeferencing historical paper maps (by identifying ground control points and warping the image) unlocks decades of records for change detection. Government open data portals, cadastral offices, census agencies, and crowdsourced platforms like OpenStreetMap are the everyday well that most planners draw from. Every dataset should arrive with provenance, because every subsequent analysis inherits the errors of its sources.

Chapter 12 — Data Quality, Error, and Uncertainty

All geographic data are approximations, and the planner’s duty is to know by how much. Quality is usually decomposed into several components. Positional accuracy measures how close recorded coordinates are to true ground positions, reported as a root mean square error computed against higher-accuracy checkpoints. Attribute accuracy measures the correctness of recorded values — for classified maps it is summarized by a confusion matrix whose overall accuracy and kappa statistic quantify agreement with ground truth. Temporal accuracy concerns currency: how old is the dataset relative to the real world it claims to represent? Completeness measures what is missing. Logical consistency measures internal contradictions: polygons that overlap when they should not, attribute domains violated, topology broken. Lineage describes where the data came from and what was done to it. ISO 19157 standardizes these categories.

Error propagates through analysis. A positional error of 5 m in each of two buffered polygons can propagate to larger errors in their intersection; an attribute error in a single factor in a weighted overlay bleeds into every downstream cell. Sensitivity analysis and Monte Carlo simulation — perturbing inputs within plausible error bounds and rerunning the model many times — are the disciplined answer. They produce probability surfaces rather than single hard outputs, revealing which conclusions are robust and which depend on the least-certain inputs.

Uncertainty is broader than measurable error. It includes vagueness (where exactly does a forest end and a shrubland begin?), ambiguity (which of several valid classifications applies?), and ignorance (we have no data for this variable). Vague categories can be handled with fuzzy sets: a pixel is 0.7 forest and 0.3 grassland rather than one or the other. Ambiguity is managed with ontologies and controlled vocabularies that make classification criteria explicit. Ignorance can only be handled honestly — by acknowledging it in the metadata and, when stakes are high, refusing to make claims the data do not support. The famous MAUP (modifiable areal unit problem) reminds us that conclusions about aggregated data can change when zone boundaries change, and that our maps are therefore artifacts of our zoning. Uncertainty and scale-sensitivity are not reasons to abandon GIS; they are reasons to use it thoughtfully.

Chapter 13 — Spatial Interpolation and Geostatistics

Interpolation estimates values at unmeasured locations from a sparse set of measured ones, producing continuous surfaces out of point samples. It rests on Tobler’s first law — nearby points are more alike than distant ones — and different methods encode this intuition differently. Thiessen (Voronoi) polygons assign each unsampled location the value of its nearest sample; the surface is piecewise constant and discontinuous at cell edges, appropriate when values are categorical or when each sample genuinely “owns” its region.

Inverse distance weighted (IDW) interpolation averages nearby samples with weights that decline as a power of distance. The power parameter controls smoothness: higher powers make nearby points dominate, sharpening the surface; lower powers spread influence, smoothing it. IDW is simple, fast, exact at sample locations, but it produces bullseyes around isolated high points and cannot extrapolate beyond the sampled range.

Splines fit smooth mathematical surfaces through or near the data points, minimizing curvature. Thin-plate and regularized splines yield aesthetically smooth surfaces for phenomena known to be gradual — elevation, temperature — but can overshoot in regions of sparse data.

Kriging, the centrepiece of geostatistics, treats the surface as a realization of a spatially correlated random process. The analyst first computes an empirical variogram: the squared difference between pairs of samples as a function of their separation. Fitting a theoretical model (spherical, exponential, Gaussian) to the variogram captures the range, sill, and nugget — the distance beyond which points are uncorrelated, the total variance, and the microscale plus measurement noise. Kriging then computes weights that minimize the expected squared error at each prediction location, and uniquely among interpolators it returns a kriging variance map quantifying prediction uncertainty. Ordinary, universal, indicator, and co-kriging extend the framework to handle trends, thresholds, and auxiliary variables.

Choice of method depends on the phenomenon, the density of samples, and whether uncertainty estimates are needed. Exploratory spatial data analysis — histograms, scatterplots against coordinates, variogram clouds, trend tests — should always precede interpolation, because blindly applying a method to non-stationary or anisotropic data yields plausible but wrong surfaces.

Chapter 14 — Spatial Modelling and Decision Support

The final step in GIS maturity is modelling: stringing together data, projections, overlays, algebra, and interpolation into a pipeline that represents a process or supports a decision. Models fall on a spectrum from descriptive (what is where) through explanatory (why it is there) to predictive (what will be) and prescriptive (what should we do).

Multi-criteria decision analysis (MCDA) is the workhorse method for suitability and site selection. Analysts identify criteria (slope, distance to roads, land cover, zoning), standardize each to a common scale (often 0–1 or 0–10), weight them by importance, and combine them into a composite score with weighted sum, weighted product, or ordered weighted averaging. The analytic hierarchy process (AHP) helps elicit defensible weights from pairwise comparisons. The output is a suitability surface that can be thresholded, ranked, or handed to a location-allocation solver.

Dynamic spatial models simulate how systems evolve. Cellular automata models — the SLEUTH urban growth model is the canonical example — update each cell’s state based on its neighbours’ states and a set of transition rules, producing plausible urban footprints decades into the future. Agent-based models replace cells with autonomous actors (households, firms, vehicles) whose behaviours generate emergent spatial patterns. Hydrological, ecological, and transportation models couple domain-specific equations with GIS data preparation and post-processing.

Spatial decision support systems (SDSS) wrap these models in interfaces that non-specialists can actually use, letting stakeholders set weights, toggle constraints, and view alternatives interactively. The principles behind them — transparency about assumptions, traceability of inputs, communicability of results — are the same principles that guide any good analysis.

What separates a useful GIS project from a decorative one is its engagement with the question that prompted it. A planner who begins with “the council needs to know which neighbourhoods are most exposed to heat stress” will collect temperature, land cover, census, and health data; reproject them to a common frame; align them on a grid; compute suitability; report uncertainty; and produce a map the council can act on. A planner who begins with “I have some shapefiles and ArcGIS Pro” will produce something pretty. The course’s aim is to graduate the first kind.