CO 454: Scheduling

Levent Tunçel

Estimated study time: 2 hr 41 min

Table of contents

Sources and References

Primary textbook — M. Pinedo, Scheduling: Theory, Algorithms, and Systems (6th ed., Springer, 2022; freely available at the author’s website)

Supplementary texts — S. French, Sequencing and Scheduling: An Introduction to the Mathematics of the Jobshop (Ellis Horwood, 1982); E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan & D.B. Shmoys, “Sequencing and Scheduling: Algorithms and Complexity” in Logistics of Production and Inventory (North-Holland, 1993); K.R. Baker, Introduction to Sequencing and Scheduling (Wiley, 1974); D.S. Hochbaum (ed.), Approximation Algorithms for NP-Hard Problems (PWS, 1997)

Online resources — D.R. Karger, C. Stein & J. Wein, “Scheduling Jobs with Fixed Precedence Constraints” survey; Peter Brucker, Scheduling Algorithms (Springer, 5th ed.); MIT OCW 6.854 Advanced Algorithms (scheduling segments)

Chapter 1: Framework and Notation

1.1 Motivation and Practical Examples

Scheduling problems arise whenever a set of tasks must be assigned to resources over time. Representative instances include:

Manufacturing: assigning jobs to machines in a factory to minimise total completion time or meet delivery deadlines.
Computer science: scheduling processes on CPUs or distributed cores to minimise response time or energy consumption.
Healthcare: assigning nurses to shifts or surgeries to operating rooms subject to legal working-hour constraints.
Transportation: routing aircraft or railway units, with crew rostering satisfying union rules.
Project management: coordinating project activities with technological precedence constraints using critical-path methods (PERT/CPM).

Despite this diversity, the mathematical structure is remarkably uniform. A scheduling instance consists of a set of jobs (or tasks, operations) to be processed on a set of machines (or processors, resources). Jobs consume machine time; the goal is to find an assignment of jobs to machines and a sequencing of jobs on each machine that optimises a given performance criterion.

1.2 The Three-Field (Graham) Notation

The standard shorthand for scheduling problems is the α | β | γ notation introduced by Graham, Lawler, Lenstra and Rinnooy Kan (1979). Each field restricts or qualifies the problem:

Definition (α | β | γ notation): A scheduling problem is written as α | β | γ, where:

α (machine environment) describes the number and configuration of machines.
β (job characteristics) lists side constraints and additional parameters.
γ (objective function) specifies the criterion to be optimised.

Remark (Historical context of the notation): The \(\alpha|\beta|\gamma\) notation was introduced by Graham, Lawler, Lenstra, and Rinnooy Kan in their landmark 1979 survey "Optimization and Approximation in Deterministic Sequencing and Scheduling." Before this, scheduling literature used inconsistent ad hoc notation, making it difficult to compare results across papers. The notation's power is that it places every scheduling problem in a unified taxonomy, immediately communicating the machine environment, any side constraints, and the objective. The systematic classification of which problems are polynomial and which are NP-hard required decades of subsequent work — and is still incomplete for some parameter combinations, particularly for shop problems with more than two machines and for problems combining release dates with complex objectives. The NP-hardness of many scheduling problems was genuinely surprising to the community, since problems like \(1\|\|L_{\max}\) (polynomial) and \(1\|r_j\|L_{\max}\) (NP-hard) look nearly identical but differ in computational complexity.

Machine environments (α field)

Definition (Machine environments): The α field specifies the machine configuration:

Single machine (\(1\)): one machine processes all jobs sequentially. Every job must wait for the previous one to finish.
Identical parallel machines (\(P_m\)): \(m\) machines each running at the same speed. Job \(j\) takes time \(p_j\) on any machine. If \(m\) is variable (part of the input), written \(P\).
Uniform parallel machines (\(Q_m\)): \(m\) machines with speeds \(s_1, \ldots, s_m\). Job \(j\) on machine \(i\) takes time \(p_j / s_i\). Faster machines finish jobs sooner.
Unrelated parallel machines (\(R_m\)): \(m\) machines with job-dependent speeds. Job \(j\) on machine \(i\) takes time \(p_{ij}\), arbitrary. Most general parallel model.
Flow shop (\(F_m\)): \(m\) machines in series. Every job visits machines \(1, 2, \ldots, m\) in exactly that order, with processing time \(p_{jk}\) on machine \(k\).
Job shop (\(J_m\)): \(m\) machines. Each job has its own routing — a specific sequence of machines to visit — which may differ from other jobs' routings.
Open shop (\(O_m\)): \(m\) machines. Each job must be processed once on each machine, but the order of machines is unconstrained and can be chosen as part of the optimisation.

Job characteristics (β field)

Multiple entries may appear simultaneously, separated by commas.

Symbol	Meaning
\(r_j\)	Job \(j\) has release date \(r_j\) (earliest start time)
\(d_j\)	Job \(j\) has a hard deadline \(d_j\) (must complete by \(d_j\))
\(\bar{d}_j\)	Due date (soft; lateness is penalised but not infeasible)
\(p_j\)	Processing time of job \(j\) (assumed integer without loss of generality by scaling)
\(w_j\)	Non-negative weight (priority factor) of job \(j\)
\(\text{pmtn}\)	Preemption allowed: jobs may be interrupted and resumed later
\(\text{prec}\)	Precedence constraints: job \(j\) cannot start until all predecessors are complete
\(s_{jk}\)	Sequence-dependent setup time between job \(j\) and job \(k\)

Objective functions (γ field)

Let \(C_j\) denote the completion time of job \(j\). Define:

Makespan: \(C_{\max} = \max_j C_j\)
Lateness: \(L_j = C_j - \bar{d}_j\); maximum lateness \(L_{\max} = \max_j L_j\)
Tardiness: \(T_j = \max(C_j - \bar{d}_j, 0)\)
Unit penalty (lateness indicator): \(U_j = \mathbf{1}[C_j > \bar{d}_j]\)
Weighted completion time: \(\sum_j w_j C_j\)
Weighted tardiness: \(\sum_j w_j T_j\)
Weighted number of late jobs: \(\sum_j w_j U_j\)

Example: The problem \(1 \| r_j, \text{pmtn} \| L_{\max}\) asks for a preemptive schedule of jobs with release dates on a single machine minimising maximum lateness. The problem \(P \| \text{prec} \| C_{\max}\) asks for a minimum-makespan schedule of jobs with precedence constraints on identical parallel machines.

Example (More three-field notation): The following instances illustrate the range of scheduling problems captured by the \(\alpha|\beta|\gamma\) notation.

\(1|\text{pmtn}|L_{\max}\): single machine, preemption allowed, minimise maximum lateness. Preemption here means a job’s processing may be suspended at any moment and resumed without penalty. This problem is polynomial (solved by preemptive EDD), in contrast to \(1|r_j|L_{\max}\) (NP-hard) and \(1||L_{\max}\) (polynomial by EDD without preemption).

\(P|\text{prec}, p_j=1|C_{\max}\): identical parallel machines (\(m\) machines, variable), unit-processing-time jobs, precedence constraints, minimise makespan. The constraint \(p_j = 1\) restricts every job to unit duration. Despite this simplification, the problem is NP-hard in general when the number of machines is variable. For fixed \(m\), polynomial algorithms exist based on matching and network flow.

\(J3||C_{\max}\): 3-machine job shop (each job visits up to three machines in a job-specific order), no side constraints, minimise makespan. This is NP-hard in the strong sense — one of the earliest classical NP-hardness results in scheduling. Even \(J2||C_{\max}\) is strongly NP-hard. The 2-machine special case \(F2||C_{\max}\) (flow shop, fixed machine order) is polynomial via Johnson’s algorithm.

\(R|r_j|\sum C_j\): unrelated parallel machines (job \(j\) on machine \(i\) takes time \(p_{ij}\), arbitrary), release dates, minimise total completion time. This is NP-hard — the combination of unrelated machines and release dates makes it intractable even without weights. The LP relaxation admits a 2-approximation via randomised rounding (Schulz and Skutella).

In each case, the three fields uniquely specify the problem: the \(\alpha\) field fixes the hardware configuration, the \(\beta\) field lists side constraints (release dates, precedence, preemption, unit sizes), and the \(\gamma\) field names the criterion. Changing any single field can shift the problem from polynomial to NP-hard or vice versa.

Definition (Preemptive scheduling): A schedule is preemptive if a job's processing may be interrupted at any time point and resumed later, possibly on a different machine. Formally, job \(j\) is assigned a (possibly discontinuous) set of time intervals \(I_j \subseteq [0, \infty)\) of total measure \(p_j\), with no two jobs sharing the same machine at the same time. The completion time \(C_j = \sup I_j\). In our model the preemption penalty is zero: there is no overhead for suspending or resuming a job, and no minimum run-length constraint. Under this assumption, preemptive schedules can always do at least as well as non-preemptive ones — any non-preemptive schedule is also a preemptive schedule (with each \(I_j\) a single interval). The gap between preemptive and non-preemptive optima can be arbitrarily large for some objectives (e.g., \(\sum w_j C_j\) on parallel machines) but is bounded for others (e.g., for \(P||C_{\max}\) the preemptive optimum equals the non-preemptive optimum up to a factor of \(4/3\)).

Remark (Complexity landscape of single-machine scheduling): The computational complexity of scheduling problems is surprisingly sensitive to the precise combination of parameters. Consider the following progression on a single machine:

\(1||\sum C_j\): solvable in \(O(n \log n)\) by SPT (Shortest Processing Time first). Every permutation that sorts jobs by non-decreasing \(p_j\) is optimal.

\(1|\text{prec}|\sum C_j\): adding arbitrary precedence constraints makes the problem NP-hard (Lenstra and Rinnooy Kan, 1978). Even with unit processing times it requires careful matching-based algorithms.

\(1|r_j|\sum C_j\): adding release dates (without precedence) makes the problem NP-hard (reduction from PARTITION; Lenstra, Rinnooy Kan and Brucker, 1977). The machine can be idle even though jobs are waiting, since a released job that arrived late might have a short processing time.

\(1|r_j, \text{pmtn}|\sum C_j\): allowing preemption restores polynomial solvability. The preemptive SRPT (Shortest Remaining Processing Time) rule is optimal and runs in \(O(n \log n)\).

\(1||\sum w_j C_j\): weighted completion time without release dates is polynomial (WSPT, \(O(n \log n)\)). Weights do not increase complexity here.

\(1||\sum w_j U_j\): the weighted number of late jobs is NP-hard (reduction from PARTITION), whereas \(1||\sum U_j\) is polynomial (Moore’s algorithm).

The interplay between preemption, precedence constraints, release dates, weights, and the objective function governs the complexity boundary. No single rule captures all cases: one must check each combination in the known complexity classification tables (see Lawler–Lenstra–Rinnooy Kan–Shmoys 1993 survey).

1.3 Standard Assumptions

Unless stated otherwise:

All data (\(p_j, r_j, d_j, w_j\)) are non-negative integers.
Machines process at most one job at a time.
Jobs are non-preemptive (interruption is forbidden) unless β contains “pmtn”.
Each job requires exactly one operation on each relevant machine.
Setup and teardown times are absorbed into processing times or treated separately via \(s_{jk}\).

Example (Concrete 4-job single-machine scheduling): Consider four jobs with processing times \(p_1 = 3\), \(p_2 = 1\), \(p_3 = 4\), \(p_4 = 2\) on a single machine. We compare two orderings. \[ C_1 = 3,\quad C_2 = 3+1 = 4,\quad C_3 = 4+4 = 8,\quad C_4 = 8+2 = 10. \]\[ \sum C_j = 3 + 4 + 8 + 10 = 25. \]\[ C_2 = 1,\quad C_4 = 1+2 = 3,\quad C_1 = 3+3 = 6,\quad C_3 = 6+4 = 10. \]\[ \sum C_j = 1 + 3 + 6 + 10 = 20. \]

SPT achieves \(\sum C_j = 20\), which is 5 less than the arbitrary order. To verify optimality, use the formula: if the SPT ordering has jobs \([1],[2],[3],[4]\) in positions \(1,2,3,4\) with processing times \(p_{[k]}\), then

\[ \sum_{j} C_j = \sum_{k=1}^{4} (n - k + 1)\, p_{[k]} = 4 \cdot 1 + 3 \cdot 2 + 2 \cdot 3 + 1 \cdot 4 = 4 + 6 + 6 + 4 = 20. \]

No permutation can do better — this follows from the rearrangement inequality: the sum \(\sum_k (n-k+1) p_{[k]}\) is minimised when the sequence \((n-k+1)\) (decreasing) is paired with the sequence \(p_{[k]}\) (increasing), i.e., the shortest job is in the last-weighted position.

Chapter 2: Single-Machine Scheduling

2.1 The Problem \(1 \| \| C_{\max}\)

With a single machine and no side constraints, makespan equals the total processing time regardless of sequence:

\[ C_{\max} = \sum_{j=1}^{n} p_j. \]

This is trivially optimal; every permutation achieves the same makespan.

2.2 Minimising Weighted Completion Time: \(1 \| \| \sum w_j C_j\)

Theorem (WSPT rule): The optimal schedule for \(1 \| \| \sum w_j C_j\) sequences jobs in non-decreasing order of the ratio \(p_j / w_j\), equivalently non-increasing order of \(w_j / p_j\).
Proof (exchange argument). Suppose an optimal schedule \(\sigma\) has adjacent jobs \(j\) and \(k\) (with \(j\) immediately before \(k\)) such that \(p_j/w_j > p_k/w_k\), i.e., \(w_k p_j > w_j p_k\). Let \(S\) be the start time of job \(j\). In schedule \(\sigma\), the contribution of the pair is: \[ w_j(S + p_j) + w_k(S + p_j + p_k) = (w_j + w_k)S + w_j p_j + (w_j + w_k)p_j \cdot \frac{w_k}{w_j + w_k} + w_k p_k. \] More cleanly, the difference in cost between order \((j, k)\) and order \((k, j)\) is: \[ \Delta = \bigl[w_j(S+p_j) + w_k(S+p_j+p_k)\bigr] - \bigl[w_k(S+p_k) + w_j(S+p_k+p_j)\bigr] = w_k p_j - w_j p_k. \]

By assumption \(w_k p_j > w_j p_k\), so \(\Delta > 0\): swapping \(j\) and \(k\) reduces the objective by \(w_k p_j - w_j p_k > 0\), contradicting the optimality of \(\sigma\). Hence no such inversion exists in any optimal schedule, and the WSPT order is optimal.

The WSPT rule (also called Smith’s rule) runs in \(O(n \log n)\) time. When all weights are equal (\(w_j = 1\)), WSPT reduces to SPT (Shortest Processing Time first), which minimises \(\sum C_j\).

Example (WSPT for 3 jobs): Three jobs with \((p_1, w_1) = (4, 6)\), \((p_2, w_2) = (2, 2)\), \((p_3, w_3) = (3, 9)\). Compute the priority ratios \(w_j/p_j\): \[ \frac{w_1}{p_1} = \frac{6}{4} = 1.5, \qquad \frac{w_2}{p_2} = \frac{2}{2} = 1.0, \qquad \frac{w_3}{p_3} = \frac{9}{3} = 3.0. \] WSPT order (decreasing \(w_j/p_j\)): job 3, job 1, job 2. \[ C_3 = 3, \quad C_1 = 3+4 = 7, \quad C_2 = 7+2 = 9. \]\[ \sum w_j C_j = 9 \cdot 3 + 6 \cdot 7 + 2 \cdot 9 = 27 + 42 + 18 = 87. \]\[ C_1 = 4, \quad C_2 = 6, \quad C_3 = 9. \]\[ \sum w_j C_j = 6 \cdot 4 + 2 \cdot 6 + 9 \cdot 9 = 24 + 12 + 81 = 117. \]\[ C_1 = 4, \quad C_3 = 7, \quad C_2 = 9. \]\[ \sum w_j C_j = 6 \cdot 4 + 9 \cdot 7 + 2 \cdot 9 = 24 + 63 + 18 = 105. \]

WSPT achieves 87, which is optimal. Intuitively, job 3 has the highest priority (weight 9, short processing time 3) and should run first to avoid making a high-weight job wait.

2.3 Minimising Maximum Lateness: \(1 \| \| L_{\max}\)

Theorem (EDD rule): For \(1 \| \| L_{\max}\), the Earliest Due Date (EDD) rule — scheduling jobs in non-decreasing order of \(\bar{d}_j\) — is optimal.
Proof (exchange argument). Suppose schedule \(\sigma\) has adjacent jobs \(j\) before \(k\) with \(\bar{d}_j > \bar{d}_k\). Let \(S\) be the start time of job \(j\). In the current order: \[ L_j = S + p_j - \bar{d}_j, \qquad L_k = S + p_j + p_k - \bar{d}_k. \] After swapping to order \((k, j)\): \[ L_k' = S + p_k - \bar{d}_k, \qquad L_j' = S + p_k + p_j - \bar{d}_j. \]

Note that \(L_k' < L_k\) (since \(p_k < p_k + p_j\)), and \(L_j' = L_k + (p_j - \bar{d}_j + \bar{d}_k) < L_k\) because \(\bar{d}_k < \bar{d}_j\) implies \(p_j - \bar{d}_j + \bar{d}_k < p_j\). In both the old and new order, the lateness of the other jobs is unchanged. Thus \(\max(L_j', L_k') \leq \max(L_j, L_k)\), and the swap does not increase \(L_{\max}\). Applying such swaps until no inversion remains gives the EDD order.

With release dates (\(1 \| r_j \| L_{\max}\)), the problem becomes NP-hard in general. However:

Theorem (Horn's algorithm, 1972): The problem \(1 \| r_j, \text{pmtn} \| L_{\max}\) is solvable in polynomial time by the preemptive EDD rule: at each time point, among all released jobs, process the one with smallest due date, preempting the current job if necessary.

Proof sketch. Let \(\sigma\) be the preemptive EDD schedule and let \(\sigma^*\) be any preemptive schedule. We show \(L_{\max}(\sigma) \leq L_{\max}(\sigma^*)\). Consider any job \(j\). In schedule \(\sigma\), job \(j\) runs only when it has the smallest due date among released jobs. Suppose \(L_j(\sigma) > L_{\max}(\sigma^*)\) — then job \(j\) is late by more than \(L_{\max}(\sigma^*)\), meaning it finishes after \(d_j + L_{\max}(\sigma^*)\). But in \(\sigma^*\), \(j\) finishes by \(d_j + L_{\max}(\sigma^*)\). Between \(r_j\) and \(d_j + L_{\max}(\sigma^*)\), schedule \(\sigma^*\) processes at most \(d_j + L_{\max}(\sigma^*) - r_j\) units of job \(j\) on one machine, so \(p_j \leq d_j + L_{\max}(\sigma^*) - r_j\). But in \(\sigma\), during \([r_j, d_j + L_{\max}(\sigma^*)]\), the machine is processing jobs with due dates \(\leq d_j\) (by EDD) — specifically, all such jobs are available and the machine is never idle when a job with due date \(\leq d_j\) is waiting. By summing processing times, we derive a contradiction: the total processing time of jobs with \(d_k \leq d_j\) (released before the current time) exceeds the available window, contradicting feasibility in \(\sigma^*\). Hence \(L_j(\sigma) \leq L_{\max}(\sigma^*)\) for all \(j\), giving \(L_{\max}(\sigma) \leq L_{\max}(\sigma^*)\).

Example (Preemptive EDD for \(1|r_j, \text{pmtn}|L_{\max}\)): Four jobs with \((p_j, r_j, d_j)\):

Job	\(p_j\)	\(r_j\)	\(d_j\)
1	4	0	6
2	2	2	4
3	3	3	8
4	1	5	7

Preemptive EDD trace:

\(t=0\): Only job 1 released. Process job 1.

\(t=2\): Job 2 arrives (\(d_2 = 4 < d_1 = 6\)). Preempt job 1 (remaining: 2 units). Process job 2.

\(t=3\): Job 3 arrives (\(d_3 = 8\)). Current: job 2 (\(d=4\)), job 1 (\(d=6\)), job 3 (\(d=8\)). Continue job 2 (smallest due date).

\(t=4\): Job 2 completes (\(C_2 = 4\), \(L_2 = 4 - 4 = 0\)). Available: job 1 (\(d=6\), rem=2), job 3 (\(d=8\), rem=3). Process job 1 (smallest due date).

\(t=5\): Job 4 arrives (\(d_4 = 7\)). Available: job 1 (\(d=6\), rem=1), job 3 (\(d=8\), rem=3), job 4 (\(d=7\), rem=1). Process job 1 (smallest due date = 6).

\(t=6\): Job 1 completes (\(C_1 = 6\), \(L_1 = 6 - 6 = 0\)). Available: job 3 (\(d=8\), rem=3), job 4 (\(d=7\), rem=1). Process job 4 (smallest due date = 7).

\(t=7\): Job 4 completes (\(C_4 = 7\), \(L_4 = 7 - 7 = 0\)). Process job 3.

\(t=10\): Job 3 completes (\(C_3 = 10\), \(L_3 = 10 - 8 = 2\)).

\[L_{\max} = \max(0, 0, 2, 0) = 2.\]

This schedule achieves \(L_{\max} = 2\), which is optimal for this instance (the total processing time from \(r_3 = 3\) to \(d_3 = 8\) is only 5 units but job 3 has \(p_3 = 3\) and must compete with jobs 1, 2, 4 during \([3, 8]\), making lateness of 2 unavoidable).

Example (EDD for 4 jobs): Four jobs with \((p_j, \bar{d}_j)\): job 1: \((3, 5)\), job 2: \((2, 3)\), job 3: \((1, 4)\), job 4: \((4, 7)\). EDD order sorts by due date: job 2 \((\bar{d}=3)\), job 3 \((\bar{d}=4)\), job 1 \((\bar{d}=5)\), job 4 \((\bar{d}=7)\). \[ C_2 = 2, \quad C_3 = 2+1 = 3, \quad C_1 = 3+3 = 6, \quad C_4 = 6+4 = 10. \]\[ L_2 = 2-3 = -1, \quad L_3 = 3-4 = -1, \quad L_1 = 6-5 = 1, \quad L_4 = 10-7 = 3. \]\[ L_{\max} = 3. \]\[ C_1=3,\ C_2=5,\ C_3=6,\ C_4=10. \]\[ L_1=3-5=-2,\ L_2=5-3=2,\ L_3=6-4=2,\ L_4=10-7=3. \]\[ L_{\max} = 3. \]\[ C_4=4,\ C_1=7,\ C_2=9,\ C_3=10. \]\[ L_4=4-7=-3,\ L_1=7-5=2,\ L_2=9-3=6,\ L_3=10-4=6. \]\[ L_{\max} = 6. \]

EDD achieves \(L_{\max} = 3\), which is optimal for this instance. Placing a job with a tight due date late (as in the last ordering) dramatically increases maximum lateness.

2.4 Minimising the Number of Late Jobs: \(1 \| \| \sum U_j\)

Theorem (Moore's algorithm, 1968): An optimal schedule for \(1 \| \| \sum U_j\) (minimise number of late jobs, unit weights) is produced as follows:

Sort jobs by EDD order.
Maintain a set A of on-time jobs. Process jobs in EDD order. When adding job \(j\) makes some job in \(A \cup \{j\}\) late, remove the job with the largest processing time from \(A \cup \{j\}\) (mark it late).
Schedule jobs in \(A\) (on-time) followed by all late jobs.

The algorithm runs in \(O(n \log n)\) time.

For the weighted version \(1 \| \| \sum w_j U_j\), the problem is NP-hard (reduces from PARTITION).

Example (Moore's algorithm — 5 jobs): Five jobs arrive with the following data, already sorted by EDD order:

Job	\(p_j\)	\(d_j\)
1	3	3
2	2	5
3	1	6
4	4	8
5	2	9

Apply Moore’s algorithm. Maintain a set \(A\) of on-time jobs and process jobs one by one in EDD order.

Add job 1 (\(p=3\)): cumulative time = 3, \(C_1 = 3 \leq d_1 = 3\). On time. \(A = \{1\}\).

Add job 2 (\(p=2\)): cumulative time = \(3+2=5\), \(C_2 = 5 \leq d_2 = 5\). On time. \(A = \{1,2\}\).

Add job 3 (\(p=1\)): cumulative time = \(5+1=6\), \(C_3 = 6 \leq d_3 = 6\). On time. \(A = \{1,2,3\}\).

Add job 4 (\(p=4\)): cumulative time = \(6+4=10\), \(C_4 = 10 > d_4 = 8\). Late! Among \(A \cup \{4\} = \{1,2,3,4\}\), the largest processing time is \(p_4 = 4\). Remove job 4 from \(A\), mark it late. Cumulative time reverts to \(6\). \(A = \{1,2,3\}\).

Add job 5 (\(p=2\)): cumulative time = \(6+2=8\), \(C_5 = 8 \leq d_5 = 9\). On time. \(A = \{1,2,3,5\}\).

\[ C_1 = 3,\quad C_2 = 5,\quad C_3 = 6,\quad C_5 = 8,\quad C_4 = 12. \]\[ U_1 = 0,\quad U_2 = 0,\quad U_3 = 0,\quad U_5 = 0,\quad U_4 = 1 \quad (12 > 8). \]

\(\sum U_j = 1\). This is optimal: we cannot schedule all 5 jobs on time because the total processing time \(3+2+1+4+2 = 12\) exceeds the tightest deadline \(d_1 = 3\), and any subset of 4 on-time jobs must include at least the jobs with tight deadlines, which requires careful balancing. Moore’s algorithm guarantees the maximum number of on-time jobs (here 4 out of 5).

Theorem (Optimality of preemptive SRPT for \(1|r_j, \text{pmtn}|\sum C_j\)): At each time \(t\), among all jobs that have been released and not yet completed, process the one with the shortest remaining processing time. This rule — Shortest Remaining Processing Time (SRPT) — produces an optimal schedule for \(1|r_j, \text{pmtn}|\sum C_j\).

Proof sketch (exchange argument). Consider any time \(t\) in a non-SRPT schedule. Suppose job \(j\) (with remaining time \(r_j^{\text{rem}}\)) is running at time \(t\), but job \(i\) is available (released before \(t\) and not yet complete) with \(r_i^{\text{rem}} < r_j^{\text{rem}}\). Consider preempting \(j\) at time \(t\) and running \(i\) instead until \(i\) completes. Job \(i\) completes earlier, reducing \(C_i\). Job \(j\) is delayed by \(r_i^{\text{rem}}\) but is eventually completed; its completion time increases by at most \(r_i^{\text{rem}}\). The net change in \(\sum C_j\) is: \[ \Delta = -r_i^{\text{rem}} + r_i^{\text{rem}} = 0 \quad \text{(in the best case, when no other job is disturbed)}. \] More carefully: at time \(t + r_i^{\text{rem}}\), job \(i\) is done and \(j\) continues. The completion time of \(i\) decreases by some positive amount (it is now done sooner), and the completion time of \(j\) increases by \(r_i^{\text{rem}}\) at most. Any other job \(k\) that was waiting is unaffected in completion time. The net effect on \(C_i + C_j\) is non-positive: \(\Delta(C_i) + \Delta(C_j) \leq -r_i^{\text{rem}} + r_i^{\text{rem}} = 0\). (In fact, \(\Delta < 0\) unless both jobs would have completed at the same time, which is a degenerate case.) Repeating this exchange over all time points yields the SRPT schedule. Since each exchange is non-worsening and we reach SRPT, SRPT is optimal.

Example (SRPT for 4 jobs with release dates): Four jobs with processing times and release dates:

Job	\(p_j\)	\(r_j\)
1	4	0
2	2	1
3	1	2
4	3	4

SRPT trace.

\(t=0\): Only job 1 is available (released). Start job 1. Remaining: \(p_1^{\text{rem}} = 4\).

\(t=1\): Job 2 arrives with \(p_2 = 2\). Remaining times: \(p_1^{\text{rem}} = 3\), \(p_2^{\text{rem}} = 2\). Shortest remaining = job 2. Preempt job 1, start job 2.

\(t=2\): Job 3 arrives with \(p_3 = 1\). Remaining: \(p_1^{\text{rem}} = 3\), \(p_2^{\text{rem}} = 1\), \(p_3^{\text{rem}} = 1\). Tie between jobs 2 and 3 (both remaining = 1). Continue job 2 (tie-break: prefer the running job).

\(t=3\): Job 2 completes (\(C_2 = 3\)). Available: job 1 (\(p^{\text{rem}}=3\)), job 3 (\(p^{\text{rem}}=1\)). Run job 3.

\(t=4\): Job 3 completes (\(C_3 = 4\)). Job 4 arrives with \(p_4 = 3\). Available: job 1 (\(p^{\text{rem}}=3\)), job 4 (\(p^{\text{rem}}=3\)). Tie — run job 1.

\(t=7\): Job 1 completes (\(C_1 = 7\)). Run job 4.

\(t=10\): Job 4 completes (\(C_4 = 10\)).

Completions: \(C_1 = 7\), \(C_2 = 3\), \(C_3 = 4\), \(C_4 = 10\).

\[ \sum C_j = 7 + 3 + 4 + 10 = 24. \]\[ C_1 = 4,\quad C_3 = 5,\quad C_2 = 7,\quad C_4 = 10. \]\[ \sum C_j = 4 + 5 + 7 + 10 = 26. \]

SRPT achieves \(\sum C_j = 24 < 26\), demonstrating the benefit of preemption: by interrupting job 1 when the shorter job 2 arrives, we complete job 2 earlier and reduce overall waiting.

Remark (Weighted number of late jobs): The problem \(1||\sum w_j U_j\) (weighted number of late jobs) is NP-hard, while the unweighted version \(1||\sum U_j\) is polynomial (Moore's algorithm). This illustrates a recurring phenomenon in scheduling: adding weights to a polynomial problem can render it NP-hard. The hardness of \(1||\sum w_j U_j\) arises because the \(0\text{-}1\) penalty \(U_j\) combined with weights \(w_j\) amounts to a weighted set cover problem — we want to find the maximum-weight subset of jobs that can be completed on time, which is essentially a variant of the knapsack problem (NP-hard). In contrast, \(1||\sum w_j C_j\) (weighted completion time) remains polynomial with weights because the WSPT rule's exchange argument extends directly to the weighted case. The pattern is: weights interact badly with threshold objectives (\(U_j, T_j\)) but not with linear objectives (\(C_j\)).

2.5 Minimising Weighted Tardiness: \(1 \| \| \sum w_j T_j\)

This problem is NP-hard (Lawler, 1977; the proof reduces from 3-PARTITION in the strong sense when processing times are arbitrary). For equal weights, \(1 \| \| \sum T_j\) is also NP-hard.

Dynamic programming approach: index jobs by subsets. Let \(f(S)\) = minimum weighted completion time for jobs in set \(S\) scheduled first. The recurrence is:

\[ f(S) = \min_{j \in S} \left\{ f(S \setminus \{j\}) + w_j \cdot T_j(C_j) \right\} \]

where \(C_j = p(S)\) (total processing time of \(S\)) if job \(j\) is last in \(S\). This runs in \(O(2^n \cdot n)\) time and \(O(2^n)\) space — exponential but exact.

2.6 Preemptive Scheduling on a Single Machine

The most important preemptive single-machine problems arise when jobs have release dates. Without release dates, preemption never helps for regular objectives (since any non-preemptive schedule can be converted to a preemptive one with the same completion times).

Definition (Regular objective): An objective function \(\gamma\) is regular if it is non-decreasing in each \(C_j\). All standard objectives (\(\sum C_j\), \(\sum w_j C_j\), \(L_{\max}\), \(\sum T_j\), \(\sum U_j\)) are regular. For regular objectives, an optimal non-preemptive schedule always exists among non-idle schedules (no unnecessary idle time). When release dates are present, however, idle time may be forced (the machine must wait for an unreleased job), and preemption can help by switching to a different released job during that wait.

Remark (Preemption vs. non-preemption with release dates): The gap between preemptive and non-preemptive optima can be arbitrarily large when release dates are present. Consider \(n\) jobs: job 1 has \(p_1 = n\), \(r_1 = 0\); jobs \(2, \ldots, n\) each have \(p_j = 1\), \(r_j = 1\). Non-preemptively, if we start job 1 at \(t=0\), it finishes at \(t=n\), then jobs \(2,\ldots,n\) finish at \(t=n+1, n+2, \ldots, 2n-1\). Total: \(\sum C_j = n + (n+1) + \ldots + (2n-1) = n^2 + n(n-1)/2\). Preemptively (SRPT): at \(t=1\), preempt job 1, run jobs \(2, \ldots, n\) (each takes 1 unit, done by \(t = n\)), then resume job 1 (remaining: \(n-1\) units), done at \(t = 2n-1\). Total: \(\sum C_j = (2 + 3 + \ldots + n) + (2n-1) = O(n^2)\) — comparable, but jobs \(2,\ldots,n\) complete at times \(2,3,\ldots,n\) rather than \(n+1,\ldots,2n-1\), giving a significant reduction. For \(\sum C_j\) specifically, SRPT gives \(\frac{n(n+1)}{2} + (2n-1) - 1 = O(n^2)\), but with much smaller constants. The relative savings grow as job size diversity increases.

2.7 Precedence Constraints

With precedence constraints (partial order \(\prec\)), the problem \(1 \| \text{prec} \| \sum w_j C_j\) is NP-hard in general. A special case:

Theorem (Lawler, 1978): The problem \(1 \| \text{prec} \| \sum w_j C_j\) with arbitrary weights and processing times is NP-hard. However, for unit weights and unit processing times (\(p_j = w_j = 1\)), an optimal schedule can be computed in polynomial time via a reduction to minimum-weight bipartite matching.

Example (Precedence constraints vs. SPT): Four jobs in a chain precedence constraint \(1 \to 2 \to 3 \to 4\) with processing times \(p_1=3, p_2=1, p_3=2, p_4=4\). Because of the chain, the only feasible order is \(1,2,3,4\). \[ C_1=3,\quad C_2=4,\quad C_3=6,\quad C_4=10. \]\[ \sum C_j = 3+4+6+10 = 23. \]\[ C_2=1,\quad C_3=3,\quad C_1=6,\quad C_4=10. \]\[ \sum C_j = 1+3+6+10 = 20. \]

The precedence constraint forces a suboptimal ordering: \(\sum C_j = 23\) versus the unconstrained SPT optimum of 20. In general, precedence constraints can force deviations from SPT that increase \(\sum C_j\) by an arbitrarily large factor. The problem \(1|\text{prec}|\sum C_j\) with general precedences is polynomial (solvable by a modified SPT that respects the partial order using topological sort), but \(1|\text{prec}|\sum w_j C_j\) with general weights and precedences is NP-hard.

Chapter 3: Parallel Machine Scheduling

3.1 Identical Parallel Machines

Makespan: \(P \| \| C_{\max}\)

This problem is NP-hard for \(m \geq 2\) (by reduction from PARTITION; see Chapter 4). Graham’s list scheduling algorithm provides approximation guarantees.

Definition (List Scheduling): Given a priority list of jobs, assign each job, as it becomes available, to the first idle machine.

Theorem (Graham, 1969): List scheduling on \(m\) identical machines achieves approximation ratio \(2 - 1/m\) for \(P \| \| C_{\max}\).
Proof sketch. Let \(j^*\) be the last job to finish, completing at time \(C_{\max}\). Let \(t\) be the time at which \(j^*\) started. At time \(t\), all \(m\) machines were busy (otherwise \(j^*\) would have started earlier by the list scheduling rule). Therefore all \(m\) machines together performed at least \(m \cdot t\) work during \([0, t]\). Since the total work is \(\sum_j p_j\), we have \(m \cdot t \leq \sum_j p_j\), giving \(t \leq \sum_j p_j / m \leq \mathrm{OPT}\). Also \(p_{j^*} \leq \mathrm{OPT}\) (since any single job must fit in OPT). Thus: \[ C_{\max} = t + p_{j^*} \leq \mathrm{OPT} + \mathrm{OPT} = 2 \cdot \mathrm{OPT}. \] A more careful analysis accounting for the fact that \(t \leq (\sum_j p_j - p_{j^*})/m\) gives the tighter bound \(C_{\max} \leq (2 - 1/m) \cdot \mathrm{OPT}\).

Example (List scheduling can be bad): Consider 2 machines and 5 jobs: four jobs of length 1 and one job of length 4. If the priority list processes the four short jobs first, then the long job: machine 1 takes jobs 1 and 2 (done at \(t=2\)), machine 2 takes jobs 3 and 4 (done at \(t=2\)), then machine 1 takes the long job 5 (done at \(t=6\)). List scheduling makespan: 6.

Optimal: assign the long job (length 4) to machine 1 alone, assign the four short jobs (total length 4) to machine 2 (done at \(t=4\)). Optimal makespan: 4.

Ratio: \(6/4 = 1.5\). This matches the bound \(2 - 1/m = 2 - 1/2 = 1.5\) for \(m=2\), showing the analysis is tight for small \(m\).

Theorem (Graham, 1969 — LPT rule): The Longest Processing Time first (LPT) rule — list scheduling with jobs sorted in non-increasing order of \(p_j\) — achieves approximation ratio \(4/3 - 1/(3m)\) for \(P \| \| C_{\max}\).
Proof sketch. Let \(j^*\) be the last job to finish. Lower bounds: \(\mathrm{OPT} \geq \max_j p_j\) and \(\mathrm{OPT} \geq \sum_j p_j / m\). In the LPT schedule, machine \(i^*\) was idle until \(j^*\) started, so all other machines were busy at that moment. The makespan is at most \(\bar{p} + p_{j^*}\), where \(\bar{p}\) is the average load. Since \(j^*\) is the last job in LPT order and there are \(n > m\) jobs (otherwise trivially optimal), \(p_{j^*} \leq p_{\lceil n/m \rceil} \leq \mathrm{OPT}/3\) (when \(n \geq m + 1\)). Combining the two lower bounds yields the stated ratio.

Weighted Completion Time: \(P \| \| \sum w_j C_j\)

The WSPT (Smith’s rule) generalises: schedule jobs globally in non-increasing order of \(w_j / p_j\), breaking ties arbitrarily, then assign each job to the currently least-loaded machine. This is not exactly optimal for \(m \geq 2\) but provides a \(2\)-approximation for the unweighted case and is often near-optimal in practice.

Preemptive Makespan: \(P \| \text{pmtn} \| C_{\max}\)

Theorem (McNaughton, 1959): An optimal preemptive schedule for \(P \| \text{pmtn} \| C_{\max}\) has makespan \[ C_{\max}^* = \max\!\left(\max_j p_j,\; \frac{\sum_j p_j}{m}\right). \] This schedule can be constructed in \(O(n)\) time by the wrap-around algorithm: lay all jobs end-to-end on a time axis of length \(C_{\max}^*\); partition this time axis into \(m\) equal segments, one per machine.

3.2 Uniform and Unrelated Machines

On uniform machines \(Q \| \| C_{\max}\), machine \(i\) processes job \(j\) in time \(p_j / s_i\). An LP relaxation over job-to-machine assignments can be rounded to yield a \((4/3)\)-approximation.

On unrelated machines \(R \| \| C_{\max}\), processing time \(p_{ij}\) for job \(j\) on machine \(i\) is arbitrary. An LP relaxation (attribute a fraction of job \(j\) to machine \(i\)) rounded by a randomised argument (Lenstra, Shmoys & Tardos, 1990) gives a 2-approximation. This is tight in the sense that no polynomial algorithm achieves ratio \(< 3/2\) unless P = NP.

3.3 Critical Path and Precedence Constraints

For \(P \| \text{prec} \| C_{\max}\) and more generally for project scheduling, the critical path method (CPM/PERT) provides a lower bound:

\[ C_{\max} \geq \max_{\text{path } \pi} \sum_{j \in \pi} p_j, \]

where the maximum is over all chains in the precedence DAG. When \(m = \infty\) (unlimited machines), the optimal makespan equals this longest path, computable in \(O(n + e)\) time by topological sort (where \(e\) is the number of precedence edges).

3.3b Minimising Weighted Completion Time on Parallel Machines

For \(P||\sum w_j C_j\), the problem is NP-hard in general (by reduction from PARTITION). However, important special cases are tractable.

Theorem (Preemptive LRPT for \(P|\text{pmtn}|\sum C_j\)): On identical parallel machines with preemption, the Longest Remaining Processing Time (LRPT) rule is optimal for \(\sum C_j\). At each moment, among all \(m\) machines, assign to them the \(m\) jobs with the longest remaining processing times. This is counterintuitive — on a single machine, SPT is optimal; on parallel machines with preemption, LRPT is optimal.

The reason LRPT works for parallel machines: completing short jobs first on one machine wastes the ability to parallelise. LRPT keeps all machines busy processing the longest remaining jobs simultaneously, reducing the time until all long jobs are done, which paradoxically reduces \(\sum C_j\).

Example (LRPT vs. SPT on 2 parallel machines): Two machines, 4 jobs with \(p = (4, 3, 2, 1)\). Total processing time = 10, so \(C_{\max} \geq 5\). \[\sum C_j = 3 + 4 + 4 + 5 = 16.\]\[\sum C_j = 1+2+4+6 = 13.\]

SPT gives \(\sum C_j = 13 < 16\). On parallel machines without preemption, SPT (assign globally shortest job to the first idle machine) is a good heuristic but not always optimal. The preemptive LRPT result applies only to the preemptive case; here the non-preemptive SPT heuristic happens to do better than LRPT on this instance because LRPT forces job 3 to wait until \(t=3\) before starting.

3.4 Bin-Packing Connection

Bin packing: given items of sizes \(s_1, \ldots, s_n \in (0,1]\), pack them into the fewest unit-capacity bins. The problem \(P_m \| \| C_{\max}\) with target \(T^*\) is equivalent to bin packing: can \(n\) jobs of sizes \(p_j / T^*\) be packed into \(m\) bins of capacity 1?

First Fit Decreasing (FFD) uses at most \(11/9 \cdot \mathrm{OPT} + 6/9\) bins (Johnson, 1973). Bin packing has no PTAS for the number-of-bins objective unless P = NP (Karmarkar & Karp, 1982), but for the makespan objective on \(m\) fixed machines, a PTAS exists (see Chapter 8).

Chapter 4: Computational Complexity in Scheduling

4.1 Complexity Classes

Definition (P, NP, NP-complete, NP-hard):

P: decision problems solvable in polynomial time by a deterministic Turing machine.
NP: decision problems for which a yes-certificate can be verified in polynomial time.
NP-complete: problems in NP to which every NP problem reduces in polynomial time (Cook–Levin theorem: 3-SAT is NP-complete).
NP-hard: at least as hard as NP-complete problems; the optimisation versions of NP-complete decision problems are NP-hard.

4.2 Strong vs. Weak NP-Hardness

Definition (Strongly / weakly NP-hard): A problem is strongly NP-hard if it remains NP-hard even when all numerical data are bounded by a polynomial in the number of inputs (i.e., it is NP-hard in the unary encoding). A problem is weakly NP-hard if it is NP-hard but admits a pseudo-polynomial time algorithm (running time polynomial in the numeric value of the input).

Key consequences:

A strongly NP-hard problem has no FPTAS (fully polynomial-time approximation scheme) unless P = NP.
A weakly NP-hard problem may admit an FPTAS (scale and round the DP).

4.3 Key Reductions for Scheduling

PARTITION and \(P2 \| \| C_{\max}\)

PARTITION: Given positive integers \(a_1, \ldots, a_n\), does there exist \(S \subseteq [n]\) with \(\sum_{i \in S} a_i = \sum_{i \notin S} a_i\)?

Theorem: \(P2 \| \| C_{\max}\) is NP-hard (weakly) by reduction from PARTITION.

Proof. We reduce PARTITION to the decision version of \(P2||C_{\max}\): given a PARTITION instance with positive integers \(a_1, \ldots, a_n\) summing to \(2B\), does there exist a partition \(S \cup \bar{S} = \{1,\ldots,n\}\) with \(\sum_{i \in S} a_i = B\)?

Construct a scheduling instance with \(n\) jobs with processing times \(p_j = a_j\) on \(m = 2\) identical machines. The total processing time is \(\sum p_j = 2B\). We ask: is the optimal makespan \(C_{\max} \leq B\)?

Reduction is polynomial: listing \(n\) processing times takes \(O(n)\) time.

Correctness: If PARTITION has a solution \(S\), assign jobs in \(S\) to machine 1 and jobs in \(\bar{S}\) to machine 2. Machine 1 runs for exactly \(\sum_{i \in S} a_i = B\) and machine 2 runs for \(\sum_{i \notin S} a_i = 2B - B = B\). So \(C_{\max} = B\). Conversely, if \(C_{\max} = B\), each machine receives a subset of jobs whose processing times sum to exactly \(B\) (since both machines must be fully utilised — any idle time would make the other machine finish later than \(B\), contradicting \(C_{\max} = B\)). This gives a partition summing to \(B\) on each side.

Since PARTITION is NP-complete (and is known to be only weakly NP-complete), \(P2||C_{\max}\) is NP-hard. The hardness is weak because PARTITION has a pseudo-polynomial \(O(nB)\) algorithm.

Since the reduction uses exponentially large numbers (the values \(a_i\) can be exponential in the input bit-length), \(P2 \| \| C_{\max}\) is only weakly NP-hard. In fact, an FPTAS exists (via dynamic programming on the DP table for makespan).

Remark (Weak vs. strong NP-hardness in scheduling): The PARTITION reduction for \(P2||C_{\max}\) is weakly NP-hard because PARTITION itself is only weakly NP-complete — it has a pseudo-polynomial \(O(nB)\) dynamic programming algorithm (the subset-sum DP). As a consequence, \(P2||C_{\max}\) also admits a pseudo-polynomial algorithm and, after scaling and rounding, an FPTAS. The situation changes dramatically for \(P||C_{\max}\) with variable \(m\): the 3-PARTITION reduction (see Section 4.3) gives strong NP-hardness, ruling out any FPTAS under standard assumptions. Similarly, \(J3||C_{\max}\) is strongly NP-hard, meaning no pseudo-polynomial algorithm exists for it unless P = NP — numerical scaling does not help when the problem structure itself is exponential. The distinction between weak and strong hardness is practically crucial: weakly hard problems often admit excellent approximation schemes, while strongly hard ones typically do not.

3-PARTITION and \(P_m \| \| C_{\max}\) (strongly NP-hard)

3-PARTITION: Given \(3q\) positive integers \(a_1, \ldots, a_{3q}\) with \(B/4 < a_i < B/2\) and \(\sum a_i = qB\), can they be partitioned into \(q\) triples each summing to \(B\)?

Theorem (Garey & Johnson, 1975): \(P_m \| \| C_{\max}\) is NP-hard in the strong sense (m is given in unary or as part of the instance) by reduction from 3-PARTITION.
Proof sketch. Create \(m = q\) machines and \(n = 3q\) jobs with \(p_j = a_j\). An optimal makespan of exactly \(B\) exists if and only if the 3-PARTITION instance has a solution. Since 3-PARTITION is strongly NP-complete, no FPTAS exists for \(P \| \| C_{\max}\) with \(m\) variable unless P = NP.

NP-hardness of \(1 \| r_j \| \sum C_j\)

Proof sketch (\(1 \| r_j \| \sum C_j\) is NP-hard by reduction from PARTITION): Given PARTITION instance with integers \(a_1, \ldots, a_n\) summing to \(2B\), construct a scheduling instance as follows. Create \(n\) jobs with processing times \(p_j = a_j\). Set release dates \(r_j = 0\) for all jobs except two "marker" jobs: add a job \(n+1\) with \(p_{n+1} = 0\) and \(r_{n+1} = B\). The key constraint is that job \(n+1\) is released at time \(B\), so if no job finishes exactly at time \(B\), the machine is idle from some time \(t < B\) until \(B\), increasing \(\sum C_j\). A partition of the \(a_j\) into two equal halves summing to \(B\) corresponds to a schedule that keeps the machine busy from 0 to \(B\) on one half, then from \(B\) onward on the other half, minimising \(\sum C_j\). The formal reduction shows that the decision version of \(1 \| r_j \| \sum C_j\) (is there a schedule with \(\sum C_j \leq K\)?) is NP-complete, hence the optimisation problem is NP-hard. This contrasts with \(1 \| \| \sum C_j\) (polynomial, solved by SPT) and demonstrates that even adding release dates to the simplest single-machine problem makes it intractable.

Other NP-hardness results

Problem	Hardness	Reduction source
\(1 \\| \\| \sum w_j T_j\)	NP-hard (strongly)	3-PARTITION (Lawler, 1977)
\(1 \\| r_j \\| L_{\max}\)	NP-hard (weakly)	PARTITION (Lenstra et al., 1977)
\(F_3 \\| \\| C_{\max}\)	NP-hard (strongly)	3-PARTITION (Garey et al., 1976)
\(J_2 \\| \\| C_{\max}\)	NP-hard (strongly)	3-SAT / Garey et al., 1976
\(1 \\| \\| \sum w_j U_j\)	NP-hard	PARTITION

Theorem (\(1|r_j, \text{prec}|C_{\max}\) is NP-hard): The single-machine makespan problem with both release dates and precedence constraints is NP-hard (weakly) by reduction from PARTITION.

Proof sketch. Note first that \(1||C_{\max} = \sum_j p_j\) is trivially polynomial (the makespan equals total processing time regardless of sequence). Adding release dates alone (\(1|r_j|C_{\max}\)) is also polynomial: it is equivalent to scheduling all jobs, and since the machine cannot process any job before its release date, the optimal makespan is \(\max(\max_j r_j + p_j, \sum_j p_j)\) — still easy. However, combining release dates with precedence constraints creates genuine difficulty.

Given a PARTITION instance \(a_1, \ldots, a_n\) with \(\sum a_i = 2B\), construct a scheduling instance as follows. Create \(n\) “data” jobs with \(p_j = a_j\) and \(r_j = 0\). Add a “barrier” job \(n+1\) with \(p_{n+1} = 0\) (zero processing time), release date \(r_{n+1} = B\), and precedence constraints requiring job \(n+1\) to complete before a “final” job \(n+2\) with \(p_{n+2} = 0\) can start. The precedence structure forces the machine to be idle at time \(B\) if no subset of the \(a_j\) sums to exactly \(B\). A partition with one side summing to \(B\) allows a schedule that keeps the machine busy from 0 to \(B\) (on the first half), then processes job \(n+1\) at time \(B\) (instantly), then finishes the second half with no idle time. Without a valid partition, there is always a gap, increasing the makespan. Thus the decision problem (is \(C_{\max} \leq 2B\)?) is equivalent to PARTITION.

Remark (Why \(1|r_j|C_{\max}\) is easy but \(1|r_j,\text{prec}|C_{\max}\) is hard): The polynomial solvability of \(1|r_j|C_{\max}\) might seem surprising — don't release dates restrict when jobs can start, creating complex idle times? In fact, the optimal makespan is always \(\max\!\left(\max_j(r_j + p_j),\, r_{\min} + \sum_j p_j\right)\), where \(r_{\min} = \min_j r_j\). One optimal schedule simply processes jobs in any order, waiting for each job's release date. The machine's total busy time is \(\sum_j p_j\) regardless of release dates, so the only question is whether any job's individual deadline exceeds the total busy time plus start — and this is captured by the formula above. Precedence constraints break this argument because they force a specific job ordering, which may require waiting for a released-but-blocked job while an unblocked but released job is available. This interaction between release times and the forced ordering creates the NP-hardness.

Polynomial-time solvable problems (selected):

Problem	Algorithm	Complexity
\(1 \\| \\| \sum w_j C_j\)	WSPT	\(O(n \log n)\)
\(1 \\| \\| L_{\max}\)	EDD	\(O(n \log n)\)
\(1 \\| \\| \sum U_j\)	Moore	\(O(n \log n)\)
\(1 \\| r_j, \text{pmtn} \\| L_{\max}\)	Preemptive EDD	\(O(n \log n)\)
\(P \\| \text{pmtn} \\| C_{\max}\)	McNaughton	\(O(n)\)
\(F_2 \\| \\| C_{\max}\)	Johnson	\(O(n \log n)\)

Chapter 5: Integer Programming and Exact Methods

5.1 IP Formulations for Single-Machine Problems

Positional (assignment) formulation

Let \(x_{jk} = 1\) if job \(j\) is assigned to position \(k\) in the sequence. Then:

\[ \min \sum_{j=1}^{n} \sum_{k=1}^{n} w_j C_{jk} \cdot x_{jk} \]

subject to \(\sum_k x_{jk} = 1\) for all \(j\), \(\sum_j x_{jk} = 1\) for all \(k\), \(x_{jk} \in \{0,1\}\), where \(C_{jk} = \sum_{l=1}^{k} \sum_{j'} p_{j'} x_{j'l}\). This is a linear assignment problem when completion times are linear in positions.

Disjunctive (precedence) formulation

Let \(y_{jk} = 1\) if job \(j\) precedes job \(k\). Let \(C_j\) be a continuous completion-time variable. Then \(1 \| \| \sum w_j C_j\) (with sequence-dependent setups \(s_{jk}\)) can be written:

\[ C_k \geq C_j + s_{jk} + p_k - M(1 - y_{jk}), \quad \forall j \neq k, \]\[ C_j + s_{jk} + p_k \leq C_k + M y_{jk}, \quad \forall j \neq k, \]\[ y_{jk} + y_{kj} = 1, \quad \forall j < k, \quad y_{jk} \in \{0,1\}. \]

This “big-M” formulation has \(O(n^2)\) binary variables and is the basis for branch-and-bound.

5.1b LP Relaxations and Duality in Scheduling

Linear programming relaxations play a central role in deriving lower bounds and approximation algorithms for scheduling problems. The key insight is that many scheduling problems can be formulated as integer programs whose LP relaxations are both tractable and informative.

Theorem (LP bound for \(1||\sum w_j C_j\)): The following LP relaxation provides a lower bound on \(\min \sum w_j C_j\): \[ \min \sum_{j} w_j C_j \] subject to: \[ \sum_{j \in S} C_j \geq p(S) + \sum_{j \in S} p_j \cdot \frac{|S|-1}{2} \quad \forall S \subseteq \{1,\ldots,n\}, \]\[ C_j \geq p_j \quad \forall j, \]

where \(p(S) = \sum_{j \in S} p_j\). The LP optimal equals the scheduling optimal (no integrality gap), so this is a tight formulation.

The exponentially many constraints (one per subset \(S\)) are handled in practice by a separation oracle: given a candidate solution \((C_j)\), find a violated constraint or certify that none exists. For this particular LP, the separation problem reduces to a sorting problem solvable in \(O(n \log n)\), making the overall LP solvable in polynomial time.

Remark (LP duality and scheduling lower bounds): LP duality provides a systematic way to extract lower bounds for scheduling problems. For the assignment formulation of \(P||C_{\max}\), the LP dual variables can be interpreted as "machine prices" — the cost of using a machine for one unit of time. An optimal dual solution assigns prices to machine-time slots such that no job wants to use a different slot, analogous to an economic equilibrium. This duality connection is exploited in the Lenstra–Shmoys–Tardos 2-approximation for \(R||C_{\max}\): the LP dual provides a fractional assignment, and the primal-dual relationship bounds the rounding error. More broadly, LP duality underlies the analysis of many scheduling approximation algorithms — whenever an algorithm is shown to achieve ratio \(\rho\), the proof typically exhibits a dual solution of value \(\geq \text{ALG}/\rho\), thereby certifying the bound.

5.2 Connection to the Travelling Salesman Problem

When sequence-dependent setup times \(s_{jk}\) are present, minimising \(C_{\max}\) on one machine is equivalent to finding a Hamiltonian path in the complete graph on \(n+1\) nodes (jobs plus a dummy start node) with edge weights \(s_{jk}\). This is precisely the asymmetric TSP. Conversely, instances of TSP can be encoded as single-machine scheduling problems with sequence-dependent setups, establishing that \(1 \| s_{jk} \| C_{\max}\) is NP-hard.

5.3 Dynamic Programming

Branch-and-bound finds exact solutions by pruning the search tree, but it offers no polynomial-time guarantee — its worst-case complexity is still exponential. Dynamic programming is a different exact method that exploits optimal substructure: if the jobs \(S \setminus \{j\}\) are scheduled optimally and job \(j\) is appended last, the result is an optimal schedule for \(S\) (given that \(j\) is scheduled last). This substructure holds for a wide range of single-machine objectives — including \(\sum w_j T_j\) — because total weighted tardiness decomposes additively over jobs and the contribution of job \(j\) depends only on when it completes.

Two flavors of DP are relevant in scheduling. The exponential-state DP stores one value per subset \(S \subseteq \{1,\ldots,n\}\), giving \(O(2^n \cdot n)\) time — exponential but far better than \(n!\). This is valuable when \(n \leq 20\) or so. The pseudo-polynomial DP applies when the objective has integer data and the problem is only weakly NP-hard: the state space is polynomial in the numeric values of the input (e.g., total processing time), making it practical when those values are small. The FPTAS for \(P2 \| \| C_{\max}\) below is a rounding technique that turns a pseudo-polynomial DP into a fully polynomial approximation scheme.

Example (Bitmask DP for \(1|\text{prec}|\sum C_j\) with 4 jobs): Consider four jobs with processing times \(p = (2, 3, 1, 2)\) and precedence constraints \(1 \to 3\), \(2 \to 3\), \(3 \to 4\) (jobs 1 and 2 must both complete before job 3 starts; job 3 must complete before job 4 starts).

Feasible orderings: any permutation \(\sigma\) of \(\{1,2,3,4\}\) must have \(\sigma^{-1}(1) < \sigma^{-1}(3)\), \(\sigma^{-1}(2) < \sigma^{-1}(3)\), and \(\sigma^{-1}(3) < \sigma^{-1}(4)\). The only feasible permutations are \((1,2,3,4)\) and \((2,1,3,4)\).

Evaluate \(\sum C_j\) for each feasible permutation:

Order \((1,2,3,4)\):

\[ C_1 = 2,\quad C_2 = 2+3 = 5,\quad C_3 = 5+1 = 6,\quad C_4 = 6+2 = 8. \]\[ \sum C_j = 2 + 5 + 6 + 8 = 21. \]

Order \((2,1,3,4)\):

\[ C_2 = 3,\quad C_1 = 3+2 = 5,\quad C_3 = 5+1 = 6,\quad C_4 = 6+2 = 8. \]\[ \sum C_j = 3 + 5 + 6 + 8 = 22. \]

Optimal: order \((1,2,3,4)\) with \(\sum C_j = 21\). Intuitively, job 1 (shorter, \(p_1=2\)) should precede job 2 (longer, \(p_2=3\)) to reduce the completion time of job 1 at the cost of delaying job 2 — but since both have equal weight, SPT among the first two unblocked jobs suggests scheduling job 1 first.

\[ f(S) = \min_{j \in S,\, \text{no successor of } j \text{ in } S} \bigl\{ f(S \setminus \{j\}) + p(S) \bigr\}. \]

This runs in \(O(2^n \cdot n)\) time and \(O(2^n)\) space. For \(n=20\), approximately \(20 \times 10^6\) operations — tractable. For \(n=30\), approximately \(30 \times 10^9\) — borderline. For \(n \geq 40\), exponential DP is typically infeasible.

Theorem (Lawler's algorithm for \(1|\text{prec}|L_{\max}\)): The optimal schedule for minimising maximum lateness with arbitrary precedence constraints on a single machine can be computed in \(O(n^2)\) time by the following backward greedy algorithm:

Lawler’s algorithm: Repeat until all jobs are scheduled. Among all jobs with no unscheduled successors (i.e., all successors are already placed in later positions), select the job \(j^*\) that minimises \(d_{j^*}\) subject to \(j^*\) being feasible to place last among the remaining jobs. Assign \(j^*\) to the last remaining position. Remove \(j^*\) from consideration.

Proof sketch (exchange argument). Let \(\sigma\) be the schedule produced by Lawler's algorithm and let \(\sigma^*\) be an optimal schedule. Suppose they first differ at some position \(k\) (counting from the end): Lawler places job \(j\) at position \(k\) from the end, while \(\sigma^*\) places job \(k'\) there. We claim the swap of \(j\) and \(k'\) in \(\sigma^*\) does not increase \(L_{\max}\): since Lawler chose \(j\) over \(k'\) (minimising \(d\) among feasible last jobs at that step), we have \(d_j \leq d_{k'}\). After the swap, the lateness of job \(j\) decreases (it finishes earlier with a tight due date), and the lateness of \(k'\) increases (it finishes later), but since \(d_{k'} \geq d_j\), the maximum does not increase. Applying this exchange argument step-by-step shows Lawler's schedule achieves \(L_{\max} \leq L_{\max}(\sigma^*)\), hence is optimal.

Remark (Exponential DP in practice): The bitmask DP for \(1|\text{prec}|\sum C_j\) runs in \(O(2^n \cdot n)\) time. For \(n = 20\) this is about \(20 \times 10^6\) operations — feasible on modern hardware in under a second. For \(n = 25\) it is about \(800 \times 10^6\) — still feasible in minutes. For \(n = 30\) it requires about \(30 \times 10^9\) operations and tens of gigabytes of memory for the DP table — typically infeasible. In practice, branch-and-bound with lower bounds from LP relaxations handles instances of \(n\) up to 50–100 for well-structured precedence graphs (chains, trees). For harder shop scheduling problems such as \(J||C_{\max}\), exact algorithms rarely scale beyond \(n = 20\) jobs without heavy pruning and problem-specific lower bounds. For operational planning in factories, this means exact methods apply only to short planning horizons or small job sets; heuristics and metaheuristics (simulated annealing, tabu search) are necessary for realistic instances.

DP for \(1 \| \| \sum w_j T_j\)

Let \(f(S, t)\) = minimum weighted tardiness of jobs in \(S\) when they are scheduled starting at time \(t\). Then:

\[ f(\emptyset, t) = 0, \]\[ f(S, t) = \min_{j \in S} \left\{ w_j \max(t + p_j - d_j, 0) + f(S \setminus \{j\}, t + p_j) \right\}. \]

Since \(t\) is determined by \(S\) (it equals the total processing time of jobs not in \(S\)), the state space has \(2^n\) entries. Running time: \(O(2^n \cdot n)\).

Pseudo-polynomial DP for weakly NP-hard problems

For \(P2 \| \| C_{\max}\) with integer data: let \(W = \sum_j p_j\). Define \(V[i]\) = 1 if a subset of jobs with total processing time exactly \(i\) exists. Recurrence: \(V[i] = 1\) if \(V[i - p_j] = 1\) for some \(j\). Running time: \(O(nW)\). Optimal makespan: \(\min_{i \leq \lfloor W/2 \rfloor} \{W - i : V[i] = 1\}\). This DP can be scaled to yield an FPTAS for \(P2 \| \| C_{\max}\).

Remark (Sequencing games and mechanism design): When jobs are submitted by strategic agents — firms, users, or processors that can lie about their processing times or due dates to gain priority — the scheduling problem becomes a mechanism design problem. A naive scheduler that simply applies SPT can be manipulated: a job owner with processing time \(p_j = 10\) might report \(p_j' = 1\) to gain earlier placement, harming all other agents. The Vickrey-Clarke-Groves (VCG) mechanism applied to scheduling yields a payment rule that makes truthful reporting a dominant strategy: each agent pays the externality they impose on others. For \(1\|\|\sum C_j\) with unit weights, the VCG mechanism combined with SPT is both optimal (minimises total completion time) and incentive-compatible (no agent benefits by misreporting processing time). However, the VCG mechanism generally requires monetary transfers and is not applicable to all objective functions — for weighted settings or with due dates, the combination of optimality and incentive compatibility may not simultaneously be achievable.

Example (LP relaxation lower bound for a single-machine scheduling instance): Consider \(1||\sum w_j C_j\) with 4 jobs: \((p_1, w_1) = (3, 4)\), \((p_2, w_2) = (2, 3)\), \((p_3, w_3) = (4, 6)\), \((p_4, w_4) = (1, 2)\). \[ \frac{w_1}{p_1} = \frac{4}{3} \approx 1.33,\quad \frac{w_2}{p_2} = \frac{3}{2} = 1.5,\quad \frac{w_3}{p_3} = \frac{6}{4} = 1.5,\quad \frac{w_4}{p_4} = \frac{2}{1} = 2. \]

WSPT order: job 4 (ratio 2), jobs 2 and 3 (ratio 1.5, tie — try both), job 1 (ratio 1.33).

\[ C_4 = 1,\quad C_2 = 3,\quad C_3 = 7,\quad C_1 = 10. \]\[ \sum w_j C_j = 2(1) + 3(3) + 6(7) + 4(10) = 2 + 9 + 42 + 40 = 93. \]\[ C_4 = 1,\quad C_3 = 5,\quad C_2 = 7,\quad C_1 = 10. \]\[ \sum w_j C_j = 2(1) + 6(5) + 3(7) + 4(10) = 2 + 30 + 21 + 40 = 93. \]

Both orderings give 93 — the WSPT tie-breaking does not affect the objective. Optimal \(\sum w_j C_j = 93\).

Lower bound from LP: the LP relaxation of the positional formulation (allowing fractional assignment of jobs to positions) gives a lower bound of at most 93. For this instance with no ties in the LP, the LP bound equals the IP bound (no integrality gap when WSPT is optimal and jobs can be sorted uniquely).

5.4 Branch-and-Bound

Branch-and-bound (B&B) explores the implicit enumeration tree of all schedules, pruning branches whose lower bound exceeds the current best-known upper bound.

Branching strategies:

Fix the next job: at each node, branch on which job is assigned to the next open position. Produces a tree of depth \(n\) with \(n!\) leaves.
Assign jobs to time slots: branch on which job is processed at a given time on a given machine.

Lower bounds:

Preemptive relaxation: solve the preemptive version (polynomial) to get a lower bound on the non-preemptive optimum.
Machine relaxation: allow a job to be processed simultaneously on all machines; the resulting single-machine problem is easier.
LP relaxation: solve the LP relaxation of the IP formulation; the optimal LP value is a valid lower bound.

Carlier–Pinson algorithm (1989): For \(J \| \| C_{\max}\) (job shop), a sub-problem on the bottleneck machine yields a 1-machine problem; solving this by Jackson’s preemptive schedule and computing its lower bound enables efficient pruning.

Chapter 6: Flow Shops

6.1 Johnson’s Algorithm for \(F_2 \| \| C_{\max}\)

In a 2-machine flow shop, every job \(j\) has a processing time \(p_{j1}\) on machine 1 and \(p_{j2}\) on machine 2, performed in that order.

Theorem (Johnson, 1954): An optimal schedule for \(F_2 \| \| C_{\max}\) is produced by the following algorithm:

Partition jobs into \(S_1 = \{j : p_{j1} \leq p_{j2}\}\) and \(S_2 = \{j : p_{j1} > p_{j2}\}\).
Order jobs in \(S_1\) by non-decreasing \(p_{j1}\).
Order jobs in \(S_2\) by non-increasing \(p_{j2}\).
The optimal sequence is: \(S_1\) (in order) followed by \(S_2\) (in order).

Running time: \(O(n \log n)\).
Proof sketch (exchange argument). For any two adjacent jobs \(j\) (before \(k\)), consider swapping them. Let \(T\) be the time machine 2 becomes free just before processing \(j\). With order \((j,k)\), machine 2 finishes job \(k\) at time: \[ F_{jk} = \max(T + p_{j2}, T + p_{j1} + p_{k1}) + p_{k2}. \] With order \((k,j)\): \[ F_{kj} = \max(T + p_{k2}, T + p_{k1} + p_{j1}) + p_{j2}. \]

Order \((j,k)\) is at least as good as \((k,j)\) if and only if \(F_{jk} \leq F_{kj}\), which simplifies (after cancelling common terms) to:

\[ \min(p_{j1}, p_{k2}) \leq \min(p_{k1}, p_{j2}). \]

Johnson’s algorithm constructs a sequence satisfying this pairwise condition for all adjacent pairs: if \(j \in S_1\) (i.e., \(p_{j1} \leq p_{j2}\)), then \(\min(p_{j1}, p_{k2}) = p_{j1}\) for any \(k\), and \(j\) should come before \(k \in S_2\) iff \(p_{j1} \leq p_{k2}\), which is guaranteed by the sorting within \(S_1\) and \(S_2\).

Example (Johnson's algorithm — 5-job 2-machine flow shop): Five jobs with \((p_{j1}, p_{j2})\): job A \((3,5)\), job B \((1,2)\), job C \((4,1)\), job D \((2,4)\), job E \((5,3)\).

Step 1: Partition.

\(S_1 = \{j : p_{j1} \leq p_{j2}\}\): job A (\(3 \leq 5\)), job B (\(1 \leq 2\)), job D (\(2 \leq 4\)). Sort \(S_1\) by increasing \(p_{j1}\): B (1), D (2), A (3).
\(S_2 = \{j : p_{j1} > p_{j2}\}\): job C (\(4 > 1\)), job E (\(5 > 3\)). Sort \(S_2\) by decreasing \(p_{j2}\): E (3), C (1).

Step 2: Johnson sequence = B, D, A, E, C.

Step 3: Compute makespan. Track both machines:

Job	\(p_{j1}\)	Start M1	End M1	\(p_{j2}\)	Start M2	End M2
B	1	0	1	2	1	3
D	2	1	3	4	3	7
A	3	3	6	5	7	12
E	5	6	11	3	12	15
C	4	11	15	1	15	16

Makespan \(C_{\max} = 16\). This is optimal for this instance; any other ordering would give a larger makespan.

Why Johnson’s rule works: The exchange argument shows that for any adjacent jobs \(j\) before \(k\), it is never better to swap if \(\min(p_{j1}, p_{k2}) \leq \min(p_{k1}, p_{j2})\). The algorithm constructs a sequence satisfying this pairwise condition for all adjacent pairs.

Remark (Flow shop vs. job shop complexity): The complexity of shop scheduling depends sharply on the number of machines and whether jobs follow a fixed or arbitrary routing. The key boundary results are:

\(F2||C_{\max}\): polynomial, solved by Johnson’s algorithm in \(O(n \log n)\).

\(F3||C_{\max}\): NP-hard in the strong sense (Garey, Johnson and Sethi, 1976). Even restricting to 3 machines in a fixed order makes the problem hard.

\(J2||C_{\max}\): NP-hard in the strong sense (Gonzalez and Sahni, 1976). The 2-machine job shop (where jobs may visit machines in different orders) is harder than the 2-machine flow shop. Jackson’s rule solves it by decomposing into two flow shop subproblems.

\(J3||C_{\max}\): NP-hard in the strong sense — one of the first NP-hardness results established in the scheduling literature (Gonzalez and Sahni, 1976). No pseudo-polynomial algorithm exists unless P = NP.

\(O2||C_{\max}\): polynomial (Gonzalez and Sahni, 1976). The 2-machine open shop (jobs visit both machines, in any order) is solvable because the machine order can be optimised as part of the schedule.

The pattern: fixing the machine order (flow shop) is easier than allowing arbitrary routings (job shop); open shop is intermediate since the order is free but must be set consistently. Adding one machine to the polynomial flow shop or job shop problem typically induces NP-hardness.

Theorem (Jackson's algorithm for \(J2||C_{\max}\)): The 2-machine job shop, where each job has at most two operations and may visit either machine first, can be solved optimally by the following algorithm.

Partition jobs into three groups:

Group \(A\): jobs processed first on machine 1, then machine 2 (routing \(1 \to 2\)).
Group \(B\): jobs processed first on machine 2, then machine 1 (routing \(2 \to 1\)).
Group \(C\): jobs processed on only one machine.

Apply Johnson’s algorithm to group \(A\) to obtain sequence \(\sigma_A\) (treating their machine-1 time as \(p_{j1}\) and machine-2 time as \(p_{j2}\)). Apply Johnson’s algorithm to group \(B\) to obtain sequence \(\sigma_B\). The optimal schedule runs the \(A\)-jobs on machine 1 in order \(\sigma_A\), interspersed with the \(B\)-jobs’ machine-1 operations (scheduled after machine 2 finishes their first operation) in order \(\sigma_B\). Machine 2 runs \(B\)-jobs first in order \(\sigma_B\), then \(A\)-jobs in order \(\sigma_A\).

The resulting schedule is optimal for \(J2||C_{\max}\).

6.2 The Flow Shop with More Machines: \(F_m \| \| C_{\max}\), \(m \geq 3\)

Theorem (Garey, Johnson & Sethi, 1976): \(F_3 \| \| C_{\max}\) is NP-hard in the strong sense (reduction from 3-PARTITION).

For \(F_3 \| \| C_{\max}\) with dominant machines (e.g., \(\min_j p_{j2} \geq \max_j p_{j1}\) or \(\min_j p_{j2} \geq \max_j p_{j3}\)), Johnson’s algorithm generalises: define \(a_j = p_{j1} + p_{j2}\), \(b_j = p_{j2} + p_{j3}\) and apply 2-machine Johnson on these combined times.

6.3 Heuristics for Large Flow Shops

NEH heuristic (Nawaz, Enscore & Ham, 1983): one of the best constructive heuristics for \(F_m \| \| C_{\max}\).

Definition (NEH heuristic):

Sort jobs in non-increasing order of total processing time \(\sum_k p_{jk}\); let this order be \(\pi_1, \pi_2, \ldots, \pi_n\).
Start with a partial schedule containing \(\pi_1\).
For \(i = 2, \ldots, n\): insert job \(\pi_i\) into the best of the \(i\) possible positions in the current partial schedule (the position that minimises the current makespan). Ties broken arbitrarily.

Running time: \(O(n^2 m)\). NEH consistently outperforms random or greedy dispatching rules in computational studies.

6.4 Flexible Flow Shops

A flexible flow shop (hybrid flow shop) has \(s\) stages; at stage \(k\), there are \(m_k \geq 1\) parallel identical machines. Every job must be processed at every stage. This generalises both the flow shop (\(m_k = 1\)) and the parallel machine problem (\(s = 1\)). Minimising \(C_{\max}\) is NP-hard even with \(s = 2\) and \(m_1 = 1\), \(m_2 = 2\).

Chapter 7: Job Shops and Open Shops

7.1 The Job Shop: \(J_m \| \| C_{\max}\)

In a job shop, each job \(j\) has a specific routing: a sequence of machines \(\mu_{j1}, \mu_{j2}, \ldots, \mu_{j n_j}\) (not necessarily all machines) with corresponding processing times. Jobs cannot be preempted.

Theorem (Garey, Johnson & Sethi, 1976): \(J_2 \| \| C_{\max}\) (2-machine job shop) is NP-hard in the strong sense. The reduction is from 3-PARTITION.

A trivial lower bound on makespan: \(C_{\max} \geq \max_j \sum_k p_{jk}\) (no job can complete before its total processing) and \(C_{\max} \geq \max_i \sum_j p_{ji}\) (no machine can be overloaded).

7.2 The Disjunctive Graph Model

Definition (Disjunctive graph): For a job shop instance, the disjunctive graph \(G = (V, C \cup D)\) is constructed as follows:

Nodes: one node per operation, plus a source \(s\) and sink \(t\).
Conjunctive arcs \(C\): for each job, arcs from each operation to the next in the routing, with weight equal to the processing time of the preceding operation. Arcs from \(s\) to the first operation of each job, and from the last operation to \(t\).
Disjunctive arcs \(D\): for each machine \(i\) and each pair of operations \((u, v)\) assigned to machine \(i\), a pair of undirected arcs \(\{u, v\}\) — these represent the machine conflict: either \(u\) precedes \(v\) or vice versa.

A complete acyclic orientation of the disjunctive arcs assigns a direction to every disjunctive arc such that the resulting directed graph is acyclic. Each such orientation corresponds to a feasible (non-preemptive) job shop schedule. The makespan equals the length of the longest path from \(s\) to \(t\) in the oriented graph.

Theorem: The optimal makespan for \(J_m \| \| C_{\max}\) equals the minimum, over all complete acyclic orientations of \(D\), of the longest path from \(s\) to \(t\) in \(G\).

Example (Disjunctive graph for a small job shop): Consider a 2-machine job shop with 3 jobs. Job 1 has routing \(M1 \to M2\) with times \((3, 2)\); job 2 has routing \(M2 \to M1\) with times \((2, 4)\); job 3 has routing \(M1 \to M2\) with times \((1, 3)\).

Nodes: source \(s\), sink \(t\), operations \(o_{11}\) (job 1 on M1), \(o_{12}\) (job 1 on M2), \(o_{21}\) (job 2 on M2), \(o_{22}\) (job 2 on M1), \(o_{31}\) (job 3 on M1), \(o_{32}\) (job 3 on M2).

Conjunctive arcs (job routing order): \(s \to o_{11} \to o_{12} \to t\), \(s \to o_{21} \to o_{22} \to t\), \(s \to o_{31} \to o_{32} \to t\). Arc weights equal the processing time of the source node.

Disjunctive arcs (machine conflicts): On M1: conflict between \(o_{11}\), \(o_{22}\), \(o_{31}\) (all use M1). On M2: conflict between \(o_{12}\), \(o_{21}\), \(o_{32}\) (all use M2). Each conflict is represented by a pair of undirected arcs.

An orientation of M1 conflicts could be: \(o_{31} \to o_{11} \to o_{22}\) (process job 3, then job 1, then job 2 on M1). An orientation of M2 conflicts could be: \(o_{21} \to o_{12} \to o_{32}\). This gives a specific feasible schedule; the makespan is the longest path from \(s\) to \(t\) in the resulting DAG.

Different orientations correspond to different schedules. The optimal makespan is found by choosing the orientation that minimises the longest path.

7.3 Branch-and-Bound for \(J \| \| C_{\max}\)

The most effective exact algorithms for the job shop use branch-and-bound over the disjunctive graph:

Branching: At each node, select an unresolved disjunctive arc (machine conflict). Branch into two sub-problems: one where operation \(u\) precedes \(v\), one where \(v\) precedes \(u\).
Lower bound: Compute the longest path in the partially oriented graph. This requires solving a longest-path problem in a DAG: \(O(|V| + |C \cup D'|)\) where \(D'\) is the set of currently oriented disjunctive arcs.
Pruning: If the lower bound at any node exceeds the best known solution, prune that branch.

The Carlier–Pinson algorithm (1989) strengthens the lower bound by extracting a single-machine bottleneck subproblem (\(1 \| r_j, q_j \| C_{\max}\) with head/tail times) and solving it exactly — this gives a tighter bound and enables more pruning.

7.4 Open Shops

In an open shop, each job must be processed once on each machine, but the order of machines is unconstrained (can differ per job and can be chosen as part of the optimisation).

Example (Open shop vs. flow shop for the same data): Two machines, 3 jobs with processing times \(p_{j1}\) (on M1) and \(p_{j2}\) (on M2): job 1 \((3, 1)\), job 2 \((2, 2)\), job 3 \((1, 3)\).

As a flow shop \(F2||C_{\max}\) (fixed order \(M1 \to M2\)): Apply Johnson’s algorithm. \(S_1 = \{j : p_{j1} \leq p_{j2}\} = \{1 \; (3>1? \text{ no}), 3 \; (1 \leq 3)\}\). Wait: \(p_{11}=3 > p_{12}=1\), so job 1 goes to \(S_2\). \(p_{21}=2 = p_{22}=2\), tie, goes to \(S_1\) or \(S_2\) (arbitrary, say \(S_1\)). \(p_{31}=1 < p_{32}=3\), job 3 in \(S_1\). \(S_1 = \{2, 3\}\) (sort by \(p_{j1}\): 3, 2), \(S_2 = \{1\}\). Johnson sequence: 3, 2, 1.

Makespan computation: M1: job 3 ends 1, job 2 ends 3, job 1 ends 6. M2: job 3 starts 1, ends 4; job 2 starts 4, ends 6; job 1 starts 6, ends 7. \(C_{\max} = 7\).

As an open shop \(O2||C_{\max}\) (machine order free per job): The optimal open shop makespan is \(\max(\max_j(p_{j1}+p_{j2}), \sum_j p_{j1}, \sum_j p_{j2}) = \max(4, 4, 4, 6, 6, 6) = 6\). We verify: \(\max_j(p_{j1}+p_{j2}) = \max(4, 4, 4) = 4\), \(\sum_j p_{j1} = 6\), \(\sum_j p_{j2} = 6\). So \(C_{\max}^* = 6\).

Achievable schedule: Job 1 visits M1 first (\(t=0\) to 3), then M2 (\(t=3\) to 4). Job 2 visits M2 first (\(t=0\) to 2), then M1 (\(t=3\) to 5). Job 3 visits M2 first (\(t=2\) to 5), then M1 (\(t=5\) to 6). Makespan \(C_{\max} = 6\).

The open shop achieves \(C_{\max} = 6 < 7\) — one unit better than the flow shop for the same data, because the machine-order freedom allows better load balancing.

Theorem (Gonzalez & Sahni, 1976): The problem \(O_m \| \text{pmtn} \| C_{\max}\) is solvable in polynomial time. The optimal preemptive makespan is: \[ C_{\max}^* = \max\!\left(\max_j \sum_k p_{jk},\; \max_k \sum_j p_{jk},\; \frac{\sum_j \sum_k p_{jk}}{m}\right). \] A constructive schedule achieving this bound is produced in \(O(mn)\) time.

For non-preemptive open shops: \(O_2 \| \| C_{\max}\) is solvable in polynomial time (Gonzalez & Sahni), but \(O_m \| \| C_{\max}\) for \(m \geq 3\) is NP-hard (reduction from 3-PARTITION).

Chapter 8: Approximation Algorithms

8.1 Framework and Definitions

Definition (Approximation ratio): An algorithm \(A\) for a minimisation problem has approximation ratio \(\rho \geq 1\) if for every instance \(I\), \[ A(I) \leq \rho \cdot \mathrm{OPT}(I). \] A PTAS (polynomial-time approximation scheme) is a family of algorithms \(\{A_\varepsilon\}_{\varepsilon > 0}\) where \(A_\varepsilon\) achieves ratio \(1 + \varepsilon\) and runs in time polynomial in \(|I|\) (but possibly exponential in \(1/\varepsilon\)). A FPTAS (fully polynomial-time approximation scheme) also runs in time polynomial in \(1/\varepsilon\).

Definition (APX-hard): A problem is APX-hard if every problem in APX (the class of constant-factor approximable problems) reduces to it under PTAS reductions. APX-hard problems have no PTAS unless P = NP.

8.2 List Scheduling: \(P_m \| \| C_{\max}\)

Recall from Chapter 3:

List scheduling: \(2 - 1/m\) ratio (Graham, 1969).
LPT rule: \(4/3 - 1/(3m)\) ratio (Graham, 1969).

The LPT ratio is tight: instances with ratio approaching \(4/3\) exist (e.g., \(m = 3\) machines, 7 jobs with times \(3,3,2,2,2,1,1\)).

Theorem (LPT approximation ratio): The Longest Processing Time first (LPT) rule achieves approximation ratio \(4/3 - 1/(3m)\) for \(P_m||C_{\max}\).

Proof. Let \(j^*\) be the last job to finish in the LPT schedule, completing at time \(C_{\max}^{\text{LPT}}\). Let \(t\) be the time at which \(j^*\) started. At time \(t\), all \(m\) machines were busy (otherwise \(j^*\) would have started earlier by the list scheduling rule, contradicting that \(t\) is its start time). Therefore, all machines together performed at least \(m \cdot t\) units of work during \([0, t]\). Since total work is \(\sum_j p_j\): \[ m \cdot t \leq \sum_j p_j - p_{j^*} \quad \Rightarrow \quad t \leq \frac{\sum_j p_j - p_{j^*}}{m}. \] Thus: \[ C_{\max}^{\text{LPT}} = t + p_{j^*} \leq \frac{\sum_j p_j - p_{j^*}}{m} + p_{j^*} = \frac{\sum_j p_j}{m} + p_{j^*}\!\left(1 - \frac{1}{m}\right). \]

Now we use two lower bounds on OPT: \(\mathrm{OPT} \geq \sum_j p_j / m\) (average load) and \(\mathrm{OPT} \geq p_1\) (the largest job). In the LPT ordering, \(j^*\) is the last job assigned, so it was assigned after all other jobs. If \(n \leq m\), then each job goes to its own machine and LPT is trivially optimal. If \(n > m\), then at the time \(j^*\) is assigned, at least \(m\) jobs have already been placed. In LPT order, \(j^*\) has the smallest processing time among all jobs. In particular, \(j^*\) was placed in position \(n \geq m+1\) in the LPT list, so \(p_{j^*} \leq p_{\lceil n/m \rceil}\). Since in any optimal schedule with \(n > m\) jobs on \(m\) machines, some machine processes at least \(\lceil n/m \rceil \geq 2\) jobs, the largest of those jobs has \(p_{j^*} \leq \mathrm{OPT}/\lceil n/m \rceil \leq \mathrm{OPT} / 2\). A more refined argument (considering the first \(m+1\) jobs in LPT order) gives \(p_{j^*} \leq \mathrm{OPT}/3\) whenever \(n \geq 2m+1\) — the standard case for the tight bound. Substituting \(p_{j^*} \leq \mathrm{OPT}/3\) and \(\sum_j p_j / m \leq \mathrm{OPT}\):

\[ C_{\max}^{\text{LPT}} \leq \mathrm{OPT} + \frac{\mathrm{OPT}}{3}\!\left(1 - \frac{1}{m}\right) = \mathrm{OPT}\!\left(\frac{4}{3} - \frac{1}{3m}\right). \qquad \square \]

Example (LPT vs. list scheduling on 3 machines): Three machines, 7 jobs with processing times \(p = (6, 5, 5, 4, 4, 3, 3)\) (listed in decreasing order). Total processing time \(\sum p_j = 30\), so \(\mathrm{OPT} \geq 30/3 = 10\). Also \(\mathrm{OPT} \geq 6\) (largest job). In fact \(\mathrm{OPT} = 10\), achieved by the assignment \(\{6,4\}\) on M1 (load 10), \(\{5,5\}\) on M2 (load 10), \(\{4,3,3\}\) on M3 (load 10).

Arbitrary list scheduling (input order \(6,5,5,4,4,3,3\)): M1 gets job 1 (\(p=6\)), M2 gets job 2 (\(p=5\)), M3 gets job 3 (\(p=5\)). Next job 4 (\(p=4\)): least loaded is M2 or M3 (load 5); assign to M2. Next job 5 (\(p=4\)): least loaded M3 (load 5); assign to M3. Next job 6 (\(p=3\)): M1 load=6, M2=9, M3=9; assign to M1. Next job 7 (\(p=3\)): M1 load=9, M2=9, M3=9; assign to M1 (break tie). M1 final load = \(6+3+3 = 12\). \(C_{\max} = 12\).

LPT rule (same sorted order in this case): assigns jobs identically to list scheduling here since the input is already sorted. So LPT also gives \(C_{\max} = 12\) in this instance.

The LPT ratio bound says \(C_{\max} \leq (4/3 - 1/9) \cdot 10 = (12/9 - 1/9) \cdot 10 = (11/9) \cdot 10 \approx 12.2\), so \(C_{\max} = 12 \leq 12.2\). The bound holds, and this instance demonstrates the bound is close to tight: the ratio is \(12/10 = 1.2\), which is near \(4/3 \approx 1.33\) for small \(m\).

The tighter example for \(m=3\) achieving ratio approaching \(4/3\) has 7 jobs with sizes \((3m-1, 3m-1, 3m-1, 3m, 3m, 3m, 2m+1)\) scaled appropriately; this instance forces LPT to assign the last (shortest) job to a machine with load \(4m-1\) while \(\mathrm{OPT} = 3m\), giving ratio \((4m-1)/(3m) \to 4/3\).

Remark (Beyond LPT — PTAS and EPTAS for scheduling): The LPT rule achieves a constant approximation ratio \(4/3 - 1/(3m)\), which is optimal among oblivious list-scheduling rules. But for fixed \(m\), the PTAS by Hochbaum and Shmoys (1987) achieves ratio \(1+\varepsilon\) in polynomial time for any \(\varepsilon > 0\). The later Efficient PTAS (EPTAS) of Jansen (2010) achieves ratio \(1+\varepsilon\) in time polynomial in both \(n\) and \(1/\varepsilon\), making it practical even for small \(\varepsilon\). For variable \(m\), \(P||C_{\max}\) is APX-hard (no PTAS unless P = NP), so constant-factor approximations like LPT are the best possible in that regime (given current complexity-theoretic beliefs). The approximation hierarchy is: exact algorithms (for small \(n\) or fixed \(m\)), PTAS/EPTAS (for fixed \(m\), any \(\varepsilon\)), constant-factor approximations (for variable \(m\)), and heuristics (for very large instances where even constant-factor guarantees are too slow to verify).

8.3 PTAS for \(P_m \| \| C_{\max}\) (Fixed \(m\))

Theorem (Hochbaum & Shmoys, 1987): For fixed \(m\), there is a PTAS for \(P_m \| \| C_{\max}\).
Proof sketch. Given \(\varepsilon > 0\) and a known lower bound \(T^*\) on OPT (use \(T^* = \max(\max_j p_j, \sum_j p_j / m)\)):

Large jobs: jobs with \(p_j \geq \varepsilon T^*\). There are at most \(1/\varepsilon\) large jobs per machine in any optimal schedule, so at most \(m/\varepsilon\) large jobs total. Enumerate all assignments of large jobs to machines (at most \(m^{m/\varepsilon}\) choices — constant for fixed \(m, \varepsilon\)).
Small jobs: for each assignment of large jobs, greedily assign small jobs one-by-one to the currently least-loaded machine. Small jobs add at most \(\varepsilon T^*\) to the makespan (since each small job has \(p_j < \varepsilon T^*\)).
The resulting schedule has \(C_{\max} \leq (1 + \varepsilon) T^* \leq (1 + \varepsilon) \mathrm{OPT}\).

Total running time: \(O(n) + m^{m/\varepsilon} \cdot O(n/\varepsilon)\) — polynomial in \(n\) for fixed \(m, \varepsilon\).

Example (PTAS via rounding for \(P_m \| \| C_{\max}\)): Suppose \(m=2\) machines, \(\varepsilon = 0.5\), and OPT is known to be around \(T^* = 10\). Large jobs are those with \(p_j \geq \varepsilon T^* = 5\); say there are 2 large jobs (sizes 7 and 6). We enumerate all \(m^{m/\varepsilon} = 2^{2/0.5} = 2^4 = 16\) assignments of large jobs to machines — but with only 2 large jobs there are just 4 distinct assignments. The best assignment puts the size-7 job on machine 1 and size-6 job on machine 2 (loads 7 and 6). Small jobs (all sizes \(< 5\)) are then assigned greedily to the least-loaded machine, adding at most \(\varepsilon T^* = 5\) to the makespan. Final makespan \(\leq 7 + 5 = 12 \leq (1 + 0.5) \cdot 10 = 15\). The actual bound can be tighter; the key point is that the ratio is provably \(1+\varepsilon\) for all instances.

APX-hardness for variable \(m\): When \(m\) is part of the input, \(P \| \| C_{\max}\) is APX-hard (no PTAS unless P = NP), since bin packing (which is APX-hard for the number of bins) reduces to it.

Example (PTAS in action — enumerate large jobs, greedily assign small): Three machines (\(m=3\)), \(\varepsilon = 1/3\), and 8 jobs with processing times \(p = (9, 8, 7, 2, 2, 1, 1, 1)\). Lower bound \(T^* = \max(9, 31/3) = \max(9, 10.3) = 10.3\), so \(\mathrm{OPT} \geq 10.3\) and we round up to \(\mathrm{OPT} \geq 11\) (since processing times are integers).

Large jobs: \(p_j \geq \varepsilon \cdot T^* = (1/3)(10.3) = 3.4\). So jobs with \(p_j \geq 4\) are large: \(\{9, 8, 7\}\) (3 jobs). Small jobs: \(\{2, 2, 1, 1, 1\}\).

Enumerate all assignments of large jobs to machines: \(3^3 = 27\) possible assignments (each of 3 large jobs to one of 3 machines). The best balanced assignment is \(\{9\}\), \(\{8\}\), \(\{7\}\) — one large job per machine, giving loads 9, 8, 7.

Assign small jobs greedily (each to least-loaded machine):

Small job \(p=2\): assign to M3 (load 7), new loads: 9, 8, 9.
Small job \(p=2\): assign to M2 (load 8), new loads: 9, 10, 9.
Small jobs \(p=1,1,1\): assign to M1 (load 9→10), M3 (load 9→10), M1 (load 10→11). Final loads: M1=11, M2=10, M3=10. \(C_{\max} = 11\).

Optimal \(C_{\max} = 11\) (since \(\sum p_j = 31\) and \(\lceil 31/3 \rceil = 11\), the load lower bound is tight). The PTAS achieves the optimal value here. The ratio guarantee is \((1+\varepsilon) = 4/3\), but this instance achieves optimality exactly.

8.4 Approximation for \(R \| \| \sum w_j C_j\)

For unrelated machines with weighted completion time:

Formulate an LP relaxation: minimise \(\sum_j w_j \bar{C}_j\) where \(\bar{C}_j\) are fractional completion times obtained by LP. The LP can be solved in polynomial time and its value is at most OPT.
Randomised rounding: assign each job \(j\) to machine \(i\) with probability \(x_{ij}\) (from the LP solution), independently. The resulting integer schedule has expected cost at most 2·OPT (Schulz & Skutella, 2002).

This gives a 2-approximation (the integrality gap of the LP is exactly 2).

8.5 Preemptive Lower Bounds and Rounding

A general technique: solve the preemptive relaxation (which is polynomial) to obtain \(C_{\max}^{\text{pmtn}} \leq \mathrm{OPT}_{\text{non-pmtn}}\). Then convert the preemptive schedule to a non-preemptive one. For \(P \| \| C_{\max}\):

\[ C_{\max}^{\text{non-pmtn}} \leq C_{\max}^{\text{pmtn}} + \max_j p_j \leq \mathrm{OPT} + \mathrm{OPT} = 2 \cdot \mathrm{OPT}. \]

(Each job is delayed by at most one preemption-induced gap of size \(\max_j p_j \leq \mathrm{OPT}\).)

More refined analyses give the \((4/3)\)-approximation for LPT.

Chapter 9: Metaheuristics and Practical Procedures

9.1 Local Search

Neighbourhood structure: For permutation scheduling problems (single machine, flow shop), define a neighbourhood \(N(\pi)\) of schedule \(\pi\) as all schedules reachable by one move:

Swap: interchange jobs at positions \(i\) and \(j\).
Insert (shift): remove job at position \(i\) and reinsert at position \(j\).
2-opt: reverse a sub-sequence between positions \(i\) and \(j\).

Local search algorithm:

Start from an initial solution \(\pi_0\) (e.g., from a constructive heuristic).
Evaluate all (or a random subset of) neighbours.
Move to the best improving neighbour.
Terminate when no improving neighbour exists (local optimum).

Local search for \(J \| \| C_{\max}\): the neighbourhood is defined by swapping two adjacent operations on the critical path (the longest path in the disjunctive graph). Only moves on the critical path can improve the makespan, so the effective neighbourhood is small.

9.2 Simulated Annealing

Simulated annealing (Kirkpatrick, Gelatt & Vecchi, 1983) extends local search by occasionally accepting worse solutions to escape local optima.

Definition (Simulated annealing):

Initialise: \(\pi \leftarrow \pi_0\), temperature \(T \leftarrow T_0\) (high).
Repeat:
- Choose a random neighbour \(\pi' \in N(\pi)\).
- If \(f(\pi') < f(\pi)\): accept (\(\pi \leftarrow \pi'\)).
- Else: accept with probability \(\exp\!\left(-\frac{f(\pi') - f(\pi)}{T}\right)\).
Reduce temperature: \(T \leftarrow \alpha T\) (geometric cooling, e.g., \(\alpha = 0.95\)) after a fixed number of iterations.
Terminate when \(T\) is below a threshold or no improvement for many iterations.

Under appropriate cooling schedules, simulated annealing converges to a global optimum with probability 1 (Hajek, 1988) — but the required number of iterations is generally exponential. In practice, SA produces near-optimal solutions for \(J \| \| C_{\max}\) in reasonable computation time.

Application to job shop: The neighbourhood consists of single-arc reversals on the critical path (van Laarhoven et al., 1992). Starting from a greedy schedule (e.g., LPT dispatching), SA explores thousands of moves per second and typically finds solutions within 1–2% of the best known lower bound on benchmark instances.

9.3 Tabu Search

Tabu search (Glover, 1986) keeps a tabu list of recently performed moves (or recently visited solutions) and forbids reversing them for a fixed number of iterations (the tabu tenure).

Key features:

Intensification: once a promising region is found, search more deeply there.
Diversification: when stuck, make a large random move to unexplored regions.
Aspiration criterion: accept a tabu move if it produces a solution better than all previously seen solutions.

For \(F_m \| \| C_{\max}\), tabu search on the space of job permutations with insert/swap moves is state-of-the-art (Nowicki & Smutnicki, 1996 for job shop; Taillard, 1990 for flow shop).

9.4 Genetic Algorithms

Genetic algorithms (Holland, 1975) maintain a population of solutions and evolve it via selection, crossover, and mutation.

Encoding for scheduling: A permutation \(\pi\) of jobs represents a schedule (evaluated by a priority rule or constructive decoder). Alternatively, a matrix of machine assignments encodes a parallel machine schedule.

Crossover operators for permutations:

PMX (Partially Mapped Crossover): select two cut points in parent permutations; copy the middle segment from parent 1; fill remaining positions using the order from parent 2, avoiding duplicates by the mapping defined in the copied segment.
OX (Order Crossover): copy a sub-sequence from parent 1; fill remaining positions in the order they appear in parent 2.

Mutation: swap two randomly chosen positions in the permutation (or apply a random insert move).

Selection: tournament selection (select \(k\) individuals at random, keep the best) or roulette wheel selection (probability proportional to fitness).

For scheduling, the fitness function is the objective value (\(C_{\max}\), \(\sum w_j C_j\), etc.) after decoding the chromosome to a schedule.

9.4b Iterated Local Search

Iterated local search (ILS) combines local search with a perturbation step to escape local optima repeatedly.

Definition (Iterated local search):

Generate an initial solution \(\pi_0\) (e.g., by NEH or LPT).
Apply local search to \(\pi_0\) to reach a local optimum \(\pi^*\).
Perturb \(\pi^*\) (e.g., apply a random double-bridge move — cut the permutation into 4 segments and reconnect in a different order) to obtain \(\pi'\).
Apply local search to \(\pi'\) to reach a new local optimum \(\pi^{**}\).
Accept \(\pi^{**}\) as the new incumbent if it improves the objective (or by an acceptance criterion). Return to step 3.
Terminate after a time limit or iteration count.

ILS is especially effective for flow shop and job shop scheduling because the double-bridge perturbation is large enough to escape deep local optima but preserves enough structure to allow quick re-optimisation by local search. It consistently outperforms pure local search on benchmark instances (Taillard, 1993) while being much simpler to implement than genetic algorithms.

Remark (No-free-lunch theorem and scheduling heuristics): The no-free-lunch theorem (Wolpert and Macready, 1997) states that no single optimisation algorithm outperforms all others when averaged over all possible problem instances. In scheduling, this manifests as the empirical finding that the best heuristic depends heavily on the problem structure. For flow shop \(F_m||C_{\max}\), the NEH heuristic combined with tabu search is state-of-the-art. For job shop \(J||C_{\max}\), iterated disjunctive graph local search (moving operations along the critical path) is superior. For parallel machine scheduling, LP-based rounding often outperforms pure combinatorial heuristics. Practitioners must choose their algorithm based on the specific scheduling environment, constraint structure, and computational budget — there is no universally dominant method. The theory of approximation algorithms provides the worst-case guarantee framework, while computational experiments on benchmark instances (Taillard benchmarks for flow shop; Fisher–Thompson instances for job shop) provide empirical guidance.

9.5 Real-World Scheduling Complexity

Practical scheduling instances involve constraints far beyond the standard \(\alpha|\beta|\gamma\) framework:

Workforce scheduling (e.g., nurse rostering): legal limits on consecutive hours worked, minimum rest periods between shifts, fairness (equitable distribution of undesirable shifts), part-time employees.
Airline crew rostering: pairing of flights into duty periods, crew legality rules (FAA/EASA regulations), crew qualifications, overnight layovers, bidding and seniority.
Sports league scheduling: home/away balance, travel distance minimisation, TV broadcast constraints, venue availability, avoiding certain matchups in consecutive rounds.

These problems are typically modelled as integer programmes with hundreds of thousands of variables, or as constraint satisfaction problems (CSP) solved by constraint programming (CP) solvers such as IBM CP Optimizer or Google OR-Tools.

Constraint programming reasons about feasibility through arc consistency and domain propagation, often combined with a branch-and-bound search. For scheduling, CP supports global constraints such as cumulative (machine capacity over time) and no-overlap (non-preemptive scheduling), enabling efficient pruning without explicit enumeration.

Chapter 10: Online and Stochastic Scheduling

10.1 Online Scheduling

In offline scheduling, the algorithm knows all jobs in advance before making any assignment decision. In online scheduling, jobs arrive over time and must be assigned irrevocably upon arrival, without knowledge of future jobs.

Definition (Online algorithm and competitive ratio): An online scheduling algorithm receives jobs one at a time and must assign each job to a machine (and possibly a time slot) as it arrives, without knowledge of future jobs. The competitive ratio of an online algorithm \(\text{ALG}\) on a minimisation problem is: \[ \rho = \sup_I \frac{\text{ALG}(I)}{\text{OPT}(I)}, \] where the supremum is over all input sequences \(I\) and \(\text{OPT}(I)\) is the optimal offline solution. An algorithm with competitive ratio \(\rho\) guarantees that its output is at most \(\rho\) times the optimal on every input. The competitive ratio framework, introduced by Sleator and Tarjan (1985), measures worst-case performance against an adversary who may choose \(I\) after seeing the algorithm but before jobs arrive.

Theorem (Lower bound on competitive ratio for \(P_m||C_{\max}\) online): No deterministic online algorithm for \(P_m||C_{\max}\) achieves competitive ratio better than \(4/3 - 1/(3m)\).

Proof sketch (adversarial argument). The adversary uses the following instance. Present \(m(m-1)\) jobs each of size 1. An optimal offline algorithm assigns \(m-1\) jobs to each machine (perfectly balanced, makespan \(m-1\)). Any online algorithm must assign these \(m(m-1)\) jobs to machines without knowing whether more jobs will arrive. By pigeonhole, some machine receives at least \(m-1\) jobs. Now the adversary presents one more job of size \(m-1\). This job must go to some machine; regardless of which, the makespan becomes at least \((m-1) + (m-1) = 2(m-1)\) on that machine, or it goes to the least-loaded machine which has load at most \(m-2\), giving makespan at least \(m-2+(m-1)=2m-3\). In either case, the online makespan exceeds the optimal offline makespan (which accounts for the new job optimally) by at least a factor approaching \(4/3 - 1/(3m)\) for appropriate values of \(m\).

More precisely, for the adversary’s construction: after the initial \(m(m-1)\) unit jobs, the optimal offline makespan is \(m\) (since we now have \(m(m-1)+1 = m^2-m+1\) jobs, not divisible evenly, but the new large job contributes \(m-1\) and can be placed with 1 unit job). The online algorithm’s makespan is at least \(4m/3 - 1/(3)\) in the worst case, giving ratio approaching \(4/3\).

Remark (LPT is online-optimal for \(P_m||C_{\max}\)): The lower bound of \(4/3 - 1/(3m)\) is matched by the LPT algorithm — but LPT requires sorting all jobs before scheduling, which requires knowing all jobs in advance. For truly online scheduling (jobs arrive one at a time, must be placed immediately), the bound still applies as a lower bound, and the \((2-1/m)\)-competitive list scheduling algorithm is the natural online counterpart. List scheduling (assign each arriving job to the currently least-loaded machine) is online and achieves ratio \(2-1/m\). There is a gap: no online algorithm achieves the LPT ratio of \(4/3 - 1/(3m)\) in the online model. The best known online algorithm for \(Pm||C_{\max}\) (Albers, 1999) achieves competitive ratio \(1.923\) for large \(m\), better than \(2-1/m\) but worse than \(4/3\). For \(m=2\), the online optimal competitive ratio is exactly \(4/3\), matching LPT — a surprising coincidence.

10.2 Stochastic Scheduling

In stochastic scheduling, processing times are not deterministic constants but random variables. The scheduler observes job completions and may choose the next job based on past observations (adaptive policy).

Definition (Stochastic scheduling): In a stochastic scheduling model, each job \(j\) has a random processing time \(P_j\) drawn from a known distribution \(F_j\). A non-anticipatory policy may use the history of completions (and any information observed during processing) but not future processing times. The objective is to minimise the expected value of the criterion, e.g., \(\mathbf{E}[\sum C_j]\). A policy is optimal if it achieves the minimum expected cost among all non-anticipatory policies.

Theorem (SEPT is optimal for stochastic \(1||\mathbf{E}[\sum C_j]\) with exponential distributions): When processing times are exponentially distributed — \(P_j \sim \text{Exp}(\mu_j)\) with mean \(1/\mu_j\) — the Shortest Expected Processing Time (SEPT) rule is optimal for minimising \(\mathbf{E}[\sum C_j]\) on a single machine. SEPT orders jobs in non-decreasing order of \(1/\mu_j = \mathbf{E}[P_j]\).

The key property enabling SEPT’s optimality for exponential distributions is the memoryless property: the remaining processing time of a job in service is independent of how long it has already been running. This means that the expected completion time of a job is its expected processing time regardless of when we start it, and the WSPT exchange argument carries over directly to the stochastic setting.

Remark (Stochastic scheduling beyond exponential distributions): For general (non-exponential) processing time distributions, the optimal stochastic scheduling policy is considerably more complex. Even the single-machine problem \(1||\mathbf{E}[\sum C_j]\) with general independent processing times may require a policy that monitors remaining processing time and preempts a running job when a new job arrives — essentially the stochastic analogue of SRPT. For the multi-armed bandit model (a generalisation of stochastic scheduling where each job is a "project" with a state-dependent reward and cost), the **Gittins index theorem** (Gittins, 1979) provides the optimal policy: at each moment, activate the project with the highest Gittins index (a function of the project's current state and reward/cost structure). The Gittins index reduces to SEPT when all projects have exponential service times and the objective is \(\mathbf{E}[\sum C_j]\). For non-exponential distributions, the Gittins index is more complex to compute but still characterises the optimal policy for a wide class of objectives.

In practical scheduling (e.g., call centre routing, cloud computing job scheduling, operating system process scheduling), stochastic models are essential when processing times are highly variable. Common distributions used in practice include log-normal (heavy-tailed for file transfer), Pareto (for internet request sizes), and Weibull (for reliability-related tasks). The optimal policy depends critically on whether preemption is allowed and on the specific distribution family.

Summary Table: Key Results in CO 454

Problem	Complexity	Best Algorithm	Bound/Ratio
\(1 \\| \\| \sum w_j C_j\)	P	WSPT	Optimal
\(1 \\| \\| L_{\max}\)	P	EDD	Optimal
\(1 \\| \\| \sum U_j\)	P	Moore’s algorithm	Optimal
\(1 \\| r_j, \text{pmtn} \\| \sum C_j\)	P	SRPT	Optimal
\(1 \\| r_j, \text{pmtn} \\| L_{\max}\)	P	Preemptive EDD	Optimal
\(1 \\| \text{prec} \\| L_{\max}\)	P	Lawler’s algorithm	Optimal, \(O(n^2)\)
\(P \\| \text{pmtn} \\| C_{\max}\)	P	McNaughton wrap-around	Optimal
\(F_2 \\| \\| C_{\max}\)	P	Johnson’s algorithm	Optimal
\(J_2 \\| \\| C_{\max}\)	P	Jackson’s algorithm	Optimal
\(O_2 \\| \text{pmtn} \\| C_{\max}\)	P	Gonzalez–Sahni	Optimal
\(1 \\| r_j \\| \sum C_j\)	NP-hard (weak)	DP / B&B	pseudo-poly
\(1 \\| \\| \sum w_j U_j\)	NP-hard (weak)	DP	pseudo-poly
\(1 \\| \\| \sum w_j T_j\)	NP-hard (strong)	Bitmask DP	\(O(2^n n)\) exact
\(P_2 \\| \\| C_{\max}\)	NP-hard (weak)	DP / FPTAS	FPTAS
\(P_m \\| \\| C_{\max}\) (fixed \(m\))	NP-hard (strong)	PTAS (Hochbaum–Shmoys)	\((1+\varepsilon)\)
\(P_m \\| \\| C_{\max}\) (LPT, variable \(m\))	NP-hard	LPT	\(4/3 - 1/(3m)\)
\(R \\| \\| C_{\max}\)	NP-hard	LP rounding (Lenstra et al.)	2
\(F_3 \\| \\| C_{\max}\)	NP-hard (strong)	B&B / SA	no poly-time
\(J_3 \\| \\| C_{\max}\)	NP-hard (strong)	B&B / SA	no poly-time
\(O_m \\| \\| C_{\max}\) (\(m \geq 3\))	NP-hard (strong)	B&B	no poly-time

Chapter 11: Mathematical Programming Formulations

Combinatorial scheduling problems admit natural formulations as integer programmes (IPs) or mixed-integer programmes (MIPs). While these formulations are not polynomial-time algorithms, they provide two distinct utilities: they allow optimal solutions to be found by branch-and-bound solvers (CPLEX, Gurobi) for moderate-sized instances, and they are the starting point for LP-relaxation-based approximation algorithms.

11.1 Time-Indexed Formulations

The time-indexed formulation is the most natural IP model for scheduling. Discretise the time horizon into integer slots \(t = 1, 2, \ldots, T\), where \(T = \sum_j p_j\) is an upper bound on the makespan.

Definition (Time-indexed IP for \(1||\sum w_j C_j\)): Introduce binary variables \[ x_{jt} = \begin{cases} 1 & \text{if job } j \text{ completes at time } t, \\ 0 & \text{otherwise.} \end{cases} \] The formulation is: \[ \min \sum_j w_j \sum_t t \cdot x_{jt} \]

subject to:

\[ \sum_t x_{jt} = 1 \quad \forall j \qquad \text{(each job completes exactly once)}, \]\[ \sum_j \sum_{s=t}^{t+p_j-1} x_{js} \leq 1 \quad \forall t \qquad \text{(machine capacity — at most one job runs at each time)}, \]\[ x_{jt} \in \{0,1\}, \quad t \geq r_j + p_j \quad \text{(feasibility — release date and processing time)}. \]

The LP relaxation of this formulation (replacing \(x_{jt} \in \{0,1\}\) with \(0 \leq x_{jt} \leq 1\)) has a known integrality gap of at most \(e/(e-1) \approx 1.582\). By rounding the LP solution — assigning each job to the first slot where its fractional completion exceeds 1/2 — one obtains a polynomial-time \((1+\varepsilon)\)-approximation for \(1|r_j|\sum w_j C_j\) (after a careful geometric rounding argument due to Schulz and Skutella). The time-indexed LP is the basis of several state-of-the-art approximation algorithms.

Remark (Size of the time-indexed formulation): With \(n\) jobs and horizon \(T = \sum p_j\), the number of variables is \(O(nT)\). In the worst case (all \(p_j = T/n\)), this is \(O(n^2)\); in the presence of large processing times, it is pseudo-polynomial in the input. This makes time-indexed models impractical for exact solution when processing times are large, but extremely effective when they are small integers (as in unit-processing-time scheduling, where \(T = n\)).

11.2 Disjunctive Programme for the Job Shop

The classical mathematical programming model for job shop scheduling uses disjunctive variables to represent the ordering of jobs on each machine.

Definition (Disjunctive programme for \(J_m||C_{\max}\)): Let \(s_{jk}\) denote the start time of job \(j\) on machine \(k\), and let \(V\) be a large constant (big-M). For each pair of jobs \(i, j\) that share machine \(k\), introduce a binary variable \(z_{ijk} \in \{0,1\}\) indicating which job precedes the other. The formulation is: \[ \min C_{\max} \] subject to: \[ s_{j,\sigma_j(l+1)} \geq s_{j,\sigma_j(l)} + p_{j,\sigma_j(l)} \quad \forall j, l \qquad \text{(routing constraints)}, \]\[ s_{jk} \geq s_{ik} + p_{ik} - V(1 - z_{ijk}) \quad \forall i \neq j, \forall k, \]\[ s_{ik} \geq s_{jk} + p_{jk} - V z_{ijk} \quad \forall i \neq j, \forall k, \]\[ C_{\max} \geq s_{jk} + p_{jk} \quad \forall j, k, \]\[ z_{ijk} \in \{0,1\}, \quad s_{jk} \geq 0. \]

Here \(\sigma_j\) is the machine routing of job \(j\).

The LP relaxation (relaxing \(z_{ijk}\) to \([0,1]\)) has an unbounded integrality gap in general, but branch-and-bound on the disjunctive variables — a technique known as shifting bottleneck when combined with single-machine subproblem solvers — is the dominant exact method for \(J_m||C_{\max}\) in practice.

11.3 Approximation Schemes

For certain NP-hard scheduling problems, a polynomial-time approximation scheme (PTAS) exists: for any fixed \(\varepsilon > 0\), the algorithm runs in time polynomial in \(n\) (though possibly exponential in \(1/\varepsilon\)) and returns a solution within a \((1+\varepsilon)\) factor of optimal.

Theorem (PTAS for \(P||\sum C_j\) and for \(P||C_{\max}\)): For each \(\varepsilon > 0\):

There is a PTAS for \(P||\sum C_j\) (identical parallel machines, minimise total completion time).
There is a PTAS for \(P||C_{\max}\) (identical parallel machines, minimise makespan), due to Hochbaum and Shmoys (1987).

Proof sketch (Hochbaum–Shmoys PTAS for \(P||C_{\max}\)): Fix \(\varepsilon > 0\). Call job \(j\) large if \(p_j > \varepsilon \cdot \text{OPT}\) and small otherwise. Since \(\text{OPT} \geq \max_j p_j / m\), there are at most \(m / \varepsilon\) large jobs. Round large job sizes up to the nearest multiple of \(\varepsilon^2 \cdot \text{OPT}\), creating at most \(1/\varepsilon\) distinct size classes with at most \(m/\varepsilon\) jobs total. Enumerate all assignments of large jobs to machines in \(O(m^{m/\varepsilon})\) time (which is polynomial for fixed \(\varepsilon\)), choosing the one that minimises makespan on large jobs. Then schedule small jobs greedily on top: assign each to the currently least-loaded machine. The greedy assignment adds at most \(\varepsilon \cdot \text{OPT}\) to the makespan (each small job has size \(\leq \varepsilon \cdot \text{OPT}\), and the machine was chosen to minimise load). The total makespan is at most \((1+\varepsilon) \cdot \text{OPT}\).

Remark (FPTAS versus PTAS): A fully polynomial-time approximation scheme (FPTAS) runs in time polynomial in both \(n\) and \(1/\varepsilon\) — it is efficiently applicable even for very small \(\varepsilon\). A PTAS is polynomial in \(n\) for fixed \(\varepsilon\) but may be exponential in \(1/\varepsilon\). For \(P_2||C_{\max}\) (two identical machines), an FPTAS exists via the subset-sum FPTAS: round all processing times to multiples of \(\delta = \varepsilon \cdot \sum p_j / n\), solve the rounded problem optimally by DP in \(O(n^2/\varepsilon)\) time, and the solution approximates the original. For \(P||C_{\max}\) with general \(m\), only a PTAS is known (since the problem is strongly NP-hard, and strongly NP-hard problems cannot have an FPTAS unless P = NP).

11.4 Critical Path and the Project Scheduling Perspective

Scheduling with precedence constraints (\(\text{prec}\)) arises naturally in project management. The critical path method (CPM) solves the deterministic project scheduling problem optimally.

Definition (Activity-on-arc network): Model a project as a directed acyclic graph (DAG) where each arc \((i,j)\) represents an activity with duration \(d_{ij}\) and nodes represent events (milestones). The critical path is the longest path from the source (project start) to the sink (project completion). Its length is the minimum project duration.

Example (Critical path on a 6-activity project): Consider activities \(A, B, C, D, E, F\) with durations \(3, 4, 2, 5, 3, 1\) days, with the precedence structure: \(A \to C\), \(A \to D\), \(B \to D\), \(B \to E\), \(C \to F\), \(D \to F\), \(E \to F\). Compute earliest start times (EST) by a forward pass: \(\text{EST}(A) = 0\), \(\text{EST}(B) = 0\), \(\text{EST}(C) = 3\) (after \(A\)), \(\text{EST}(D) = \max(3, 4) = 4\) (after \(A\) and \(B\)), \(\text{EST}(E) = 4\) (after \(B\)), \(\text{EST}(F) = \max(3+2, 4+5, 4+3) = \max(5, 9, 7) = 9\). Project duration = \(9 + 1 = 10\) days.

The critical path is \(B \to D \to F\) (durations \(4 + 5 + 1 = 10\)), confirming the minimum project duration. Activities on the critical path have zero float — any delay in \(B\), \(D\), or \(F\) directly delays project completion. Activities \(A, C, E\) have positive float (slack), meaning they can be delayed without affecting the project end date.

The connection to scheduling: CPM solves \(1|\text{prec}|C_{\max}\) in \(O(n+m)\) time (just a longest-path computation). The single-machine problem with general objectives (\(L_{\max}\), \(\sum w_j C_j\)) and precedence constraints is considerably harder. Lawler’s algorithm solves \(1|\text{prec}|L_{\max}\) in \(O(n^2)\); general weighted criteria with precedence are strongly NP-hard.

11.5 Parametric and Sensitivity Analysis

In practice, scheduling decisions are made repeatedly in a changing environment. Sensitivity analysis asks: how much can problem parameters change before the optimal schedule changes? This is dual to standard LP sensitivity analysis.

For the WSPT rule (\(1||\sum w_j C_j\)), the optimal sequence is determined by the ratios \(w_j / p_j\). A perturbation of job \(j\)’s weight by \(\delta\) changes its position in the optimal order only if the perturbed ratio crosses the ratio of an adjacent job in the sequence. Formally, if job \(j\) is optimally scheduled immediately before job \(k\) (so \(w_j/p_j \geq w_k/p_k\)), the ordering remains optimal provided \(w_j / p_j + \delta/p_j \geq w_k/p_k\), i.e., \(\delta \geq p_j(w_k/p_k - w_j/p_j)\). This gives a per-job stability range for the weights.

Example (WSPT stability range): Suppose three jobs have \((p_j, w_j) = (2, 4), (3, 6), (1, 1)\), giving WSPT ratios \(2, 2, 1\). Jobs 1 and 2 tie (both ratio 2), so either ordering is optimal; job 3 goes last. If the weight of job 3 increases to \(w_3 = 3\), its ratio becomes 3, and it should now be scheduled first. The critical weight for job 3 to overtake jobs 1 and 2 is \(w_3 / 1 \geq 2\), i.e., \(w_3 \geq 2\). Sensitivity analysis identifies this threshold without re-solving the problem from scratch.

Chapter 12: Open Problems and Research Frontiers

Several fundamental questions in scheduling theory remain open:

Complexity of \(1|r_j|\sum C_j\): This problem (unit-weight completion times with release dates) is NP-hard (weakly), but whether it has a pseudo-polynomial algorithm or is strongly NP-hard is unknown for the preemptive version \(1|r_j, \text{pmtn}|\sum C_j\). The preemptive version is solved by SRPT (polynomial), but the non-preemptive version’s precise complexity is unresolved.
Approximation ratio for \(J_m||C_{\max}\): The general job shop has no known constant-factor approximation. The best known algorithms achieve \(O(\log n \log m)\)-approximation via LP rounding on the disjunctive formulation. Closing this gap to a constant-factor approximation (or proving a lower bound) is a major open problem.
Online scheduling lower bounds: For online makespan minimisation on \(m\) machines, the best known competitive ratio is between \(1.54\) (lower bound) and \(1.923\) (upper bound) for large \(m\). Closing this gap is open.
Stochastic vs. deterministic gaps: For most objectives, the best stochastic scheduling policy is not known in closed form for general (non-exponential) distributions. Understanding how much the optimal expected cost exceeds the optimal deterministic cost (the “price of stochasticity”) in terms of distributional parameters is an active research area.

These questions connect scheduling to broader themes in computational complexity, approximation algorithms, and probability theory. The field remains one of the most mathematically rich and practically relevant areas of combinatorial optimisation.

Appendix: Reduction Templates for Scheduling NP-Hardness Proofs

Most NP-hardness reductions for scheduling follow one of three templates, each translating a known NP-hard problem into a scheduling instance.

Template 1: Reduction from PARTITION. To prove \(P_2||C_{\max}\) is weakly NP-hard, reduce from PARTITION: given integers \(a_1, \ldots, a_n\) with \(\sum a_i = 2B\), construct jobs with \(p_j = a_j\) on two machines and ask if \(C_{\max} \leq B\). A perfect partition corresponds to a balanced schedule. This template yields weak NP-hardness (the problem is polynomial when processing times are bounded polynomially in \(n\)).

Template 2: Reduction from 3-PARTITION. To prove strong NP-hardness, reduce from 3-PARTITION (NP-hard in the strong sense): given \(3m\) integers \(a_i \in (B/4, B/2)\) summing to \(mB\), ask if they partition into \(m\) triples each summing to \(B\). Construct \(m+1\) jobs per “slot” such that each machine in the schedule must fit exactly 3 jobs per unit of time. This template yields strong NP-hardness — no pseudo-polynomial algorithm exists unless P = NP.

Template 3: Reduction from DIRECTED HAMILTONIAN PATH. To prove NP-hardness of problems with precedence constraints (e.g., \(1|\text{prec}|\sum w_j C_j\)), construct a job for each arc and node of the Hamiltonian instance. Precedence constraints encode the graph structure; the optimal schedule corresponds to a Hamiltonian path. This gives NP-hardness in the ordinary sense (since Hamiltonian path is NP-complete but not known to be NP-hard in the strong sense).

Remark (Pseudo-polynomial vs. strongly NP-hard): A scheduling problem is weakly NP-hard if it is NP-hard but admits a pseudo-polynomial time algorithm (running time polynomial in \(n\) and the magnitude of the largest input number). It is strongly NP-hard if it remains NP-hard even when all numerical parameters are bounded by a polynomial in \(n\). Template 1 typically gives weak NP-hardness (FPTAS may exist); Template 2 gives strong NP-hardness (no FPTAS unless P = NP). Knowing which category a problem falls into determines the type of exact algorithm or approximation scheme to pursue: strongly NP-hard problems require PTAS at best, while weakly NP-hard problems admit FPTAS via rounding and DP.