MTE 201: Experimental Measurement and Statistical Analysis
Hongxia Yang
Estimated study time: 1 hr 5 min
Table of contents
Sources and References
Primary reference — D.C. Montgomery, G.C. Runger, N.F. Hubele, Engineering Statistics, 5th ed., Wiley, 2011.
Online resources — MIT OCW 18.650 Statistics for Applications, NIST/SEMATECH e-Handbook of Statistical Methods, Khan Academy Statistics and Probability, BIPM Guide to Expression of Uncertainty in Measurement (GUM).
Chapter 1: Descriptive Statistics and Data Visualization
1.1 The Role of Statistics in Engineering
Modern engineering is inseparable from data. Every manufactured component has dimensional variation; every measurement carries noise; every physical process exhibits run-to-run fluctuation. Statistical thinking provides a principled language for quantifying that variability, deciding whether an observed difference is real or random, and designing experiments that extract maximum information at minimum cost.
The workflow of an engineering study follows a recurring cycle: formulate a question, design an experiment, collect data, summarize and visualize, fit a model, make an inference, and communicate conclusions. This course addresses each step with rigour appropriate to mechatronics practice — from reading a pressure transducer to comparing two manufacturing processes with a hypothesis test.
1.2 Populations, Samples, and Variables
A population is the entire collection of objects or outcomes of interest. A sample is the subset actually observed. Because populations are usually infinite or impractically large, statistical inference draws conclusions about population parameters from sample statistics.
Variables are classified by the type of values they take:
- Continuous: temperature, voltage, dimension — can take any value in an interval.
- Discrete: defect count, number of cycles to failure — takes countable values.
- Categorical / nominal: supplier identity, material grade — no natural ordering.
1.3 Measures of Location
Let \( x_1, x_2, \ldots, x_n \) be a sample of \( n \) observations.
is the most common measure of centre. It is sensitive to extreme values (outliers).
1.4 Measures of Spread
The divisor \( n - 1 \) (rather than \( n \)) makes \( s^2 \) an unbiased estimator of the population variance \( \sigma^2 \).
1.5 Stem-and-Leaf Diagrams
A stem-and-leaf diagram retains the actual data values while providing a visual impression of shape and spread. Each observation is split into a stem (leading digit(s)) and a leaf (trailing digit). Stems are listed vertically; leaves fan out horizontally. Because no information is lost, stem-and-leaf plots are ideal for small-to-medium data sets (roughly \( n \leq 100 \)).
98.2, 99.1, 100.4, 101.3, 99.8, 98.7, 100.0, 101.1, 99.5, 100.6, 98.9, 100.2, 101.7, 99.3, 100.8, 98.4, 100.5, 101.0, 99.7, 100.1
Stems are the integer part (98, 99, 100, 101). Leaves are the tenths digit:
Stem | Leaves 98 | 2 4 7 9 99 | 1 3 5 7 8 100 | 0 1 2 4 5 6 8 101 | 0 1 3 7
The distribution is slightly right-skewed; the mode is in the 100-\(\Omega\) stem. No obvious outliers. The sample mean is \( \bar{x} = 100.0 \, \Omega \) and the sample standard deviation \( s \approx 0.92 \, \Omega \).
1.6 Histograms
A histogram partitions the data range into \( k \) class intervals (bins) of equal width \( h \) and plots the frequency or relative frequency of observations in each bin. Choice of \( k \) matters: too few bins hide shape; too many introduce noise. Sturges’ rule suggests \( k \approx 1 + \log_2 n \).
Histograms reveal modality, skewness, and the presence of gaps or outliers. For continuous data, the area of each bar (not just height) is proportional to relative frequency.
1.7 Box-and-Whisker Plots
- A box spanning \( Q_1 \) to \( Q_3 \) (the IQR).
- A line inside the box at the median \( Q_2 \).
- Whiskers extending to the most extreme observations within \( 1.5 \times \mathrm{IQR} \) of the quartiles.
- Points beyond the whiskers plotted individually as suspected outliers.
Box plots excel at side-by-side comparison of multiple groups, making them indispensable in factorial experiments.
Chapter 2: Probability and Random Variables
2.1 Axioms of Probability
Let \( S \) be the sample space of an experiment and \( A \subseteq S \) an event.
- \( P(A) \geq 0 \) for every event \( A \).
- \( P(S) = 1 \).
- For mutually exclusive events \( A_1, A_2, \ldots \): \( P\!\left(\bigcup_{i} A_i\right) = \sum_i P(A_i) \).
Derived results include \( P(A^c) = 1 - P(A) \) and, for arbitrary events, the addition rule \( P(A \cup B) = P(A) + P(B) - P(A \cap B) \).
Equivalently, \( P(A \mid B) = P(A) \): knowing \( B \) occurred does not change the probability of \( A \).
2.2 Random Variables
2.3 Probability Distributions for Continuous Random Variables
- \( f(x) \geq 0 \) for all \( x \).
- \( \displaystyle\int_{-\infty}^{\infty} f(x)\, dx = 1 \).
- \( \displaystyle P(a \leq X \leq b) = \int_a^b f(x)\, dx \).
The CDF is non-decreasing, right-continuous, with \( F(-\infty) = 0 \) and \( F(\infty) = 1 \).
2.4 Expected Value and Variance
For any function \( g(X) \):
\[ E\left[g(X)\right] = \int_{-\infty}^{\infty} g(x)\, f(x)\, dx. \]The standard deviation \( \sigma = \sqrt{\sigma^2} \) is in the same units as \( X \).
Key linearity property: for constants \( a, b \),
\[ E\left[aX + b\right] = a\mu + b, \qquad \mathrm{Var}(aX + b) = a^2 \sigma^2. \]2.5 The Normal Distribution
The normal distribution is fully characterised by its mean \( \mu \) and variance \( \sigma^2 \). The standard normal \( Z \sim N(0,1) \) has CDF \( \Phi(z) \) tabulated in every statistics text. Any normal variable is standardised by
\[ Z = \frac{X - \mu}{\sigma}. \]About 4.55 % of shafts fall outside specification. The engineer should investigate whether tightening the process standard deviation below 0.04 mm is economically feasible.
Chapter 3: Common Probability Distributions
3.1 Continuous Distributions
3.1.1 Exponential Distribution
with \( E[X] = 1/\lambda \) and \( \mathrm{Var}(X) = 1/\lambda^2 \). The parameter \( \lambda > 0 \) is the rate.
The exponential distribution models time between events in a Poisson process and has the memoryless property: \( P(X > s + t \mid X > s) = P(X > t) \). It is widely used for electronic component lifetimes and inter-arrival times in queuing.
3.1.2 Uniform Distribution
Round-off error in digital measurement is often modelled as \( U(-h/2, h/2) \) where \( h \) is the least significant digit.
3.1.3 Weibull Distribution
where \( \beta > 0 \) is the shape parameter and \( \delta > 0 \) is the scale parameter.
The Weibull hazard rate \( h(x) = (\beta/\delta)(x/\delta)^{\beta-1} \) is increasing for \( \beta > 1 \) (wear-out), constant for \( \beta = 1 \) (exponential; random failures), and decreasing for \( \beta < 1 \) (infant mortality). This makes it the dominant distribution in reliability engineering.
3.2 Discrete Distributions
3.2.1 Binomial Distribution
\( E[X] = np \), \( \mathrm{Var}(X) = np(1-p) \).
With \( X \sim B(25, 0.03) \):
\[ P(X \leq 1) = P(X=0) + P(X=1) = (0.97)^{25} + 25(0.03)(0.97)^{24}. \]\[ \approx 0.4670 + 0.3611 = 0.8281. \]There is roughly an 83 % chance of observing at most one defective board.
3.2.2 Poisson Distribution
\( E[X] = \mathrm{Var}(X) = \lambda \). The equality of mean and variance is a diagnostic signature of Poisson data.
The Poisson distribution is also an approximation to \( B(n, p) \) when \( n \) is large, \( p \) is small, and \( \lambda = np \) is moderate — the rare-event approximation.
3.2.3 Normal Approximation for Discrete Data
When \( n \) is large and \( p \) is not too close to 0 or 1, the binomial \( B(n, p) \) is well approximated by \( N(np, np(1-p)) \). The continuity correction improves accuracy: replace \( P(X \leq k) \) by \( P(Y \leq k + 0.5) \) where \( Y \sim N(np, np(1-p)) \). A common rule of thumb is that \( np \geq 5 \) and \( n(1-p) \geq 5 \) are required for the approximation to be reasonable.
Chapter 4: Sampling Distributions and the Central Limit Theorem
4.1 Sampling and the Sample Mean
When we draw a random sample \( X_1, X_2, \ldots, X_n \) from a population with mean \( \mu \) and variance \( \sigma^2 \), the sample mean
\[ \bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i \]is itself a random variable. Its distribution — called the sampling distribution — is the key link between a single dataset and the population from which it came.
The standard deviation \( \sigma/\sqrt{n} \) is called the standard error of the mean.
4.2 The Central Limit Theorem
Equivalently, \( \bar{X} \) is approximately \( N(\mu, \sigma^2/n) \) for large \( n \), regardless of the shape of the original population distribution.
Expanding \( M_Y(t/\sqrt{n}) \) in a Taylor series around 0 using the known cumulants \( \kappa_1 = 0, \kappa_2 = 1 \):
\[ M_Y\!\left(\frac{t}{\sqrt{n}}\right) = 1 + \frac{t^2}{2n} + O\!\left(n^{-3/2}\right). \]Therefore
\[ M_{Z_n}(t) = \left(1 + \frac{t^2}{2n} + O(n^{-3/2})\right)^n \to e^{t^2/2} \quad \text{as } n \to \infty, \]which is the MGF of \( N(0, 1) \). By the continuity theorem for MGFs, \( Z_n \xrightarrow{d} N(0,1) \). \( \square \)
4.3 Small Samples and the t-Distribution
When the population variance \( \sigma^2 \) is unknown — which is the usual situation — and the sample size is small (\( n < 30 \)), the standardised sample mean follows a t-distribution rather than a standard normal.
the t-distribution with \( \nu \) degrees of freedom. For a random sample from \( N(\mu, \sigma^2) \):
\[ T = \frac{\bar{X} - \mu}{S / \sqrt{n}} \sim t_{n-1}. \]The t-distribution is bell-shaped and symmetric about zero, but has heavier tails than the standard normal. As \( \nu \to \infty \), \( t_\nu \to N(0,1) \).
4.4 Probability Plotting
A normal probability plot (quantile-quantile or Q-Q plot) assesses whether data come from a normal distribution. The sorted sample quantiles are plotted against the corresponding theoretical quantiles of a standard normal. Approximate linearity supports normality; systematic curves indicate skewness or heavy tails. Outliers appear as isolated points far from the line.
For Weibull data, a similar linearisation is possible: if \( X \sim \mathrm{Weibull}(\beta, \delta) \), then \( \ln(-\ln(1 - F(x))) = \beta \ln x - \beta \ln \delta \), which is linear in \( \ln x \). Plotting \( \ln(-\ln(1-\hat{F})) \) vs. \( \ln x \) (Weibull probability paper) yields a straight line whose slope estimates \( \beta \).
4.5 Joint Probability Distributions
The Pearson correlation coefficient \( \rho = \mathrm{Cov}(X,Y) / (\sigma_X \sigma_Y) \) satisfies \( -1 \leq \rho \leq 1 \). If \( X \) and \( Y \) are independent, then \( \mathrm{Cov}(X,Y) = 0 \), but the converse is not generally true.
A critical property for the CLT and regression: for independent random variables,
\[ \mathrm{Var}\!\left(\sum_{i=1}^n a_i X_i\right) = \sum_{i=1}^n a_i^2\, \mathrm{Var}(X_i). \]Chapter 5: Measurement Systems and Experimental Uncertainty
5.1 Overview of Measurement Systems
A measurement system converts a physical quantity into a signal that can be read, recorded, or processed. Every system consists of a sensor (transduces the physical quantity), a signal conditioner (amplifies, filters, or converts), and a readout (display, data acquisition system, or actuator). Understanding the behaviour and limitations of each stage is prerequisite to interpreting data correctly.
Key performance characteristics of a sensor include:
- Range: the interval of inputs over which it operates correctly.
- Sensitivity: change in output per unit change in input (\( K = \Delta \mathrm{output} / \Delta \mathrm{input} \)).
- Resolution: the smallest detectable input change.
- Accuracy: closeness of the indicated value to the true value.
- Precision (repeatability): closeness of repeated measurements under unchanged conditions.
- Linearity: degree to which sensitivity is constant across the range.
- Hysteresis: difference in output for the same input depending on direction of approach.
5.2 Temperature Measurement
5.2.1 Thermocouples
A thermocouple exploits the Seebeck effect: two dissimilar metals joined at a measurement junction produce an electromotive force (EMF) proportional to the temperature difference between the hot junction and the reference junction (typically held at 0°C). Common types include J (iron/constantan, 0–760°C), K (chromel/alumel, −200–1260°C), and T (copper/constantan, −200–370°C).
The relationship between EMF \( E \) and temperature \( T \) is nonlinear (polynomial of degree 5–9 in ITS-90 reference tables). Over narrow ranges, a linear approximation \( E \approx S \cdot T \) holds, where \( S \) (the Seebeck coefficient, in \(\mu\mathrm{V}/^\circ\mathrm{C}\)) is approximately constant.
5.2.2 Resistance Temperature Detectors (RTDs)
The electrical resistance of a metal varies with temperature. For platinum RTDs (PT100: \( R_0 = 100 \, \Omega \) at 0°C):
\[ R(T) = R_0 \left(1 + AT + BT^2\right), \]where for platinum \( A = 3.9083 \times 10^{-3} \, ^\circ\mathrm{C}^{-1} \) and \( B = -5.775 \times 10^{-7} \, ^\circ\mathrm{C}^{-2} \). RTDs are more stable and accurate than thermocouples but slower and more fragile.
5.2.3 Thermistors
Thermistors are semiconductor devices with a large negative (NTC) or positive (PTC) temperature coefficient of resistance. Their response is highly nonlinear, often modelled by the Steinhart-Hart equation:
\[ \frac{1}{T} = A_0 + A_1 \ln R + A_3 (\ln R)^3. \]High sensitivity makes thermistors ideal for narrow-range precision measurements (e.g., body temperature), but their nonlinearity and self-heating demand careful calibration.
5.3 Pressure Measurement
Pressure is defined as force per unit area. Gauge pressure \( p_g \) is measured relative to atmospheric; absolute pressure \( p_{abs} = p_g + p_{atm} \); differential pressure is the difference between two points.
Common transducer technologies include:
- Bourdon tube: elastic tube that straightens under pressure; displacement drives a pointer or LVDT. Rugged but limited to static or slowly varying pressures.
- Piezoresistive (strain-gauge) transducers: pressure deflects a thin diaphragm whose strain is sensed by a Wheatstone bridge. High frequency response; widely used in data acquisition.
- Piezoelectric transducers: dynamic pressure generates a charge; suitable for high-frequency measurements (combustion, acoustics) but cannot measure static pressure.
5.4 Displacement and Strain Measurement
Strain gauges exploit the piezoresistive effect in a foil grid bonded to a surface. The gauge factor \( GF \) relates relative resistance change to strain \( \varepsilon \):
\[ \frac{\Delta R}{R} = GF \cdot \varepsilon, \qquad GF \approx 2 \text{ for metallic gauges.} \]A Wheatstone bridge converts the small resistance change to a measurable voltage. Temperature compensation is achieved by using dummy gauges in an adjacent bridge arm.
Linear Variable Differential Transformers (LVDTs) are AC-excited inductive sensors that produce a voltage output linearly proportional to core displacement over a specified range. They are highly linear, frictionless, and have infinite resolution.
5.5 Calibration
A calibration curve relates the instrument output \( y \) to the reference standard input \( x \). If the device is linear, the curve is \( y = a_0 + a_1 x \); least-squares regression (Chapter 7) determines \( a_0 \) and \( a_1 \). Calibration uncertainty must be combined with measurement uncertainty in the final result.
5.6 Systematic and Random Errors
Every measurement \( x_m \) deviates from the true value \( x_t \) according to:
\[ x_m = x_t + e_s + e_r, \]where \( e_s \) is the systematic error (bias) — a fixed, repeatable offset — and \( e_r \) is the random error — a zero-mean, varying component.
Systematic errors arise from sensor calibration offsets, zero drift, loading effects (e.g., inserting a thermometer perturbs the temperature field), electromagnetic interference, and quantisation. They cannot be reduced by averaging; they must be identified and corrected.
Random errors arise from electrical noise, mechanical vibration, air currents, and round-off. They follow a probability distribution (often normal with mean zero) and can be reduced by averaging \( n \) measurements: \( \sigma_{\bar{x}} = \sigma_x / \sqrt{n} \).
5.7 Uncertainty Analysis and Error Propagation
In many experiments, the quantity of interest \( y \) is not measured directly but computed from several directly measured quantities \( x_1, x_2, \ldots, x_k \) through a known functional relationship \( y = f(x_1, x_2, \ldots, x_k) \).
The partial derivatives are evaluated at the nominal values \( \bar{x}_1, \ldots, \bar{x}_k \).
Taking the variance of both sides and using independence (\( \mathrm{Cov}(\delta x_i, \delta x_j) = 0 \) for \( i \neq j \)):
\[ \mathrm{Var}(\delta y) \approx \sum_{i=1}^{k} \left(\frac{\partial f}{\partial x_i}\right)^2 \mathrm{Var}(\delta x_i) = \sum_{i=1}^{k} \left(\frac{\partial f}{\partial x_i}\right)^2 u_i^2. \]Taking the square root gives the combined standard uncertainty \( u_y \). \( \square \)
So \( P = 36.0 \pm 0.85 \) W (approximately 2.4 % relative uncertainty).
Evaluating: \( Q = \frac{\pi (0.0025)}{4}(2.00) \approx 3.927 \times 10^{-3} \) m\(^3\)/s. The relative uncertainty is dominated by the diameter term because of the square dependence — a lesson that precision in diameter measurement pays greater dividends than precision in velocity.
Chapter 6: Hypothesis Testing and Confidence Intervals
6.1 Framework of Hypothesis Testing
A statistical hypothesis is a claim about a population parameter. Hypothesis testing provides a rule for deciding, based on sample data, whether the evidence is sufficient to reject the claim.
- Null hypothesis \( H_0 \): the default claim (e.g., \( \mu = \mu_0 \)). It is assumed true until evidence forces rejection.
- Alternative hypothesis \( H_1 \): the claim supported by rejection of \( H_0 \).
A one-sided test has \( H_1: \mu > \mu_0 \) or \( H_1: \mu < \mu_0 \). A two-sided test has \( H_1: \mu \neq \mu_0 \).
| Decision | \( H_0 \) is true | \( H_0 \) is false |
|---|---|---|
| Reject \( H_0 \) | Type I error (prob. \( \alpha \)) | Correct (power \( 1 - \beta \)) |
| Fail to reject \( H_0 \) | Correct (prob. \( 1 - \alpha \)) | Type II error (prob. \( \beta \)) |
\( \alpha \) is the significance level. \( \beta \) is the probability of missing a true effect. Power \( = 1 - \beta \) is the probability of correctly detecting a real difference.
6.2 Tests on a Single Mean
Compute the test statistic
\[ z_0 = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}. \]Reject \( H_0 \) if \( \left|z_0\right| > z_{\alpha/2} \), the upper \( \alpha/2 \) critical value of \( N(0,1) \).
Reject \( H_0: \mu = \mu_0 \) vs. \( H_1: \mu \neq \mu_0 \) if \( \left|t_0\right| > t_{\alpha/2, n-1} \).
6.3 The p-Value
6.4 Type II Error and Sample Size Determination
The Type II error probability \( \beta \) depends on the true value \( \mu = \mu_0 + \delta \) (the actual departure from \( H_0 \)), the significance level \( \alpha \), and the sample size \( n \). For the two-sided z-test:
\[ \beta \approx \Phi\!\left(z_{\alpha/2} - \frac{\delta\sqrt{n}}{\sigma}\right) - \Phi\!\left(-z_{\alpha/2} - \frac{\delta\sqrt{n}}{\sigma}\right). \]To achieve power \( 1 - \beta \) at a minimum detectable difference \( \delta = |\mu - \mu_0| \):
\[ n \approx \frac{(z_{\alpha/2} + z_\beta)^2 \sigma^2}{\delta^2}. \]With 11 measurements, the test has 90 % power to detect a 2 N bias at the 5 % significance level.
6.5 Confidence Intervals
For \( \sigma \) known:
\[ \bar{x} - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{x} + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}. \]For \( \sigma \) unknown (use sample \( s \), \( t_{n-1} \) critical value):
\[ \bar{x} - t_{\alpha/2,\, n-1} \frac{s}{\sqrt{n}} \leq \mu \leq \bar{x} + t_{\alpha/2,\, n-1} \frac{s}{\sqrt{n}}. \]\( n = 8 \), \( \bar{x} = 28.21 \) MPa, \( s = 0.483 \) MPa, \( t_{0.025, 7} = 2.365 \).
\[ \bar{x} \pm t_{0.025,7} \frac{s}{\sqrt{n}} = 28.21 \pm 2.365 \cdot \frac{0.483}{\sqrt{8}} = 28.21 \pm 0.40. \]The 95 % CI is \( \left[27.81, 28.61\right] \) MPa. If the design specification requires \( \mu \geq 28.0 \) MPa, the evidence supports compliance at the 95 % confidence level.
6.6 Inference on Two Population Means
6.6.1 Paired t-Test
Used when each observation in one group is naturally paired with an observation in the other (e.g., before/after measurements on the same unit).
Form differences \( d_i = x_{1i} - x_{2i} \) and test \( H_0: \mu_D = 0 \) using \( t_0 = \bar{d} / (s_d / \sqrt{n}) \sim t_{n-1} \) under \( H_0 \).
6.6.2 Two-Sample t-Test (Unpaired)
For independent samples from two populations, test \( H_0: \mu_1 = \mu_2 \).
Equal variances assumed (pooled estimate):
\[ s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}, \qquad t_0 = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{1/n_1 + 1/n_2}} \sim t_{n_1 + n_2 - 2}. \]Unequal variances (Welch’s test):
\[ t_0 = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{s_1^2/n_1 + s_2^2/n_2}}, \]with Welch-Satterthwaite approximate degrees of freedom
\[ \nu \approx \frac{(s_1^2/n_1 + s_2^2/n_2)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}. \]6.7 Confidence Intervals for Proportions
For a proportion \( p \) estimated from \( n \) observations with \( x \) successes, \( \hat{p} = x/n \):
\[ \hat{p} - z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \leq p \leq \hat{p} + z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}. \]Valid when \( n\hat{p} \geq 5 \) and \( n(1-\hat{p}) \geq 5 \).
Chapter 7: Design of Experiments and Regression Analysis
7.1 Strategy of Experimentation
Engineering experiments rarely involve a single variable. A thoughtful experimental strategy controls nuisance factors, ensures balanced treatment of all factor combinations, and allows estimation of factor interactions. Key principles:
- Randomisation: run trials in random order to protect against lurking time-trends.
- Replication: replicate each treatment combination to estimate pure experimental error and increase power.
- Blocking: group experimental units into homogeneous blocks to reduce background variability.
7.2 The 2² Factorial Design
A \( 2^k \) factorial design involves \( k \) factors each at two levels, coded \( -1 \) (low) and \( +1 \) (high). A \( 2^2 \) design has four treatment combinations.
- Main effect of A: average change in \( y \) when \( A \) moves from low to high, averaged over levels of \( B \).
- Main effect of B: analogous.
- Interaction AB: difference in the effect of \( A \) at the two levels of \( B \). A non-zero interaction means that the effect of one factor depends on the level of the other.
With \( n \) replicates per cell and total \( N = 4n \) runs, the effects are estimated by contrasts and can be tested with an \( F \)-test in the ANOVA framework. Significant interactions render main-effect interpretations incomplete.
7.3 The 2³ Factorial Design and Response Surfaces
A \( 2^3 \) design adds a third factor \( C \), giving \( 8 \) treatment combinations. It estimates 7 effects: three main effects (\( A, B, C \)), three two-factor interactions (\( AB, AC, BC \)), and one three-factor interaction (\( ABC \)). A three-factor interaction is usually difficult to interpret physically and, if absent, the corresponding contrast estimates pure experimental error.
When a \( 2^k \) design indicates that a region of the experimental space is promising, response surface methodology adds centre points and axial runs to fit a second-order model:
\[ \hat{y} = \beta_0 + \sum_i \beta_i x_i + \sum_i \beta_{ii} x_i^2 + \sum_{i| Run | A | B | \( \bar{y} \) |
|---|---|---|---|
| (1) | − | − | 0.80 |
| a | + | − | 1.20 |
| b | − | + | 0.95 |
| ab | + | + | 1.05 |
Main effect of A: \( \frac{1}{2}\left[(1.20 - 0.80) + (1.05 - 0.95)\right] = \frac{1}{2}(0.40 + 0.10) = 0.25 \) mm.
Main effect of B: \( \frac{1}{2}\left[(0.95 - 0.80) + (1.05 - 1.20)\right] = \frac{1}{2}(0.15 - 0.15) = 0.00 \) mm.
Interaction AB: \( \frac{1}{2}\left[(1.20 - 0.80) - (1.05 - 0.95)\right] = \frac{1}{2}(0.40 - 0.10) = 0.15 \) mm.
Temperature has a strong main effect; injection speed alone has no effect on average. The positive interaction means high temperature amplifies the effect of speed. An engineer minimising shrinkage should run low temperature; if forced to run high temperature, low speed reduces damage.
7.4 Simple Linear Regression
When a response \( y \) depends linearly on a single predictor \( x \), the model is
\[ Y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \quad \varepsilon_i \overset{\text{i.i.d.}}{\sim} N(0, \sigma^2), \quad i = 1, \ldots, n. \]The unknown parameters are the intercept \( \beta_0 \), slope \( \beta_1 \), and error variance \( \sigma^2 \).
7.5 Least-Squares Estimation
They are
\[ \hat{\beta}_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2} = \frac{S_{xy}}{S_{xx}}, \qquad \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x}. \]where \( S_{xy} = \sum(x_i - \bar{x})(y_i - \bar{y}) \) and \( S_{xx} = \sum(x_i - \bar{x})^2 \).
Dividing the first equation by \( n \): \( \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \). Substituting into the second equation and rearranging yields \( \hat{\beta}_1 = S_{xy}/S_{xx} \). \( \square \)
7.6 Properties of OLS Estimators and Inferences
The unbiased estimator of \( \sigma^2 \) is the mean squared error:
\[ \hat{\sigma}^2 = s^2 = \frac{\mathrm{SS}_E}{n-2} = \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{n-2}, \]where \( \hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_i \) are the fitted values and \( e_i = y_i - \hat{y}_i \) are the residuals. Two degrees of freedom are lost because two parameters (\( \beta_0, \beta_1 \)) are estimated.
Hypothesis test on slope: To test \( H_0: \beta_1 = 0 \) (no linear relationship):
\[ t_0 = \frac{\hat{\beta}_1}{s / \sqrt{S_{xx}}} \sim t_{n-2} \text{ under } H_0. \]A \( (1-\alpha)100\% \) CI on \( \beta_1 \) is \( \hat{\beta}_1 \pm t_{\alpha/2,\, n-2} \cdot s/\sqrt{S_{xx}} \).
The coefficient of determination
\[ R^2 = 1 - \frac{\mathrm{SS}_E}{\mathrm{SS}_T}, \quad \mathrm{SS}_T = \sum(y_i - \bar{y})^2, \]measures the proportion of total variability explained by the regression. \( R^2 \in [0, 1] \); values close to 1 indicate a strong linear fit.
| \( x \) (°C) | 100 | 200 | 300 | 400 | 500 |
|---|---|---|---|---|---|
| \( y \) (mV) | 4.10 | 8.14 | 12.21 | 16.40 | 20.65 |
Compute \( \bar{x} = 300 \), \( \bar{y} = 12.30 \), \( S_{xx} = 100{,}000 \), \( S_{xy} = 4120 \).
\[ \hat{\beta}_1 = \frac{4120}{100{,}000} = 0.04120 \text{ mV/°C}, \qquad \hat{\beta}_0 = 12.30 - 0.04120 \times 300 = 0.0 \text{ mV}. \]The calibration curve is \( \hat{y} = 0.04120 x \) mV. The Seebeck coefficient estimate \( 0.04120 \) mV/°C \( = 41.2\,\mu\mathrm{V}/°C \) aligns with tabulated values for type K thermocouples (approximately \( 41\,\mu\mathrm{V}/°C \) near 300°C). The residuals should be checked: any systematic curvature would indicate the linear model is insufficient over this range and a quadratic term should be added.
7.7 Multiple Linear Regression
When \( p \) predictors are available, the model becomes
\[ Y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \cdots + \beta_p x_{ip} + \varepsilon_i. \]In matrix notation with \( \mathbf{y} \in \mathbb{R}^n \), design matrix \( \mathbf{X} \in \mathbb{R}^{n \times (p+1)} \), and \( \boldsymbol{\beta} \in \mathbb{R}^{p+1} \):
\[ \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}. \]The OLS estimator is
\[ \hat{\boldsymbol{\beta}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}, \]provided \( \mathbf{X}^\top \mathbf{X} \) is invertible (no perfect multicollinearity).
The error variance estimate is \( \hat{\sigma}^2 = \mathrm{SS}_E / (n - p - 1) \). The adjusted \( R^2 \) penalises for adding predictors:
\[ R^2_{\mathrm{adj}} = 1 - \frac{\mathrm{SS}_E / (n-p-1)}{\mathrm{SS}_T / (n-1)}. \]7.8 Model Adequacy Checking
Fitting a regression model is not the end of the analysis. Residual diagnostics reveal violations of assumptions:
- Normality: a normal probability plot of residuals should be approximately linear. Marked curvature indicates non-normality.
- Constant variance (homoscedasticity): a plot of residuals \( e_i \) vs. fitted values \( \hat{y}_i \) should show random scatter with no funnel shape. Funnel patterns indicate variance increasing with the mean — a log transform of \( y \) often stabilises variance.
- Independence: residuals vs. run order should show no trend or pattern. Trends suggest time-varying conditions or autocorrelation.
- Influential observations: Cook’s distance and leverage values identify individual points that disproportionately affect the fitted model.
is fitted. Adding the squared term reduces \( \mathrm{SS}_E \) by 68 %, and residuals now appear random. The quadratic coefficient \( \hat{\beta}_2 \) captures the nonlinearity of the bridge. The engineer includes this correction in the calibration firmware.
Summary of Key Formulae
The table below collects the central formulae developed throughout these notes.
| Quantity | Formula |
|---|---|
| Sample mean | \( \bar{x} = \frac{1}{n}\sum x_i \) |
| Sample variance | \( s^2 = \frac{1}{n-1}\sum(x_i - \bar{x})^2 \) |
| Standard error of mean | \( \sigma_{\bar{X}} = \sigma/\sqrt{n} \) |
| Error propagation | \( u_y = \sqrt{\sum_i \left(\partial f/\partial x_i\right)^2 u_i^2} \) |
| z-test statistic | \( z_0 = (\bar{x} - \mu_0)/(\sigma/\sqrt{n}) \) |
| t-test statistic | \( t_0 = (\bar{x} - \mu_0)/(s/\sqrt{n}) \) |
| CI for \( \mu \) (\( \sigma \) unknown) | \( \bar{x} \pm t_{\alpha/2,\, n-1}\, s/\sqrt{n} \) |
| Sample size (power) | \( n \approx (z_{\alpha/2} + z_\beta)^2 \sigma^2/\delta^2 \) |
| OLS slope | \( \hat{\beta}_1 = S_{xy}/S_{xx} \) |
| OLS intercept | \( \hat{\beta}_0 = \bar{y} - \hat{\beta}_1\bar{x} \) |
| MSE | \( s^2 = \mathrm{SS}_E/(n-2) \) |
| Multiple OLS | \( \hat{\boldsymbol{\beta}} = (\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top\mathbf{y} \) |