MTE 201: Experimental Measurement and Statistical Analysis

Hongxia Yang

Estimated study time: 1 hr 5 min

Table of contents

Sources and References

Primary reference — D.C. Montgomery, G.C. Runger, N.F. Hubele, Engineering Statistics, 5th ed., Wiley, 2011.

Online resourcesMIT OCW 18.650 Statistics for Applications, NIST/SEMATECH e-Handbook of Statistical Methods, Khan Academy Statistics and Probability, BIPM Guide to Expression of Uncertainty in Measurement (GUM).


Chapter 1: Descriptive Statistics and Data Visualization

1.1 The Role of Statistics in Engineering

Modern engineering is inseparable from data. Every manufactured component has dimensional variation; every measurement carries noise; every physical process exhibits run-to-run fluctuation. Statistical thinking provides a principled language for quantifying that variability, deciding whether an observed difference is real or random, and designing experiments that extract maximum information at minimum cost.

The workflow of an engineering study follows a recurring cycle: formulate a question, design an experiment, collect data, summarize and visualize, fit a model, make an inference, and communicate conclusions. This course addresses each step with rigour appropriate to mechatronics practice — from reading a pressure transducer to comparing two manufacturing processes with a hypothesis test.

Engineering context. Descriptive statistics are not a preliminary formality. The histogram of surface-roughness measurements from a CNC mill can immediately reveal whether the process is centred on specification and whether its spread is acceptable — before any formal testing is needed. Getting comfortable with rapid graphical analysis is a professional skill.

1.2 Populations, Samples, and Variables

A population is the entire collection of objects or outcomes of interest. A sample is the subset actually observed. Because populations are usually infinite or impractically large, statistical inference draws conclusions about population parameters from sample statistics.

Variables are classified by the type of values they take:

  • Continuous: temperature, voltage, dimension — can take any value in an interval.
  • Discrete: defect count, number of cycles to failure — takes countable values.
  • Categorical / nominal: supplier identity, material grade — no natural ordering.

1.3 Measures of Location

Let \( x_1, x_2, \ldots, x_n \) be a sample of \( n \) observations.

Sample mean. The arithmetic average \[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \]

is the most common measure of centre. It is sensitive to extreme values (outliers).

Sample median. The middle value when the data are sorted. For odd \( n \), the median is \( x_{(( n+1)/2)} \); for even \( n \), it is the average of \( x_{(n/2)} \) and \( x_{(n/2+1)} \). The median is robust to outliers.

1.4 Measures of Spread

Sample variance and standard deviation. \[ s^2 = \frac{1}{n-1} \sum_{i=1}^{n} \left( x_i - \bar{x} \right)^2, \qquad s = \sqrt{s^2}. \]

The divisor \( n - 1 \) (rather than \( n \)) makes \( s^2 \) an unbiased estimator of the population variance \( \sigma^2 \).

Range. The simplest spread measure: \( R = x_{(n)} - x_{(1)} \). Easy to compute but uses only two data points.
Interquartile range (IQR). \( \mathrm{IQR} = Q_3 - Q_1 \), the range of the middle 50 % of sorted data. Robust to outliers.

1.5 Stem-and-Leaf Diagrams

A stem-and-leaf diagram retains the actual data values while providing a visual impression of shape and spread. Each observation is split into a stem (leading digit(s)) and a leaf (trailing digit). Stems are listed vertically; leaves fan out horizontally. Because no information is lost, stem-and-leaf plots are ideal for small-to-medium data sets (roughly \( n \leq 100 \)).

Example 1.1 — Resistor tolerances. A quality engineer measures the resistance (in \( \Omega \)) of 20 resistors nominally rated at 100 \( \Omega \):

98.2, 99.1, 100.4, 101.3, 99.8, 98.7, 100.0, 101.1, 99.5, 100.6, 98.9, 100.2, 101.7, 99.3, 100.8, 98.4, 100.5, 101.0, 99.7, 100.1

Stems are the integer part (98, 99, 100, 101). Leaves are the tenths digit:

Stem | Leaves
 98  | 2 4 7 9
 99  | 1 3 5 7 8
100  | 0 1 2 4 5 6 8
101  | 0 1 3 7

The distribution is slightly right-skewed; the mode is in the 100-\(\Omega\) stem. No obvious outliers. The sample mean is \( \bar{x} = 100.0 \, \Omega \) and the sample standard deviation \( s \approx 0.92 \, \Omega \).

1.6 Histograms

A histogram partitions the data range into \( k \) class intervals (bins) of equal width \( h \) and plots the frequency or relative frequency of observations in each bin. Choice of \( k \) matters: too few bins hide shape; too many introduce noise. Sturges’ rule suggests \( k \approx 1 + \log_2 n \).

Histograms reveal modality, skewness, and the presence of gaps or outliers. For continuous data, the area of each bar (not just height) is proportional to relative frequency.

1.7 Box-and-Whisker Plots

Box plot. A graphical five-number summary consisting of:
  • A box spanning \( Q_1 \) to \( Q_3 \) (the IQR).
  • A line inside the box at the median \( Q_2 \).
  • Whiskers extending to the most extreme observations within \( 1.5 \times \mathrm{IQR} \) of the quartiles.
  • Points beyond the whiskers plotted individually as suspected outliers.

Box plots excel at side-by-side comparison of multiple groups, making them indispensable in factorial experiments.


Chapter 2: Probability and Random Variables

2.1 Axioms of Probability

Let \( S \) be the sample space of an experiment and \( A \subseteq S \) an event.

Kolmogorov axioms.
  1. \( P(A) \geq 0 \) for every event \( A \).
  2. \( P(S) = 1 \).
  3. For mutually exclusive events \( A_1, A_2, \ldots \): \( P\!\left(\bigcup_{i} A_i\right) = \sum_i P(A_i) \).

Derived results include \( P(A^c) = 1 - P(A) \) and, for arbitrary events, the addition rule \( P(A \cup B) = P(A) + P(B) - P(A \cap B) \).

Conditional probability. Given \( P(B) > 0 \): \[ P(A \mid B) = \frac{P(A \cap B)}{P(B)}. \]
Statistical independence. Events \( A \) and \( B \) are independent if and only if \[ P(A \cap B) = P(A) \cdot P(B). \]

Equivalently, \( P(A \mid B) = P(A) \): knowing \( B \) occurred does not change the probability of \( A \).

2.2 Random Variables

Random variable. A function \( X: S \to \mathbb{R} \) that assigns a real number to each outcome in the sample space. Random variables are discrete if their range is countable, and continuous if their range is an interval (or union of intervals).

2.3 Probability Distributions for Continuous Random Variables

Probability density function (pdf). A function \( f(x) \) is a pdf for a continuous random variable \( X \) if:
  • \( f(x) \geq 0 \) for all \( x \).
  • \( \displaystyle\int_{-\infty}^{\infty} f(x)\, dx = 1 \).
  • \( \displaystyle P(a \leq X \leq b) = \int_a^b f(x)\, dx \).
Cumulative distribution function (CDF). \[ F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t)\, dt. \]

The CDF is non-decreasing, right-continuous, with \( F(-\infty) = 0 \) and \( F(\infty) = 1 \).

2.4 Expected Value and Variance

Expected value. For a continuous random variable with pdf \( f \): \[ \mu = E\left[X\right] = \int_{-\infty}^{\infty} x\, f(x)\, dx. \]

For any function \( g(X) \):

\[ E\left[g(X)\right] = \int_{-\infty}^{\infty} g(x)\, f(x)\, dx. \]
Variance. \[ \sigma^2 = \mathrm{Var}(X) = E\!\left[\left(X - \mu\right)^2\right] = E\!\left[X^2\right] - \mu^2. \]

The standard deviation \( \sigma = \sqrt{\sigma^2} \) is in the same units as \( X \).

Key linearity property: for constants \( a, b \),

\[ E\left[aX + b\right] = a\mu + b, \qquad \mathrm{Var}(aX + b) = a^2 \sigma^2. \]

2.5 The Normal Distribution

Normal (Gaussian) distribution. \( X \sim N(\mu, \sigma^2) \) has pdf \[ f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right), \quad -\infty < x < \infty. \]

The normal distribution is fully characterised by its mean \( \mu \) and variance \( \sigma^2 \). The standard normal \( Z \sim N(0,1) \) has CDF \( \Phi(z) \) tabulated in every statistics text. Any normal variable is standardised by

\[ Z = \frac{X - \mu}{\sigma}. \]
Empirical rule (68–95–99.7 rule). For \( X \sim N(\mu, \sigma^2) \): \[ P(\mu - \sigma \leq X \leq \mu + \sigma) \approx 0.6827, \]\[ P(\mu - 2\sigma \leq X \leq \mu + 2\sigma) \approx 0.9545, \]\[ P(\mu - 3\sigma \leq X \leq \mu + 3\sigma) \approx 0.9973. \]
Example 2.1 — Shaft diameter. The diameter of a machined shaft is \( X \sim N(50.00, 0.04^2) \) mm. The specification limits are \( 50.00 \pm 0.08 \) mm. Find the fraction of shafts within specification. \[ P(49.92 \leq X \leq 50.08) = P\!\left(\frac{49.92-50.00}{0.04} \leq Z \leq \frac{50.08-50.00}{0.04}\right) = P(-2 \leq Z \leq 2) \approx 0.9545. \]

About 4.55 % of shafts fall outside specification. The engineer should investigate whether tightening the process standard deviation below 0.04 mm is economically feasible.


Chapter 3: Common Probability Distributions

3.1 Continuous Distributions

3.1.1 Exponential Distribution

Exponential distribution. \( X \sim \mathrm{Exp}(\lambda) \) has pdf \[ f(x) = \lambda e^{-\lambda x}, \quad x \geq 0, \]

with \( E[X] = 1/\lambda \) and \( \mathrm{Var}(X) = 1/\lambda^2 \). The parameter \( \lambda > 0 \) is the rate.

The exponential distribution models time between events in a Poisson process and has the memoryless property: \( P(X > s + t \mid X > s) = P(X > t) \). It is widely used for electronic component lifetimes and inter-arrival times in queuing.

3.1.2 Uniform Distribution

Continuous uniform distribution. \( X \sim U(a, b) \) has pdf \( f(x) = 1/(b-a) \) for \( a \leq x \leq b \) and zero elsewhere. Then \[ E[X] = \frac{a+b}{2}, \qquad \mathrm{Var}(X) = \frac{(b-a)^2}{12}. \]

Round-off error in digital measurement is often modelled as \( U(-h/2, h/2) \) where \( h \) is the least significant digit.

3.1.3 Weibull Distribution

Weibull distribution. \( X \sim \mathrm{Weibull}(\beta, \delta) \) has pdf \[ f(x) = \frac{\beta}{\delta} \left(\frac{x}{\delta}\right)^{\beta - 1} \exp\!\left(-\left(\frac{x}{\delta}\right)^\beta\right), \quad x \geq 0, \]

where \( \beta > 0 \) is the shape parameter and \( \delta > 0 \) is the scale parameter.

The Weibull hazard rate \( h(x) = (\beta/\delta)(x/\delta)^{\beta-1} \) is increasing for \( \beta > 1 \) (wear-out), constant for \( \beta = 1 \) (exponential; random failures), and decreasing for \( \beta < 1 \) (infant mortality). This makes it the dominant distribution in reliability engineering.

3.2 Discrete Distributions

3.2.1 Binomial Distribution

Binomial distribution. Let \( X \) be the number of successes in \( n \) independent Bernoulli trials each with success probability \( p \). Then \( X \sim B(n, p) \) with \[ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0, 1, \ldots, n. \]

\( E[X] = np \), \( \mathrm{Var}(X) = np(1-p) \).

Example 3.1 — Defective components. A PCB assembly line has a 3 % defect rate. A batch of 25 boards is inspected. Find the probability that at most 1 is defective.

With \( X \sim B(25, 0.03) \):

\[ P(X \leq 1) = P(X=0) + P(X=1) = (0.97)^{25} + 25(0.03)(0.97)^{24}. \]\[ \approx 0.4670 + 0.3611 = 0.8281. \]

There is roughly an 83 % chance of observing at most one defective board.

3.2.2 Poisson Distribution

Poisson distribution. \( X \sim \mathrm{Poisson}(\lambda) \) models the number of events occurring in a fixed interval when events happen independently at a constant average rate \( \lambda \): \[ P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!}, \quad k = 0, 1, 2, \ldots \]

\( E[X] = \mathrm{Var}(X) = \lambda \). The equality of mean and variance is a diagnostic signature of Poisson data.

The Poisson distribution is also an approximation to \( B(n, p) \) when \( n \) is large, \( p \) is small, and \( \lambda = np \) is moderate — the rare-event approximation.

3.2.3 Normal Approximation for Discrete Data

When \( n \) is large and \( p \) is not too close to 0 or 1, the binomial \( B(n, p) \) is well approximated by \( N(np, np(1-p)) \). The continuity correction improves accuracy: replace \( P(X \leq k) \) by \( P(Y \leq k + 0.5) \) where \( Y \sim N(np, np(1-p)) \). A common rule of thumb is that \( np \geq 5 \) and \( n(1-p) \geq 5 \) are required for the approximation to be reasonable.


Chapter 4: Sampling Distributions and the Central Limit Theorem

4.1 Sampling and the Sample Mean

When we draw a random sample \( X_1, X_2, \ldots, X_n \) from a population with mean \( \mu \) and variance \( \sigma^2 \), the sample mean

\[ \bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i \]

is itself a random variable. Its distribution — called the sampling distribution — is the key link between a single dataset and the population from which it came.

Exact properties of \( \bar{X} \). For any population with mean \( \mu \) and finite variance \( \sigma^2 \): \[ E[\bar{X}] = \mu, \qquad \mathrm{Var}(\bar{X}) = \frac{\sigma^2}{n}. \]

The standard deviation \( \sigma/\sqrt{n} \) is called the standard error of the mean.

4.2 The Central Limit Theorem

Central Limit Theorem (CLT). Let \( X_1, \ldots, X_n \) be i.i.d. random variables with mean \( \mu \) and finite variance \( \sigma^2 \). Then as \( n \to \infty \), \[ \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \xrightarrow{d} N(0, 1). \]

Equivalently, \( \bar{X} \) is approximately \( N(\mu, \sigma^2/n) \) for large \( n \), regardless of the shape of the original population distribution.

Sketch of proof via moment-generating functions. Let \( Y_i = (X_i - \mu)/\sigma \) so that \( E[Y_i] = 0 \) and \( \mathrm{Var}(Y_i) = 1 \). The standardised sample mean is \( Z_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n Y_i \). The moment-generating function (MGF) of \( Z_n \) is \[ M_{Z_n}(t) = \left[ M_Y\!\left(\frac{t}{\sqrt{n}}\right) \right]^n. \]

Expanding \( M_Y(t/\sqrt{n}) \) in a Taylor series around 0 using the known cumulants \( \kappa_1 = 0, \kappa_2 = 1 \):

\[ M_Y\!\left(\frac{t}{\sqrt{n}}\right) = 1 + \frac{t^2}{2n} + O\!\left(n^{-3/2}\right). \]

Therefore

\[ M_{Z_n}(t) = \left(1 + \frac{t^2}{2n} + O(n^{-3/2})\right)^n \to e^{t^2/2} \quad \text{as } n \to \infty, \]

which is the MGF of \( N(0, 1) \). By the continuity theorem for MGFs, \( Z_n \xrightarrow{d} N(0,1) \). \( \square \)

Practical convergence speed. For symmetric, unimodal populations, \( n \geq 10 \) is often sufficient. For highly skewed distributions (e.g., exponential with small \( \lambda \)), \( n \geq 30 \) is the conventional threshold. In practice, always plot the data before blindly invoking the CLT.

4.3 Small Samples and the t-Distribution

When the population variance \( \sigma^2 \) is unknown — which is the usual situation — and the sample size is small (\( n < 30 \)), the standardised sample mean follows a t-distribution rather than a standard normal.

t-distribution. If \( Z \sim N(0,1) \) and \( V \sim \chi^2_\nu \) are independent, then \[ T = \frac{Z}{\sqrt{V/\nu}} \sim t_\nu, \]

the t-distribution with \( \nu \) degrees of freedom. For a random sample from \( N(\mu, \sigma^2) \):

\[ T = \frac{\bar{X} - \mu}{S / \sqrt{n}} \sim t_{n-1}. \]

The t-distribution is bell-shaped and symmetric about zero, but has heavier tails than the standard normal. As \( \nu \to \infty \), \( t_\nu \to N(0,1) \).

4.4 Probability Plotting

A normal probability plot (quantile-quantile or Q-Q plot) assesses whether data come from a normal distribution. The sorted sample quantiles are plotted against the corresponding theoretical quantiles of a standard normal. Approximate linearity supports normality; systematic curves indicate skewness or heavy tails. Outliers appear as isolated points far from the line.

For Weibull data, a similar linearisation is possible: if \( X \sim \mathrm{Weibull}(\beta, \delta) \), then \( \ln(-\ln(1 - F(x))) = \beta \ln x - \beta \ln \delta \), which is linear in \( \ln x \). Plotting \( \ln(-\ln(1-\hat{F})) \) vs. \( \ln x \) (Weibull probability paper) yields a straight line whose slope estimates \( \beta \).

4.5 Joint Probability Distributions

Joint pdf. For continuous random variables \( X \) and \( Y \), a function \( f(x, y) \geq 0 \) with \( \iint f(x,y)\,dx\,dy = 1 \) is their joint pdf. Marginal distributions are obtained by integrating out the other variable: \[ f_X(x) = \int_{-\infty}^{\infty} f(x, y)\, dy. \]
Independence. Continuous random variables \( X \) and \( Y \) are independent if and only if \[ f(x, y) = f_X(x) \cdot f_Y(y) \quad \text{for all } x, y. \]
Covariance and correlation. \[ \mathrm{Cov}(X, Y) = E\!\left[(X - \mu_X)(Y - \mu_Y)\right] = E[XY] - \mu_X \mu_Y. \]

The Pearson correlation coefficient \( \rho = \mathrm{Cov}(X,Y) / (\sigma_X \sigma_Y) \) satisfies \( -1 \leq \rho \leq 1 \). If \( X \) and \( Y \) are independent, then \( \mathrm{Cov}(X,Y) = 0 \), but the converse is not generally true.

A critical property for the CLT and regression: for independent random variables,

\[ \mathrm{Var}\!\left(\sum_{i=1}^n a_i X_i\right) = \sum_{i=1}^n a_i^2\, \mathrm{Var}(X_i). \]

Chapter 5: Measurement Systems and Experimental Uncertainty

5.1 Overview of Measurement Systems

A measurement system converts a physical quantity into a signal that can be read, recorded, or processed. Every system consists of a sensor (transduces the physical quantity), a signal conditioner (amplifies, filters, or converts), and a readout (display, data acquisition system, or actuator). Understanding the behaviour and limitations of each stage is prerequisite to interpreting data correctly.

Key performance characteristics of a sensor include:

  • Range: the interval of inputs over which it operates correctly.
  • Sensitivity: change in output per unit change in input (\( K = \Delta \mathrm{output} / \Delta \mathrm{input} \)).
  • Resolution: the smallest detectable input change.
  • Accuracy: closeness of the indicated value to the true value.
  • Precision (repeatability): closeness of repeated measurements under unchanged conditions.
  • Linearity: degree to which sensitivity is constant across the range.
  • Hysteresis: difference in output for the same input depending on direction of approach.
Accuracy vs. precision. A sensor can be precise (low scatter) but inaccurate (biased), or accurate on average but imprecise (high scatter). Good calibration reduces systematic error (bias); averaging many readings reduces random error. Both are needed for trustworthy data.

5.2 Temperature Measurement

5.2.1 Thermocouples

A thermocouple exploits the Seebeck effect: two dissimilar metals joined at a measurement junction produce an electromotive force (EMF) proportional to the temperature difference between the hot junction and the reference junction (typically held at 0°C). Common types include J (iron/constantan, 0–760°C), K (chromel/alumel, −200–1260°C), and T (copper/constantan, −200–370°C).

The relationship between EMF \( E \) and temperature \( T \) is nonlinear (polynomial of degree 5–9 in ITS-90 reference tables). Over narrow ranges, a linear approximation \( E \approx S \cdot T \) holds, where \( S \) (the Seebeck coefficient, in \(\mu\mathrm{V}/^\circ\mathrm{C}\)) is approximately constant.

5.2.2 Resistance Temperature Detectors (RTDs)

The electrical resistance of a metal varies with temperature. For platinum RTDs (PT100: \( R_0 = 100 \, \Omega \) at 0°C):

\[ R(T) = R_0 \left(1 + AT + BT^2\right), \]

where for platinum \( A = 3.9083 \times 10^{-3} \, ^\circ\mathrm{C}^{-1} \) and \( B = -5.775 \times 10^{-7} \, ^\circ\mathrm{C}^{-2} \). RTDs are more stable and accurate than thermocouples but slower and more fragile.

5.2.3 Thermistors

Thermistors are semiconductor devices with a large negative (NTC) or positive (PTC) temperature coefficient of resistance. Their response is highly nonlinear, often modelled by the Steinhart-Hart equation:

\[ \frac{1}{T} = A_0 + A_1 \ln R + A_3 (\ln R)^3. \]

High sensitivity makes thermistors ideal for narrow-range precision measurements (e.g., body temperature), but their nonlinearity and self-heating demand careful calibration.

5.3 Pressure Measurement

Pressure is defined as force per unit area. Gauge pressure \( p_g \) is measured relative to atmospheric; absolute pressure \( p_{abs} = p_g + p_{atm} \); differential pressure is the difference between two points.

Common transducer technologies include:

  • Bourdon tube: elastic tube that straightens under pressure; displacement drives a pointer or LVDT. Rugged but limited to static or slowly varying pressures.
  • Piezoresistive (strain-gauge) transducers: pressure deflects a thin diaphragm whose strain is sensed by a Wheatstone bridge. High frequency response; widely used in data acquisition.
  • Piezoelectric transducers: dynamic pressure generates a charge; suitable for high-frequency measurements (combustion, acoustics) but cannot measure static pressure.

5.4 Displacement and Strain Measurement

Strain gauges exploit the piezoresistive effect in a foil grid bonded to a surface. The gauge factor \( GF \) relates relative resistance change to strain \( \varepsilon \):

\[ \frac{\Delta R}{R} = GF \cdot \varepsilon, \qquad GF \approx 2 \text{ for metallic gauges.} \]

A Wheatstone bridge converts the small resistance change to a measurable voltage. Temperature compensation is achieved by using dummy gauges in an adjacent bridge arm.

Linear Variable Differential Transformers (LVDTs) are AC-excited inductive sensors that produce a voltage output linearly proportional to core displacement over a specified range. They are highly linear, frictionless, and have infinite resolution.

5.5 Calibration

Calibration. The process of comparing a measurement device's output to a known reference standard under controlled conditions, and adjusting or documenting the relationship between indicated and true values.

A calibration curve relates the instrument output \( y \) to the reference standard input \( x \). If the device is linear, the curve is \( y = a_0 + a_1 x \); least-squares regression (Chapter 7) determines \( a_0 \) and \( a_1 \). Calibration uncertainty must be combined with measurement uncertainty in the final result.

Traceability. Calibration is meaningful only when the reference standard is traceable to national or international standards (e.g., NIST in the USA, NRC in Canada). Metrological traceability requires an unbroken chain of comparisons, each with documented uncertainty.

5.6 Systematic and Random Errors

Every measurement \( x_m \) deviates from the true value \( x_t \) according to:

\[ x_m = x_t + e_s + e_r, \]

where \( e_s \) is the systematic error (bias) — a fixed, repeatable offset — and \( e_r \) is the random error — a zero-mean, varying component.

Systematic errors arise from sensor calibration offsets, zero drift, loading effects (e.g., inserting a thermometer perturbs the temperature field), electromagnetic interference, and quantisation. They cannot be reduced by averaging; they must be identified and corrected.

Random errors arise from electrical noise, mechanical vibration, air currents, and round-off. They follow a probability distribution (often normal with mean zero) and can be reduced by averaging \( n \) measurements: \( \sigma_{\bar{x}} = \sigma_x / \sqrt{n} \).

5.7 Uncertainty Analysis and Error Propagation

In many experiments, the quantity of interest \( y \) is not measured directly but computed from several directly measured quantities \( x_1, x_2, \ldots, x_k \) through a known functional relationship \( y = f(x_1, x_2, \ldots, x_k) \).

Error propagation formula (first-order Taylor expansion). If \( x_1, \ldots, x_k \) are independent, each with uncertainty \( u_i \), then the combined uncertainty \( u_y \) in \( y = f(x_1, \ldots, x_k) \) is \[ u_y = \sqrt{\sum_{i=1}^{k} \left(\frac{\partial f}{\partial x_i}\right)^2 u_i^2}. \]

The partial derivatives are evaluated at the nominal values \( \bar{x}_1, \ldots, \bar{x}_k \).

Derivation via Taylor expansion. Write \( \delta x_i = x_i - \bar{x}_i \) for the deviation of the \( i \)-th measurement from its nominal. A first-order Taylor expansion gives \[ \delta y \approx \sum_{i=1}^{k} \frac{\partial f}{\partial x_i} \delta x_i. \]

Taking the variance of both sides and using independence (\( \mathrm{Cov}(\delta x_i, \delta x_j) = 0 \) for \( i \neq j \)):

\[ \mathrm{Var}(\delta y) \approx \sum_{i=1}^{k} \left(\frac{\partial f}{\partial x_i}\right)^2 \mathrm{Var}(\delta x_i) = \sum_{i=1}^{k} \left(\frac{\partial f}{\partial x_i}\right)^2 u_i^2. \]

Taking the square root gives the combined standard uncertainty \( u_y \). \( \square \)

Example 5.1 — Electrical power. Power is calculated as \( P = V I \) where \( V \) and \( I \) are independently measured. Given \( V = 12.0 \pm 0.2 \) V and \( I = 3.0 \pm 0.05 \) A (both standard uncertainties), find the uncertainty in \( P \). \[ \frac{\partial P}{\partial V} = I = 3.0, \qquad \frac{\partial P}{\partial I} = V = 12.0. \]\[ u_P = \sqrt{(3.0)^2(0.2)^2 + (12.0)^2(0.05)^2} = \sqrt{0.36 + 0.36} = \sqrt{0.72} \approx 0.85 \text{ W}. \]

So \( P = 36.0 \pm 0.85 \) W (approximately 2.4 % relative uncertainty).

Example 5.2 — Volumetric flow rate. Pipe flow rate \( Q = \frac{\pi d^2}{4} v \) where \( d = 0.050 \pm 0.001 \) m and \( v = 2.00 \pm 0.05 \) m/s. Note that \( d \) appears squared, so errors in \( d \) are amplified. \[ \frac{\partial Q}{\partial d} = \frac{\pi d}{2} v, \qquad \frac{\partial Q}{\partial v} = \frac{\pi d^2}{4}. \]\[ u_Q = \sqrt{\left(\frac{\pi (0.05)(2.00)}{2}\right)^2 (0.001)^2 + \left(\frac{\pi (0.05)^2}{4}\right)^2 (0.05)^2}. \]

Evaluating: \( Q = \frac{\pi (0.0025)}{4}(2.00) \approx 3.927 \times 10^{-3} \) m\(^3\)/s. The relative uncertainty is dominated by the diameter term because of the square dependence — a lesson that precision in diameter measurement pays greater dividends than precision in velocity.


Chapter 6: Hypothesis Testing and Confidence Intervals

6.1 Framework of Hypothesis Testing

A statistical hypothesis is a claim about a population parameter. Hypothesis testing provides a rule for deciding, based on sample data, whether the evidence is sufficient to reject the claim.

Null and alternative hypotheses.
  • Null hypothesis \( H_0 \): the default claim (e.g., \( \mu = \mu_0 \)). It is assumed true until evidence forces rejection.
  • Alternative hypothesis \( H_1 \): the claim supported by rejection of \( H_0 \).

A one-sided test has \( H_1: \mu > \mu_0 \) or \( H_1: \mu < \mu_0 \). A two-sided test has \( H_1: \mu \neq \mu_0 \).

Type I and Type II errors.
Decision\( H_0 \) is true\( H_0 \) is false
Reject \( H_0 \)Type I error (prob. \( \alpha \))Correct (power \( 1 - \beta \))
Fail to reject \( H_0 \)Correct (prob. \( 1 - \alpha \))Type II error (prob. \( \beta \))

\( \alpha \) is the significance level. \( \beta \) is the probability of missing a true effect. Power \( = 1 - \beta \) is the probability of correctly detecting a real difference.

6.2 Tests on a Single Mean

z-test for a single mean (\( \sigma \) known). To test \( H_0: \mu = \mu_0 \) against \( H_1: \mu \neq \mu_0 \) at significance level \( \alpha \):

Compute the test statistic

\[ z_0 = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}. \]

Reject \( H_0 \) if \( \left|z_0\right| > z_{\alpha/2} \), the upper \( \alpha/2 \) critical value of \( N(0,1) \).

t-test for a single mean (\( \sigma \) unknown). With \( s \) in place of \( \sigma \): \[ t_0 = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \sim t_{n-1} \text{ under } H_0. \]

Reject \( H_0: \mu = \mu_0 \) vs. \( H_1: \mu \neq \mu_0 \) if \( \left|t_0\right| > t_{\alpha/2, n-1} \).

6.3 The p-Value

p-value. The probability, assuming \( H_0 \) is true, of observing a test statistic at least as extreme as the one actually computed. A small p-value provides evidence against \( H_0 \). Formally, reject \( H_0 \) at level \( \alpha \) if and only if \( p\text{-value} < \alpha \).
Common misconception. The p-value is not the probability that \( H_0 \) is true. It is a conditional probability about data given \( H_0 \). A p-value of 0.03 means: if \( H_0 \) were true, only 3 % of random samples would produce a test statistic this extreme or more so — not that there is a 97 % probability \( H_0 \) is false.

6.4 Type II Error and Sample Size Determination

The Type II error probability \( \beta \) depends on the true value \( \mu = \mu_0 + \delta \) (the actual departure from \( H_0 \)), the significance level \( \alpha \), and the sample size \( n \). For the two-sided z-test:

\[ \beta \approx \Phi\!\left(z_{\alpha/2} - \frac{\delta\sqrt{n}}{\sigma}\right) - \Phi\!\left(-z_{\alpha/2} - \frac{\delta\sqrt{n}}{\sigma}\right). \]

To achieve power \( 1 - \beta \) at a minimum detectable difference \( \delta = |\mu - \mu_0| \):

\[ n \approx \frac{(z_{\alpha/2} + z_\beta)^2 \sigma^2}{\delta^2}. \]
Example 6.1 — Sensor calibration check. A force sensor is claimed to have mean output \( \mu_0 = 100 \) N at a reference load. A technician suspects bias. From past data, \( \sigma = 2 \) N. If the true mean is \( 102 \) N (a 2 N bias), how many measurements are needed to detect this with 90 % power at \( \alpha = 0.05 \)? \[ n \approx \frac{(z_{0.025} + z_{0.10})^2 \sigma^2}{\delta^2} = \frac{(1.96 + 1.282)^2 (4)}{4} = (3.242)^2 = 10.5 \implies n = 11. \]

With 11 measurements, the test has 90 % power to detect a 2 N bias at the 5 % significance level.

6.5 Confidence Intervals

Confidence interval for \( \mu \). A \( (1-\alpha) \times 100 \% \) confidence interval for the population mean is an interval \( [L, U] \) constructed from the data such that, in repeated sampling, \( (1-\alpha) \times 100 \% \) of intervals so constructed will contain the true \( \mu \).

For \( \sigma \) known:

\[ \bar{x} - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{x} + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}. \]

For \( \sigma \) unknown (use sample \( s \), \( t_{n-1} \) critical value):

\[ \bar{x} - t_{\alpha/2,\, n-1} \frac{s}{\sqrt{n}} \leq \mu \leq \bar{x} + t_{\alpha/2,\, n-1} \frac{s}{\sqrt{n}}. \]
Example 6.2 — Compressive strength. Eight concrete cylinders are tested, yielding strengths (MPa): 28.1, 27.6, 29.0, 28.4, 27.9, 28.8, 28.2, 27.7. Compute a 95 % CI for the true mean strength.

\( n = 8 \), \( \bar{x} = 28.21 \) MPa, \( s = 0.483 \) MPa, \( t_{0.025, 7} = 2.365 \).

\[ \bar{x} \pm t_{0.025,7} \frac{s}{\sqrt{n}} = 28.21 \pm 2.365 \cdot \frac{0.483}{\sqrt{8}} = 28.21 \pm 0.40. \]

The 95 % CI is \( \left[27.81, 28.61\right] \) MPa. If the design specification requires \( \mu \geq 28.0 \) MPa, the evidence supports compliance at the 95 % confidence level.

6.6 Inference on Two Population Means

6.6.1 Paired t-Test

Used when each observation in one group is naturally paired with an observation in the other (e.g., before/after measurements on the same unit).

Form differences \( d_i = x_{1i} - x_{2i} \) and test \( H_0: \mu_D = 0 \) using \( t_0 = \bar{d} / (s_d / \sqrt{n}) \sim t_{n-1} \) under \( H_0 \).

6.6.2 Two-Sample t-Test (Unpaired)

For independent samples from two populations, test \( H_0: \mu_1 = \mu_2 \).

Equal variances assumed (pooled estimate):

\[ s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}, \qquad t_0 = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{1/n_1 + 1/n_2}} \sim t_{n_1 + n_2 - 2}. \]

Unequal variances (Welch’s test):

\[ t_0 = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{s_1^2/n_1 + s_2^2/n_2}}, \]

with Welch-Satterthwaite approximate degrees of freedom

\[ \nu \approx \frac{(s_1^2/n_1 + s_2^2/n_2)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}. \]

6.7 Confidence Intervals for Proportions

For a proportion \( p \) estimated from \( n \) observations with \( x \) successes, \( \hat{p} = x/n \):

\[ \hat{p} - z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \leq p \leq \hat{p} + z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}. \]

Valid when \( n\hat{p} \geq 5 \) and \( n(1-\hat{p}) \geq 5 \).


Chapter 7: Design of Experiments and Regression Analysis

7.1 Strategy of Experimentation

Engineering experiments rarely involve a single variable. A thoughtful experimental strategy controls nuisance factors, ensures balanced treatment of all factor combinations, and allows estimation of factor interactions. Key principles:

  • Randomisation: run trials in random order to protect against lurking time-trends.
  • Replication: replicate each treatment combination to estimate pure experimental error and increase power.
  • Blocking: group experimental units into homogeneous blocks to reduce background variability.

7.2 The 2² Factorial Design

A \( 2^k \) factorial design involves \( k \) factors each at two levels, coded \( -1 \) (low) and \( +1 \) (high). A \( 2^2 \) design has four treatment combinations.

Main effects and interactions in a 2² design. With factors \( A \) and \( B \) and response \( y \):
  • Main effect of A: average change in \( y \) when \( A \) moves from low to high, averaged over levels of \( B \).
  • Main effect of B: analogous.
  • Interaction AB: difference in the effect of \( A \) at the two levels of \( B \). A non-zero interaction means that the effect of one factor depends on the level of the other.

With \( n \) replicates per cell and total \( N = 4n \) runs, the effects are estimated by contrasts and can be tested with an \( F \)-test in the ANOVA framework. Significant interactions render main-effect interpretations incomplete.

7.3 The 2³ Factorial Design and Response Surfaces

A \( 2^3 \) design adds a third factor \( C \), giving \( 8 \) treatment combinations. It estimates 7 effects: three main effects (\( A, B, C \)), three two-factor interactions (\( AB, AC, BC \)), and one three-factor interaction (\( ABC \)). A three-factor interaction is usually difficult to interpret physically and, if absent, the corresponding contrast estimates pure experimental error.

When a \( 2^k \) design indicates that a region of the experimental space is promising, response surface methodology adds centre points and axial runs to fit a second-order model:

\[ \hat{y} = \beta_0 + \sum_i \beta_i x_i + \sum_i \beta_{ii} x_i^2 + \sum_{iThe stationary point (maximum, minimum, or saddle) of the fitted surface guides optimisation.

Example 7.1 — Injection moulding. A \( 2^2 \) design studies the effect of melt temperature (A: 200°C vs. 230°C) and injection speed (B: 50 mm/s vs. 80 mm/s) on part shrinkage (mm, averaged over 2 replicates per cell):
RunAB\( \bar{y} \)
(1)0.80
a+1.20
b+0.95
ab++1.05

Main effect of A: \( \frac{1}{2}\left[(1.20 - 0.80) + (1.05 - 0.95)\right] = \frac{1}{2}(0.40 + 0.10) = 0.25 \) mm.

Main effect of B: \( \frac{1}{2}\left[(0.95 - 0.80) + (1.05 - 1.20)\right] = \frac{1}{2}(0.15 - 0.15) = 0.00 \) mm.

Interaction AB: \( \frac{1}{2}\left[(1.20 - 0.80) - (1.05 - 0.95)\right] = \frac{1}{2}(0.40 - 0.10) = 0.15 \) mm.

Temperature has a strong main effect; injection speed alone has no effect on average. The positive interaction means high temperature amplifies the effect of speed. An engineer minimising shrinkage should run low temperature; if forced to run high temperature, low speed reduces damage.

7.4 Simple Linear Regression

When a response \( y \) depends linearly on a single predictor \( x \), the model is

\[ Y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \quad \varepsilon_i \overset{\text{i.i.d.}}{\sim} N(0, \sigma^2), \quad i = 1, \ldots, n. \]

The unknown parameters are the intercept \( \beta_0 \), slope \( \beta_1 \), and error variance \( \sigma^2 \).

7.5 Least-Squares Estimation

Ordinary least-squares (OLS) estimators. The least-squares estimators \( \hat{\beta}_0 \) and \( \hat{\beta}_1 \) minimise \[ S(\beta_0, \beta_1) = \sum_{i=1}^{n} \left(y_i - \beta_0 - \beta_1 x_i\right)^2. \]

They are

\[ \hat{\beta}_1 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n}(x_i - \bar{x})^2} = \frac{S_{xy}}{S_{xx}}, \qquad \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x}. \]

where \( S_{xy} = \sum(x_i - \bar{x})(y_i - \bar{y}) \) and \( S_{xx} = \sum(x_i - \bar{x})^2 \).

Derivation of OLS estimators. Differentiate \( S \) with respect to \( \beta_0 \) and \( \beta_1 \) and set to zero (the normal equations): \[ \frac{\partial S}{\partial \beta_0} = -2 \sum_{i=1}^{n}(y_i - \beta_0 - \beta_1 x_i) = 0 \implies n\hat{\beta}_0 + \hat{\beta}_1 \sum x_i = \sum y_i, \]\[ \frac{\partial S}{\partial \beta_1} = -2 \sum_{i=1}^{n} x_i (y_i - \beta_0 - \beta_1 x_i) = 0 \implies \hat{\beta}_0 \sum x_i + \hat{\beta}_1 \sum x_i^2 = \sum x_i y_i. \]

Dividing the first equation by \( n \): \( \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \). Substituting into the second equation and rearranging yields \( \hat{\beta}_1 = S_{xy}/S_{xx} \). \( \square \)

7.6 Properties of OLS Estimators and Inferences

Gauss-Markov theorem. Under the regression model with \( E[\varepsilon_i] = 0 \), \( \mathrm{Var}(\varepsilon_i) = \sigma^2 \) (constant), and uncorrelated errors, the OLS estimators \( \hat{\beta}_0 \) and \( \hat{\beta}_1 \) are the Best Linear Unbiased Estimators (BLUE): they have minimum variance among all linear unbiased estimators.

The unbiased estimator of \( \sigma^2 \) is the mean squared error:

\[ \hat{\sigma}^2 = s^2 = \frac{\mathrm{SS}_E}{n-2} = \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{n-2}, \]

where \( \hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_i \) are the fitted values and \( e_i = y_i - \hat{y}_i \) are the residuals. Two degrees of freedom are lost because two parameters (\( \beta_0, \beta_1 \)) are estimated.

Hypothesis test on slope: To test \( H_0: \beta_1 = 0 \) (no linear relationship):

\[ t_0 = \frac{\hat{\beta}_1}{s / \sqrt{S_{xx}}} \sim t_{n-2} \text{ under } H_0. \]

A \( (1-\alpha)100\% \) CI on \( \beta_1 \) is \( \hat{\beta}_1 \pm t_{\alpha/2,\, n-2} \cdot s/\sqrt{S_{xx}} \).

The coefficient of determination

\[ R^2 = 1 - \frac{\mathrm{SS}_E}{\mathrm{SS}_T}, \quad \mathrm{SS}_T = \sum(y_i - \bar{y})^2, \]

measures the proportion of total variability explained by the regression. \( R^2 \in [0, 1] \); values close to 1 indicate a strong linear fit.

Example 7.2 — Thermocouple calibration. A K-type thermocouple produces EMF readings (mV) at known temperatures (°C):
\( x \) (°C)100200300400500
\( y \) (mV)4.108.1412.2116.4020.65

Compute \( \bar{x} = 300 \), \( \bar{y} = 12.30 \), \( S_{xx} = 100{,}000 \), \( S_{xy} = 4120 \).

\[ \hat{\beta}_1 = \frac{4120}{100{,}000} = 0.04120 \text{ mV/°C}, \qquad \hat{\beta}_0 = 12.30 - 0.04120 \times 300 = 0.0 \text{ mV}. \]

The calibration curve is \( \hat{y} = 0.04120 x \) mV. The Seebeck coefficient estimate \( 0.04120 \) mV/°C \( = 41.2\,\mu\mathrm{V}/°C \) aligns with tabulated values for type K thermocouples (approximately \( 41\,\mu\mathrm{V}/°C \) near 300°C). The residuals should be checked: any systematic curvature would indicate the linear model is insufficient over this range and a quadratic term should be added.

7.7 Multiple Linear Regression

When \( p \) predictors are available, the model becomes

\[ Y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \cdots + \beta_p x_{ip} + \varepsilon_i. \]

In matrix notation with \( \mathbf{y} \in \mathbb{R}^n \), design matrix \( \mathbf{X} \in \mathbb{R}^{n \times (p+1)} \), and \( \boldsymbol{\beta} \in \mathbb{R}^{p+1} \):

\[ \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}. \]

The OLS estimator is

\[ \hat{\boldsymbol{\beta}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}, \]

provided \( \mathbf{X}^\top \mathbf{X} \) is invertible (no perfect multicollinearity).

The error variance estimate is \( \hat{\sigma}^2 = \mathrm{SS}_E / (n - p - 1) \). The adjusted \( R^2 \) penalises for adding predictors:

\[ R^2_{\mathrm{adj}} = 1 - \frac{\mathrm{SS}_E / (n-p-1)}{\mathrm{SS}_T / (n-1)}. \]

7.8 Model Adequacy Checking

Fitting a regression model is not the end of the analysis. Residual diagnostics reveal violations of assumptions:

  • Normality: a normal probability plot of residuals should be approximately linear. Marked curvature indicates non-normality.
  • Constant variance (homoscedasticity): a plot of residuals \( e_i \) vs. fitted values \( \hat{y}_i \) should show random scatter with no funnel shape. Funnel patterns indicate variance increasing with the mean — a log transform of \( y \) often stabilises variance.
  • Independence: residuals vs. run order should show no trend or pattern. Trends suggest time-varying conditions or autocorrelation.
  • Influential observations: Cook’s distance and leverage values identify individual points that disproportionately affect the fitted model.
Connection to the measurement chapters. Residuals in a regression model of sensor data often carry physical meaning. A systematic pattern in residuals (e.g., residuals increasing at high temperatures) may indicate that a sensor's nonlinearity is significant and a higher-order calibration model is needed. Residual analysis thus bridges statistical modelling and instrumentation science — the same tools serve both.
Example 7.3 — Strain gauge bridge output. An engineer collects Wheatstone bridge output voltage \( V \) (mV) as a function of applied load \( F \) (kN) for a load cell. After fitting the model \( \hat{V} = 0.200F + 0.015 \), the residuals are plotted vs. \( F \). A slight quadratic pattern is observed, with residuals positive at low and high loads and negative at mid-range. This indicates the sensor has mild nonlinearity, and the quadratic model \[ V = \beta_0 + \beta_1 F + \beta_2 F^2 + \varepsilon \]

is fitted. Adding the squared term reduces \( \mathrm{SS}_E \) by 68 %, and residuals now appear random. The quadratic coefficient \( \hat{\beta}_2 \) captures the nonlinearity of the bridge. The engineer includes this correction in the calibration firmware.


Summary of Key Formulae

The table below collects the central formulae developed throughout these notes.

QuantityFormula
Sample mean\( \bar{x} = \frac{1}{n}\sum x_i \)
Sample variance\( s^2 = \frac{1}{n-1}\sum(x_i - \bar{x})^2 \)
Standard error of mean\( \sigma_{\bar{X}} = \sigma/\sqrt{n} \)
Error propagation\( u_y = \sqrt{\sum_i \left(\partial f/\partial x_i\right)^2 u_i^2} \)
z-test statistic\( z_0 = (\bar{x} - \mu_0)/(\sigma/\sqrt{n}) \)
t-test statistic\( t_0 = (\bar{x} - \mu_0)/(s/\sqrt{n}) \)
CI for \( \mu \) (\( \sigma \) unknown)\( \bar{x} \pm t_{\alpha/2,\, n-1}\, s/\sqrt{n} \)
Sample size (power)\( n \approx (z_{\alpha/2} + z_\beta)^2 \sigma^2/\delta^2 \)
OLS slope\( \hat{\beta}_1 = S_{xy}/S_{xx} \)
OLS intercept\( \hat{\beta}_0 = \bar{y} - \hat{\beta}_1\bar{x} \)
MSE\( s^2 = \mathrm{SS}_E/(n-2) \)
Multiple OLS\( \hat{\boldsymbol{\beta}} = (\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top\mathbf{y} \)
Integrating the two halves of the course. Measurement and statistics are not separate subjects bolted together for curriculum convenience. Every sensor reading is a random variable; its distribution is characterised by the physics of transduction and the electronics of signal conditioning. Calibration fits a regression model. Uncertainty propagation uses derivatives — the same linearisation that underlies the CLT proof. Hypothesis tests decide whether a process has drifted; confidence intervals quantify how much. A mechatronics engineer who internalises this unity can move fluidly from an oscilloscope trace to a regression residual plot to a two-sample t-test — using one coherent set of tools throughout the design-build-test cycle.
Back to top