INTEG 220: Nature of Scientific Knowledge

Katie Plaisance

Estimated study time: 1 hr 4 min

Table of contents

Science is one of the most powerful knowledge-making enterprises in human history. It has given us germ theory, quantum mechanics, the structure of DNA, vaccines, and space telescopes. Yet most of us, including scientists themselves, rarely stop to ask the most basic philosophical questions about it: What is scientific knowledge? When can we trust it? How does it get produced? Who counts as an expert? And how should it inform the decisions we make together as a society?

These are the questions at the heart of INTEG 220. The course sits at the intersection of philosophy of science and science studies, drawing on thinkers from Aristotle to Thomas Kuhn to contemporary philosophers examining the relationship between science, values, diversity, and public trust. Rather than treating science as a black box that simply delivers correct answers, this course pulls back the curtain and examines the machinery inside — the observations, experiments, arguments, social structures, institutions, and human judgments that together produce what we call scientific knowledge.

Part I: What Is Science?

Chapter 1: Pulling Back the Curtain

The Epistemology of Science

Most courses in science ask you to learn what scientists have discovered: the laws of thermodynamics, the structure of DNA, the mechanisms of evolution. This course asks something different — not what science has found, but how it works and why we should trust it.

This is a question for epistemology, the branch of philosophy that studies knowledge. Epistemologists ask: What is knowledge? What distinguishes knowing something from merely believing it? How do we share knowledge, and how do we update it when new evidence arrives? These questions apply to all domains of inquiry, but they become especially important when we apply them to science, because science plays such a central role in modern life. Medical treatments, environmental policies, public health guidance — these are all grounded in scientific research. Understanding how that research actually works is therefore not merely an academic exercise but a civic necessity.

The dominant popular image of science is roughly as follows: scientists make objective observations of the natural world, derive universal laws from those observations through careful logical reasoning, and thereby build up a body of reliable knowledge that is free from human bias and subjective interference. This image has deep roots in a philosophical tradition known as logical positivism, which flourished in the early twentieth century in Vienna and elsewhere. The logical positivists believed that genuine scientific knowledge had to be grounded in observable experience, verified by experiment, and expressed in logically precise language. Their slogan — science is knowledge derived from the facts of experience — captures an appealing picture of how science works.

The British philosopher Alan Chalmers, in his widely read book What Is This Thing Called Science?, subjects this logical positivist picture to a systematic critique. His argument is not that science is unreliable or that we should distrust it. Rather, he shows that the actual practice of science is far richer, more complex, and more interesting than the simple model suggests. Working through Chalmers’s analysis chapter by chapter is one of the best introductions available to the philosophy of science, because it forces us to confront how much careful philosophical thinking lies behind what looks, on the surface, like a straightforward empirical enterprise.

Why Philosophy of Science Matters

For students who will live in a world saturated with scientific claims — in medicine, in environmental policy, in debates over public health — the ability to think carefully about science is not optional. It is not enough to simply trust science because scientists say so, nor is it enough to dismiss science because it is produced by imperfect human beings. What is needed is a nuanced understanding of what makes scientific knowledge reliable, where its limitations lie, and how to evaluate claims that are made in science’s name.

One of the distinctive features of the University of Waterloo’s Knowledge Integration program is the insistence that integrating knowledge across disciplines requires understanding not just the content of different fields but the epistemic frameworks they rely on — their methods, their standards of evidence, their background assumptions. That metalevel understanding is precisely what philosophy of science provides.

Chapter 2: Observation and Theory-Ladenness

The Myth of Pure Observation

The logical positivist picture of science assumes that we can make observations that are simply given to us by experience — that seeing, hearing, and measuring are passive activities in which the world imprints itself directly on our senses without any interpretation or prior knowledge interfering. Chalmers’s first major move is to challenge this assumption, and he does so by asking us to look at how observation actually works.

Consider a simple optical illusion such as the Necker cube or the duck-rabbit figure. Two people looking at the same drawing genuinely see different things, and the visual system snaps between interpretations without any change in the underlying image. This is not a mere curiosity. It reveals something profound: perception is not passive. What we see depends on what our visual system expects to see, on patterns we have learned to recognize, on background frameworks that organize incoming stimuli into meaningful wholes. Chalmers calls this the theory-ladenness of observation — the idea that what a person observes is partly determined by the theories and background knowledge they bring to the situation.

This has been demonstrated dramatically in the history of science. When Galileo pointed his newly constructed telescope at the moons of Jupiter in 1609, he interpreted the small points of light orbiting the giant planet as evidence for a heliocentric solar system. But his contemporaries, trained in Aristotelian physics and the Ptolemaic cosmology, literally could not make sense of what they were seeing through the same instrument. Their background theory told them that the Earth was the center of the universe, that perfect circular motions defined the heavens, and that there could be no moons orbiting Jupiter. So the observations that were, for Galileo, evidence for a new cosmology were, for his contemporaries, either instrumental artifacts or philosophical impossibilities. Same phenomenon, entirely different observations, because the observers brought entirely different theoretical frameworks.

The point generalizes. A trained physician examining an X-ray sees pneumonia; an untrained observer sees only grey and white shadows. A marine biologist examining a rock pool identifies species, feeding behaviors, and ecological relationships; a casual visitor sees pretty little creatures. In both cases, what the expert “sees” is shot through with theory and background knowledge. The observations are not raw data delivered by the world but interpretations constructed through a trained perceptual apparatus. This is not a failing of science — it is part of what makes scientific observation so powerful. But it does mean that observation cannot provide the theory-independent foundation that the logical positivists imagined.

Active and Public Observation

Chalmers also emphasizes two further features of scientific observation that distinguish it from the casual seeing of everyday life. First, scientific observation is active, not passive. Scientists do not simply wait for the world to present itself to them. They design experiments, construct instruments, choose what to measure and how to measure it, and interpret results against a backdrop of theoretical expectations. This active, interventionist character of scientific observation requires substantial background knowledge: knowing which measurements are relevant, which instruments are appropriate, how to control for confounding factors, and how to interpret the results once obtained.

Second, scientific observation is public rather than private. One of science’s most important epistemic features is that its claims can be independently checked by other scientists. When a research team publishes a finding in a peer-reviewed journal, other experts in the field can attempt to replicate the experiments, scrutinize the methodology, and challenge the interpretation. This intersubjective checkability is what Chalmers proposes as the basis for a reconceived notion of scientific objectivity.

The traditional conception of objectivity — the idea that scientific observations are free from any kind of bias — is untenable, because we have seen that perceptual and cognitive biases inevitably shape what we observe. But there is a different conception of objectivity that is more defensible: the idea that scientific claims are objective insofar as they can be publicly tested by straightforward procedures, checked and rechecked by independent investigators. On this view, objectivity is not about eliminating subjectivity from individual scientists but about subjecting findings to a community process of scrutiny and criticism.

This reconception of objectivity has a corollary: intersubjective agreement can serve as a marker of reliability. If multiple independent investigators, using different methods and starting from different theoretical backgrounds, converge on the same finding, that convergence provides stronger evidence than any single observation. But intersubjective agreement is not infallible. If all the investigators share the same cognitive biases or the same demographic background, their agreement might reflect shared blind spots rather than genuine evidence. This issue — the problem of homogeneity of perspective in scientific communities — will become important when we turn to Helen Longino’s argument for diversity in science.

A striking case study of what happens when research samples are too narrow comes from a well-known paper on WEIRD psychology. “WEIRD” stands for Westernized, Educated, Industrialized, Rich, and Developed, and the finding is that a disproportionate amount of human psychology research has been conducted on subjects from these populations — often university undergraduates at North American and European institutions. The worry is that generalizations drawn from these subjects may not hold for the much broader range of human populations that actually exist. When your sample is systematically narrow, intersubjective agreement among researchers working with that sample may give you high reliability within that sample while telling you little about people more generally.

Chapter 3: Experimentation and Causation

Why Observation Alone Is Not Enough

Even if we grant that scientific observation is active, theory-laden, and subject to public checking, there is a limit to what observation alone can tell us. Scientists are not merely curious about how the world appears — they want to understand why things happen. They want to identify causal relationships: not just that A and B tend to go together, but that A produces or brings about B. And causal claims cannot be established by observation alone, because observation can only give us correlations.

The distinction between correlation and causation is one of the most frequently cited principles in scientific reasoning. Ice cream sales and drowning deaths are correlated — both peak in summer — but ice cream does not cause drowning. The correlation is explained by a third variable: hot weather drives both ice cream consumption and swimming. To establish that A causes B, you need to do something more than observe them together. You need to intervene — to change A while holding everything else constant and see whether B changes as a result.

This is why experiments are so central to science. An experiment is not merely a structured observation; it is a deliberate intervention designed to break correlations and reveal causal structure. By manipulating independent variables and measuring their effects on dependent variables while controlling for confounders, experiments provide a pathway to causal knowledge that passive observation cannot.

Knockout Experiments and Genetic Causation

One of the most illuminating examples of how experimental intervention works comes from molecular genetics. When scientists want to understand the function of a particular gene — what proteins it produces, what developmental processes it influences, what phenotypic effects it has — they cannot simply observe which genes are present in organisms with different traits. That would give correlations, not causation. Instead, they perform knockout experiments: they create organisms (typically mice) in which a specific gene has been selectively disabled, then compare these knockout organisms to controls in which the gene is intact.

If the knockout mice develop green eyes while the controls have blue eyes, that is evidence that the knocked-out gene plays a role in eye color. But interpreting the results requires care. Genetics has revealed that genetic redundancy is common: multiple genes may contribute to the same phenotypic outcome, so that knocking out one gene has no observable effect because backup genes compensate. The absence of a phenotypic difference is therefore not conclusive evidence that the gene is irrelevant — it may mean you need to knock out several genes to see an effect.

This example illustrates a broader point about causation in complex biological systems. To understand what causes eye color in mice, you do not simply ask which gene is present or absent. You ask which variable, when changed, makes a difference to the outcome. Scientists call this the difference-that-makes-a-difference principle. In a burning building, oxygen is certainly causally relevant to fire — no oxygen, no fire — but when we ask what caused the fire, we are typically asking which factor changed in a way that led to the fire starting. If oxygen is always present, it is not what made the difference on this occasion; the struck match is what made the difference.

Which causes are scientifically relevant will depend on the research question. Consider phenylketonuria (PKU), a genetic condition in which individuals lack the enzyme needed to break down the amino acid phenylalanine. PKU is classified as a genetic disorder because the relevant variation is in a gene. Yet the most effective intervention is environmental, not genetic: simply restricting dietary phenylalanine prevents the harmful effects. From the perspective of treatment, the relevant causal factor to manipulate is the diet, even though the underlying mechanism is genetic. This illustrates that the relevance of a causal explanation depends on the purposes for which it is being sought — a deeply philosophical point that has practical implications for how we approach complex social and health problems.

Theory-Dependence of Experiments

Just as observations are theory-laden, experiments are theory-dependent. Designing a useful experiment requires knowing which variables might be causally relevant, which confounders to control, what measurements to take, and how to interpret the results. None of this is possible without a rich background theory. A scientist who believed in a simple “one gene, one protein” model of genetics — as geneticists once did — would interpret the absence of a phenotypic effect from a knockout experiment very differently from a scientist who knew about genetic redundancy.

This theory-dependence does not make experiments epistemically circular, but it does mean that the evidential relationship between experimental results and scientific claims is never simple or direct. Experimental findings are always interpreted against a theoretical background that partly determines what they mean. We will see this point take on greater philosophical weight when we discuss Karl Popper’s falsificationism and the Duhem-Quine thesis.

Chapter 4: Scientific Reasoning — Induction and Its Problems

Deduction and Induction

Having analyzed the “facts of experience” side of the logical positivist slogan, Chalmers now turns to the “derived from” side. The question is: what kind of logical relationship connects scientific observations to scientific laws and theories? The logical positivists seem to have assumed that this derivation is a form of inference — that after making enough observations, scientists can logically infer general laws. But what kind of inference is this?

Chalmers distinguishes two fundamental types of argument: deductive and inductive. A deductive argument is one in which the truth of the premises guarantees the truth of the conclusion — if the premises are true, the conclusion must be true. This property is called validity. An argument is deductively valid if it is impossible for all the premises to be true and the conclusion false simultaneously. Consider the classic example:

All metals expand when heated.
This rod is a metal.
Therefore, this rod expands when heated.

If both premises are true, the conclusion cannot possibly be false. This is a valid deductive argument.

Beyond validity, a deductive argument can also be sound: a sound argument is valid and has all true premises. Soundness guarantees a true conclusion. The distinction between validity and soundness matters because arguments can be valid while having false premises (and therefore a potentially false conclusion) and sound arguments give us the strongest possible logical guarantee.

An inductive argument, by contrast, moves from specific observations to general conclusions. From observing that many individual metals expand when heated, we infer that all metals expand when heated. But note what has happened: the conclusion goes beyond the observations. No matter how many metal samples we have examined, there might always be an unexamined sample that does not expand. Inductive arguments, as Chalmers emphasizes, are not truth-preserving — the premises can all be true while the conclusion is false.

The Problem of Induction

This brings us to one of the most famous problems in the philosophy of science: the problem of induction, first clearly formulated by David Hume in the eighteenth century. The problem is simple to state: how can we ever be justified in making inductive inferences? We cannot justify induction deductively, because the inference goes beyond the premises. We cannot justify it inductively — by pointing out that induction has worked well in the past — without already assuming that the future resembles the past, which is precisely what needs to be justified.

Chalmers illustrates the point with Newton’s laws of motion. For nearly two hundred years after Newton published his Principia Mathematica in 1687, his three laws of motion were verified by observation after observation, experiment after experiment. In every practical setting that human beings encountered, from falling apples to orbiting planets, the laws held. The inductive evidence seemed overwhelming. Yet Einstein’s special theory of relativity, developed in the early twentieth century, showed that Newton’s laws break down at velocities approaching the speed of light. The laws remain excellent approximations for everyday speeds, but they are not universally true. The inductive generalization from centuries of confirming observations was, in the end, false for the full range of conditions.

This is not a peculiarity of Newton’s physics. It reflects a deep logical feature of all inductive inference: however many observations confirm a generalization, the next observation might refute it. The English philosopher Bertrand Russell illustrated this with the fable of the inductivist chicken, who observed each day that humans brought food when they appeared, concluded inductively that humans always bring food, and was then killed by the farmer on the day before Christmas. The chicken’s inductive inference was perfectly reasonable given the evidence, yet it was fatally wrong.

Criteria for Good Inductive Inference

Chalmers does not conclude that induction is useless — he acknowledges that scientists constantly make inductive inferences, and that this is unavoidable. But he asks: what makes an inductive inference a good one? He considers three candidate criteria:

The number of observations must be large. We should not generalize from two or three cases. But “large” is vague: how large is large enough? The answer varies by field, research question, and what prior knowledge suggests. A large number in ecology might be a dozen observations; in particle physics it might be millions. Determining what counts as a sufficient sample requires scientific judgment, which itself draws on background theory.
Observations must be repeated under a wide variety of conditions. Generalizations are more reliable if they hold across diverse circumstances. But again, “wide variety” requires interpretation: which conditions are relevant? Which dimensions of variation matter? Newton’s laws were tested under a wide variety of conditions, but not at relativistic speeds — and that turned out to be the crucial exception.
No accepted observation should conflict with the derived law. A generalization that faces any counterexample should be revised or rejected. But this criterion proves surprisingly difficult to apply, because, as we shall see with Popper and Duhem, determining whether a genuine counterexample exists often requires further theoretical assumptions.

What emerges from Chalmers’s analysis is that inductive inference in science is not a mechanical procedure. It requires judgment at every step — judgment about sample size, about relevant conditions, about how to handle apparent exceptions. And that judgment draws on background knowledge and theoretical commitments that are themselves not given by the data. Science is, in this sense, a thoroughly human enterprise, not a logical machine.

Chapter 5: Falsificationism — A More Sophisticated Picture

Popper’s Challenge to the Inductivists

The Austrian-British philosopher Karl Popper recognized, even as a young man, that the problem of induction was damaging for any attempt to ground science in observation. His response was to propose a radical reconception of what science actually does. Scientists, Popper argued, do not try to verify their theories by accumulating confirming observations. They try to falsify them by searching for observations that the theory predicts should not exist.

The key logical insight is asymmetry: while no number of confirming observations can prove a universal generalization true, a single disconfirming observation can prove it false. If the theory says “all swans are white,” then a thousand white swans provide only inductive confirmation (which, as we have seen, does not logically guarantee truth), but a single black swan definitively refutes the universal claim. This asymmetry is logically straightforward: to disprove “all S are P,” you only need to find one S that is not P.

Popper’s falsificationism therefore defines scientific theories by their falsifiability: a theory is genuinely scientific if it makes predictions that could, in principle, be shown false by some possible observation. Theories that are compatible with any possible observation — that can always be saved from refutation by adding auxiliary hypotheses or reinterpreting the evidence — are not really scientific at all. They are pseudoscience.

The most famous targets of Popper’s critique were Freudian psychoanalysis and Adlerian psychology. Popper noticed that practitioners of these theories could interpret literally any observation as confirming their theories. A man jumps into the water to save a drowning child? That confirms Adler’s theory of inferiority complex (because the man needed to prove his courage). A man pushes a drowning child into the water? That also confirms Adler’s theory (because the man’s inferiority complex made him act antisocially). A theory that “predicts” both outcomes equally well predicts nothing at all. Contrast this with Einstein’s general theory of relativity, which made precise, specific, bold predictions about how light would bend near massive objects — predictions that were tested and confirmed during the solar eclipse of 1919, but that might very well have been falsified.

The Duhem-Quine Thesis and Holism

Popper’s falsificationism is elegant and appealing, but it faces a profound difficulty: in real scientific practice, it is almost never a single hypothesis that confronts an observation. When scientists test a hypothesis, they do so against a backdrop of auxiliary assumptions: assumptions about how their instruments work, about the initial conditions of their experiment, about the meanings of key concepts, about other theories that are taken for granted. This web of interconnected claims means that when an experiment yields an unexpected result, scientists cannot simply conclude that the hypothesis being tested is false. The unexpected result might instead indicate that one of the auxiliary assumptions is wrong.

This insight is associated with the French physicist and philosopher Pierre Duhem and the American philosopher W.V.O. Quine, and is known as the Duhem-Quine thesis or sometimes as holism. The thesis states that scientific hypotheses face the tribunal of experience not individually but as a corporate body — what Quine evocatively called “the web of belief.”

Chalmers illustrates the point with a historical example. When astronomers made observations inconsistent with Newtonian gravitational theory, they did not immediately conclude that Newton was wrong. Instead, they first explored whether some auxiliary assumption might be at fault — perhaps there was an undetected planet perturbing the orbit. In the case of Uranus, this strategy succeeded: the predicted planet Neptune was found, and Newtonian mechanics was vindicated. In the case of Mercury’s anomalous precession, the strategy of invoking an auxiliary hypothesis did not work, and the anomaly ultimately required general relativity to explain. But the important point is that there was no purely logical procedure for determining which path to take. The choice between revising the core hypothesis and revising an auxiliary assumption is a matter of scientific judgment and community debate.

Falsificationism, in its naive form, thus founders on the Duhem-Quine thesis. No hypothesis can be straightforwardly falsified by a single observation, because the observation is always interpreted against a background of auxiliary assumptions, and it is always possible — in principle — to save the hypothesis by modifying one of those assumptions instead.

Chapter 6: Kuhn’s Revolution — Paradigms and Scientific Change

Beyond Falsificationism

Popper’s falsificationism represented a significant advance over naive inductivism, but it still conceived of science primarily as a logical activity: a series of bold conjectures subjected to rigorous attempts at refutation. The American historian and philosopher of science Thomas Kuhn, in his landmark 1962 book The Structure of Scientific Revolutions, challenged this picture with a very different account of how science actually works — one grounded not in logic but in history and sociology.

Kuhn’s central contribution is the concept of the paradigm. A paradigm, in Kuhn’s sense, is not merely a theory. It is a whole framework of exemplary problem solutions, shared methods, conceptual tools, standards of evidence, and background assumptions that defines what counts as legitimate scientific work in a particular field at a particular time. Newtonian mechanics was a paradigm. So was Ptolemaic astronomy, before Copernicus. So is contemporary molecular biology.

Under a paradigm, scientists engage in what Kuhn calls normal science: the systematic extension and application of the paradigm to new problems, the refinement of its predictions, the resolution of its puzzles. Normal science is not primarily about testing the paradigm — it is about solving problems within the paradigm, using the paradigm’s tools and methods. Most scientific work, most of the time, is normal science. Scientists are not radical skeptics perpetually questioning their fundamental assumptions; they are puzzle-solvers operating within a shared framework they largely take for granted.

Anomalies and Scientific Revolutions

Normal science inevitably generates anomalies — results that resist explanation within the existing paradigm. For a while, these anomalies are set aside, attributed to experimental error or the complexity of the system, or addressed by adding auxiliary hypotheses. But if anomalies accumulate and deepen, if the paradigm begins to seem unable to account for a growing range of phenomena, the scientific community may enter a period of crisis. During a crisis, the foundational assumptions of the paradigm are questioned, alternative frameworks are proposed, and scientists begin to feel that something is fundamentally wrong.

Eventually, a new paradigm emerges that resolves the accumulated anomalies and opens up new lines of inquiry. The transition from the old paradigm to the new one is a scientific revolution. Famous examples include the Copernican revolution (replacing Ptolemaic geocentrism with heliocentrism), the Darwinian revolution (replacing special creation with natural selection), and the quantum mechanical revolution (replacing classical physics for the domain of atomic phenomena).

Kuhn made several controversial claims about scientific revolutions. One was that rival paradigms are incommensurable — they use different concepts, different standards of evidence, and different criteria for what counts as a good explanation, so that defenders of competing paradigms often talk past each other. Another was that the choice between paradigms cannot be purely rational, because the criteria for evaluating theories are themselves partly determined by paradigm. There is no neutral Archimedean standpoint from which to adjudicate between paradigms using logic alone; the choice involves something more like a gestalt switch or a conversion experience.

From Kuhn to Contemporary Philosophy of Science

Kuhn’s work was enormously influential, and it pushed the philosophy of science in a direction that Popper found deeply uncomfortable: it seemed to imply that scientific change is not purely rational, that social and psychological factors play a significant role, and that scientific progress is not straightforwardly cumulative. The sociology of scientific knowledge (SSK) took these implications seriously and argued, in the 1970s and 1980s, that scientific knowledge should be explained entirely in terms of social factors without privileging science over other ways of knowing.

This represented one extreme — a relativism about scientific knowledge. But by the time we reach contemporary philosophy of science, the philosophical consensus has moved toward a more nuanced position. As Chalmers acknowledges in his later chapters, and as the thinkers we encounter in the second half of this course emphasize, science is neither the perfectly rational, objective enterprise of the logical positivists nor the merely social construction of the strong relativists. Understanding science adequately requires holding both of these insights in tension: science is a social activity conducted by fallible human beings with biases and interests, and it has epistemic features — its self-correcting mechanisms, its community processes of criticism and replication — that make it one of the most reliable methods of inquiry we have developed.

One of the key points that contemporary philosophers of science share is that the context of discovery — the social, psychological, and historical circumstances under which a scientist forms a hypothesis — cannot be cleanly separated from the context of justification — the logical and evidential process by which that hypothesis is evaluated. The logical positivists wanted to restrict philosophy of science to the context of justification, treating discovery as a matter for psychologists and sociologists. But Kuhn showed that the values and assumptions embedded in a paradigm shape what gets counted as evidence and what gets treated as an anomaly. The two contexts are intertwined.

Part II: Science and Society

Chapter 7: Science and Values — Rejecting the Value-Free Ideal

The Value-Free Ideal

The second half of the course shifts from asking what is science? to asking how does science relate to society? The transition is anchored by a cluster of questions about values. The logical positivists, and most popular accounts of science, assume a sharp distinction between facts on one hand and values on the other. Science, on this view, deals in facts; ethics and politics deal in values. Scientific research is — or should be — value-free: conducted without the influence of moral, social, or political commitments.

This is the value-free ideal of science, and it is the target of a compelling argument by the philosopher Heather Douglas in her book Science, Policy, and the Value-Free Ideal. Douglas’s core argument is that this ideal is both unattainable and, if properly understood, undesirable.

To understand what Douglas means, we need to distinguish two kinds of values. Epistemic values are values that directly bear on whether a theory is a good one as a theory: simplicity, precision, explanatory breadth, empirical adequacy, testability. These are the values of good scientific reasoning, and almost no one disputes that they belong in science. Non-epistemic values (also called moral or social values) are values that reflect ethical, political, or social commitments: fairness, human well-being, environmental protection, social justice. The value-free ideal says that while epistemic values are appropriate in science, non-epistemic values are not — at least not in the internal stages of scientific research (data collection, analysis, interpretation).

Douglas challenges this claim with a careful analysis of research on the carcinogenicity of dioxin, an environmental toxin studied extensively in connection with regulatory policy. When scientists study whether dioxin causes cancer in rats, they face a series of methodological decisions: Which assay methods to use? How many tissue samples to examine? When examining cellular tissue for signs of malignancy, how much ambiguity is tolerable before classifying a cell as cancerous? What threshold of statistical significance is required before concluding that a compound is carcinogenic?

None of these decisions is purely epistemic. They all involve judgments about the consequences of error. If you set your threshold for concluding that a compound is carcinogenic very high (requiring very strong evidence before calling something harmful), you will minimize false positives — false claims that a compound is harmful when it is not — but you will increase false negatives — cases where a compound really is harmful but your study fails to detect it. The consequences of false positives and false negatives are different: false positives may lead to unnecessary regulation and economic costs, while false negatives may lead to continued exposure to harmful substances. Deciding how to balance these consequences requires weighing human health risks against economic impacts — and that is a value judgment.

Douglas’s conclusion is that non-epistemic values inevitably play a role in the internal stages of science, particularly wherever scientists must decide how to handle uncertainty. The question is not whether such values are present but whether they are transparent. Scientists who subscribe to the value-free ideal may not realize they are making value-laden decisions; they will therefore fail to make those decisions explicitly and accountably. By contrast, scientists who recognize that values are at work can be explicit about what values are informing their choices, making the research more transparent and more useful to policymakers and the public.

Values in Climate Modeling

Douglas’s argument is not merely theoretical. The philosopher Nancy Tuana, a feminist philosopher of science who collaborates directly with climate scientists, has worked with climate modelers to help them identify and be explicit about the value-laden choices embedded in their models. How should a model handle uncertainty about aerosol effects? Should it include or exclude a particular positive feedback mechanism? Scientists disagree, and their disagreements often trace back to different judgments about how to balance competing scientific and ethical considerations. Tuana’s approach is to ask modelers to be transparent about these choices in their publications — to include not just their findings but the reasoning behind methodological decisions, including the values that informed them.

The practical benefit is substantial. When different climate modeling groups make different predictions, those differences are often traceable to different value-laden methodological choices. Understanding this helps policymakers interpret the uncertainty in climate projections more accurately. It also helps the public understand why scientists sometimes disagree without this meaning that science as a whole is unreliable.

Chapter 8: Objectivity through Diversity — Helen Longino

Heather Douglas’s argument against the value-free ideal raises an urgent question: if values inevitably influence scientific reasoning, how can we preserve any meaningful notion of scientific objectivity? The answer developed by the philosopher Helen Longino represents one of the most influential contributions to contemporary philosophy of science.

Longino’s move is to shift the question of objectivity from the level of individuals to the level of communities. Earlier accounts of objectivity asked: is this scientist free from bias? Does this experiment control for all relevant confounders? But Longino argues that objectivity is not primarily a property of individual scientists or individual methods — it is a property of scientific communities and the processes by which they produce knowledge.

Her key insight builds on both Kuhn’s account of science and the Duhem-Quine thesis. If scientific reasoning always involves background assumptions, and if different scientists bring different background assumptions to their work, then the process of transformative criticism — the process by which scientists examine and challenge each other’s background assumptions — is essential for identifying and correcting the value-laden or otherwise problematic assumptions that any individual scientist might not be able to see. This is because one’s own background assumptions are often invisible to oneself. What you take for granted, you do not question.

Four Criteria for Objective Scientific Communities

Longino proposes four criteria that a scientific community must satisfy in order to produce genuinely objective knowledge:

1. Recognized avenues for criticism. The community must have institutional mechanisms through which its members can criticize each other’s work. Peer review is the most obvious, but it includes informal mechanisms too: conference discussions, exchanges in journals, replication studies, and commentary literature. Without such avenues, background assumptions can become entrenched simply because no one challenges them.

2. Shared standards for evaluation. Criticism is only effective if there are shared standards against which claims can be assessed. If two scientists use different criteria for what counts as good evidence, they cannot productively criticize each other. Shared standards in science include statistical thresholds, methodological norms, conceptual definitions, and standards for experimental control. Importantly, these standards themselves can be criticized and revised — they are not fixed forever — but at any given time, a productive scientific community needs enough shared ground to make criticism possible.

3. Community uptake of criticism. It is not enough to allow criticism; the community must actually respond to it. If a researcher raises a serious methodological objection and it is simply ignored, the machinery of objectivity has broken down. Uptake does not mean automatic acceptance — a criticism can be evaluated and rejected on good grounds. But it must be taken seriously, not dismissed because of the critic’s identity, status, or institutional affiliation.

4. Equality of intellectual authority. This criterion is the most explicitly political. The community must not systematically discount contributions from members on the basis of irrelevant factors such as gender, race, or institutional rank. If the objections of women or members of minority groups are routinely dismissed while the same objections raised by white men are taken seriously, the community is epistemically damaged — it is failing to use all the critical resources available to it.

Diversity and Objectivity

The connection between diversity and objectivity is the philosophical pay dirt of Longino’s account. Because individuals from different social locations — different genders, races, classes, cultural backgrounds — tend to bring different background assumptions, experiences, and perspectives to their work, a scientifically diverse community is more likely to identify hidden assumptions than a homogeneous one. This is not because diversity is intrinsically valuable in a moral sense (though it may be that too) — it is because, epistemically, diverse perspectives provide more vantage points from which to identify the blind spots of any particular perspective.

The economist Scott Page, in his book The Diversity Bonus, illustrates this with a striking example from economic history. When women began entering economics in significant numbers in the latter decades of the twentieth century, women economists began pointing out that conventional measures of economic output — in particular, the gross domestic product (GDP) — systematically excluded unpaid labor. Cooking, childcare, caregiving, housekeeping — these activities are economically productive, they contribute to human welfare and to the conditions under which market activity is possible, but they were not counted in GDP because they were unpaid. The reason they were excluded was not a deliberate choice but an invisible assumption: the assumption that “labor” means “paid employment.” Male economists, who as a class were less likely to perform unpaid domestic labor, were less likely to notice that assumption. Women economists, who as a class had more direct experience of performing that unpaid labor, were more likely to see it as an assumption worth questioning.

The result was a genuine scientific advance: economists developed new measures of economic activity that better captured the full range of economically productive behavior. The advance was driven not by new data but by new perspectives that identified a previously invisible assumption.

Longino’s framework thus provides a philosophical foundation for the empirical case for diversity in science. The point is not that scientists should be selected for their demographic characteristics rather than their qualifications. It is that communities that systematically exclude certain perspectives are epistemically impoverished — they are missing the critical resources that those perspectives might provide.

Chapter 9: Trust, Expertise, and Knowledge

Epistemic Dependence

Modern science is irreducibly collaborative. No single scientist can master all the knowledge required to conduct the research that science now undertakes. A paper in high-energy physics may have 99 authors, none of whom is in a position to independently verify every claim in the paper; each co-author contributes a specialized piece of knowledge and trusts the others to have contributed theirs reliably. The Human Genome Project required hundreds of labs across dozens of countries, coordinating for years, each trusting the others’ contributions to the whole.

The philosopher John Hardwig drew out the epistemological implications of this situation in a 1991 paper titled “The Role of Trust in Knowledge.” Hardwig’s argument is that in contemporary science, epistemic dependence is inescapable. We depend on others for much of what we know, not merely as a matter of convenience but as a matter of necessity. I cannot verify, with my own eyes and my own reasoning alone, the experimental findings that underpin modern medicine. I have no choice but to trust the scientists who produced them. And this trust is not irrational — it is, in most cases, the epistemically responsible thing to do. But it means that trust is not merely a social or psychological phenomenon. It is an epistemic phenomenon: it is part of the very structure of knowledge.

Hardwig’s insight was revolutionary in the context of mainstream epistemology, which had been almost entirely individualistic. The dominant tradition in Western epistemology, going back to Descartes, treated knowledge as something each person must establish for themselves through their own rational faculties. Hardwig showed that this picture is untenable for the actual practice of knowledge production in modern science.

White and Crease: Three Cases of Trust

The philosophers Kyle White and Robert Crease built on Hardwig’s foundation in a 2007 paper analyzing how trust operates between scientific communities and the broader public. Their analysis identifies three types of cases that arise when scientific knowledge is applied in social and political contexts:

Unrecognized contributor cases occur when scientists, in conducting their research, fail to recognize and incorporate the local knowledge of the communities they are studying or whose environment they are investigating. The paradigmatic example White and Crease discuss comes from Brian Wynne’s sociological study of British sheep farmers in Cumbria following the Chernobyl nuclear disaster in 1986. Scientists dispatched to assess radioactive contamination in the region set up models of how radionuclides would spread through the environment and began advising farmers on whether it was safe to sell their sheep. But the farmers, who had worked this land for generations, tried to tell the scientists that their models were wrong — that the terrain had local features, drainage patterns, and sheep-grazing habits that the scientists had not accounted for. The scientists dismissed the farmers’ knowledge as mere anecdote, continued to rely on their own models, and produced advice that was, indeed, inaccurate in ways the farmers had predicted.

The epistemological lesson is not that scientists should simply defer to local knowledge — but that they failed to recognize the farmers’ knowledge as knowledge at all. They had a narrow conception of expertise that included only credentialed scientists, and by failing to treat the farmers as legitimate sources of relevant information, they impoverished their own research. The result was worse science and, predictably, a breakdown of trust between scientists and the farming community.

Poison well cases are situations in which public distrust of scientific findings appears, on the surface, to reflect misunderstanding or ignorance — but actually reflects a different set of values or a different judgment about risk. White and Crease argue that misdiagnosing a poison well case as a simple case of public ignorance is a serious mistake. If policy-makers assume that the problem is that people do not understand the science, they will try to fix it by better science communication and education. But if people already understand the science and simply weigh its implications differently — because they have different values, different risk tolerances, or different levels of trust in the institutions producing the science — then more education will not change their minds and may even deepen their distrust. We will encounter this analysis again when we discuss vaccine hesitancy.

Trusting mediator cases are the success stories — cases in which scientists and communities with different kinds of expertise have managed to build collaborative relationships that produce better science and greater public trust simultaneously. These cases often involve what Collins and Evans call interactional experts: individuals who can move between the language of science and the language of the community, facilitating genuine knowledge exchange rather than one-way transmission.

Indigenous Knowledge and Scientific Expertise

Kyle White’s work on Indigenous science and traditional ecological knowledge provides some of the most important examples of trusting mediator cases. Indigenous communities, particularly those whose livelihoods have depended for generations on specific local ecosystems, often possess detailed empirical knowledge of those ecosystems — knowledge about seasonal patterns, species interactions, historical changes in species abundance — that cannot be derived from laboratory science and that may not even be conceivable within the conceptual frameworks of academic ecology. This knowledge is not mystical or merely cultural; it is empirical knowledge built up over generations of careful observation and practical engagement with the land.

White argues that if scientists conceive of expertise too narrowly — as requiring formal credentials and institutional affiliation — they will systematically fail to recognize and incorporate this knowledge, with the result that their own science is impoverished. Conversely, indigenous communities have often experienced scientists as coming in with predetermined models, dismissing local knowledge, and producing recommendations that reflect the scientists’ assumptions more than the actual conditions on the ground. This history of epistemological dismissal is a significant source of the distrust of Western science in many indigenous communities.

The philosopher Naomi Oreskes argues that one of science’s goals is not merely to produce knowledge but to produce knowledge that is actually used. If that is right, then scientists have an interest not only in the quality of their epistemic processes but in the trustworthiness of their products in the communities where those products need to do work. Science that the relevant community does not trust is science that will not achieve its practical purposes. This gives scientists a strong reason, beyond the purely epistemic one, to build collaborative relationships with communities whose knowledge and trust they need.

Chapter 10: Rethinking Expertise — Collins and Evans

The Problem of the Expert/Non-Expert Divide

One of the sharpest questions in the philosophy of science concerns the line between expertise and non-expertise. The logical positivists drew this line sharply: scientific experts were those with formal training and institutional credentials; everyone else was a layperson. On this view, the appropriate relationship between science and the public is one in which experts communicate established findings to a passive audience of non-experts.

Kuhn’s work complicated this picture, as did the sociology of scientific knowledge, which showed that the internal workings of scientific communities were shaped by social factors in ways that the “expertise = credentials” view could not accommodate. By the 1980s, some sociologists of science had taken a more radical position: since scientific knowledge is socially constructed and does not have privileged access to objective truth, there is no principled reason to prefer expert judgment to lay judgment on matters of scientific controversy.

The sociologists Harry Collins and Robert Evans, in their book Rethinking Expertise (2007), argued that this pendulum had swung too far. Their aim is to develop a more nuanced account of expertise — one that avoids both the naive credentialism of the logical positivists and the corrosive relativism that threatens to dissolve the expert/non-expert distinction entirely.

Collins and Evans begin by distinguishing two problems that any adequate account of expertise must address. The problem of legitimacy is the problem of expertise being conceived too narrowly: if we require formal credentials before recognizing anyone’s knowledge as legitimate, we will dismiss the local knowledge of farmers, the experiential knowledge of patients, the traditional ecological knowledge of indigenous communities. The problem of extension is the mirror image: if we extend expertise to anyone with a sincere belief and personal experience, the notion of expertise becomes so broad as to be meaningless, and we lose the ability to distinguish genuine experts from pretenders.

The Periodic Table of Expertises

Collins and Evans’s solution is to map the different levels of knowledge and expertise that exist along a spectrum, from the most basic general knowledge to the deepest specialist mastery. They call this their periodic table of expertises.

At the most basic level is beer-mat knowledge — the kind of trivia that fits on the coasters used in British pubs, the sort of facts that might win a game of Trivial Pursuit. Knowing that phenylalanine is an amino acid, or that the human genome contains roughly three billion base pairs, is beer-mat knowledge. It is useful for certain purposes, but it provides no real understanding of how the systems involved work.

The next level is popular understanding, acquired through popular science books, science journalism, and documentary films. A reader of Scientific American who understands the basics of how viruses replicate and how vaccines generate immunity has popular understanding of immunology. This is genuinely more useful than beer-mat knowledge — it provides a framework for interpreting new information. But Collins and Evans point out that popular understanding of settled science (well-established and not currently under active revision) is more reliable than popular understanding of frontier science (areas where experts are actively debating, methods are being developed, and findings are provisional). During a pandemic, when the science is evolving rapidly, popular understanding of new findings may be significantly out of date before it even becomes widely circulated.

Primary source knowledge comes from reading original peer-reviewed research — the actual journal articles in which scientists report their methods and findings. Primary sources include the caveats, methodological details, and quantitative uncertainty that popular accounts systematically strip away. The p-value, the confidence interval, the limitations section, the discussion of alternative explanations — these are present in the original paper and absent from the newspaper headline. Someone who reads primary sources gains access to the actual texture of scientific argument and uncertainty, which is a substantial epistemic advantage.

Collins and Evans note that one of the most effective strategies for acquiring primary source knowledge in a field where you are not already an expert is to ask genuine experts what to read. They can point you to the most important and most reliable work, provide context for why it matters, and warn you away from the papers that have been criticized or superseded. This is essentially what instructors do in university courses: they curate the primary literature, provide frameworks for interpreting it, and make the tacit knowledge of the field more explicit.

Specialist Expertise: Contributory and Interactional

At the top of Collins and Evans’s hierarchy are two forms of specialist expertise, and the distinction between them is the most novel and influential part of their account.

Contributory expertise is what we normally think of as scientific expertise: the kind possessed by a practicing scientist who can contribute new knowledge to a field. A physicist who can design and run experiments, interpret the results, and publish findings that advance the field has contributory expertise in physics. Contributory expertise requires not only deep knowledge of the field’s content but also tacit knowledge — the implicit, inarticulate practical knowledge acquired through years of immersion in the practice of a discipline. You cannot learn to run a gel or debug a climate model simply by reading about it; you need to have done it, repeatedly, with guidance and feedback.

Collins and Evans borrow the concept of tacit knowledge from the philosopher Michael Polanyi, who argued that we always know more than we can tell. Experts know how to do things that they cannot fully articulate. A master violin-maker knows how to select wood by its grain and resonance in ways that are not captured in any manual. An experienced clinician recognizes a patient’s presentation as characteristic of a particular condition through a pattern recognition that resists full verbalization. This tacit dimension of expertise is why enculturation — immersion in the practice of a community — is essential to genuine expertise. Graduate school, in this light, is primarily an apprenticeship: students learn the tacit norms, practices, and judgments of a scientific discipline by practicing alongside those who already possess them.

Interactional expertise is the concept that Collins and Evans introduce to address the problem of legitimacy. An interactional expert is someone who can speak the language of a specialism — who understands the concepts, the methods, the debates, and the standards — without being able to actually practice it. They have acquired the discursive competence of the field without the practical competence.

Collins himself became an interactional expert in gravitational wave physics through years of close collaboration with physicists — attending their conferences, reading their papers, interviewing them at length, and immersing himself in their discussions — even though he cannot design or run a gravitational wave detector. The test he proposes is a version of the Turing test: an interactional expert should be able to engage in conversation with genuine contributory experts in a way that is indistinguishable from that of a fellow expert.

Interactional expertise is significant because it describes a kind of knowledge that bridges the gap between expert communities. Kyle White’s ability to converse fluently with both environmental scientists and indigenous elders is a form of interactional expertise — he understands enough of both sets of practices and concepts to facilitate genuine exchange between them. Katie Plaisance, as a philosopher of science who collaborates with behavioral geneticists, has developed interactional expertise in behavioral genetics, enabling her to ask philosophically productive questions of the scientists she works with and to understand their responses in context. For students in Knowledge Integration — a program dedicated to working across disciplinary boundaries — interactional expertise is, arguably, the central professional aspiration of the degree.

Chapter 11: Science Communication and Vaccine Hesitancy

The Deficit Model and Its Failures

When public health officials, scientists, and commentators worry about vaccine hesitancy, they typically frame the problem in terms of a knowledge deficit: people who are hesitant to vaccinate themselves or their children must not understand the scientific evidence for vaccine safety and efficacy; if they understood it, they would vaccinate. The solution, on this view, is better science communication and public education.

This framing is so deeply embedded in how scientists and policymakers think about public understanding of science that it has come to be called the deficit model. The model has three defining features. First, it draws a sharp distinction between the scientific community (the experts with knowledge) and the public (the laypeople with deficits). Second, it treats the public as a homogeneous group all suffering from the same deficit. Third, it places the responsibility for public misunderstanding entirely on the public (or on popularizers who corrupt scientific findings), taking responsibility off the scientists themselves.

The sociologist of science Brian Wynne, whose work on the Cumbrian sheep farmers we encountered earlier, has been one of the most persistent critics of the deficit model. Decades of empirical research in the sociology of science communication have consistently shown that the model does not fit the evidence. People who are skeptical of particular scientific claims are often not ignorant of those claims; they have specific reasons for their skepticism that relate to their values, their lived experiences, their trust in institutions, and their assessments of risk.

Maya Goldenberg, a philosopher of science at the University of Guelph, applies this critique specifically to vaccine hesitancy in her paper “Public Misunderstanding of Science? Reframing the Problem of Vaccine Hesitancy.” She argues that vaccine hesitancy should be recognized as a poison well case in White and Crease’s sense: the disagreement between vaccine advocates and vaccine hesitators is not primarily a disagreement about the scientific facts but a disagreement about values, risk, and institutional trust.

Reframing Vaccine Hesitancy

Consider the range of reasons that parents give for vaccine hesitancy. Some are concerned about the pace of vaccine development. Some have experienced adverse events — their own or a family member’s — following vaccination and feel these events are not taken seriously by medical institutions. Some have specific concerns about particular vaccines (such as those produced by pharmaceutical companies with histories of misconduct). Some weigh the risks of vaccine adverse events more heavily than the risks of the diseases being prevented, particularly when the prevalence of those diseases has been reduced by previous rounds of vaccination to the point where they are no longer obviously threatening. And some belong to communities that have historical reasons to distrust medical institutions — communities whose members remember Tuskegee, forced sterilization programs, and other episodes of medical exploitation.

None of these reasons is simply ignorance. They are specific, often historically grounded, and many of them involve legitimate value judgments. A parent who believes their child’s interests matter more than herd immunity thresholds is not confused about science; they are weighing risks differently than public health authorities do. A member of an indigenous community who distrusts a vaccine produced by a large pharmaceutical company in the context of a history of medical exploitation is not displaying an irrational phobia; they are applying historically informed skepticism to a new situation.

The important implication is that if vaccine hesitancy is a poison well case — driven more by values and trust than by ignorance — then the deficit model’s proposed solution (more science communication, more education) will not work and may actively backfire. Research has confirmed this. Sending vaccine information to hesitant parents, explaining the scientific evidence in ever greater detail, does not increase vaccination rates and in some studies has been associated with decreased vaccination intentions, presumably because the parents felt condescended to.

What does work, as Goldenberg argues, is rebuilding trust. This requires acknowledging the legitimate concerns of hesitant communities, taking seriously their experiences of medical harm, improving the transparency of the processes by which vaccines are approved and regulated, and developing the kinds of trusting relationships that White and Crease identify in their trusting mediator cases. It is a slow, relational process — very different from a broadcasting campaign.

The Contextualist Model

Goldenberg advocates for replacing the deficit model with a contextualist model of public understanding of science, developed by sociologists including Wynne and Brian Wynne (who appears both as a scholar of science communication and as the researcher behind the Cumbrian sheep farmer case). The contextualist model begins from the recognition that there is no single “public” — there are many publics, each with different knowledge, values, interests, and relationships to scientific institutions. Understanding public responses to scientific findings requires understanding these specific contexts, not assuming a uniform deficit.

On the contextualist model, the scientist’s role in public communication shifts from that of an authoritative transmitter delivering knowledge to passive recipients toward something more like a participant in a dialogue. This shift requires scientists to acknowledge uncertainty honestly, to explain not just what the findings are but how they were produced and what their limitations are, to take seriously the concerns and values of the communities they are addressing, and to be transparent about the interests that fund and shape their research.

Chapter 12: Manufactured Scientific Controversies

When Doubt Is the Product

Not all public uncertainty about scientific issues reflects genuine uncertainty in the scientific community. Sometimes, uncertainty is manufactured — deliberately created by well-resourced interests seeking to delay regulation, protect industries, or discredit inconvenient findings.

The historians of science Naomi Oreskes and Erik Conway document this phenomenon extensively in their book Merchants of Doubt. Their central case studies include the tobacco industry’s response to evidence linking smoking to cancer, the lead industry’s response to evidence of childhood lead poisoning, and the fossil fuel industry’s response to evidence of human-caused climate change. In each case, industry-funded actors employed a common playbook: fund contrarian scientists to challenge the mainstream consensus, amplify uncertainties and disagreements in the scientific literature, create the appearance of ongoing expert debate where there was essentially none, and use media coverage of that apparent debate to delay regulatory action.

The philosopher Michael Parker, in his analysis of manufactured scientific controversies, identifies several features that characterize them. Manufactured controversies typically involve:

Jargon and technical opacity: using scientific-sounding language that lay audiences cannot evaluate, lending an appearance of scientific legitimacy to claims that do not have it.
Cherry-picking evidence: selectively citing studies that support a predetermined conclusion while ignoring the much larger body of evidence pointing the other way.
False balance: insisting that because there is disagreement, both sides deserve equal consideration — exploiting the journalistic norm of presenting “both sides” even when one side represents a tiny minority of informed opinion.
Ad hominem attacks on scientists: discrediting individual researchers rather than engaging with their arguments.
Shifting the burden of proof: demanding certainty before accepting findings, when all scientific findings are probabilistic.

Understanding how manufactured controversies work requires applying the analytical tools developed throughout this course. When Chalmers showed us that observation is theory-laden and that scientific reasoning depends on background assumptions, this insight cuts in two directions. On one hand, it shows that science is more complex than naive empiricism suggests. On the other hand, it reveals why manufactured controversies can be epistemically damaging: they exploit the genuine complexity of scientific reasoning to create confusion in audiences who lack the background knowledge to evaluate the arguments.

From Understanding to Combat

An exercise conducted in class illustrates how manufactured controversies work from the inside. Student groups were asked to design a manufactured scientific controversy — to create memes, public service announcements, or other communicative artifacts that would spread a scientifically dubious claim. The exercise revealed, in practice, the mechanisms that Parker and others have described theoretically.

One group focused on the ecological footprint — pointing out that the concept of personal carbon footprints was itself introduced into public discourse by British Petroleum as a deliberate strategy to shift responsibility for climate change from corporations to individuals. Another group exploited the “heavier objects fall faster” misconception using technically-sounding language and borrowed mathematical formulas that the audience could not evaluate. A third group created content around the health effects of red wine, an area where the science is genuinely contested but where cherry-picked headlines create the impression of settled consensus in favor of drinking.

The pedagogical insight is that understanding manufactured controversies requires both the critical philosophical tools to identify where scientific arguments are being distorted and the epistemological humility to recognize that genuine scientific uncertainty is always present. The challenge is to distinguish genuine uncertainty — the kind that scientists honestly acknowledge and that should inform policy decisions with appropriate caution — from manufactured uncertainty designed to forestall action.

Chapter 13: Interdisciplinary Collaboration and the Toolbox Dialogue Initiative

The Challenge of Working Across Disciplines

The course ends where Knowledge Integration lives: at the intersection of disciplines. Throughout the semester, we have seen that scientific knowledge is shaped by the paradigms, background assumptions, and values of particular disciplinary communities. Different sciences ask different questions, use different methods, apply different standards of evidence, and adopt different ontological commitments — different views about what kinds of things fundamentally exist and what kinds of explanation count as genuine explanation.

When scientists from different disciplines collaborate — as they must when addressing complex problems like climate change, pandemic response, food security, or mental health — they bring these disciplinary differences with them. What seems obvious to an experimental psychologist may be contested by an evolutionary biologist. What counts as a sufficiently large sample in ecology may seem far too small to a physicist. What a quantitative researcher calls analysis a qualitative researcher may call reductionism. These are not merely terminological differences — they reflect genuine differences in epistemic commitments, shaped by the paradigms each researcher has been trained in.

The Toolbox Dialogue Initiative

The challenge of interdisciplinary communication was the problem that prompted philosopher Michael O’Rourke and chemical engineer Sanford Eigenbrode to develop the Toolbox Dialogue Initiative (TDI), which began as the Toolbox Project in the early 2000s at the University of Idaho and is now based at Michigan State University. The occasion was an NSF-funded graduate program in which PhD students from different disciplines were required to co-author a chapter of their dissertation with a student from a different field. What O’Rourke and Eigenbrode found, repeatedly, was that the students were unable to collaborate effectively — not because of interpersonal friction, but because they were operating within such different paradigms that they could not even agree on what the problem was.

The TDI’s solution is a structured dialogue intervention. Participants from different disciplines are gathered for a two-hour facilitated workshop in which they fill out a questionnaire — the “toolbox instrument” — designed to surface their underlying epistemic commitments. The instrument asks them to rate their agreement with statements such as: “Replication is absolutely necessary for confirming scientific findings,” “The goal of science is to produce universal laws,” “Qualitative data can constitute scientific evidence,” and “The values of researchers inevitably affect the conclusions they draw.” Participants first fill out the instrument individually, then share and discuss their responses as a group, guided by a trained facilitator.

The immediate effect is to make visible the tacit assumptions that normally remain below the surface. Scientists who thought they shared a common framework with their collaborators discover that they hold fundamentally different views on what constitutes good evidence. A quantitative ecologist who believes strongly in replication discovers that her collaborator from conservation biology works with populations too small to replicate, and that this has led him to develop qualitative methods she has not encountered. Making these differences explicit — rather than allowing them to fester as unexplained sources of conflict — is the first step toward genuinely productive collaboration.

Kuhnian Paradigms Made Explicit

What the Toolbox Dialogue Initiative is doing, at a philosophical level, is precisely what Kuhn’s analysis of paradigms suggests should be done: it is making the normally invisible assumptions of a paradigm visible. Recall that one of Kuhn’s central insights was that scientists working within a paradigm typically take its background assumptions for granted — they are the “rules of the game” that everyone operates by without thinking about them. The problem in interdisciplinary collaboration is that collaborators from different disciplines have been shaped by different paradigms, and their respective background assumptions may be in conflict. The TDI creates a structured occasion to bring those assumptions to the surface.

The connection to Collins and Evans is equally direct. The kind of understanding facilitated by TDI workshops is a form of interactional expertise acquisition: participants gain enough insight into the language, concepts, and commitments of disciplines different from their own to collaborate productively without becoming contributory experts in those disciplines. This is exactly the kind of boundary-crossing knowledge that students in Knowledge Integration programs need to develop.

The connection to Longino is perhaps the most fundamental. Longino argued that objectivity in science requires recognized avenues for criticism, shared standards, uptake of criticism, and equality of intellectual authority. All of these conditions are threatened in interdisciplinary collaboration, because collaborators may not share standards, may not understand each other’s criticism, and may apply different criteria of intellectual authority across disciplines. The TDI can be understood as an attempt to create, for interdisciplinary teams, the conditions that Longino identifies as necessary for objective inquiry: a structured environment in which background assumptions become explicit, criticism becomes possible, and different perspectives are given genuine uptake.

Tacit Knowledge and Explicit Dialogue

One of the most important empirical findings from the TDI’s research program is that participants often discover they agree more than they thought — or disagree more fundamentally than they realized. Both outcomes are valuable. When scientists discover that their apparent disagreements were terminological rather than substantive, they can proceed with greater confidence and mutual understanding. When they discover that their disagreements run deeper — that they really do hold different values about what counts as evidence or what counts as a good explanation — they can negotiate explicitly about how to handle those differences in their collaboration rather than letting them generate unresolved conflict.

The TDI represents, in this way, a practical application of the philosophy of science to a real-world problem. It takes seriously the insights that science is shaped by paradigms, that background assumptions are often invisible, that tacit knowledge is central to expertise, and that diverse perspectives are essential to objectivity — and it turns these insights into a concrete tool for improving collaborative scientific practice.

This is, in miniature, what philosophy of science at its best does: not merely analyze science from the outside but contribute, through conceptual clarification and critical insight, to making science better.

The questions raised in this course do not have tidy resolutions. Science is neither the perfectly rational enterprise of the logical positivists nor the merely social construction of the radical sociologists. It is a human activity conducted by communities of fallible, value-laden, socially situated inquirers, constrained and guided by rigorous methods, subject to elaborate processes of collective criticism, and capable of generating genuinely reliable knowledge about a complex world — while also being capable of error, bias, capture by interests, and systematic exclusion of perspectives that would have improved it.

Understanding this complex reality is not a reason to distrust science. It is the condition for trusting it wisely.