We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
‘At the present state of science it does not seem possible to avoid the formal character of the quantum theory which is shown by the fact that the interpretation of atomic phenomena does not involve a description of the mechanism of the discontinuous processes […]. On the correspondence principle it seems nevertheless possible to […] arrive at a consistent description of optical phenomena by connecting the discontinuous effects occurring in atoms with the continuous radiation field […]’. This announced a turning point in atomic physics research; for the first time there were reasons to assert that discontinuity was not in itself incompatible with the theory's descriptive content, and rational solutions were provided that related a whole set of phenomena dependent on the radiation properties of matter to the continuous picture of the radiation field. This was the conclusion arrived at in a new paper by Bohr which took up and developed the hypothesis put forward by the American physicist John Slater that ‘the atom, even before a process of transition between two stationary states takes place, is capable of communication with distant atoms through a virtual radiation field’ was taken up and developed. The article, ‘The Quantum Theory of Radiation’, signed by Bohr, Kramers and Slater, was published in the Philosophical Magazine in January 1924.
Slater had arrived in Copenhagen at the end of the preceding year with the idea that his hypothesis might lead to very different theoretical results.
In 1913 the Philosophical Magazine published a long article by Bohr in three parts containing the first quantum theory of the atom. The article was entitled ‘On the Constitution of Atoms and Molecules’ and soon became known in the scientific circles of the day as Bohr's ‘trilogy’. The article provided a sound theoretical foundation for the Rutherford model of the nuclear atom that had become established in 1911 thanks to new experimental discoveries about the elementary constituents of matter. The importance and originality of Bohr's paper are usually seen as lying in his successful use of quantum concepts in the solution of problems concerning the constitution and physical properties of atoms, thereby effecting a significant extension of the scope of the quantization hypotheses first introduced by Planck at the beginning of the century. Until 1910 physicists had – with few but important exceptions (in particular Einstein, von Laue and Ehrenfest) – generally been convinced that Planck's constant h was characteristic only of the problem of heat radiation, i.e. had seen it as a particular hypothesis making possible the theoretical derivation of the black-body law. Bohr's work would thus assume a two-fold importance in the evolution of 20th century physics. On the one hand, it would represent the first attempt to formulate a consistent theory of the constitution of the atom capable of explaining much of the experimental data available and of deducing empirical laws concerning the spectra of the elements. On the other, it would mark a decisive advance for quantum-theoretical conceptions by establishing their high level of generality.
As Quine wrote in 1961, what strikes us in a paradox is its initial air of absurdity, which develops into a sort of psychological discomfort when we compare the conclusions of the reasoning with the apparently irrefutable arguments on which it is based. However, as he went on to observe, ‘More than once in history the discovery of paradox has been the occasion for major reconstruction at the foundations of thought’. Catastrophe may therefore lurk even in the most innocent-seeming of paradoxes and force us to recognize the arbitrary nature ‘of a buried premise or of some preconception previously reckoned as central to physical theory, to mathematics or to the thinking process’. More recently, ideas of this kind have been used to assert in general that the role of paradox in the growth of knowledge is to generate category switches and to construct new universes of discourse. It could also be argued that paradoxes constitute elements of the logic of discovery capable of amplifying and thus making intelligible the still obscure phases accompanying paradigm shifts. Paradoxes would thus represent points of accumulation for the tension created between a given system of conceptual representation and a theory or empirical discovery that violates it.
Historians of science have tried different interpretative approaches in their attempts to unravel the thought process that was to lead Einstein in 1905 to the formulation of special relativity and the simultaneous demolition of the classical conception of space and time.
‘Sprache und Wirklichkeit in der modernen Physik’ is the title of a lecture Heisenberg gave in 1960 at the Bayerische Akademie der Schönen Künste. The polemics that had accompanied the establishment of quantum mechanics were over by then and the so-called problems of its foundations had lost much of their initial interest. Einstein had been dead for some years. Right up to the end he had expressed his profound dissatisfaction with a theory which, in his opinion, had something unreasonable about it. With the volume dedicated to him by the community of physicists and philosophers, the long debate between Bohr and Einstein had come to an end. It had been a disappointing conclusion, marked by the reaffirmation of the respective viewpoints of two scientists who by now found great difficulty even in defining a common code of communication. As though seeking to narrow the gap between them, Heisenberg took the opportunity to underline the existence of a common feature in their contributions to 20th century physics. Their discoveries had in fact made it possible to recognize that ‘even the fundamental and most elementary concepts of science, such as space, time, place, velocity, have become problematic and must be re-examined’. The conceptual and cognitive implications of relativity and of quantum mechanics were obviously different but Heisenberg maintained that both theories asked the same question: ‘Does the language we use when we speak of experiments correspond to the artificial language of mathematics which, as we know, describes real relationships correctly; or has it become separated from it so that we must be content with imprecise linguistic formulations and only return to the artificial language of mathematics when we are forced to express ourselves with precision?’.
As is already clear from the preceding historical sketch of the development of foundational problems in statistical mechanics, the concept of probability is invoked repeatedly in the important discussions of foundational issues. It is used informally in the dialectic designed to reconcile the time-asymmetry of statistical mechanics with the time-reversibility of the underlying dynamics, although, as we have seen, its introduction cannot by itself resolve that dilemma. Again, informally, it is used to account for the existence of equilibrium as the macro-state corresponding to the “overwhelmingly most probable” micro-states, and to account for the approach to equilibrium as the evolution of microstates from the less to the more probable. More formally, the attempts at finding an acceptable derivation of Boltzmann-like kinetic equations all rest ultimately on attempts to derive, in some sense, a dynamical evolution of a “probability distribution” over the micro-states compatible with the initial macro-constraints on the system. The picturesque notion of the ensemble, invoked in the later work of Maxwell and Boltzmann and made the core of the Gibbs presentation of statistical mechanics, really amounts to the positing of a probability distribution over the micro-states of a system compatible with its macro-constitution, and a study of the changes of such a distribution over time as determined by the underlying dynamics.
Positivist versus derivational models of reduction
The progress of science is marked by the continual success of attempts to unify a greater and greater range of phenomena in more and more comprehensive theoretical schemes. This unifying process often takes the form of a theory that encompasses the phenomena in one domain of experience being “reduced” to some other theory, the full range of phenomena handled by the reduced theory now being handled by the reducing theory. And the reducing theory continues to do justice to the phenomena for which it was originally designed as well.
Examples are manifold. We are told that Kepler's laws of planetary motion reduced to Newtonian mechanics, that Newtonian mechanics reduced to special relativity, and special relativity to general relativity. On the other hand, we are told that Newtonian mechanics reduced to quantum mechanics. The physical optical theory of light allegedly reduces to the theory of electromagnetism, a theory already obtained by the search for the unifying account to which the earlier separate theories of electricity and magnetism reduced. Nowadays we are told that the quantum theory of electromagnetism (quantum electrodynamics) reduces to the electro-weak quantum field theory and that there are hopes that theory will reduce, along with the quantum field theoretic account of strong interactions, to the grand unified theory.
Let us review once more the basic components of Boltzmann's final account of the asymmetry of the world described by the Second Law of Thermodynamics. In his final picture of statistical mechanics, we would expect to find an isolated system almost always at or near equilibrium. Excursions to states of very low entropy ought to be rare, and we should expect to have higher-entropy states immediately after and immediately before improbable excursions from a close to equilibrium condition. How can we reconcile this account of the probabilities of micro-states in the world with what we actually find? What we find is a universe apparently quite far from equilibrium. It seems to be approaching equilibrium in the future direction of time, but, as far as we can tell, seems to be ever further from equilibrium as we move back in time into the past direction. In addition, this entropic asymmetry of the universe as a whole is matched by the parallelism of entropic increase of branch systems temporarily isolated from the main system.
Boltzmann, the reader will remember, offers a multi-faceted story about the world to reconcile his probability attributions with the observed facts. First, it is posited that the universe available to our inspection is only a tiny fragment of the whole universe.
At this point, it might be useful to take a retrospective look at some of the major questions posed throughout this book, asking ourselves to what extent the questions have been answered and to what extent important puzzling issues still remain to be resolved. The reader will not be surprised by now, I expect, to discover that it is the author's view that many of the most important questions still remain unanswered in very fundamental and important ways.
A reasonable understanding of the role played by probabilistic assertions in statistical mechanics requires, I believe, some version of an interpretation of probability that views it as frequency or proportion in the physical world. Although “subjectivist” or “logicist” interpretations certainly are suggested by the role played by symmetry principles and principles of indifference in generating the posited probabilities, and although, as we have seen, the probabilities can sometimes be generated out of the lawlike structure of the underlying dynamics alone, still an understanding of how these probabilities are then used to describe the evolution of systems and to explain that behavior requires that they be understood in the manner of proportions.
But, as we have also seen, this understanding of probability is fraught with subtle difficulties.
The aim of this chapter is to present an extremely abbreviated and selective historical survey of the development of thermodynamics, kinetic theory, and statistical mechanics. No attempt will be made to be comprehensive. Nor is the material presented in a manner that would suit historians of science. Neither issues of chronology and attribution nor the far more important questions of placing the specific scientific results in their broader scientific and cultural context are our primary concern. The goal is merely to present some important developments in the history of these theories in such a way as to provide a background or context that will facilitate our later conceptual exploration.
Many of the scientific results noted in this chapter will only be mentioned here and not referred to again. They are being noted simply to place other, more relevant, results in their historical context. Other aspects of the historical development — those dealing with the conceptual problems at the foundational level — will be treated in much greater detail, especially in Chapters 5 through 7. The reader ought not to be discouraged, then, if certain conceptual and foundational issues seem to be too sketchily treated in this chapter to be grasped with full clarity, nor upset that many questions posed in the historical context are left in abeyance.
What results might we want to obtain from a statistical mechanical theory of non-equilibrium? Even a brief itemization of our ability to describe the non-equilibrium situation in macroscopic terms will be sufficient to indicate the breadth and depth of the results we would like to underpin with a statistical, microscopic theory.
It is an experimental fact that many non-equilibrium states of matter (and radiation) can be characterized in terms of a small number of field parameters — that is, assignments of values of a physical quantity by means of a function from locations in the system to numerical values. In general, these parameters are derived from those we use to characterize the equilibrium situation by a kind of generalization. Thus, such kinematic and dynamical quantities as energy, pressure, and volume are carried over, except that an intensive quantity like pressure now becomes local pressure at a point. And such purely thermodynamic quantities as temperature and entropy are generalized to local temperature and local entropy production. So the macroscopic situations with which we expect to be able to deal, at least most straightforwardly, in statistical mechanics are those of systems that are either close to equilibrium or that, although far from equilibrium, are such that they can be described in terms borrowed from equilibrium theory in small enough spatio-temporal regions of them.
Autonomous equilibrium theory and its rationalization
From Maxwell's equilibrium distribution to the generalized micro-canonical ensemble
The existence of an equilibrium state, describable by a few macroscopic parameters whose values and interrelations do not change with time, is the most fundamental thermodynamic fact for which statistical mechanics must account. A full understanding of the equilibrium situation would require a demonstration within the context of the dynamical theory of non-equilibrium that the equilibrium state exists as the “attractor” to which the dynamics of non-equilibrium drives systems, and it is that approach to the problem we shall examine in Chapters 6 and 7.
Beginning with Maxwell's first derivation of the equilibrium velocity distribution for molecules of a gas, though, there have been approaches to deriving the equilibrium features of a system that at least minimize considerations of the underlying micro-dynamics and of the time evolution toward equilibrium that one hopes to show is driven by it. As we saw, Maxwell's first derivation of the equilibrium distribution relied solely on the assumption that the components of molecular velocity in three perpendicular directions were probabilistically independent, an assumption clearly recognized by Maxwell as “precarious.”
Whereas Boltzmann's first essay on equilibrium treated it from the general dynamical perspective, using the kinetic equation and H-theorem to attempt to show that the equilibrium distribution was the unique stationary distribution, he later developed, in the course of his probabilistic response to the earlier criticisms of the kinetic equation and H-theorem, his new method applicable to the case of the ideal gas.
There are four fundamental theories that constitute, at present, the foundational pillars of our physical theory of the world: general relativity, quantum mechanics, the theory of elementary particles, and statistical mechanics. Physics is of course far from a finished discipline, and each of these fundamental theories presents its own budget of scientific and philosophical problems. But the kinds of problems faced by those who would examine the so-called foundational issues in these areas vary in a marked and interesting way from theory to theory.
General relativity — at present the most plausible theory of the structure of space-time, a domain of inquiry that, since Einstein, is usually taken to include the theory of gravitation — is in many ways the most fully worked out of the theories. Many fascinating scientific questions remain: Should we accept general relativity, or some alternative to it like the Dicke—Brans scalar-tensor theory? Are there generalizations of the theory that might encompass other forms of interaction over and above gravitation? Should the allowable worlds be restricted by some conditions of causal “niceness,” for example? But we are, at least, clear about what the theory itself amounts to. Scientifically, the most dubious aspect of the theory is its totally classical nature, and all expect that some day we will have a new quantized theory to take its place.
The late nineteenth century saw two attempts in science to invoke the conditions of the universe in the large to explain phenomena that had previously been thought to be purely “local” in their nature and hence explainable by reference to only nearby facts about the world.
One of these novel approaches to explanation was Mach's introduction of the overall structure of the universe as a component of his proposed solution to the problem of absolute acceleration in Newtonian dynamics. Newton had accounted for the difference between inertial and non-inertial motion, a physical difference revealed by the presence in the latter case of effects due to the so-called inertial forces, effects absent in the case of inertial motion, by positing that genuine accelerations were accelerations relative to “space itself.” Later, related theories posit the inertial frames of space-time, in the case of neo-Newtonianism or the Minkowski space-time of special relativity, or the local time-like geodesic structure of space-time, in the case of general relativity, as the reference objects relative to which motion is absolutely accelerated motion. Mach found the postulation of “space itself” methodologically illegitimate, and sought for an explanation of the asymmetry between inertial and non-inertial motion in some theory that would involve only the relative motions of material objects to one another in its explanatory account.
The aim of this work is to continue the exploration into the foundational questions on the physical theory that underpins our general theory of thermodynamics and that, for the first time, introduced probabilistic considerations into the fundamentals of our physical description of the world. That physical theory is statistical mechanics.
The history of this foundational quest is a long one. It begins with an intense examination of the premises of the theory at the hands of James Clerk Maxwell, Ludwig Boltzmann, and their brilliant critics. It continues in the work of Hans Reichenbach, who introduced their deep questions to the philosophical community. And the quest has persisted as a set of difficult conceptual challenges, in a study made ever richer by the development of ever more sophisticated technical resources with which to treat the problems. I hope that this book will encourage others in the philosophical community to join with those in physics who continue to look for the elusive resolutions of the puzzles.
I have benefited enormously from discussions with John Earman, who has often guided me to the crucial questions to be addressed. James Joyce and Robert Batterman, as students and as colleagues, have been enormously helpful to me in my thinking about these issues.