We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The transition from the atomic core model to the electron's spin, the debate on the adequacy of semi-classical models, and the lack of an appropriate scientific terminology are symptomatic of the revolutionary transition from the old quantum theory to the new quantum theory around 1921–5. As such, they provide us with a foil for rethinking Kuhn's view on scientific revolutions. In this view, scientific revolutions are distinctively accompanied by incommensurability between paradigms, or – to use Kuhn's later terminology – untranslatability between scientific lexicons. In the light of the historical reconstruction offered in Chapter 2, I shall here argue for the prospective intelligibility of the revolutionary transition around 1924 via a two-step argument that (1) reconsiders Kuhn's notion of incommensurability as untranslatability (Section 3.2), and (2) offers a positive account of the way the electron's Zweideutigkeit and the exclusion rule came out of the old quantum theory (Section 3.3).
The revolutionary transition from the old quantum theory to the new quantum theory
On 5 November 1980, Thomas Kuhn delivered a lecture at Harvard University entitled ‘The crisis of the old quantum theory: 1922–25’. In his distinctive style of reasoning, Kuhn presented the rise of quantum mechanics after 1925 as the result of a period of crisis of the old quantum theory between 1922 and 1925. The old quantum theory – in Kuhn's view – cannot be regarded as a full-blown theory but rather as a set of algorithms to solve problems and paradoxes.
The history of the exclusion principle is already an old one, but its conclusion has not yet been written.
It is now eighty years since Wolfgang Pauli introduced an ‘extremely natural’ prescriptive rule, while dealing with some spectroscopic anomalies that beset physicists in the heyday of the old quantum theory. The rule excluded the possibility that any two bound electrons in an atom were in the same dynamic state, identified by a set of four quantum numbers. Hence the name of Ausschlieβungsregel (exclusion rule), or Pauli's Verbot (Pauli's veto) as Werner Heisenberg nicknamed it. The far-reaching physical significance of this rule became clear only later.
From spectroscopy to atomic physics, from quantum field theory to high-energy physics, there is hardly another scientific principle that has more far-reaching implications than Pauli's exclusion principle. It is thanks to Pauli's principle that one obtains the electronic configurations underlying the classification of chemical elements in Mendeleev's periodic table as well as atomic spectra. To this same principle we credit the statistical behaviour of any half-integral spin particles (protons, neutrons, among many others) and the stability of matter. Shifting to high-energy physics, it is the exclusion principle that fixes the crucial constraint for binding quarks in hadrons, which together with leptons compose our physical world.
This book advances a philosophical analysis of the enduring and far-reaching validity of Pauli's principle. It does not aim to address what a scientific principle is.
The aim of this book was to investigate the rationale for Pauli's principle, and the conditions under which we are justified in regarding a phenomenological and contingent rule as an important scientific principle. To this purpose, I urged a Kantian perspective. As a conclusion, I want to foreshadow how this perspective relates to contemporary discussions about images of science, and how it bears upon the on-going debate on scientific realism in philosophy of science. I shall barely scratch the surface of this complex topic and its vast literature. The best I can do in these concluding remarks is to suggest a philosophical position that I think deserves to be further explored. What follows should then be read more as an outline for future research, than as a conclusion.
I began this book by reconstructing the origins of the exclusion principle as a phenomenological rule arising from the spectroscopic research of the 1920s. I argued that the rule was derived – in conjunction with the concept of the electron's Zweideutigkeit – from spectroscopic phenomena with the help of some theoretical assumptions, in the period of revolutionary transition from the old to the new quantum theory. I defended the prospective intelligibility of this revolutionary transition in the light of the piecemeal process of transformation of the old quantum theory, from which brand new scientific concepts and nomic generalizations followed.
The exclusion principle was the final outcome of Pauli's struggle to understand some spectroscopic anomalies in the early 1920s: doublets were observed in the spectra of alkali metals, singlets and triplets in the spectra of the alkaline earths, and even more anomalous patterns were observed when chemical elements were placed in an external magnetic field (anomalous Zeeman effect and Paschen–Back effect). These anomalous spectra challenged the old quantum theory, and prompted a radical theoretical change (Section 2.1). From 1920 to 1924 Alfred Landé, Werner Heisenberg, and Niels Bohr were all engaged in trying to save the traditional spectroscopic model (the so-called atomic core model) and to reconcile it with the observed anomalies. The impasse was solved only with Pauli's introduction of a fourth degree of freedom for the electron, and the consequent demise of the atomic core model (Section 2.2). What Pauli called the ‘twofoldness’ [Zweideutigkeit] of the electron's angular momentum was soon reinterpreted as the electron's spin (Section 2.3). Pauli's exclusion rule was announced in this semi-classical spectroscopic context that characterized the revolutionary transition from the old quantum theory to the new quantum theory around 1925.
The prehistory of Pauli's exclusion principle
Atomic spectra and the Bohr–Sommerfeld theory of atomic structure
The existence of spectral lines had been known to scientists since the beginning of the nineteenth century when Wollaston and Fraunhofer first observed the dark absorption lines in the spectrum of the Sun.
This book is the result of almost ten years of research. It has accompanied me through an intense period of my life, from the end of my undergraduate studies in Rome across the years of my Ph.D. in London until my current Research Fellowship at Girton College (University of Cambridge, UK). I have grown with it, and with it, I have come to develop my philosophical ideas. Looking back, I can see the way they have evolved and focussed; how they came to be refined, and sometimes revised. I owe intellectual debts to many people who in various ways have contributed to the development of my ideas over this span.
My original intention of studying the exclusion principle dates back to 1996. At that time I was an undergraduate student in Rome, very keen on philosophy of science and history of modern physics. Reading Pauli's scientific correspondence, I was struck by a passage of a letter to Landé in which the famous exclusion principle was introduced as an ‘extremely natural rule’. It may have appeared ‘extremely natural’ to Pauli, but to me the overall manoeuvre seemed mysterious and intriguing. I could not help plunging into the details of this fascinating historical episode. I owe an old debt to my teachers Silvano Tagliagambe, who hooked me on philosophy of science, and Sandro Petruccioli, who encouraged me to consider Wolfgang Pauli as a possible research topic.
During the stimulating years of my Ph.D. at the London School of Economics, my research project received a new twist.
This chapter sets the scene for the philosophical analysis of the exclusion principle that I shall carry out in this book. What is the role and function of a scientific principle? Whence does it derive its accreditation and nomological strength? In the philosophical literature on scientific principles, different answers have been given to these questions, from Poincaré's conventionalism to Reichenbach's analysis of coordinating principles. More recently, Michael Friedman has latched onto the latter tradition to defend ‘relativized a priori principles’ as principles that are subject to revision during scientific revolutions, but at the same time maintain a constitutively a priori role within a theoretical framework. This is germane to a reinterpretation of Kant's notion of ‘a priori’, whose purpose is to make a Kantian approach to scientific principles compatible with scientific revolutions and modern scientific developments; whence a resultant ‘dynamic Kantianism’. I shall endorse a suitable version of dynamic Kantianism to investigate the nature of the exclusion principle as playing a ‘regulative’ rather than a ‘constitutive’ function. The regulative/constitutive distinction has a distinguished philosophical pedigree in Kant and in the neo-Kantian tradition of Ernst Cassirer, as I shall spell out in Section 1.4.
Introduction
In a letter to Alfred Landé on 24 November 1924, Wolfgang Pauli announced an ‘extremely natural prescriptive rule’ that could shed light on some puzzling spectroscopic phenomena he had dealt with in the past three years. The foundation of the rule remained an open question.
Shortly after their introduction, the electron's Zweideutigkeit and Pauli's Ausschliessungsregel were embedded into a growing theoretical framework: from Fermi–Dirac statistics in 1926 (Section 4.2), to the non-relativistic quantum mechanics of the magnetic electron with Pauli's spin matrices in 1927 (Section 4.3); from Wigner and von Neumann's group theoretical derivation of the spin matrices in 1927 (Section 4.4), to Jordan's reinterpretation of Pauli's exclusion principle in terms of anticommutation relations for particle creation and annihilation operators in 1928 (Section 4.5), which paved the way for quantum field theory. The most important step in building up this theoretical framework was the transition from non-relativistic to relativistic quantum mechanics, with Dirac's equation for the electron in 1928 (Section 4.6), which finally allowed the derivation of the electron's spin with its anomalous magnetic moment and hence clarified in a conclusive way the origin of spectroscopic anomalies. The negative energy solutions of the Dirac equation, and Dirac's attempt to accommodate them via the hole theory in 1930, anticipated the experimental discovery of the antiparticle of the electron, the positron. Sections 4.7-4.9 reconstruct Pauli's sustained criticism of Dirac's hole theory. The history of this debate is intertwined with Pauli's search for a spin–statistics connection, which finally culminated in the spin–statistics theorem in 1940. With this theorem, the nomological shift from the status of a phenomenological rule to that of a fundamental scientific principle is finally completed.
Introduction
The year 1925 was the annus mirabilis for quantum mechanics.
Quantum mechanics is our most successful physical theory. It underlies our very detailed understanding of atomic physics, chemistry, and nuclear physics, and the many technologies to which physical systems in these regimes give rise. Additionally, relativistic quantum mechanics is the basis for the standard model of elementary particles, which very successfully gives a partial unification of the forces operating at the atomic, nuclear, and subnuclear levels.
However, from its inception the probabilistic nature of quantum mechanics, and the fact that “quantum measurements” in the orthodox formulation appear to require the intervention of non-quantum mechanical “classical systems,” have led to speculations by many physicists, mathematicians, and philosophers of science that quantum mechanics may be incomplete. Among the Founding Fathers of quantum theory, Einstein and Schrödinger were both of the opinion that quantum mechanics is in some way unsatisfactory, and this view has been amplified in more recent profound work of John Bell, among others. In an opposing camp, many others in the physics, mathematics, and philosophy communities have attempted to provide an interpretational foundation in which quantum mechanics remains a complete and self-contained system. Among the Founding Fathers, Bohr, Born, and Heisenberg maintained that quantum mechanics is a complete system, and a number of recent proposals have been made to improve upon or to provide alternatives to their “Copenhagen Interpretation.” The debate continues, and has spawned an enormous literature.
In this chapter we set up a classical Lagrangian and Hamiltonian dynamics for matrix models. The fundamental idea is to set up an analog of classical dynamics in which the phase space variables are non-commutative, and the basic tool that allows one to accomplish this is cyclic invariance under a trace. Since no assumptions about commutativity of the phase space variables (such as canonical commutators/anticommutators) are made at this stage, the dynamics that we set up is not the same as standard quantum mechanics. Quantum mechanical behavior will be seen to emerge only when, in Chapters 4 and 5, we study the statistical mechanics of the classical matrix dynamics formulated here.
In Section 1.1, we introduce our basic notation for bosonic and fermionic matrices, and give the cyclic identities that will be used repeatedly throughout the book. In Section 1.2, we define the derivative of a trace quantity with respect to an operator, and give the basic properties of this definition. In Section 1.3, we use the operator derivative to formulate a Lagrangian and Hamiltonian dynamics for matrix models. In Section 1.4, we introduce a generalized Poisson bracket appropriate to trace dynamics, constructed from the operator derivative defined in Section 1.2, and give its properties and some applications. Finally, in Section 1.5 we discuss the relation between the trace dynamics time evolution equations, and the usual unitary Heisenberg picture equations of motion obtained when one assumes standard canonical commutators/anticommutators.
Bosonic and fermionic matrices and the cyclic trace identities
We shall assume finite-dimensional matrices, although ultimately an extension to the infinite-dimensional case may be needed. The matrix elements of these matrices will be constructed from ordinary complex numbers, and from complex anticommuting Grassmann numbers.
In Section 2.4, we illustrated the trace dynamics formalism by constructing the trace dynamics analog of a simple field theory model, in which a Dirac fermion interacts with a scalar Klein–Gordon field. Much of the recent literature in quantum field theory has concerned itself with supersymmetric theories, in which invariance under the Poincaré group has been extended to invariance under the graded Poincaré group, and theories of this type are considered likely to play a central role in the ultimate unification of the forces. Our aim in this chapter (which can be omitted on a first reading) is to show that the trace dynamics formalism naturally extends to globally supersymmetric theories. Specifically, we shall see that, when there is a global supersymmetry, there is a conserved trace supersymmetry current with a time-independent trace supercharge Qα, that together with the trace four momentum obeys the Poincaré supersymmetry algebra under the generalized Poisson bracket of Eq. (1.11a). We shall illustrate this statement with three concrete examples, the trace dynamics versions (Adler 1997a,b) of the Wess–Zumino model (Section 3.1), the supersymmetric Yang–Mills model (Section 3.2), and the so-called “matrix model for M theory” (Section 3.3). These three examples are worked out using component field methods; we close in Section 3.4 with a short discussion of a superspace approach, and of the obstruction that prevents the construction of a trace dynamics theory with local supersymmetry.
The Wess–Zumino model
We begin with the trace dynamics transcription of the Wess–Zumino model.