Superior Index Go to the next: Chapter 2
Print Files: A4 Size.
Physics is a logical activity, which unlike some other intellectual pursuits frowns on radical departures, progressing by the introduction of elegant ideas which give a better basis for what we already know while leading to new results. However, this inevitably means that the subject at a fundamental level is in a constant state of reinterpretation. Also, it is often not easy to see how old concepts fit into a new framework. A prime example is the concept of mass, which has traditionally been regarded as the source of the gravitational field. Historically, a source and its field have been viewed as separate things. But as recognized by a number of workers through time, this distinction is artificial and leads to significant technical problems. Our most successful theory of gravity is general relativity, which traditionally has been formulated in terms of a set of field equations whose left-hand side is geometrical (the Einstein tensor) and whose right-hand side is material (the energy-momentum tensor). However, Einstein himself realized soon after the formulation of general relativity that this split has drawbacks, and for many years looked for a way to transpose the "base-wood" of the right-hand side of his equations into the "marble" of the left-hand side. Building on ideas of Kaluza and Klein, it has recently become feasible to realize Einstein's dream, and the present volume is mainly a collection of technical results, which shows how this can be done. The basic idea is to unify the source and its field using the rich algebra of higher-dimensional Riemannian geometry. In other words: space, time and matter become parts of geometry.
This is an idea many workers would espouse, but to be something more than an academic jaunt we have to recall the two conditions noted above. Namely, we have to recover what we already know (with an unavoidable need for reinterpretation); and we have to derive something new with at least a prospect of testability. The present chapter is concerned with the first of these, and the succeeding chapters mainly with the second. Thus the present chapter is primarily a review of gravitation and particle physics as we presently understand these subjects. Since this is mainly known material, these accounts will be kept brief, and indeed those readers who are familiar with these subjects may wish to boost through them. However, there is a theme in the present chapter, which transcends the division of physics into theories of macroscopic and microscopic scope. This is the nature and origin of the so-called fundamental constants. These are commonly taken as indicators of what kind of theory is under consideration (e.g., Newton's constant is commonly regarded as typical of classical theory and Planck's constant as typical of quantum theory). But at least one fundamental constant, the speed of light, runs through all modern physical theories; and we cannot expect to reach a meaningful unification of the latter without a proper understanding of where the fundamental constants originate. In fact, the chapters after this one make use of a view that it is necessary to establish and may be unfamiliar to some workers: the fundamental constants are not really fundamental, their main purpose being to enable us to dimensionally transpose certain material quantities so that we can write down consistent laws of physics.
A lot has been written on these, and there is a large literature on unsuccessful searches for their possible variations in time and space. We will be mainly concerned with their origin and status, on which several reviews are available. Notably there are the books by Wesson (1978), Petley (1985) and Barrow and Tipler (1986); the conference proceedings edited by McCrea and Rees (1983); and the articles by Barrow (1981) and Wesson (1992). We will presume a working physicist's knowledge of the constants concerned, and the present section is to provide a basis for the discussions of physical theory which follow.
The so-called fundamental constants are widely regarded as a kind of distillation of physics. Their dimensions are related to the forms of physical laws, whose structure can in many cases be recovered from the constants by dimensional analysis. Their sizes for some choice of units allow the physical laws to be evaluated and compared to observation. Despite their perceived fundamental nature, however, there is no theory of the constants as such. For example, there is no generally accepted formalism that tells us how the constants originate, how they relate to one another, or how many of them are necessary to describe physics. This lack of background seems odd for parameters that are widely regarded as basic.
The constants we will be primarily concerned with are those that figure in gravity and particle physics. It is convenient to collect the main ones here, along with their dimensions and approximate sizes in c.g.s. units:
Speed of light | c | L T−1 | 3.0 | ×1010 |
Gravitational constant | G | M−1 L3 T−2 | 6.7 | ×10−8 |
Planck's constant | h | M L2 T−1 | 6.6 | ×10−27 |
Electron charge (modulus) | e | M1/2 L3/2 T−1 | 4.8 | ×10−10 |
Here e is measured in electrostatic or Gaussian units. We will use e.s.u. in the bulk of what follows, though S.I. will be found useful in places. The two systems of units are of course related by 4 πε0, where the permittivity of free space is ε0 = 8.9×10−12 C2 m−3 s2 Kg−1. In S.I. e = 1.6×10−19C (Coulombs: see Jackson 1975, pp. 29, 817; and Griffiths 1987, p. 9). The permeability of free space μ0, is not an independent constant because c2 ≡ 1 / ε0 μ0. The above table suggests that we need to understand 3 overlapping things: constants, dimensions and units.
One common view of the constants is that they define asymptotic states. Thus c is the maximum velocity of a massive particle moving in flat spacetime; G defines the limiting potential for a mass that does not form a black hole in curved spacetime; ε0 is the empty-space or vacuum limit of the dielectric constant; and h defines a minimum amount of energy (alternatively ħ ≡ h/2π e defines a minimum amount of angular momentum). This view is acceptable, but somewhat begs the question of the constants' origin.
Another view is that the constants are necessary inventions. Thus if a photon moves away from an origin and attains distance r in time t, it is necessary to write r = ct as a way of reconciling the different natures of space and time. Or, if a test particle of mass m1 moves under the gravitational attraction of another mass m2 and its acceleration is d2r/dt2 at separation r, it is observed that m1d2r/dt2 is proportional to m1m2/r2, and to get an equation out of this it is necessary to write d2r/dt2 = Gm2/r2 as a way of reconciling the different natures of mass, space and time. A similar argument applies to the motion of charged bodies and ε0. In quantum theory, the energy E of a photon is directly related to its frequency ν, so we necessarily have to write E = h ν. The point is, that given a law of physics which relates quantities of different dimensional types, constants with the dimensions c = LT−1, G = M−1L3T−2, ε0 = Q2M−1L−3T2 and h = ML2T−1 are obligatory.
This view of the constants is logical, but disturbing to many because it means they are not really fundamental and in fact largely subjective in origin. However, it automatically answers the question raised in the early days of dimensional analysis as to why the equations of physics are dimensionally homogeneous (e.g. Bridgman 1922). It also explains why subsequent attempts to formalize the constants using approaches such as group theory have led to nothing new physically (e.g. Bunge 1971). There have also been notable adherents of the view that the fundamental constants are not what they appear to be. Eddington (1929, 1935, 1939) put forward the opinion that while an external world exists, our laws are subjective in the sense that they an constructed to match our own physical and mental modes of perception. Though he was severely criticized for this opinion by physicists and philosophers alike, recent advances in particle physics and relativity make it more palatable now than before. Jeffreys (1973, pp. 87-94, 97) did not see eye to eye with Eddington about the sizes of the fundamental constants, but did regard some of them as disposable. In particular, he pointed out that in electrodynamics c merely measures the ratio of electrostatic and electromagnetic units for charge. Hoyle and Narlikar (1974, pp. 97, 98) argued that the c2 in the common relativistic expression (c2t2 − x2 − y2 − z2) should not be there, because "there is no more logical reason for using a different time unit than there would be for measuring x, y, z in different units". They stated that the velocity of light is unity, and its size in other units is equivalent to the definition 1 s = 299 792 500 m, where the latter number is manmade. McCrea (1986, p. 150) promulgated an opinion that is exactly in line with what was discussed above, notably in regard to c, h and G, which he regarded as "conversion constants and nothing more". These comments show that there is a case that can be made for removing the fundamental constants from physics.
Absorbing constants in the equations of physics has become commonplace in recent years, particularly in relativity where the algebra is usually so heavy that it is undesirable to encumber it with unnecessary symbols. Formally, the rules for carrying this out in a consistent fashion are well known (see e.g. Desloge 1984). Notably, if there are N constants with N bases, and the determinant of the exponents of the constants' dimensions is nonzero so they are independent, then their magnitudes can be set to unity. For the constants c, G, ε0, h with bases M, L, T, Q it is obvious that ε0 and Q can be removed this way. (Setting ε0 = 1 gives Heaviside-Lorentz units, which are not the same as setting 4 πε0 = 1 for Gaussian units, but the principle is clearly the same: see Griffiths, 1987, p. 9.) The determinant of the remaining dimensional combinations M0L1T−1, M−1L3T−2, M1L2T−1 is finite, so the other constants c, G, h can be set to unity. Conceptually, the absorbing of constants in this way prompts 3 comments. (a) There is an overlap and ambiguity between the idea of a base dimension and the idea of a unit. All of mechanics can be expressed with dimensional bases M, L, T; and we have argued above that these originate because of our perceptions of mass, length and time as being different things. We could replace one or more of these by another base (e.g. in engineering force is sometimes used as a base), but there will still be 3. If we extend mechanics to include electrodynamics, we need to add a new base Q. But the principle is clear, namely that the base dimensions reflect the nature and extent of physical theory. In contrast, the idea of a unit is less conceptual but more practical. We will discuss units in more detail below, but for now we point up the distinction by noting that a constant can have different sizes depending on the choice of units while retaining the same dimensions. (b) The process of absorbing constants cannot be carried arbitrarily far. For example, we cannot set e = 1, ħ = 1 and c = 1 because it makes the electrodynamic fine-structure constant α ≡ e2 / ħ c equal to 1, whereas in the real world it is observed to be approximately 1/137. This value actually has to do with the peculiar status of e compared to the other constants (see below), but the caution is well taken. (c) Constants mutate with time. For example, the local acceleration of gravity g was apparently at one time viewed as a `fundamental' constant, because it is very nearly the same at all places on the Earth's surface. But today we know that g = GME / r2E in terms of the mass and radius of the Earth, thus redefining g in more basic terms. Another example is that the gravitational coupling constant in general relativity is not really G but the combination 8 πG / c4 (Section 1.3), and more examples are forthcoming from particle physics (Section 1.4). The point of this and the preceding comments is that where the fundamental constants are concerned, formalism is inferior to understanding.
To gain more insight, let us discuss in greater detail the relation between base dimensions and units, concentrating on the latter. There are 7 base dimensions in widespread use (Petley 1985, pp. 26-29). Of these 3 are the familiar M, L, T of mechanics. Then electric current is used in place of Q. And the other 3 are temperature, luminous intensity and amount of substance (mole). As noted above, we can swap dimensional bases if we wish as a matter of convenience, but the status of physics fixes their number. By contrast, choices of units are infinite in number. At present there is a propensity to use the S.I. system (Smith 1983). While not enamoured by workers in astrophysics and certain other disciplines because of the awkwardness of the ensuing numbers, it is in widespread use for laboratory-based physics. The latter requires well-defined and reproducible standards, and it is relevant to review here the status of our basic units of time, length and mass.
The second in S.I. units is defined as 9 192 631 770 periods of a microwave oscillator running under well-defined conditions and tuned to maximize the transition rate between two hyperfine levels in the ground state of atoms of 133Cs moving without collisions in a near vacuum. This is a fairly sophisticated definition, which is used because the caesium clock has a long-term stability of 1 part in 1014 and an accuracy of reproducibility of 1 part in 1013. These specifications are better than those of any other apparatus, though in principle a water clock would serve the same purpose. So much for a unit of time. The metre was originally defined as the distance between two scratch marks on a bar of metal kept in Paris. But it was redefined in 1960 to be 1 650 763.73 wavelengths of one of the orange-red lines in the spectrum of a 83Kr lamp running under certain well-defined conditions. This standard, though, was defined before the invention of the laser with its high degree of stability, and is not so good. A better definition of the metre can be made as the distance traveled by light in vacuum in a time of 1/2 997 924.58 (caesium clock) seconds. Thus we see that a unit of length can be defined either autonomously or in conjunction with the speed of light. The kilogram started as a lump of metal in Paris, but unlike its compatriot the metre continued in use in the form of carefully weighed copies. This was because Avogadro's number, which gives the number of atoms in a mass of material equal to the atomic number in grams, was not known by traditional means to very high precision. However, it is possible to obtain a better definition of the kilogram in terms of Avogadro's number derived from the lattice spacing of a pure crystal of a material like 26Si, where the spacing can be determined by X-ray diffraction and optical interference. Thus, a unit of mass can be defined either primitively or in terms of the mass of a crystal of known size. We conclude that most accuracy can be achieved by defining a unit of time, and using this to define a unit of length, and then employing this to obtain a unit of mass. However, more direct definitions can be made for all of these quantities, and there is no reason as far as units are concerned why we should not absorb c, G and h.
This was actually realized by Planck, who noted that their base dimensions are such as to allow us to define `natural' units of mass, length and time. (See Barrow 1983: similar units were actually suggested by Stoney somewhat earlier; and some workers have preferred to absorb ħ rather than h.) The correspondence between natural or Planck units and the conventional gram, centimetre and second can be summarized as follows:
1 mp ≡ | |
ħ c
| |
1/2 | = 2.2 ×10−5 g 1 g = 4.6 ×104 mp |
1 lp ≡ | |
G ħ
| |
1/2 | = 1.6 ×10−33 cm 1 cm = 6.3 ×1032 lp |
1 tp ≡ | |
G ħ
| |
1/2 | = 5.4 ×10−44 s 1 s = 1.9 ×1043 tp |
In Planck units, all of the constants c, G and ħ become unity and they consequently disappear from the equations of physics.
This is convenient but it involves a choice of units only and does not necessarily imply anything more. It has often been stated that a consistent theory of quantum gravity that involves c, G and ħ would naturally produce particles of the Planck mass noted above. However, this is theoretically unjustified based on what we have discussed; and seems to be practically supported by the observation that the universe is not dominated by 10−5 g black holes. A more significant view is that all measurements and observations involve comparing one thing with another thing of similar type to produce what is ultimately a dimensionless number (see Dicke 1962; Bekenetein 1979; Barrow 1981; Smith 1983; Wesson 1992). The latter can have any value, and are the things that physics needs to explain. For example, the electromagnetic fine-structure constant α ≡ e2/ħ c ≅ 1/137 needs to be explained, which is equivalent to saying that the electron charge needs to be explained (Griffiths 1987). The `gravitational fine-structure constant' Gmp2/ħ c ≅ 5×10−39 needs to be explained, which is equivalent to saying that the mass of the proton needs to be explained (Carr and Rees 1979). And along the same lines, we need to explain the constant involved in the observed correlation between the spin angular momenta and masses of astronomical objects, which is roughly GM2/Jc ≅ 1/300 (Wesson 1983). In other words, we get no more out of dimensional analysis and a choice of units than is already present in the underlying equations, and neither technique is a substitute for proper physics.
The physics of explaining the charge of the electron or the mass of a proton, referred to above, probably lies in the future. However, some comments can be made now. As regards e, it is an observed fact that α is energy or distance-dependent. Equivalently, e is not a fundamental constant in the same class as c, h and G. The current explanation for this involves vacuum polarization, which effectively screens the charge of one particle as experienced by another (see Section 1.4). This mechanism is depressingly mechanical to some field theorists, and in attributing an active role to the vacuum would have been anathema to Einstein. [There are also alternative explanations for it, such as the influence of a scalar field, as discussed in Nodvik (1985) and Chapter 5.] However, the philosophy of trying to understand the electron charge, rather than just accepting it as a given, has undoubted merit. The same applies to the masses of the elementary particles, which however are unquantized and so present more of a challenge. The main question is not whether we wish to explain charges and masses, but rather what is the best approach.
In this regard, we note that both are geometrizable (Hoyle and Narlikar 1974; Wesson 1992). The rest mass of a particle m is the easiest to treat, since using G or h we can convert m to a length:
xm ≡ |
G m
| or xm ≡ |
h
| . |
Physically, the choice here would conventionally be described as one between gravitational or atomic units, a ploy which has been used in several theories that deal with the nature of mass (see Wesson 1978 for a review). Mathematically, the choice is one of coordinates, provided we absorb the constants and view mass as on the same footing as time and space (see Chapter 7). The electric charge of a particle q is harder to treat, since it can only be geometrized by including the gravitational constant via xq ≡ (G/c4)1/2 q. This, together with the trite but irrefutable fact that masses can carry charges but not the other way round, suggests that mass is more fundamental than charge.
In the original form of this theory due to Einstein, space is regarded as a construct in which only the relations between objects have meaning. The theory agrees with all observations of gravitational phenomena, but the best books that deal with it are those which give a fair treatment of the theory's conceptual implications. Notably, those by Weinberg (1972), Misner, Thome and Wheeler (1973), Rindler (1977) and Will (1993). We should also mention the book by Jammer (1961) on concepts of mass; and the conference proceedings edited by Barbour and Pfister (1995) on the idea due to Mach that mass locally depends on the distribution of matter globally. The latter was of course a major motivation for Einstein, and while not incorporated into standard general relativity is an idea that will reoccur in subsequent chapters.
The theory is built on 10 dimensionless potentials which are the independent elements in a 4 x 4 metric tensor gαβ (α, β = 0−3). These define the square of the distance between 2 nearby points in 4D via ds2 = gαβ dxα dxβ. (Here a repeated index upstairs and downstairs is summed over, and below we will use the metric tensor to raise and lower indices on other tensors.) The coordinates xα are in a local limit identified as x0=ct, x1=x, x2=y, x3=z using Cartesians. However, because the theory employs tensors and therefore gives relations valid in any system of coordinates (covariance), the space and time labels may be mixed up and combined arbitrarily. Thus space and time are not distinct entities. Also, the role of the speed of light c is to dimensionally transpose a quantity with the dimension T to one with dimension L, so that all 4 of xα may be treated on the same footing. Partial derivatives with respect to the xα can be combined to produce the Christoffel symbol Γγαβ, which enables one to create a covariant derivative such that the derivative of a vector is now given by ∇α Vβ = ∂Vα / ∂xβ − Γγαβ Vγ. From gαβ, and its derivatives, one can obtain the Ricci tensor Rαβ, the Ricci scalar R and the Einstein tensor Gαβ ≡ Rαβ − R gαβ/2. The last is constructed so as to have zero covariant divergence: ∇α Gαβ = 0. These tensors enable us to look at the relationship between geometry and matter. Specifically, the Einstein tensor Gαβ, can be coupled via a constant κ to the energy-momentum tensor Tαβ that describes properties of matter: Gαβ = κTαβ. These are Einstein's field equations. In the weak-field limit where g00 ≅ (1 + 2φ/c2) for a fluid of density ρ, Einstein's equations give back Poisson's equation ∇2 φ = 4 πG ρ. This presumes that the coupling constant is κ = 8 πG / c4, and shows that Einstein gravity contains Newton gravity. However, Einstein's field equations have only been rigorously tested in the solar system and the binary pulsar, where the gravitational field exists essentially in empty space or vacuum. In this case, Tαβ = 0 and the field equations Gαβ = 0 are equivalent to the simpler set
| (1.1) |
These 10 relations serve in principle to determine the 10 gαβ, and are the ones verified by observations.
Notwithstanding this, let us consider the full equations for a perfect isotropic fluid with density ρ and pressure p (i.e. there is no viscosity, and the pressure is equal in the 3 spatial directions). Then the energy-momentum tensor is Tαβ = (p + ρc2) μα μβ − p gαβ where μα, are the 4-velocities (see below). This is constructed so as to have zero divergence, and the equation of continuity and the equations of motion for the 3 spatial directions are derived from the 4 components of ∇α Tαβ = 0. The covariant derivative here actually treats the metric tensor as a constant, so it is possible to add a term proportional to this to either the left-hand side or right-hand side of Einstein's equations. The former usage is traditional, so the full field equations are commonly written
| (1.2) |
Here Λ is the cosmological constant, and its modulus is known to be small. It corresponds in the weak-field limit to a force per unit mass |Λ| c2 r / 3 which increases with radius r from the centre of (say) the solar system, but is not observed to significantly affect the orbits of the planets. However, it could be insignificant locally but significant globally, as implied by its dimensions (L−2) In this regard, it is instructive to move the Λ term over to the other side of the field equations and incorporate it into Tαβ as a "vacuum" contribution to the density and pressure:
| (1.3) |
This "vacuum fluid" has the equation of state pν = − ρν c2, and while ρν, is small by laboratory standards it could in principle be of the same order of magnitude as the material density of the galaxies (10−29 − 10−31 gm cm−3). Also, while |Λ| is constrained by general relativity and observations of the present universe, there are arguments concerning the stability of the vacuum from quantum field theory which imply that it could have been larger in the early universe. But Λ (and G, c) are true constants in the original version of general relativity, so models of quantum vacuum transitions involve step-like phase changes (see e.g. Henriksen, Emslie and Wesson 1983). It should also be noted that while matter in the present universe has a pressure that is positive or close to zero ("dust"), there is in principle no reason why in the early universe or other exotic situations it cannot be taken negative. Indeed, any microscopic process which causes the particles of a fluid to attract each other can in a macroscopic way be described by p < 0 (the vacuum treated classically is a simple example). In fact, it is clear that p and ρ in general relativity are phenomenological, in the sense that they are labels for unexplained particle processes. It is also clear that the prime function of G and c is to dimensionally transpose matter labels such as p and ρ so that they match the geometrical objects of the theory.
The pressure and density are intimately connected to the motion of the fluid which they describe. This can be appreciated by looking at the general equation of motion, in the form derived by Raychaudhuri, and the continuity or conservation equation:
|
Here R is the scale factor of a region of fluid with vorticity w, shear σ, and uniform pressure and density (see Ellis 1984: a dot denotes the total derivative with respect to time, and R should not be confused with the Ricci scalar introduced above and should not be taken as implying the existence of a physical boundary). From the first of (1.4) we see that the acceleration caused by a portion of the fluid depends on the combination (3p + ρc2), so for mass to be attractive and positive we need (3p + ρc2) > 0. From the second of (1.4), we see that the rate of change of density depends on the combination (p + ρc2), so for matter to be stable in some sense we need (p + ρc2) > 0. These inequalities, sometimes called the energy conditions, should not however be considered sacrosanct. Indeed, gravitational energy is a slippery concept in general relativity, and there are several alternative definitions of "mass" (Hayward 1994). These go beyond the traditional concepts of active gravitational mass as the agent which causes a gravitational field, passive gravitational mass as the agent which feels it, and inertial mass as the agent which measures energy content (Bonnor 1989). What the above shows is that in a fluid-dynamical context, (3p + ρc2) is the gravitational energy density and (p + ρc2) is the inertial energy density.
For a fluid which is homogeneous and isotropic ( ≡ uniform), without vorticity or shear, Einstein's equations reduce to 2 relations commonly called after Friedmann:
|
Here k = ± 1, 0 is the curvature constant which describes the departure of the 3D part of spacetime from flat Minkowski (specified by gαβ = ηαβ = diagonal +1, -1, -1, -1). There are many solutions of (1.5) which are more or less in agreement with cosmological observations. The simplest is the Einstein-de Sitter model. It has k = 0, Λ = 0 , p = 0, ρ = 1/6 πG t2 and a scale factor R(t) which grows as t2/3. However, it requires about 2 orders of magnitude more matter to be present than in the visible galaxies, a topic we will return to in Sections 1.6 and 4.2. In general, solutions of (1.5) are called Friedmann-Robertson-Walker (FRW), where the last two names refer to the workers who derived the metric for these uniform cosmological models. This metric is commonly given in two different coordinate systems, whose justification has to do with whether one takes the global view wherein all directions in 3D space are treated the same, or the local view wherein quantities are measured from us as `centre'. Noting that the radial coordinates r are different, the (3D) isotropic and non-isotropic forms of the metric are given by:
|
Here dΩ2 ≡ dθ2 + sin2 θdφ2 defines the angular part of the metric in spherical polar coordinates. A photon which moves radially in the field described by (1.6) is defined by ds = 0 with dθ = dφ = 0. Using the second of (1.6) its (coordinate-defined) velocity is then
| (1.7) |
Here the sign choice corresponds to whether the photon is moving towards or away from us. The important thing, though, is that the "speed" of the photon is not c.
This parameter, as noted in Section 1.2, is commonly regarded as defining an upper limit to the speed of propagation of causal effects. However, this interpretation is only true in the local, special-relativity limit. In the global, general-relativity case the size of causally-connected regions is defined by the concept of the horizon. An excellent account of this is given by its originator, Rindler (1977, p. 215). In the cosmological application, there are actually 2 kinds of horizon. An event horizon separates those galaxies we can see from those we cannot ever see even as t → ∞; a particle horizon separates those galaxies we can see from those we cannot see now at t = t0 ( ≅ 2×1010 yr). FRW models exist which have both kinds of horizon, one but not the other, or neither. A model in the latter category is that of Milne. (It has k = −1, Λ = 0, p = 0 and R(t) proportional to t, and would solve the so-called horizon problem posed by the 3K microwave background did it not also have ρ = 0.) The distance to the particle horizon defines the size of that part of the universe which is in causal communication with us. The distance can be worked out quite simply for any k if we assume Λ = p = 0 (Weinberg 1972, p. 489). In terms of Hubble's parameter now (H0 ≡ [(R0)\dot]/R0) and the deceleration parameter now (q0 = − [(R0)\ddot] R0 / [(R0)\dot]2), the distances are given by:
|
Even for the middle case, the Einstein-de Sitter model with flat 3-space sections, the distance to the horizon is not ct0. This confirms what was noted above, and shows that in relativity the purpose of c is merely to transpose a time to a length.
Particles with finite as opposed to zero rest masses move not along paths with ds = 0 but along paths with s a minimum. In particle physics with a special-relativity metric, the action principle for the motion of a particle with mass m is commonly written δ[∫mds]=0. Assuming m = constant and replacing ds by its general relativity analog using ds2 = gαβ dxα dxβ, the variation leads to 4 equations of motion:
| (1.9) |
This is the geodesic equation, and its 4 components serve in principle to determine the 4-velocities uγ ≡ dxγ / ds as functions of the coordinates. We note that, in addition to the assumption that m is constant, m does not appear in (1.9): general relativity is not a theory of forces but a theory of accelerations. In practice, (1.9) can only be solved algebraically for certain solutions of the field equations. The latter in vacuum are (1.1), and we note here that these can be obtained from an action via δ[∫R(−g)1/2d4x]=0. Here g is the determinant of the metric tensor, which with the conventional split of spacetime into time and space has signature (+ - - -) so g is negative. The field equations with matter can also be obtained from an action, but split into a geometrical part and a matter part. However, the split of a metric into time and space parts, and the split of the field equations into geometric and matter parts, are to a certain extent subjective.
This has evolved along different lines than gravitation, and while general relativity is monolithic, the standard model of particle physics is composite. Of relevance are the books by Ramond (1981), Griffiths (1987), and Collins, Martin and Squires (1989). The last is a good review of the connections between particle physics and cosmology, and also treats higher-dimensional theories of the types we will examine in subsequent sections. However, the present section is mainly concerned with standard 4D particle physics as based on Lagrangians, and the conceptual differences between gravitation and quantum theory.
The material is ordered by complexity: we consider the equations of Maxwell, Schrodinger, Klein-Gordon, Dirac, Proca and Yang-Mills; and then proceed to quantum chromodynamics and the standard model (including Glashow-Salam-Weinberg theory). As before, there is an emphasis on fundamental constants and the number of parameters required to make theory compatible with observation.
Classical electromagnetism is described by a 4-potential Aα and a 4-current Jα (covariant and contravariant quantities differ now by at most a sign). Then Maxwell's equations are contained in the tensor relations
| (1.10) |
∂Fαβ
| + |
∂Fβγ
| + |
∂Fγα
| = 0 |
implicit in the definition of the Faraday tensor Fαβ. However, Maxwell's equations may also be obtained by substituting the Lagrangian
| (1.11) |
in the Euler-Lagrange equations, which give (1.10). Strictly, L here is a Lagrangian density and has dimensions energy/volume, presuming we use the c.g.s./e.s.u. system of units. These units also imply that ε0 does not appear (see Section 1.2). Thus c is the only constant that figures, in analogy with the original version of general relativity in which only G/c4 figured (no cosmological constant). This is connected with the fact that these theories describe photons and gravitons with exactly zero rest mass.
Planck's constant ħ comes into the field theory of particles when the 3-momentum p and total energy E of a particle are replaced by space and time operators that act on a wave-function Ψ. Thus the prescriptions p → (ħ/i) ∇ and E → (i ħ) ∂/ ∂t applied to the non-relativistic energy equation p2/2m + V = E (where m is rest mass and V is the potential energy) result in the Schrodinger equation
| (1.12) |
The path Lagrangian for this is L=T−V in general, which for a particle with charge q moving with a 3-velocity dx/dt << c in an electromagnetic field is L = (m/2) (dx/dt)2 − (q/c) Aα dxα/dt. The path action for this is S = ∫12 L dt, where the integral is between two points. The variation δS = 0 gives the equations of motion of the particle between these two points, which in classical theory is a unique path. In quantum theory, there are non-unique paths, but the sum over paths Σexp(iS/ħ) has the interpretation that the modulus squared is the probability that the particle goes from position 1 to 2. Clearly the phase S/ħ has to be dimensionless, and this is why ħ appears in the sum over paths. Instead of including it in the latter thing, however, we could instead use Σexp(iS) and redefine the Lagrangian to be
| (1.13) |
This has been pointed out by Hoyle and Narlikar (1974, p. 102; see also Ramond, 1981, p. 35). They go on to argue that since the second term in (1.13) contains another q implicit in Aα, it is the combination q2/ħ that is important, and in it ħ can be absorbed into q2. Also, in the first term in (1.13) it is the combination m/ħ that is important, and in it ħ can be absorbed into m. Thus the Lagrangian reduces back to the form given before.
A similar prescription to that above applied to the relativistic energy equation E2 − p2c2 = m2c4 or pαpα = m2c2 for a freely-moving particle (V = 0) results in the Klein-Gordon equation
| (1.14) |
Here φ is a single scalar field and the Lagrangian is
| (1.15) |
Equations (1.14) and (1.15) describe a spin-0 particle in flat spacetime. We will consider the generalization to curved spacetime below.
Spin-1/2 particles were described in another equation formulated by Dirac, who `factorized' the energy relation pαpα = m2c2 with the help of four 4×4 matrices γα. These latter are related to the metric tensor of Minkowski spacetime by the relation γαγβ + γβγα = 2 ηαβ. The Dirac equation is
| (1.16) |
Here Ψ is a bi-spinor field, which can be thought of as a 4-element column matrix (though it is not a 4-vector) in which the upper two elements represent the two possible spin states of an electron while the lower two elements represent the two possible spin states of a positron. The Lagrangian is
| (1.17) |
Here [(Ψ)] is the adjoint spinor defined by [(Ψ)] ≡ Ψ+ γ0, where Ψ+ is the usual Hermitian or transpose conjugate obtained by transposing Ψ from a column to a row matrix and complex-conjugating its elements. The Lagrangian (1.17) is for a free particle. It is invariant under the global gauge or phase transformation Ψ→ ei θ Ψ (where θ is any real number), because [(Ψ)] → e−i θ[(Ψ)] and the exponentials cancel out in the combination [(Ψ)] Ψ. But it is not invariant under the local gauge transformation Ψ→ ei θ(x) Ψ which depends on location in spacetime. If the principle of local gauge invariance is desired, it is necessary to replace (1.17) by
| (1.18) |
Here Aα is a potential which we identify with electromagnetism and which changes under local gauge transformations according to Aα → Aα + ∂λ/ ∂xα where λ(xα) is a scalar function. In fact, we can say that the requirement of local gauge invariance for the Dirac Lagrangian (1.18) obliges the introduction of the field Aα typical of electromagnetism.
Actually the Lagrangian (1.18) should be even further extended by including a `free' term for the gauge field. In this regard, the transformation Aα → Aα + ∂λ/ ∂xα leaves Fαβ unchanged, but not a term like Aα Aα. The appropriate term to add to (1.18) is therefore (−1/16π) Fαβ Fαβ, so the full Dirac Lagrangian is
| (1.19) |
If we define a current density Jα ≡ cq([(Ψ)] γα Ψ), the last two terms give back Maxwell's Lagrangian (1.11). The Lagrange density (1.19) describes electrons or positrons interacting with an electromagnetic field consisting of massless photons. However, a term like the one we just discarded (Aα Aα) may be acceptable in a theory of massive gauge particles. Indeed, a field derived from a vector potential Aα associated with a particle of finite rest mass m is described by the Proca equation
| (1.20) |
This describes a spin-1 particle such as a massive photon, and can be obtained from the Lagrangian
| (1.21) |
Again we see the combination m/ħ, so ħ may be absorbed here if so desired as it has been elsewhere.
If we consider two 4-component Dirac fields, it can be shown that a locally gauge-invariant Lagrangian can only be obtained if we introduce three vector fields (A1α, A2α, A3α). These can be thought of as a kind of 3-vector Aα. It is also necessary to change the definition of Fαβ, used above. The 3 components of the new quantity (F1αβ, F2αβ, F3αβ) can again be thought of as a kind of vector, where now Fαβ ≡ [∂Aβ / ∂xα − ∂Aα / ∂xβ − (2q / ħ c) (Aα×Aβ)]. Further, the three Pauli matrices (τ1, τ2, τ3) can be regarded as a vector τ. Then with dot products between vectors defined in the usual way, the Lagrangian is
| (1.22) |
Here Ψ can be thought of as a column matrix with elements Ψ1 and Ψ2, each of which is a 4-component Dirac spinor. The latter still describe spin-1/2 particles of mass m (where we have assumed both particles to have the same mass for simplicity), and they interact with three gauge fields A1α, A2α, A3α which by gauge invariance must be massless. The kind of gauge invariance obeyed by (1.22) is actually more complex than that involving global and local phase transformations with ei θ considered above. There Ψ was a single spinor, whereas here Ψ is a 2-spinor column matrix. This leads us to consider a 2×2 matrix which we take to be unitary (U+U = 1). In fact the first two terms in (1.22) are invariant under the global transformation Ψ→ U Ψ, because [(Ψ)] →[(Ψ)] U+ so the combination [(Ψ)] Ψ is invariant. Just as any complex number of modulus 1 can be written as ei θ with θ real, any unitary matrix can be written U = e1H with H Hermitian (H+ = H). Since H is a 2×2 matrix it involves 4 real numbers, say θ and a1, a2, a3 which can be regarded as the components of a 3-vector a. As before, let τ be the 3-vector whose components are three 2×2 Pauli matrices, and let 1 stand for the 2×2 unit matrix. Then without loss of generality we can write H = θ1 + τ·a, so U = ei θ ei τ·a. The first factor here is the old phase transformation. The second is a 2×2 unitary matrix which is special in that the determinant is actually 1. Thus Ψ→ ei τ·a Ψ is a global special-unitary 2-parameter, or SU(2), transformation. It should be recalled that this global invariance only involves the first two terms of the Lagrangian (1.22), which resemble the Lagrangian (1.17) of Dirac. The passage to local invariance along lines similar to those considered above leads to the other terms in the Lagrangian (1.22) and was made by Yang and Mills.
The full Yang-Mills Lagrangian (1.22) is invariant under local SU(2) gauge transformations, and leads to field equations that were originally supposed to describe two equal-mass spin-1/2 particles interacting with three massless spin-1 (vector) particles. In this form the theory is somewhat unrealistic, but still useful. For example, if we drop the first two terms in (1.22) we obtain a Lagrangian for the three gauge fields alone which leads to an interesting classical-type field theory that resembles Maxwell electrodynamics. This correspondence becomes clear if like before we define currents Jα ≡ cq([(Ψ)] γα τΨ), whereby the last two terms in (1.22) give a gauge-field Lagrangian
| (1.23) |
This closely resembles the Maxwell Lagrangian (1.11). But of course (1.23) gives rise to a considerably more complicated theory, solutions of which have been reviewed by Actor (1979). Some of these represent magnetic monopoles, which have not been observed. Some represent instantons and merons, which are hypothetical particles that tunnel between topologically distinct vacuum regions. Tunneling can in principle be important cosmologically. For example, Vilenkin (1982) has suggested that a certain type of instanton tunneling to de Sitter space from nothing can give birth to an inflationary universe. However, it is doubtful if the kinds of particles predicted by pure SU(2) Yang-Mills theory will ever have practical applications. The real importance of this theory is that it showed it was feasible to use a symmetry group involving non-commuting 2×2 matrices to construct a non-Abelian gauge theory. This idea led to more successful theories, notably one for the strong interaction based on SU(3) colour symmetry.
Quantum chromodynamics (QCD) is described by 3 coloured Dirac spinors that can be denoted Ψred, Ψblue, Ψgreen and 8 gauge fields given by a kind of 8-vector Aα. Each of Ψr, Ψb, Ψg is a 4-component Dirac spinor, and it is convenient to regard them as the elements of a column matrix Ψ. This describes the colour states of a massive spin-1/2 quark. The 8 components of Aα, are associated with the 8 Gell-Mann matrices (λ1−8), which are the SU(3) equivalents of the Pauli matrices of SU(2), and describe massless spin-1 gluons. The Lagrangian for QCD can be constructed by adding together 3 Dirac Lagrangians like (1.17) above (one for each colour), insisting on local SU(3) gauge invariance (which brings in the 8 gauge fields), and adding in a free gauge-field term (using Fαβ as defined above for the original Yang-Mills theory). The complete Lagrangian is
| (1.24) |
This resembles (1.22) above. However, the electric charge of a quark needs to be a fraction of e in order to account for the common hadrons as quark composites. And particle physics is best described by 6 quarks with different flavours (d, u, s, c, b, t) and different masses m. This means we really need 6 versions of (1.24) with different masses. A gluon does not carry electric charge, but it does carry colour charge. This is unlike its analogue the photon in electrodynamics, allowing bound gluon states (glueballs) and making chromodynamics generally quite complicated.
We do not need to go into the intricacies of QCD, especially since good reviews are available (Ramond 1981; Llewellyn Smith 1983; Griffiths 1987; Collins, Martin and Squires 1989). But a couple of points related to charges and masses are relevant to our discussion. In the case of electrons interacting via photons, the Dirac Lagrangian and the fact that α ≡ e2 / ħ c ≅ 1/137 is small allows perturbation analysis to be used to produce very accurate models. Indeed, quantum electrodynamics (QED) gives predictions that are in excellent agreement with experiment. However, the coupling parameter whose asymptotic value is the traditional fine-structure constant is actually energy or distance dependent. As mentioned in Section 1.2, this is commonly ascribed to vacuum polarization. Thus, a positive charge (say) surrounded by virtual electrons and positrons tends to attract the former and repel the latter. (Virtual particles do not obey Heisenberg's uncertainty relation and in modern quantum field theory the vacuum is regarded as full of them.) There is therefore a screening process, which means that the effective value of the embedded charge (and α) increases as the distance decreases. In analogy with QED, there is a similar process in QCD, but due to the different nature of the interaction the coupling parameter decreases as the distance decreases. This is the origin of asymptotic freedom, whose converse is that quarks in (say) a proton feel a strong restoring force if they move outwards and are in fact confined. In addition to the variable nature of coupling `constants' and charges, the masses in QCD are also not what they appear to be. The m which appears in a Lagrangian like (1.24) is not really a given parameter, but is believed to arise from the spontaneous symmetry breaking which exists when a symmetry of the Lagrangian is not shared by the vacuum. Thus a manifestly symmetric Lagrangian with massless gauge-field particles can be rewritten in a less symmetric form by redefining the fields in terms of fluctuations about a particular ground state of the vacuum. This results in the gauge-field particles becoming massive and in the appearance of a massive scalar field or Higgs particle. In QCD. the quarks are initially taken to be massless, but if they have Yukawa-type couplings to the Higgs particle then they acquire masses. The Higgs mechanism in QCD, however, is really imported from the theory of the weak interaction, and has been mentioned here to underscore that the masses of the quarks are not really fundamental parameters.
The theory of the weak interaction was originally developed by Fermi as a way of accounting for beta decay, but is today mainly associated with Glashow, Weinberg and Salam who showed that it was possible to unify the weak and electromagnetic interactions (for reviews see Salam 1980 and Weinberg 1980). As it is formulated today, the theory of the weak interaction involves mediation by 3 very massive intermediate vector (spin-1) bosons, two of which (W±) are electrically charged and one of which (Z0) is neutral. These can be combined with the photon of electromagnetism via the symmetry group SU(2)⊗U(1), which is however spontaneously broken by the mechanism outlined in the preceding paragraph. Actually, the massive Z0 and the massless photon are combinations of states that depend on a weak mixing angle θw, whose value is difficult to calculate from theory but is θw ≅ 29o from experiment. The theory of the weak interaction, like QED and QCD, involves a coupling parameter which is not constant.
What we have been discussing in the latter part of this section are parts of the standard model of particle physics, which symbolically unifies the electromagnetic, weak and strong interactions via the symmetry group U(1)⊗SU(2)⊗SU(3). An appealing feature of this theory is that with increasing energy the electromagnetic coupling increases while the weak and strong couplings decrease, suggesting that they come together at some unifying energy. This, however, is not known: it is probably of order 1016 GeV, but could be as large as the Planck mass of order 1019 GeV (see Weinberg 1983; Llewellyn Smith 1983; Ellis 1983; Kibble 1983; Griffiths 1987, p. 77; Collins, Martin and Squires 1989, p. 159). Also, there are uncertainties in the theory, notably to do with the QCD sector where the numbers of colors and flavors are conventionally taken as 3 and 6 respectively but could be different. This means that while in the conventional model there are 6 quark masses and 6 lepton masses, there could be more. In fact, if we include couplings and other things, there are. at least 20 parameters in the theory (Ellis 1983). One might hope to reduce this by using a simple unifying group for U(1), SU(2) and SU(3). but the minimal example of SU(5) does not actually help much in this regard. And then there is the perennial question: What about gravity?
The idea that the world may have more than 4 dimensions is due to Kaluza (1921), who with a brilliant insight realized that a 5D manifold could be used to unify Einstein's theory of general relativity (Section 1.3) with Maxwell's theory of electromagnetism (Section 1.4). After some delay, Einstein endorsed the idea, but a major impetus was provided by Klein (1926). He made the connection to quantum theory by assuming that the extra dimension was microscopically small, with a size in fact connected via Planck's constant h to the magnitude of the electron charge e (Section 1.2). Despite its elegance, though, this version of Kaluza-Klein theory was largely eclipsed by the explosive development first of wave mechanics and then of quantum field theory. However, the development of particle physics led eventually to a resurgence of interest in higher-dimensional field theories as a means of unifying the long-range and short-range interactions of physics. Thus did Kaluza-Klein 5D theory lay the foundation for modern developments such as 11D supergravity and 10D superstrings (Section 1.6). In fact, there is some ambiguity in the scope of the phrase "Kaluza-Klein theory". We will mainly use it to refer to a 5D field theory, but even in that context there are several versions of it. The literature is consequently enormous, but we can mention the conference proceedings edited by De Sabbata and Schmutzer (1983), Lee (1984) and Appelquist, Chodos and Freund (1987). A recent comprehensive review of all versions of Kaluza-Klein theory is the article by Overduin and Wesson (1997a). The latter includes a short account of what is referred to by different workers as non-compactified, induced-matter or space-time-matter theory. Since this is the subject of the following chapters, the present section will be restricted to a summary of the main features of traditional Kaluza-Klein theory.
This theory is essentially general relativity in 5D, but constrained by two conditions. Physically, both have the motivation of explaining why we perceive the 4 dimensions of spacetime and (apparently) do not see the fifth dimension. Mathematically, they are somewhat different, however. (a) The so-called `cylinder' condition was introduced by Kaluza, and consists in setting all partial derivatives with respect to the fifth coordinate to zero. It is an extremely strong constraint that has to be applied at the outset of calculation. Its main virtue is that it reduces the algebraic complexity of the theory to a manageable level. (b) The condition of compactification was introduced by Klein, and consists in the assumption that the fifth dimension is not only small in size but has a closed topology (i.e. a circle if we are only considering one extra dimension). It is a constraint that may be applied retroactively to a solution. Its main virtue is that it introduces periodicity and allows one to use Fourier and other decompositions of the theory.
There are now 15 dimensionless potentials, which are the independent elements in a symmetric 5×5 metric tensor gAB (A, B = 0−4: compare section 1.3). The first 4 coordinates are those of spacetime, while the extra one x4 = l (say) is sometimes referred to as the "internal" coordinate in applications to particle physics. In perfect analogy with general relativity, one can form a 5D Ricci tensor RAB, a 5D Ricci scalar R and a 5D Einstein tensor GAB ≡ RAB − RgAB/2. The field equations would logically be expected to be GAB = kTAB with some appropriate coupling constant k and a 5D energy-momentum tensor. But the latter is unknown, so from the time of Kaluza and Klein onward much work has been done with the `vacuum' or `empty' form of the field equations GAB = 0. Equivalently, the defining equations are
| (1.25) |
These 15 relations serve to determine the 15 gAB, at least in principle.
In practice, this is impossible without some starting assumption about gAB. This is usually connected with the physical situation being investigated. In gravitational problems, an assumption about gAB = gAB(xc) is commonly called a choice of coordinates, while in particle physics it is commonly called a choice of gauge. We will meet numerous concrete examples later, where given the functional form of gAB(xc) we will calculate the 5D analogs of the Christoffel symbols ΓCAB which then give the components of RAB (Chapters 2-4). Kaluza was interested in electromagnetism, and realized that gab can be expressed in a form that involves the 4-potential Aα that figures in Maxwell's theory. He adopted the cylinder condition noted above, but also put g44 = constant. We will do a general analysis of the electromagnetic problem later (Chapter 5), but here we look at an intermediate case where gAB = gAB(xα), g44 = − Φ2 (xα). This illustrates well the scope of Kaluza-Klein theory, and has been worked on by many people, including Jordan (1947, 1955). Bergmann (1948), Thiry (1948), Lessner (1982), and Liu and Wesson (1997). The coordinates or gauge are chosen so as to write the 5D metric tensor in the form
| (1.26) |
where κ is a coupling constant. Then the field equations (1.25) reduce to
|
Here Gαβ and Fαβ are the usual 4D Einstein and Faraday tensors (see sections 1.3 and 1.4 respectively), and Tαβ, is the energy-momentum tensor for an electromagnetic field given by Tαβ = (gαβ Fγδ Fγδ/4 − Fγα Fβγ)/2. Also [¯] ≡ gαβ ∇α ∇β is the wave operator, and the summation convention is in effect. Therefore we recognize the middle member of (1.27) as the 4 equations of electromagnetism modified by a function, which by the last member of (1.27) can be thought of as depending on a wave-like scalar field. The first member of (1.27) gives back the 10 Einstein equations of 4D general relativity, but with a right-hand side which in some sense represents energy and momentum that are effectively derived from the fifth dimension. In short, Kaluza-Klein theory is in general a unified account of gravity, electromagnetism and a scalar field.
Kaluza's case g44 = − Φ2 = −1 together with the identification κ = ( 16 πG / c4 )1/2 makes (1.27) read
|
These are of course the straight Einstein and Maxwell equations in 4D, but derived from vacuum in 5D, a consequence which is sometimes referred to as the Kaluza-Klein "miracle". However, these relations involve by (1.27) the choice of electromagnetic gauge Fαβ Fαβ = O and have no contribution from the scalar field. The latter could well be important, particularly in application to particle physics. In the language of that subject, the field equations (1.25) of Kaluza-Klein theory describe a spin-2 graviton, a spin-1 photon and a spin-0 boson which is thought to be connected with how particles acquire mass. The field equations can also be derived from a 5D action δ[ ∫R(−g)1/2 d5x ] = 0, in a way analogous to what happens in 4D Einstein theory.
It is also possible to put Kaluza-Klein theory into formal correspondence with other 4D theories, notably the Brans-Dicke scalar-tensor theory (see Overduin and Wesson 1997a). This theory is sometimes cast in a form where the scalar field is effectively disguised by putting the functional dependence into G, the gravitational `constant'. In this regard it belongs to a class of 4D theories, which includes ones by Dirac, Hoyle and Narlikar and Canuto et al., where the constants are allowed to vary with cosmic time (see Wesson 1978 and Barbour and Pfister 1995 for reviews). However, it should be stated with strength that Kaluza-Klein theory is essentially 5D, and trying to cast it into 4D form is technically awkward. It should also be noted that the reasons for treating 4D fundamental constants in this way are conceptually obscure.
These are based on the idea of supersymmetry, wherein each boson (integral spin) is matched with a fermion (half integral spin). Thus the particle which is presumed to mediate classical gravity (the graviton) has a partner (the gravitino). This kind of symmetry is natural, insofar as particle physics needs to account for both bosonic and fermionic matter fields. But it is also attractive because it leads to a cancellation of the enormous zero-point fields which otherwise exist but whose energy density is not manifested in the curvature of space (this is related to the so-called cosmological constant problem, which is discussed elsewhere). The literature on supergravity and superstrings is diverse, but we can mention the review articles by Witten (1981) and Duff (1996); and the books by West (1986) and Green, Schwan and Witten (1987). The status of the electromagnetic zero-point field has been discussed by Wesson (1991). There is an obvious connection between 5D Kaluza-Klein theory, 11D supergravity and 10D superstrings. But while the former is more-or-less worked out, the latter are still in a state of development with an uncertain prognosis where it comes to their relevance to the real world. For this reason, and also because supersymmetry lies outside the scope of the rest of this work, we will content ourselves with a short history.
Supersymmetric gravity or supergravity began life as a 4D theory in 1976 but quickly made the jump to higher dimensions ("Kaluza-Klein supergravity"). It was particularly successful in 11D, for three principal reasons. First, Nahm showed that 11 was the maximum number of dimensions consistent with a single graviton (and an upper limit of two on particle spin). This was followed by Witten's proof that 11 was also the minimum number of dimensions required for a Kaluza-Klein theory to unify all the forces in the standard model of particle physics (i.e. to contain the gauge groups of the strong SU(3) and electroweak SU(2)⊗U(1) interactions). The combination of supersymmetry with Kaluza-Klein theory thus appeared to uniquely fix the dimensionality of the world. Second, whereas in lower dimensions one had to choose between several possible configurations for the matter fields, Cremmer et al. demonstrated in 1978 that in 11D there is a single choice consistent with the requirements of supersymmetry (in particular, that there be equal numbers of Bose and Fermi degrees of freedom). In other words, while a higher-dimensional energy-momentum tensor was still required, its form at least appeared somewhat natural. Third, Freund and Rubin showed in 1980 that compactification of the 11D model could occur in only two ways: to 7 or 4 compact dimensions, leaving 4 (or 7, respectively) macroscopic ones. Not only did 11D spacetime appear to be specially favored for unification, but it also split perfectly to produce the observed 4D world. (The other possibility, of a macroscopic 7D world, could however not be ruled out, and in fact at least one such model was constructed as well.) Buoyed by these three successes, 11D supergravity appeared set by the mid-1980s as a leading candidate for the hoped-for "theory of everything".
Unfortunately, certain difficulties have dampened this initial enthusiasm. For example, the compact manifolds originally envisioned by Witten (those containing the standard model) turn out not to generate quarks or leptons, and to be incompatible with supersymmetry. Their most successful replacements are the 7-sphere and the "squashed" 7-sphere, described respectively by the symmetry groups SO(8) and SO(5)⊗SU(2). But these groups do not contain the minimum symmetry requirements of the standard model [SU(3)⊗SU(2)⊗U(1)]. This is commonly rectified by adding matter-related fields, the "composite gauge fields", to the 11D Lagrangian. Another problem is that it is very difficult to build chirality (necessary for a realistic fermion model) into an 11D theory. A variety of remedies have been proposed for this, including the common one of adding even more gauge fields, but none has been universally accepted. It should also be mentioned that supergravity theory is marred by a large cosmological constant in 4D, which is difficult to remove even by fine-tuning. Finally, quantization of the theory inevitably leads to anomalies.
Some of these difficulties can be eased by descending to 10 dimensions: chirality is easier to obtain, and many of the anomalies disappear. However, the introduction of chiral fermions leads to new kinds of anomalies. And the primary benefit of the 11D theory - its uniqueness - is lost: 10D is not specially favored, and the theory does not break down naturally into 4 macroscopic and 6 compact dimensions. (One can still find solutions in which this happens, but there is no reason why they should be preferred.) In fact, most 10D supergravity models not only require ad hoc higher-dimensional matter fields to ensure proper compactification, but entirely ignore gauge fields arising from the Kaluza-Klein mechanism (i.e. from symmetries of the compact manifold). A theory which requires all gauge fields to be effectively put in by hand can hardly be considered natural.
A breakthrough in solving the uniqueness and anomaly problems of 10D theory occured when Green and Schwarz and Gross et al. showed that there were 2 (and only 2) 10D supergravity models in which all anomalies could be made to vanish: those based on the groups SO(32) and E8⊗E8, respectively. Once again, extra terms (known as Chapline-Manton terms) had to be added to the higher-dimensional Lagrangian. This time, however, the addition was not completely arbitrary; the extra terms were those which would appear anyway if the theory were a low-energy approximation to certain kinds of supersymmetric string theory.
Supersymmetric generalizations of strings, or superstrings, are far from being understood. However, they have some remarkable virtues. For example, they retain the appeal of strings, wherein a point particle is replaced by an extended structure, which opens up the possibility of an anomaly-free approach to quantum gravity. (They do this while avoiding the generic prediction of tachyons, which plagued the old string theories.) Also, it is possible to make connections between certain superstring states and extreme black holes. (This may help resolve the problem of what happens to the information swallowed by classical singularities, which has been long standing in general relativity.) It is true that, for a while, there was thought to be something of a uniqueness problem for 10D superstrings, in that the groups SO(32) and E8⊗E8 admit five different string theories between them. But this difficulty was addressed by Witten, who showed that it is possible to view these five theories as aspects of a single underlying theory, now known as M-theory (for "Membrane"). The low-energy limit of this new theory, furthermore, turns out to be 11D supergravity. So it appears that the preferred dimensionality of spacetime may after all be 11, at least in regard to higher-dimensional theories which are compactified.
Supersymmetric particles such as gravitinos and neutralinos, if they exist, could provide the dark or hidden matter necessary to explain the dynamics of galaxies and bring cosmological observations into line with the simplest 4D cosmological models (see Section 1.3). However, such `dark' matter is probably not completely dark, because the particles concerned are unstable to decay in realistic (non-minimal) supersymmetric theories, and will contribute photons to the intergalactic radiation field. Observations of the latter can be used to constrain supersymmetric weakly interacting massive particles (WIMPS). Thus gravitinos and neutralinos are viable dark-matter candidates if they have decay lifetimes greater than of order 1011 yr and 109 yr respectively (Overduin and Wesson 1997b). In this regard, they are favored over non-supersymmetric candidates such as massive neutrinos, axions and a possible decaying vacuum (Overduin and Wesson 1997c, 1992). There are other candidates, but clearly the identification of dark matter is an important way of testing supersymmetry.
This chapter has presented a potted account of theoretical physics as it exists at the present. We have learned certain things, namely: that fundamental constants are not (Section 1.2); that general relativity describes gravity excellently in curved 4D space (Section 1.3); that particle physics works well as a composite theory in flat 4D space (Section 1.4); that Kaluza-Klein theory in its original version unifies gravity and electromagnetism in curved 5D space (Section 1.5); and that supergravity and superstrings provide possible routes to new physics in 11D and 10D. So, where do we go from here?
There is no consensus answer to this, but let us consider the following line of reasoning. Physics is a description of the world as we perceive it (Eddington). In order to give a logical and coherent account of the maximum number of physical phenomena, we should presumably use the most advanced mathematical techniques. For the last century through to now, this implies that we should use geometry (Einstein, Riemann). The field equations of general relativity have no mathematical constraint as regards the number of dimensions in which they should be applied the choice follows from physics and depends on what we wish to explain. Also, there are certain ways of embedding lower-dimensional spaces with complicated structure in higher-dimensional spaces with simple structure, including flat ones (Campbell, Eisenhart: see the next chapter). So the question of how we can best describe gravity and particle physics is to a certain extent a question of algebraic technology. Now we might expect that the many quantum properties of elementary particles should be described by a space with a large number of dimensions. However, the classical properties of matter should be able to be handled by a space with a moderate number of dimensions.
The rest of this treatise is a compilation of (mainly technical) results which demonstrates this view. It will be seen that properties of matter such as the density and pressure of a fluid, as well as the rest mass and electric charge of a particle, can be derived from 5D geometry. This may sound surprising, but there are important differences between what we do now and what others have done before. The theory we will be working with is obviously not Einstein general relativity, since it is not 4D but 5D in nature. But it is not Kaluza-Klein either, because we do not invoke the hobbling cylinder condition typical of that theory, preferring instead to examine an unrestricted and rich 5D algebra. What we do in the following chapters also differs from previous work in that we do not need an explicit energy-momentum tensor: it will be seen that matter can be derived from geometry.