1.5 Corresponding States |
The late 19th century attempts to reconcile the phenomenon of light within the spacetime framework of mechanics were based on the conception of light as a disturbance propagating in an extensive medium, i.e., a medium comprised of localizable parts in Galilean space and time. The medium of these theories was originally patterned on the notion of a continuous fluid or elastic solid substance, and light was regarded as a wave propagating at a characteristic speed c with respect to this medium, but eventually all the simple elastic and fluid models came into conflict with observation. Suggestions for a "molecular vortex" model were also never developed into a satisfactory theory. The most successful ether theory was that of Lorentz, who was a proponent of an absolutely rigid and motionless ether. |
Strictly speaking, ether theories can be regarded as relativistic, provided the medium is regarded as a substance in its own right. The principle of relativity then asserts that all physical behavior is unaffected by imparting to all substances (including the putative medium of light) an arbitrary uniform velocity. In fact, such theories are actually more relational than the special theory of relativity, because they consider the only physically significant motions of substantial objects to be their motions relative to other substantial objects, including, the supposed luminiferous ether itself. However, ether theorists rarely addressed the question of whether inertial acceleration (for example) is to be defined relative to the ether, or relative to some prior background space, so the degree of relationism in these theories was ambiguous (not to say problematic). In any case, ether theorists themselves generally held that the ether's rest frame was effectively an absolute rest frame, and some went so far as to say that the ether was the vacuum - endowed with certain physical properties, such as rest frame. In this sense ether theories entail a rejection of Galilean relativity. |
Of course, in order for the ether to define a single uniform state of "rest" we must assume that its parts are all at rest with respect to each other, which tends to conflict with the notion of disturbances of the ether, and is inconsistent with the notion of the ether as an ordinary inertial substance. (It's possible to postulate a local absolute rest frame at each point based on the state of motion of the putative ether at that point, but such variations due to "dragging" of the ether imply observable consequences which are absent from our experience.) In addition, we must still account for the empirical fact that light propagates perfectly well across intervals of space and time devoid of all ordinary inertial substances, so we apparently have no means of ascertaining its rest frame. Furthermore, since we observe that the speed of light is invariant with respect to relatively moving inertial coordinate systems, it is necessary to postulate physical effects of "absolute motion" relative to the undetectable rest frame of the ether in order to account for these observations, and these effects generally entail a violation of Newton's third law for observable entities. Hence, the ether theories commit us to a profound revision of Newtonian mechanics (a fact that is sometimes overlooked). |
In effect, then, ether theories assert that light is based in a non-inertial substance, so the principle of inertial relativity (if it is to be meaningful) does not apply to light or its medium. In order to maintain this view, ether theories are forced to conclude that observers are generally being misled about the "true" spatial and temporal intervals between events, presumably as a result of anisotropic distortions of rulers and changes in the rates of his clocks due to "absolute velocity" through the ether, all conspiring to make it appear as if each system of local coordinates is the true rest frame of the ether. According to this interpretation, rulers must automatically contract in the direction of motion, and clocks must slow down and somehow offset their synchronizations at different locations and states of motion, so that they conform to one of the systems of local coordinates in the theory developed by Lorentz in the years from 1886 to 1904. |
The rationale given by Lorentz for nature's apparent conspiracy to make things appear relativistic (without actually being so) was based on the idea that all material bodies - including the molecules that comprise our rulers and clocks - are bound together and configured by means of forces, and that all such forces are of a similar nature to electromagnetic forces (i.e., the mechanism of light) and inertial forces in the sense that they transform from one Galilean system of coordinates to another just as do electro-magnetic forces. Lorentz called this the Molecular Force Hypothesis. If this is granted, then it's plausible that when measuring the characteristics and behavior of material objects by means of the attributes and behavior of other material objects (such as clocks and rulers), we would never detect any effect that absolute velocity (with respect to some putative fixed background frame) has on such objects. In essence, this amount to the observation that if we use a yardstick to measure itself, it will always indicate one yard, regardless of whether it has been stretched or compressed. |
By this kind of reasoning, Lorentz and others argued that the apparent relativity of physical phenomena does not warrant the conclusion that the phenomena are "truly" relativistic, because all our measuring instruments (and indeed we ourselves) are subject to exactly the same effects of absolute velocity as the phenomena we are trying to measure. In this way it is possible to resolve the apparent paradox of two observers in relative states of motion, each measuring himself to be at the center of one and the same expanding spherical wave of light. We simply assume that the absolute velocity of each observer (relative to the ether) distorts his co-moving clocks and rulers in whatever way is necessary to make it appear that he remains at the center of the expanding sphere. According to this point of view, inertia is actually not isotropic with respect to most "true" system of coordinates, and the classical inertial coordinate systems are erroneous, based on distorted measurements of time and distance. |
However, Lorentz was sensitive to the criticism that contraction of lengths and slowing of clocks were simply ad hoc assumptions with no physical basis. Of course, in a historical sense they really were ad hoc, because Lorentz had been surprised by the need for length contraction implied by Michelson's null result. However, in retrospect, Lorentz elaborated a constructive theory that explains the apparent relativity of electrodynamics in terms of the detailed physical processes that govern the sizes, shapes, and motions of material objects, all assuming a traditional Galilean background of space and time. In those days the only known forces with any appreciable strength on a small scale were the electric-magnetic forces, which Lorentz believed were propagated at the invariant speed c with respect to an absolutely motionless background medium (the luminiferous ether). Hence he argued that, by the Molecular Force Hypothesis, it was reasonable to suppose that the structure of all fundamental (non-ether) material entities was established and enforced by signals that propagate at the absolute speed c. For example, an elementary particle at rest in the ether would be expected to have a spherical shape, based on the idea that an electromagnetic wave emanating from the geometric center of the particle would expand spherically until reaching the radius of the particle, where we can imagine that it is reflected, and the reflected wave then contracts spherically back to a point (like a spatial filter) and re-expands on the next cycle. This is illustrated by the left-hand cycle below. (Only two spatial dimensions are shown; in full 4-dimensional spacetime each shell is a sphere). |
(This same conception will re-appear in broader context in section 9.11.) If the particle is moving relative to the putative ether, then obviously the absolute shape must change from a sphere to an ellipsoid, as illustrated by the right-hand figure above. Of course, the spatial size of the particle with respect to the ether rest frame coordinates is just the intersection of a horizontal time slice with the shaft swept out by these shells. It's also clear that, for any given characteristic particle, since there is no motion relative to the ether in the transverse direction, the size in the transverse direction is unaffected by the motion. Thus the widths of the shells in the "y" direction in the above figure are equal. |
On this basis, what can we infer about the natural measures of space and time for moving objects? The figure below shows side and top views of one cycle of a stationary and a moving particle (with motions referenced to the rest frame of the putative ether). |
It's understood that these represent the same characteristic particle (i.e., particles of the same intrinsic physical construction), so the transverse size is the same. The right-hand particle is moving with a speed v in the positive x direction. In each case the geometric center of the particle is moving from point A to point B. The characteristic radius of the particle shell has been taken as unity. |
In order to make the transverse sizes of the shells equal, the enclosed areas of the cross-sectional side views must be equal. Thus, light emanating from point A of the moving particle extends a distance 1/l to the left and a distance l to the right, where l is a constant function of v. Specifically, we must have |
The leading edge of the shaft swept out by the moving shell crosses the x axis at a distance l(1-v) from the center point A, which implies that the object's instantaneous spatial extent from the center to the leading edge is only |
Likewise it's easy to see that the elapsed time (according to the putative ether rest frame coordinates) for one cycle of the moving particle, i.e., from point A to point B, is simply |
compared with an elapsed time of 2 for the same particle at rest. Hence we unavoidably arrive at Fitzgerald's length contraction and Lorentz's time dilation for objects in motion with respect to the x,y,t coordinates, provided only that all characteristic spatial and temporal intervals associated with physical entities are enforced by signals that propagate at the fixed speed c = 1 with respect to these coordinates. From these effects we easily deduce the full Lorentz transformation. This is essentially the same reasoning that leads to the elliptical shape of the retarded potential of a uniformly moving electric charge (the so-called Heaviside ellipsoid). |
It's somewhat puzzling that as late as 1909 Poincare still believed length contraction required a separate assumption or empirical justification, independent of the justification for the "local time" parameter, which he understood was the effective time indicated on clocks at rest in the transformed coordinates. This is surprising, because it seems fairly obvious that, since spatial lengths are related directly to time intervals by dx = v dt, any change in the effective time coordinate automatically implies a corresponding change in the effective space coordinate. If an observer moves at the speed v relative to the ground, and passes over an object of length L at rest on the ground, it's clear that the length of that object as assessed by the moving observer will be affected by his effective measure of time. Since he is moving at speed v, the length of the object is v dt, where dt is the time it takes him to traverse the length of the object. But which "dt" will he use? Naturally if he bases his length estimate on the measure of the time interval recorded on a ground clock, he will have dt = L/v, so he will judge the object to be v(L/v) = L units in length. However, if he uses his own effective time as indicated on his own co-moving clock, he will have dt' = dt (1-v2)1/2, so the effective length is v[(L/v)(1-v2)1/2] = L(1-v2)1/2. Thus, effective length contraction is logically unavoidable if we accept effective time dilation. |
We saw in section 1.4 that the full Lorentz transformation can be inferred from the principle of relativity combined with Maxwell's equations, and we also saw that the simple wave equation (for light) can be substituted for Maxwell's equations, and is sufficient to imply the Lorentz transformation. Einstein subsequently demonstrated that we can go even further in simplifying the basic principles, and replace the wave equation with the condition that light propagates (in vacuum) at the speed c with respect to every system of inertial coordinates. From this, the full Lorentz transformation, with its effective time dilation and length contraction, follows unavoidably. In fact, taking the viewpoint of Lorentz, this can even be demonstrated on the assumption that the speed of light is actually c only with respect to one particular system of coordinates. Even on this basis we can show that the speed of light will seem to be c with respect to every inertial system of coordinates. |
Suppose the speed of light is exactly c in all directions with respect to the fixed coordinates X,T. Also suppose, like Lorentz, that all physical processes conform to patterns established by bound light-like interactions, so we can represent arbitrary time keepers by means of "light clocks", idealized as a pair of mirrors facing each other, bouncing a pulse of light back and forth. If such a clock is moving with speed v in terms of the X,T coordinates, then clearly the pulses will occur more slowly than they would if the clock was stationary, because the light must travel a greater distance (with respect to the X,T coordinates). If the axis between the mirrors is perpendicular to the direction of motion of the clock, then increments of time in the effective co-moving coordinates x,t satisfy the relation dt = dT (1-v2)1/2. However, if the axis is parallel to the direction of motion we have dt = dT (1-v2). In other words, it might seem as if our moving clock keeps time differently depending on the direction in which it is pointing. Then, by the reasoning given above, we might also think that an object of rest-length L has the effective length L(1-v2), so we seem to have deduced a strange anisotropic system of measurements. |
This analysis assumed that the moving clock's rest length (i.e., the distance between the mirrors) was uncontracted with respect to the ground coordinates, but since we are constructing a system rest coordinates for the clock in terms of which the speed of light is to be isotropically c, the very same analysis that indicates length contraction for objects moving relative to the original coordinates X,T also indicates the same contraction for objects moving relative to the new coordinates x,t. This reciprocity implies that the clock contracts in the longitudinal direction relative to the ground's coordinates by the same factor that objects on the ground contract in terms of the moving coordinates. |
At this stage we need to proceed carefully, because the amount of spatial contraction is depends on the amount of time dilation, but the amount of time dilation depends on the spatial contraction. If we decide the clock is shortened by the full longitudinal factor of (1-v2), then there will be no time dilation at all, but of course this is logically inconsistent, because if there is no time dilation there is no length contraction, so we must restore the clock to its full length... which restores the time dilation, which implies length contraction again, and so on. There is only one logically self-consistent arrangement that reconciles each reference frame's natural measures of longitudinal time and length. The lengths and the times must both be reduced the factor (1-v2)1/2. This is also equal to the transverse time dilation, so in fact we do have isotropic clocks with respect to the natural measures of space and time of any uniformly moving frame, and of course the speed of light is c with respect to any of those systems of coordinates. This is illustrated by the figures below, showing how the spacetime pattern of reflecting light rays imposes a skew in both the time and the space axes of relatively moving systems of coordinates. |
A slightly different approach is to notice that according to a "transverse" light clock, we have the partial derivative dt/dT = 1/(1-v2)1/2 along the absolute time axis, i.e., the line X = 0. Integrating gives t = (T - f(X))/(1-v2)1/2 where f(x) is an arbitrary function of X. The question now is: does there exist a function f(X) that will yield physical relativity? If there does, then obviously those are the coordinates that will naturally be adopted as the rest frame by any observer at rest with respect to them. The answer is yes, we can set f(X) = vX, which gives t = (T-vX)/(1-v2)1/2. To show reciprocity, note that X = vT along the t axis, so we have t = T(1-v2)/(1-v2)1/2, which gives T = t/(1-v2)1/2 and so dT/dt = 1/(1-v2)1/2. We've seen that this same transformation yields relativity in the longitudinal direction as well, so there does indeed exist, for any object in any state of motion, a coordinate system with respect to which all optical phenomena are isotropic, and as a matter of empirical fact this is precisely the same class of systems invoked by Galileo's principle of mechanical relativity, the inertial systems, i.e., coordinate systems with respect to which mechanical inertia is isotropic. |
The complete reciprocity and symmetry between the "true" rest frame coordinates and each of the local effective coordinate systems may seem surprising, and in fact some people have declared it logically impossible, since we seem to have shown that dt/dT = dT/dt, which ought to imply (dt)2 = (dT)2. However, this objection is based on a confusion between total and partial derivatives. What we evaluated as dt/dT and dT/dt were actually directional derivatives of the transformation along two different directions. The parameter t is a function of both X and T, and what we called dt/dT is really (t/T)X, i.e., the partial of t with respect to T at constant X. Likewise T is a function of both x and t, and what we called dT/dt is really (T/t)x, i.e., the partial derivative of T with respect to t at constant x. Needless to say, there is nothing logically inconsistent about a transformation between (x,t) and (X,T) such that (t/T)X equals (T/t)x. |
The preceding was derived on the basis of the assumption that light propagates isotropically at the speed c with respect to one particular "ether" frame of reference, but this led inevitably to the conclusion that there exist infinitely many effective coordinate systems, with spatial origins in any state of uniform motion relative to the "ether" frame, such that the speed of light is isotropically c with respect to each of these effective coordinate systems. We've also seen that complete reciprocity exists between these system of coordinates, which implies that any one of them could, in principle, be regarded as the rest frame of the putative ether. Nevertheless, Lorentz and others maintained that only one of those frames was the "true" frame, and that all the other effective "local" coordinate systems were, in some sense, less physically genuine than the coordinates in which the putative ether is at rest. It is undeniably possible to accommodate the experimental results of Michelson, et al, in this way. |
However, there are reasons to be less than completely satisfied with this resolution. For one thing, it relies heavily on a hypothesized unmovable background (the ether) that acts on material bodies but is not acted upon by them, making it quite unlike any ordinary substance. Its only detectable characteristic seems to be complete undetectability. Moreover, from an operational point of view, Lorentz's explanation requires us to define all "true" lengths and time intervals on the basis of the one true isotropic (directionally symmetric) light frame, and yet it provides us with no means of identifying that crucially important frame, and in fact, according to the theory's own logic, that frame is not identifiable even in principle, at least not by means limited to ordinary physical phenomena. It is, of course, theoretically possible that some other kind of phenomenon which, unlike ordinary material bodies, is not affected by the light-carrying medium, might be able to identify the true state of rest, but all the evidence indicates that gravity and the strong and weak nuclear forces (unknown to Lorentz, of course) transform under relative velocity in precisely the same way as does the electromagnetic force, so we are still left with no (local) physical means of identifying the true rest frame, without which we are totally unable to determine the "true" values of any of the parameters of the theory. It is not very satisfactory for the central component of a theory to be both undetectable and of no effective physical significance. (Incidentally, this was the case with Newton's absolute space, which he carefully defined and justified at great length, but then never had occasion to use in the actual propositions of the Principia). |
Another, perhaps more important reason for dissatisfaction with this explanation is that it not only denies the principle of relativity with respect to the observed phenomena of light, it also implies that the principle of relativity is strictly inapplicable to the motions of material bodies as well. This is because, as noted above, theories in which an undetectable ether affect the inertial behavior of material object necessarily involve a violation of Newton's third law for observable entities. If we consider an enclosed space capsule filled with quiescent air, we are required (according to Lorentz's conception) to believe that sound waves moving through the air in an apparently isotropic manner are actually not isotropic, because according to Lorentz the apparent isotropy is just an illusion created by our distorted rulers and clocks (unless we happen to be at true rest, which can't always be the case). So, in attempting to reconcile light with the principle of Galilean relativity, it seems that we have only succeeded in invalidating Galilean relativity entirely. Of course, it remains true in Lorentz's theory that everything - including light - still appears to be relativistic. We are simply asked to regard this fact as the inevitable consequence of self-referential measurements, i.e., a yard stick - which is fundamentally an electromagnetic configuration - will always indicate that it is one yard long, even if it is "actually" shrunk or stretched. Essentially Lorentz was recognizing that our facilities for measuring physical phenomena are necessarily physical themselves, and we can't assume our measurements are exempt from the physical effects we are trying to measure. This is somewhat analogous to the conclusion reached in quantum mechanics, where it's understood that our observations of physical entities and interactions consist of physical interactions themselves, so we cannot consider phenomena independently of the act of observation. |
Of course, Einstein's relativistic interpretation invokes the same self-referential quality by defining distances and times operationally in terms of known physical processes and entities, such as clocks and rulers. The difference is that, in addition to operational time, Lorentz continued to maintain a metaphysical time, even after he had acknowledged that it could not be defined in any physical way, and that all known phenomena proceed as if it did not exist. According to Lorentz, only one particular set of (presumably inertial) coordinates represents the true measures of time and space, and all the other inertial coordinate systems are merely useful fictions, corresponding to how rulers and clocks are affected by their absolute states of motion relative to the rest frame of the ether. On the other hand, despite his attachment to a unique metaphysical decomposition of spacetime into absolute space and time, Lorentz ultimately acknowledged the greater simplicity and heuristic power of Einstein's interpretation (described in the next sections), which became progressively more evident as the years passed, especially once Minkowski reduced it to a simple assertion about the pseudo-metrical structure of spacetime, and even more so when Einstein showed how a generalization of this view led naturally to a field theory of gravitation, unifying inertia and gravity just as the special theory unifies electricity and magnetism. |