It's often overlooked that Einstein actually began his 1905 paper "On the Electro-dynamics of Moving Bodies" by describing a system of coordinates based on a single absolute measure of time. He points out that we could assign time coordinates to each event |
...by using an observer located at the origin of the coordinate system, equipped with a clock, who coordinates the arrival of the light signal originating from the event to be timed and traveling to his position through empty space. |
This is equivalent to Lorentz's conception of "true" time, provided the origin of the coordinate system is at "true" rest. However, for every frame of reference except the one at rest with the origin, these coordinates would not be inertial, i.e., Newton's laws of motion would not even be quasi-statically valid. Furthermore, the selection of the origin is operationally arbitrary, and even if the origin were agreed upon, there would be significant logistical difficulties in actually carrying out a coordination based on such a network of signals. Einstein says "We arrive at a much more practical arrangement by means of the following considerations" (my emphasis). |
In his original presentation of special relativity Einstein proposed two basic principles, derived from experience. The first is nothing other than Galileo's classical principle of relativity, which states that for any material object in any state of motion there exists a system of coordinates, called inertial coordinates, with respect to which the object is instantaneously at rest and Newton's laws of motion are quasi-statically valid. However, as discussed in previous sections, this principle alone is not sufficient to give a useful basis for evaluating physical phenomena. We must also have knowledge of the transformation rule that determines how the description of events with respect to one system of inertial coordinates is related to the description of those same events with respect to another, relatively moving, system of inertial coordinates. Rather than simply assuming a transformation rule based on some prior metaphysical conception of space and time, Einstein realized that the correct rule could be deduced from the facts of experience, particularly "the unsuccessful attempts to discover any motion of the earth relatively to the 'light medium'". Since we define motion in terms of inertial coordinates, i.e., systems of coordinates in which mechanical inertia is isotropic, these experiments imply that the propagation of light is isotropic with respect to the very same class of coordinate systems. On the other hand, all the experimental results that are consolidated into Maxwell's equations imply that the propagation speed of light (with respect to any inertial coordinate system) is independent of the state of motion of the emitting source. |
As an aside, notice that isotropy with respect to inertial coordinates is what we would expect if light was a stream of inertial corpuscles (as suggested by Newton), whereas the independence of the speed of light from the motion of its source is what we would expect if light was a wave phenomenon. This is the same dichotomy that we encounter in quantum mechanics, and it's not coincidental that Einstein wrote his seminal paper on light quanta almost simultaneously with his paper on the electrodynamics of moving bodies. He might actually have chosen to combine the two into a single paper discussing general heuristic considerations arising from the observed properties of light, and the reconciliation of the apparent dichotomy in the nature of light as it is usually understood. |
From the facts that (1) light propagates isotropically with respect to every system of inertial coordinates (which is essentially just an extension of Galileo's principle of relativity), and (2) the speed of propagation of light with respect to any system of inertial coordinates is independent of the motion of the emitting source, it follows that the speed of light in invariant with respect to every system of inertial coordinates. From this simple fact, Einstein derived the correct transformation rule between relatively moving systems of inertial coordinates. |
To establish the transformation rule for this "more practical" system of coordinates (i.e., the class of inertial coordinate systems), Einstein notes that if an observer at location A sends a pulse of light at time t1 toward a distant object B and receives the reflected pulse from B at time t2, then according to the inertial coordinate system in which A is at rest the event of the light striking B occurs at the time (t1 + t2)/2. In other words, since light is isotropic with respect to the same class of coordinate systems in which mechanical inertia is isotropic, the light pulse takes the same amount of time, (t2 - t1)/2, to travel each way. Also the event of the light striking B occurs at the spatial location c(t2 - t1)/2 with respect to the inertial coordinates in which A is at rest at the origin. |
Naturally the invariance of light speed with respect to inertial coordinates is implicit in the principles on which special relativity is based, but we must not make the mistake of thinking that this invariance is therefore tautological, or merely an arbitrary definition. Inertial coordinates are not arbitrary, and they are definable without explicit reference to the phenomenon of light. The real content of Einstein's principles is that light is an inertial phenomena (despite its apparent wavelike nature). Oddly enough, the clearest statement of this insight came only as an afterthought appearing in Einstein's second paper on relativity in 1905, in which he explicitly concluded that "radiation carries inertia between emitting and absorbing bodies". Once it is posited that light is inertial, then Galileo's principle of relativity automatically implies that light propagates isotropically from the source, regardless of the source's state of uniform motion. Consequently, if we elect to use space and time coordinates in terms of which light speed is not isotropic (which we are certainly free to do), we will necessarily find that no inertial processes are isotropic. For example, we will find that two identical marbles expelled from a tube in opposite directions by an explosive charge located between them will not fly away at equal speeds, i.e., momentum will not be conserved. Conversely, if we use ordinary mechanical inertial processes together with the conservation of momentum (and we decline to assign any momentum or reaction to unobservable and/or immovable entities), we will necessarily arrive at clock synchronizations that are identical with those given by Einstein's light rays. Thus, Einstein's "more practical arrangement" is based on (and ensures) isotropy not just for light propagation, but for all inertial phenomena. |
If a uniformly moving observer uses pairs of identical material objects thrown with equal force in opposite directions to establish spaces of simultaneity, he will find that his synchronization agrees with that produced by Einstein's assumed isotropic light rays. The special attribute of light in this regard is due to the fact that, although light is inertial, it has no mass of its own, and therefore no rest frame. It can be regarded entirely as nothing but an interaction along a null interval between two massive bodies, the emitter and absorber. From this follows the indefinite metric of spacetime, and light's seemingly paradoxical combination of wavelike and inertial properties. (This is discussed more fully in Section 9.11.) |
It's also worth noting that when Einstein invoked the operational definitions of time and distance based on light propagation, he commented that "we assume this definition of synchronization is free from contradictions, and possible for any number of points". This is crucial for understanding why a set of definitions based on the propagation of light is tenable, in contrast with a similar set of definitions based on non-inertial signals, such as acoustical waves or postal messages. A set of definitions based on any non-inertial signal can't possibly preserve inertial isotropy. Of course, a signal requiring an ordinary material medium for its propagation would obviously not be suitable for a universal definition of time, because it would be inapplicable across regions devoid of that substance. Likewise, a signal consisting of (or carried by) any material object would be unsuitable because such objects do not exhibit any particular fixed characteristic of motion, as shown by the fact that they can be brought to rest. Furthermore, if there exist any signals faster than those on which we base our definitions of temporal synchronization, those definitions will be easily falsified. The fact that Einstein's principles are empirically viable at all, far from being vacuous or tautological, is actually somewhat miraculous. |
In fact, if we were to describe the kind of physical phenomenon that would be required in order for us to have a consistent capability of defining a coherent basis of temporal synchronization for spatially separate events, clearly it could be neither a material object, nor a disturbance in a material medium, and yet it must exhibit some fixed characteristic quality of motion that exceeds the motion of any other object or signal. We hardly have any right to expect, a priori, that such phenomenon exists. On the other hand, it could be argued that Einstein's second principle is just as classical as his first, because sight has always been the de facto arbiter of simultaneity (as well as of straightness, as in "uniform motion in a straight line"). Even in Galileo's day it was widely presumed that vision was instantaneous, so it automatically was taken to define simultaneity. (We discuss the historical progress of understanding the speed of light in Section 3.3.) The difference between this and the modern view is not so much the treatment of light as the means of defining simultaneity, but simply the realization that light propagates at a finite speed, and therefore the spacetime manifold is only partially ordered. |
The conventional axiomatic approach to deriving the Lorentz transformations follows closely the form of Einstein's 1905 paper, in which the special theory of relativity is formally deduced from two empirically derived principles: |
(1) The laws of physics take the same form with respect to any inertial system of coordinates. |
(2) The speed of light is c with respect to any inertial system of coordinates. |
It is often commented that if Maxwell's equations are regarded as fundamental laws of physics, then (2) is superfluous, because Maxwell's equations prescribe the speed of light propagation independent of the source's motion, and with no restriction to any particular system of inertial coordinates. However, by 1905 Einstein already had good reasons to doubt the absolute validity of Maxwell's equations, because he had already completed his paper on the photo-electric effect which introduced the idea of photons, i.e., light propagating as discrete packets of energy, a concept which cannot be represented as a solution of Maxwell's linear equations. In addition, Einstein realized that a purely electromagnetic theory of matter based on Maxwell's equations was impossible, because those equations by themselves could never explain the equilibrium of electric charge that constitutes a charged particle. "Only different, nonlinear field equations could possibly accomplish such a thing." This observation shows how unsupported was the "molecular force hypothesis" of Lorentz, according to which all the forces of nature were assumed to transform exactly as do electromagnetic forces as described by Maxwell's linear equations. Knowing that the molecular forces responsible for the equilibrium of charged particles must necessarily be of a fundamentally different character than the Lorentz forces of electromagnetism, and certainly knowing that the stability of matter may not even have a description in the form of a continuous field theory at all, it's clear that the constructive motivation for Lorentz's hypothesis on the basis of Maxwell's equations is very weak. |
Einstein's contribution was to recognize that "the bearing of the Lorentz transformation transcended its connection with Maxwell's equations and was concerned with the nature of space and time in general". So, instead of basing special relativity on an assumption of the absolutely validity of Maxwell's equations, Einstein based it on the particular characteristic exhibited by those equations, namely Lorentz invariance, that he intuited was the more fundamental principle, one that could serve as an organizing principle analogous to the conservation of energy in thermodynamics, and one that could encompass all physical laws, even if they turned out to be completely dissimilar to Maxwell's equations. Remarkably, this has turned out to be the case. |
Although Einstein explicitly highlighted just two principles as the basis of special relativity in his 1905 paper, he later acknowledged (in an unpublished manuscript known as the Morgan document, written in 1921) three additional and important assumptions that had been tacitly invoked in that paper: |
(3) Homogeneity: The intrinsic properties of ideal rods and clocks do not depend on their positions in (empty) space, nor do they vary over time. |
(4) Spatial Isotropy: The intrinsic properties of ideal rods and clocks do not depend on their orientations in (empty) space. |
(5) Memorylessness: The extrinsic properties of rods and clocks may be functions of their current states of motion, but not of their previous states of motion. |
The last assumption is needed to exclude the possibility that every elementary particle may somehow "remember" is entire history of accelerations, and thereby "knows" its present absolute velocity relative to a common fixed reference. By assuming memorylessness we assume that the current intrinsic properties and behavior of an elementary particle may depend on the particle's present velocity and acceleration, but not on its previous history of velocities and accelerations. |
To review the standard derivation, consider two inertial systems frames of reference k and K with coordinates (x,y,z,t) and (X,Y,Z,T) respectively, and oriented so that the x and X axes coincide, and the xy plane coincides with the XY plane. Also, suppose the system K is moving in the positive x direction with fixed speed v relative to the system k, and the origins of the two systems momentarily coincide at time t = T = 0. According to the principle of homogeneity, the relationship between the two sets of coordinates must be linear, so there must be constants A1 and A2 (for a given v) such that X = A1x + A2 t. Furthermore, if an object is stationary relative to K, and if it passes through the point (x,t) = (0,0), then it's position in general satisfies x = vt, from the definition of velocity, and the X coordinate of that point with respect to the K system is 0. Therefore we have |
X = A1 (vt) + A2 t = 0 |
Since this must be true for non-zero t, we must have A1 v + A2 = 0, and so A2 = -A1 v. Consequently, there is a single constant A (for any given v) such that |
X = A(x - vt) |
Similarly there must be constants B and C such that |
Also, invoking isotropy and homogeneity, we claim that T is independent of y and z, so it must be of the form |
T = D x + E t |
for some constants D and E (for a given v). Now it only remains to determine the values of the constants A, B, C, D, and E in the above expressions. |
Suppose at the instant when the spatial origins of k and K coincide a spherical wave of light is emitted from their common origin. At a subsequent time t in the first frame of reference the sphere of light must be the locus of points satisfying the equation |
and likewise, according to our principles, in the second frame of reference the spherical wave at time T must be the locus of points described by |
By substituting from the previous expressions for the upper case variables into equation (2) we have |
[A(x-vt)]2 + (By)2 + (Cz)2 = c2 (Dx+Et)2 |
Expanding these terms and rearranging gives |
The assumption that light propagates at the same speed in both frames of reference implies that a simultaneous spherical shell of light in one frame is also a simultaneous spherical shell of light in the other frame, so equation (3) must agree with equation (1). Equating the coefficients gives |
Clearly we can take B = C = 1 (rather than -1, since we choose not to reflect the y and z directions). Dividing the 4th equation by 2, we're left with the three equations in the three unknowns A, D, and E: |
Solving the first equation for A2 and substituting this into the 2nd and 3rd equations gives |
(1 + c2 D2) v + DE c2 = 0 |
E2 - (1 + c2 D2) (v/c)2 = 1 |
Solving the first for E and substituting into the 2nd gives a single quadratic equation in D, with the roots |
Substituting this into either equation and solving the resulting quadratic for E gives |
Note that the equations require opposite signs for D and E. Now, for small values of v/c we expect to find E approaching +1 (as in Galilean relativity), so we choose the positive root for E and the negative root for D. Finally, from the relation A2 - c2 D2 = 1 we get |
and again we select the positive root. Consequently we have the Lorentz transformation |
With this transformation we can easily verify that |
x2 + y2 + z2 - c2 t2 = X2 + Y2 + Z2 - c2 T2 |
so this quantity is the squared "absolute distance" from the origin to the point with lower case coordinates (x,y,z,t) and the corresponding K coordinates (X,Y,Z,T), which confirms that the absolute spacetime interval between two points is the same in both frames. Notice that equations (1) and (2) already implied this relation for null intervals. In other words, the original premise was that if x2 + y2 + z2 - c2t2 equals zero, then X2 + Y2 + Z2 - C2T2 also equals zero. The above reasoning show that a consequence of this premise is that, for any arbitrary real number k, if x2 + y2 + z2 - c2t2 equals k, then X2 + Y2 + Z2 - C2T2 also equals k. Therefore, this quadratic form represents an absolute invariant quantity associated with the interval from the origin to the event (x,y,z,t). |