Initially the special theory of relativity was regarded as just a particularly simple and elegant interpretation of Lorentz's ether theory, but it soon became clear that there is a profound heuristic difference between the two theories, most evident when we consider the singularity implicit in the Lorentz transformation x' = g(x-vt), t' = g(t-vx), where g = 1/(1-v2)1/2. As v approaches arbitrarily close to 1, the factor g goes to infinity. If these relations are strictly valid (locally), as all our observations and experiements suggest, then according to Lorentz's view all configurations of objects moving through the absolute ether must be capable of infinite spatial "contractions" and temporal "dilations", without the slightest distortion. This is clearly unrealistic. Hence the only plausible justification for the Lorentzian view is a belief that the Lorentz transformation equations are not strictly valid, i.e., that they must break down at some point. Indeed, this was Lorentz's ultimate justification, as he held to the possibility that absolute speed might, after all, make some difference to the intrinsic relations between physical entities. However, one hundred years after Lorentz's time, there still is no evidence to support his suspicion. To the contrary, all the tremendous advances of the last century in testing the Lorentz transformation "to the nth degree" have consistently confirmed it's exact validity. At some point a reasonable person must ask himself "What if the Lorentz transformation really is exactly correct?" This is a possibility that a neo-etherist cannot permit himself to contemplate - because the absolute physical singularity along light-like intervals implied by the Lorentz transformation is plainly incompatible with any realistic ether - but it is precisely what special relativity requires us to consider, and this ultimately leads to a completely new and more powerful view of causality. |
The singularity of the Lorentz transformation is most clearly expressed in terms of the underlying Minkowski pseudo-metric. Recall that the invariant space time interval dt between the events (t,x) and (t+dt, x+dx) is given by |
(dt)2 = (dt)2 - (dx)2 |
where t and x are any set of inertial coordinates. This is called a pseudo-metric rather than a metric because, unlike a true metric, it doesn't satisfy the triangle inequality, and the interval between distinct points can be zero. This occurs for any interval such that dt = dx, in which case the invariant interval dt is literally zero. Arguably, it is only in the context of Minkowski spacetime, with its null connections between distinct events, that phenomena involving quantum entanglement can be rationalized. |
Pictorially, the locus of points whose squared distance from the origin is 1 consists of the two hyperbolas labeled +1 and -1 in the figure below. |
The diagonal axes denoted by a and b represents the paths of light through the origin, and the magnitude of the squared spacetime interval along these axes is 0, i.e., the metric is degenerate along those lines. This is all expressed in terms of conventional space and time coordinates, but it's also possible to define the spacetime separations between events in terms of null coordinates along the light-line axes. Conceptually, we rotate the above figure by 45 degrees, and regard the a and b lines as our coordinate axes, as shown below: |
In terms of a linear parameterization (a,b) of these "null coordinates" the locus of points at a squared "distance" (dt)2 from the origin is an orthogonal hyperbola satisfying the equation |
(dt)2 = (da)(db) |
Since the light-lines a and b are degenerate, in the sense that the absolute spacetime intervals along those lines vanish, the absolute velocity of a worldline, given by the "slope" db/da = 0/0, is strictly undefined. This indeterminacy, arising from the singular null intervals in spacetime, is at the heart of special relativity, allowing for infinitely many different scalings of the light-line coordinates. In particular, it is natural to define the rest frame coordinates a,b of any worldline in such a way that da/db = 1. This expresses the principle of relativity, and also entails Einstein's second principle, i.e., that the (local) velocity of light with respect to the natural measures of space and time for any worldline is unity. The relationship between the natural null coordinates of any two worldlines is then expressed by the requirement that, for any given interval dt, the components da,db with respect to one frame are related to the components da',db' with respect to another frame according to the equation (da)(db) = (da')(db'). It follows that the scale factors of any two frames Si and Sj are related according to |
where vij is the usual velocity parameter (in units such that c = 1) of the origin of Sj with respect to Si. Notice there is no absolute constraint on the scaling of the a and b axes, there is only a relative constraint, so the "gage" of the light-lines really is indeterminate. Also, the scale factors are simply the relativistic Doppler shifts for approaching and receding sources. This accords with the view of the ab coordinate "grid lines" as the network of light-lines emitted by a strobed source moving along the reference world-line. |
To illustrate how we can operate with these null coordinate scale relations, let us derive the addition rule for velocities. Given three co-linear unaccelerated particles with the pairwise relative velocity parameters v12, v23, and v13, we can solve the "a scale" relation for v13 to give |
We also have |
Multiplying these together gives an expression for da1/da3, which can be substituted into (1) to give the expected result |
Interestingly, although neither the velocity parameter v nor the quantity (1+v)/(1-v) is additive, it's easy to see that the parameter ln[(1+v)/(1-v)] is additive. In fact, this parameter corresponds to the arc length of the "dt = constant" hyperbola connecting the two world lines at unit distances from their intersection, as shown by integrating the differential distance along that curve |
Since the equation of the hyperbola for dt = 1 is 1 = dt2 - dx2 we have |
Substituting this into the previous expression and performing the integration gives |
Recalling that dt2 = dt2 - dx2, we have dt + dx = dt2 / (dt - dx), so the quantity dx + dt can be written as |
Hence the absolute arc length along the dt = 1 surface between two world lines that intersect at the origin with a mutual velocity v is |
Naturally the additivity of this logarithmic form implies that the argument is a multiplicative measure of mutual speeds. The absolute interval between the intersection points of the two worldlines with the dt = 1 hyperbola is |
One strength of the conventional pseudo-metrical formalism is that (t,x) coordinates easily generalize to (t,x,y,z) coordinates, and the invariant interval generalizes to |
(dt)2 = (dt)2 - (dx)2 - (dy)2 - (dz)2 |
The generalization of the null (lightlike) coordinates and corresponding invariant is not as algebraically straightforward, but it conveys some interesting aspects of the spacetime structure. Intuitively, an observer can conceive of the absolute interval between himself and some distant future event P by first establishing a scale of radial measure outward on his forward light cone in all directions, and then for each direction evaluate the parameterized null measure along the light cone to the point of intersection with the backward null cone of P. This will assign, to each direction in space, a parameterized distance from the observer to the backward light cone of P, and there will be (in flat spacetime) two distinguished directions, along which the null measure is maximum or minimum. These are the principle directions for the interval from the observer to E, and the product of the null measures in these directions is invariant. In other words, if a second observer, momentarily coincident with the first but with some relative velocity, determines the null measures along the principle directions to the backward light cone of E, with respect to his own natural parameterization, the product will be the same as found by the first observer. |
It's often convenient to take the interval to the point P as the time axis of inertial coordinates t,x,y,z, so the eigenvectors of the null cone intersections become singular, and we can simply define the null coordinates u = t + r, v = t - r, where r = (x2+y2+z2)1/2. From this we have t = (u+r)/2 and r = (u-v)/2 along with the corresponding differentials dt = (du+dv)/2 and dr = (du-dv)/2. Making these substitutions into the usual Minkowski metric in terms of polar coordinates |
we have the Minkowski line element in terms of angles and null coordinates |
These coordinates are often useful, but we can establish a more generic system of null coordinates in 3+1 dimensional spacetime by arbitrarily choosing four non-parallel directions in space from an observer at O, and then the coordinates of any timelike separated event are expressed as the four null measures radially in those directions along the forward null cone of O to the backward null cone of P. This provides enough information to fully specify the interval OP. |
In terms of the usual orthogonal spacetime coordinates, we specify the coordinates (T,X,Y,Z) of event P relative to the observer O at the origin in terms of the coordinates of four events I1, I2, I3, I4 on the intersection of the forward null cone of O and the backward null cone of P. If ti,xi,yi,zi denote the conventional coordinates of Ii, then we have |
ti2 = xi2 + yi2 + zi2 (T - ti)2 = (X - xi)2 + (Y - yi)2 + (Z - zi)2 |
for i = 1, 2, 3, 4. Expanding the right hand equations and canceling based on the left hand equalities, we have the system of equations |
The left hand side of all four of these equations is the invariant squared proper time interval t2 from O to P, and we wish to express this in terms of just the four null measures in the four chosen directions. For a specified set of directions in space, this information can be conveyed by the four values t1, t2, t3, and t4, since the magnitudes of the spatial components are determined by the directions of the axes and the magnitude of the corresponding t. In general we can define the direction coefficients aij such that |
with the condition ai12 + ai22 + ai32 = 1. Making these substitutions, the system of equations can be written in matrix form as |
We can use any four directions for which the determinant of the coefficient matrix does not vanish. One natural choice is to use the vertices of a tetrahedron inscribed in a unit sphere, so that the four directions are perfectly symmetrical. We can take as the coordinates of the vertices |
Inserting these values for the direction coefficients aij, we can solve the matrix equation for T, X, Y, and Z to give |
Substituting into the relation t2 = T2 - X2 - Y2 - Z2 and solving for t2 gives |
Naturally if t1 = t2 = t3 = t4 = t, then this gives t = 2t. Also, notice that, as expected, this expression is perfectly symmetrical in the four lightlike coordinates. It's interesting that if the right hand term was absent, then t would be simply the harmonic mean of the ti. |
More generally, in a spacetime of 1 + (D-1) dimensions, the invariant interval in terms of D perfectly symmetrical null measures t1, t2,..., tD satisfies the equation |
It can be verified that with D = 2 this expression reduces to t2 = 4t1t2 , which agrees with our earlier hyperbolic formulation t2 = ab with a = 2t1 and b=2t2. In the particular case D = 4, if we define U = 2/t and uj = 1/(2tj) this equation can be written in the form |
where s is the average squared difference of the individual u terms from the average, i.e., |
This is the statistical variance of the uj values. Incidentally, we've seen that the usual representation s2 = x2 - t2 of the invariant spacetime interval is a generalization of the familiar Pythagorean "sum-of-squares" equation of a circle, whereas the interval can also be expressed in the hyperbolic form s2 = ab. This reminds us of other fundamental relations of physics that have found expression as hyperbolic relations, such as the uncertainty relations |
in quantum mechanics, where h is Planck's constant. In general if the operators A,B corresponding to two observables do not commute (i.e., if AB - BA 0), then an uncertainty relation applies to those two observables, and they are said to be incompatible. Spatial position and momentum are maximally incompatible, as are energy and time. Such pairs of variables are called conjugates. This naturally raises the question of whether the variables parameterizing two oppositely directed null rays in spacetime can, in some sense, be regarded as conjugates, accounting for the invariance of their product. Indeed the special theory of relativity can be interpreted in terms of a fundamental limitation on our ability to make measurements, just as can the theory of quantum mechanics. In quantum mechanics we say that it's not possible to simultaneously measure the values of two conjugate variables such that the product of the uncertainties of those two measurements is less than h/4p. Likewise in special relativity we could say that it's not possible to measure the time difference dt between two events separated by the spatial distance dx such the ratio dt/dx of the variables is less than 1/c. In quantum mechanics we may imagine that the particle possesses a precise position and momentum, even though we are unable to determine it due to practical limitations of our measurement techniques. If only we have infinitely weak signal, i.e., if only h = 0, we could measure things with infinite precision. Likewise in special relativity we may imagine that there is an absolute and precise relationship between the times of two distant events, but we are prevented from determining it due to the practical limitations. If only we had an infinnitely fast signal, i.e., if only 1/c was zero, we could measure things with infinite precision. In other words, nature possesses structure and information that is inaccessible to us (hidden variables), due to the limitations of our measuring capabilities. |
However, it's also possible to regard the limitations imposed by quantum mechanics (h 0) and special relativity (1/c 0) not as limitations of measurement, but as expressions of an actual ambiguity and "incompatibility" in the independent meanings of those variables. Einstein's central contribution to modern relativity was the idea that there is no one "true" simultaneity between spatially separate events, but rather spacetime events are only partially ordered, and the decomposition of space and time into separate variables contains an inherent ambiguity on the scale of 1/c. In other words, he rejected Lorentz's "hidden variable" approach, and insisted on treating the ambiguity in the spacetime decomposition as fundamental. This is interesting in part because, when it came to quantum mechanics, Einstein's instinct was to continue trying to find ways of measuring the "hidden variables", and he was never comfortable with the idea that the Heisenberg uncertainty relations express a fundamental ambiguity in the decomposition of conjugate variables on the scale of h. (Late in life, as Einstein continued arguing against Bohr's notion of complementarity in quantum mechanics, one of his younger collegues said "But Professor Einstein, you yourself originated this kind of positivist reasoning about conjugate variables in the theory of space and time", to which Einstein replied "Well, perhaps I did, but it's nonsense all the same".) |
Another model suggested by the relativistic interpretation of spacetime is to conceive of space and time as two superimposed waves, combining constructively in the directions of the space and time axes, but destructively (i.e., cancelling out) along light lines. For any given inertial coordinate system x,t, we can associate with each event an angle q defined by tan(q) = t/x. Thus the interval from the origin to the point x,t makes an angle q with the positive x axis, and we have t = x tan(q), so we can express the squared magnitude of a spacelike interval as |
Multiplying through by cos(q)2 gives |
Substituting t2 / tan(q)2 for x2 gives the analogous expression |
Adding these two expressions gives the result |
Consequently the "circular" locus of events satisfying x2 + t2 = r2 for any fixed r can be represented in polar coordinates (s,q) by the equation |
which is the equation of two lemniscates, as illustrated below. |
The lemniscate was first discussed by Jakob Bernoulli in 1694, as the locus of points satisfying the equation |
which is, in Bernoulli's words, "a lying eight-like figure, folded in a knot of a bundle, or of a lemniscus, a knot of a French ribbon". (The study of this curve led Fagnano, Euler, Legendre, Gauss, and others to the discovery of addition theorems for integrals, of which the relativistic velocity composition law is an example.) Notice that the lemniscate is the inverse (in the sense of inversive geometry) of the hyperbola relative to the circle of radius k. In other words, if we draw a line emanating from the origin and it strikes the lemniscate at the radius s, then it strikes the hyperbola at the radius R where sR = k2. This follows from the fact that the equation for a hyperbola in polar coordinates is R2 = k2/[E2 cos(q)2 - 1] where E is the eccentricity, and for an orthogonal hyperbola we have E = . Hence the denominator is 2cos(q)2 - 1 = cos(2q), and the equation of the hyperbola is R2 = k2/cos(2q). Since the polar equation for the lemniscate is s2 = k2cos(2q) we have sR = k2. |