1.2 Systems of Reference |
The assumptions of persistent identities and finite dimensionality help to contain the complexity of our model of the external world. Were it not for these, we might imagine the space of n objects to have on the order of n(n-1)/2 dimensions, proportional to the number of pairwise relations. Fortuitously, it appears possible to arrange our information about the world in such a way that the spatial relations between n particles can be fully represented by assigning to each particle just three real numbers, the spatial coordinates with respect to some (non-unique) system of reference, which we conceptualize as the Cartesian product of three complete copies of the set of real numbers. This is the ordinary model of three-dimensional Euclidean space, denoted as E3. |
On this basis we can imagine an almost totally objective world consisting of multiple impersonal objects residing at specific locations within a three-dimensional continuous space, and allowing that the locations may independently change over a succession of instants. To parameterize this succession of instants we introduce a fourth coordinate, so that a given instance of a given particle is fully specified by three spatial coordinates and one temporal coordinate. It's natural to regard each particle as, at least potentially, an independent observer with its own sequence of states, but therein lies a possible ambiguity, because it isn't clear how the temporal states of one particle (or observer) are to be placed in correspondence with the temporal states of another. Here we must make an important decision about how our model of the world is to be constructed. We might choose to regard the totality of all entities as comprising a single element in a succession of universal temporal states, in which case the temporal correspondence between entities is unambiguous. In such a universe the temporal coordinate induces a total ordering of events, which is to say, if we let the symbol denote temporal precedence or equality, then for every three events a,b,c we have |
(i) a a |
On the other hand, we might choose to regard the temporal state of each individual particle or observer as an independent quantity, bearing in mind that orderings of the elements of a set are not necessarily total. For example, consider the subsets of a flat plane, and the ordering induced by the inclusion relation (Í). Obviously the first three axioms of a total ordering are satisfied, because we have (i) a Í a , (ii) if a Í b and b Í a, then a = b, and (iii) if a Í b and b Í c, then a Í c. However, the fourth axiom is not satisfied, because it's entirely possible to have two sets neither of which is included in the other. This is called a partial ordering, and we may wish to allow for the possibility that the temporal relations between events induce a partial rather than a total ordering. In fact, we have no a priori reason to expect that temporal relations induce even a partial ordering, so it may be safest to assume that each entity possesses its own temporal state, and let our observations teach us how those states are mutually related. Similar caution should be applied when modeling the relations between the spatial states of particles. |
Without digressing too far into abstraction, we may assume that an observer is able to organize his information and experiences into a representation in terms of the private system of references (x,y,z,t), and he maps the locus of any persistently identifiable entity as a function of his own internal state variable p by the function x(p),y(p),z(p), and t(p). In particular, we can assume that he chooses coordinates so that his own locus is automatically characterized by |
Regarding the quantification of the state parameter p, we abstract this from the aggregate of our experience by identifying the largest equivalence class of recognizable sequences of "like" events or experiences that occur in constant proportions to each other, such as our heart beats, ocean tides, the swinging of a pendulum, the quantum evolution of a cesium clock. By defining time in terms of the largest and most reliable set of sequences that evolve linearly with respect to each other, we achieve the greatest possible simplification. In this way, we ultimately arrive at the identification of a particle's state parameter (or "proper time") p with the quantum phase of that particle. Obviously the observer has the derivatives dt(p)/dp = 1 and dx(p)/dp = 0 for his own locus. |
In addition, each observer is assumed to be capable of assessing the values of x(p), y(p), z(p), and t(p) for the locus of any other particle, where p is still the observer's own state variable. From these he can evaluate the derivatives dt(p)/dp and dx(p)/dp for that other particle in terms of the observer's own system of reference. A measure of the relative "motion" of the particle (in the x direction) with respect to the observer is the ratio of these two derivatives, which gives dx(p)/dt(p). On this basis the observer's motion relative to himself is zero, which is an absolutely distinguished quantity. However, we could also consider the motion of an object with respect to itself to have an undefined or infinite quantity, which is also absolutely distinguished, and which could be represented by the reciprocal ratio dt(p)/dx(p). Both of these ratios (dt/dx and dx/dt) have absolutely distinguished values for the observer himself, and give suitable measures of motion for other particles. |
Likewise a second observer, with the state variable P, would organize his information in terms of a coordinate system (X,Y,Z,T) such that his own locus satisfies |
These are all relations of the lowest order, i.e., relations which each observer assigns independently to other particles. In order for this model to represent an objective world in common to all observers we require a way of establishing relations between these primitive relations. In particular, if observer A determines that a certain particle B has a state of motion u, and that another observer, C, has a state of motion v, we wish to determine the motion of B with respect to C. The metric for a given model of spacetime describes quantitatively how the objective absolute spacetime separation between two events depends on the individual spatial and temporal components of the separation under any given system of coordinates (possibly within some restricted class), which enables us to translate the components from one system of reference to another. There are different possible metrics that spacetime might possess, so we are once again required to make a choice. (By the way, we're using the term "metric" in this section to refer to any quantification of invariant separation, without assuming that it necessarily satisfies all the axioms of a true metric in the mathematical sense. This qualification is discussed more fully in Section 9.) |
The classical Galilean metric consists of two separate parts, one for space and one for time but, strictly speaking, the spatial part can only be applied when the temporal part vanishes. To illustrate, consider two events with the coordinates (x1,y1,z1,t1) and (x2,y2,z2,t2) respectively. According to the Galilean metric we can infer the following information about the absolute separation between these two events: |
Why is the spatial distance D undefined if the events are not simultaneous? We might try to use the same formula as in the case of simultaneous events, but the relations between the (x,y,z) coordinates at time t1 and the (x,y,z) coordinates at time t2 depend on the state of motion of the reference frame, which according to Galilean relativity is completely arbitrary. For example, consider two events with spatial coordinates (0,0,0) and (3,0,0) meters at times t1 = 0 and t2 = 1 seconds respectively. In these coordinates it may appear that the events are separated by a spatial distance of 3 meters, but an equally valid coordinate system is one that is moving uniformly in the positive x direction at a speed of 3 meters/sec, in terms of which the spatial coordinates of the two events are both (0,0,0). In general, since Galilean relativity does not recognize any upper bound on the speed of a valid frame of reference, we can transform any apparent spatial distance into zero, or into any other value, simply by changing the frame of reference. To put this another way, given any two non-simultaneous events in a Galilean context, and any real number k, there exists a perfectly valid frame of reference with respect to which the spatial distance between those two events equals k. Thus, the absolute spatial distance between non-simultaneous events in Galilean relativity is totally undefined. |
It seems counter-intuitive that we can't define any definite spatial distance between non-simultaneous events, because surely there is a physically meaningful sense in which the Andromeda Galaxy tomorrow is further away from us than our kitchen table tomorrow. This impression reflects our intuitive sense that speed is not entirely arbitrary, and some distances would be "more difficult" to traverse than others in a given interval of time. Of course, this impression may just reflect our intuitive sense of the finiteness of the energy at our disposal, which places practical limits on our ability to change the relative velocities of material bodies. Nevertheless, it would seem odd if motion - which is essentially a process that converts time into distance - could, in principle, completely change any non-simultaneous spatial distance to any other, and yet has no effect at all on temporal component of the interval. |
One way of giving absolute significance to non-simultaneous distances would be to limit the maximum possible relative speed, which would restrict the set of possible spatial coordinates a uniformly moving object at a certain spatial location could have at any other time (past or future). For example, suppose we stipulate a maximum relative velocity of c. This would mean that the two events discussed above, when extrapolated at any allowable speed to a common instant of time, must then be separated by a (signed) spatial distance no greater than D + cT and no less than D - cT, where |
Although we haven't succeeded in defining a unique spatial distance, we have at least bounded it between two finite magnitudes D + cT and D - cT. Now, for any two events, suppose we define the quantity s as the geometric mean of these two bounds, so we have |
In order for s2 to be positive we must have D greater than cT, which implies that the interval in question is "spacelike". On the other hand, for intervals with D less than cT we could apply the same heuristic argument to hypothesize an absolute squared magnitude for timelike intervals |
where t represents the geometric mean of the greatest and least time intervals extrapolated to the same location (analogous to the max and min spatial distances extrapolated to the same instant). As a result, we find s2 = - c2 t2, which shows that these two squared intervals are essentially the same, differing only by the constant scale factor -c2, which just converts between spatial and temporal units. Consequently we can speak of the space-time interval between two events. Of course, for any given pair of events only one of the quantities s2 and t2 will be positive, so either s or t is real, depending on whether D is greater than or less than cT. This naturally splits the set of all possible intervals into two subsets, called spacelike (s2 > 0) and timelike (t2 > 0). In the special case when D exactly equals cT we find that both s and t vanish. |
Thus the complementary quantities s and t seem like very natural candidates to define as the (squared) absolute spacetime interval between two arbitrary events, especially because they are very nearly invariant under Galilean transformations, at least for relative velocities much less than the limiting value c. However, they aren't quite invariant, so we would need to revise our transformation rules in order to adopt this quantity as the absolute interval. We will see precisely how the rules need to be changed, and the profound implications of this change, in subsequent sections. |
Galilean relativity tells us there is no unique mapping from any given spatial location at one instant to any given spatial location at another, and there is no way to encode the uniform motion of an object. Either motion must be an inherent property of an object at a given instant, with is self-contradictory, or the property of motion must extend over multiple instants. What establishes that this point in this instant corresponds to that point in the next instant? There is no distinct "next" point on the real line, so we can't conceive of the instants as discrete entities. They must constitute a continuous range of instants, but that implies a relation between the longitudinal units and the transverse units, i.e., between space and time. The existence of motion requires a unified space-time manifold, which implies a conversion factor c between units of space and units of time. |
If an object C is moving to the right with a speed v relative to an observer at rest in reference system A, and a different observer at rest in reference system B is moving to the left with a speed u relative to A, then what is the speed of the object C relative to reference system B? Both Galileo and Newton assumed that with respect to B the object was moving to the right with a speed v + u. In other words, they assumed co-linear speeds are simply additive. This assumption may appear inevitable, considering that speeds actually are simply additive (by definition) with respect to any one system of reference. It requires some careful thought to recognize that there is (at least potentially) a distinction between (1) the difference between the speeds of C and B relative to A, and (2) the speed of C relative to B. Both Galileo and Newton (not to mention everyone else prior to Einstein) believed that (1) and (2) were numerically equivalent, if not logically identical. This is the point on which modern science differs from the classical theory of Galilean kinematics, and it's worth emphasizing that the difference is not over the principle of relativity itself, but over the assumption of simple additivity for the composition of velocities, and the underlying conceptions of space and time on which that assumption was based. |
Purely kinematic relativity contains enough degrees of freedom that we can simply define our systems of reference (i.e., coordinate systems) to satisfy the additivity of velocity. In other words, we can adopt velocity additivity as a principle, and this is essentially what scientists had tacitly done since ancient times. The great insight of Galileo and his successors was that this principle is inadequate to single out the physically meaningful reference systems. A new principle was necessary, namely, the principle of inertia, to be discussed in the next section. This new and more profound interpretation of relativity became the foundation of the scientific revolution of the 17th and 18th centuries, but not until the beginning of the 20th century did scientists recognize that it was actually incompatible with the ancient assumption of simply additive speeds. |