3.1 Postulates and Principles

The two main propositions on which Einstein based the special theory of relativity serve as postulates in a formal sense, but more fundamentally they represent principles. A postulate is an axiom stipulated as part of a purely formal deductive system, whereas a principle (in the context of physics) is a conceptual schema that serves both inductively to establish a framework within which to organize our knowledge, and deductively to make predictions. In his 1905 paper "On the Electrodynamics of Moving Bodies" Einstein begins with the crucial conjecture that

... the same laws of electrodynamics and optics will be valid for all coordinate systems in which the equations of mechanics hold good. We will raise this conjecture (hereafter called the "principle of relativity") to the status of a postulate...

This passage not only encapsulates the entire absolute physical content of special relativity, it also illustrates how conjectures, principles, and postulates are subtly interwoven into the foundations of a physical theory. The assertion was conjectural - at the time - because it had been empirically confirmed only up to the first order in v/c (per the experimental results cited explicitly in the paper, although Michelson and Morley had already confirmed the conjecture up to the second order). Einstein proposed to adopt this conjecture as a postulate in the formal structure of his theory, but he acknowledges its more fundamental role as a principle, since it entails the decision to organize our knowledge in terms of coordinate systems in which the equations of mechanics hold good, i.e., inertial coordinates.

He goes on to introduce a second proposition that will be employed as a postulate in the formal structure of the theory, namely,

... that the velocity of light always propagates in empty space with a definite velocity c that is independent of the state of motion of the emitting body. These two postulates suffice for the attainment of a simple and consistent electrodynamics of moving bodies based on Maxwell's theory for bodies at rest.

Interestingly, in the paper "Does the Inertia of a Body Depend on Its Energy Content?" published later in the same year, Einstein commented that

... the principle of the constancy of the velocity of light... is of course contained in Maxwell's equations.

In view of this we might ask why he did not simply dispense with his "second principle" and assert that the "laws of electrodynamics and optics" referred to in the statement of the relativity principle are none other than Maxwell's equations. He might simply have based his theory on the single proposition that Maxwell's equations are valid for every system of coordinates in which the laws of mechanics hold good. The reason he chose not to follow this course is apparently to be found in still another paper that Einstein published in the same year, "On a Heuristic Point of View Concerning the Production and Transformation of Light", in which he wrote

... despite the complete confirmation of [Maxwell's theory] by experiment, the theory of light, operating with continuous spatial functions, leads to contradictions when applied to the phenomena of emission and transformation of light.

In other words, Einstein was already aware of the fact that, although Maxwell's equations are empirically satisfactory in many respects, they cannot be regarded as fundamentally correct or valid, so it isn't surprising that he preferred not to base his theory of relativity on them explicitly. He needed to divine from them the key feature, whose significance "transcended its connection with Maxwell's equations", and which would serve as a viable principle for organizing our knowledge of all phenomena, including both optics and mechanics. The principle he decided on was the invariance of light speed (in empty space) with respect to any system of inertial coordinates.

After reviewing the operational definition of inertial coordinates in section 1 (which he does by optical rather than mechanical means, missing the opportunity to clarify the significance of inertial coordinates in establishing the connection between mechanical and optical phenomena), he gives more formal statements of his two principles

The following reflections are based on the principle of relativity and the principle of the constancy of the velocity of light. These two principles we define as follows:
1. The laws by which the states of physical systems undergo change are not affected, whether these changes of state be referred to the one or the other of two systems of co-ordinates in uniform translatory motion.
2. Any ray of light moves in the "stationary" system of co-ordinates with the determined velocity c, whether the ray is emitted by a stationary or by a moving body. Hence velocity equals [length of] light path divided by time interval [of light path], where time interval [and length are] to be taken in the sense of the definition in 1.

Clearly the first is nothing but the principle of inertia, accepted as a fundamental principle of physics since the time of Galileo. Although a clear and unambiguous understanding of this principle is more elusive than one might think, it was considered fairly unobjectionable and even conventional by physicists in 1905. Einstein's second principle, on the other hand, was regarded as a novelty and suspected of being unwarranted. Of course, as expressed above, it isn't even a self-contained statement, because its entire meaning and significance depends on "the sense of" time intervals and (implicitly) spatial lengths given in 1 of Einstein's paper, where we find that time intervals and spatial lengths are defined to be such that their ratio equals the fixed constant c for light paths. This has tempted some readers to conclude that "Einstein's second postulate" was merely a tautology, with no substantial content. The source of this confusion is the fact that the essential axiomatic foundations underlying special relativity are contained not in the two famous propositions at the beginning of 2 of Einstein's paper (as quoted above), but rather in the sequence of assumptions and definitions explicitly spelled in 1. Among these are the very first statement

Let us take a system of co-ordinates in which the equations of Newtonian mechanics hold good.

To this statement Sommerfeld added the note "i.e., to the first approximation", meaning for motion with speeds small in comparison with the speed of light. Of course, Einstein was aware of the epistemological shortcomings of the above statement, because while it tells us to begin with an inertial system of coordinates, it doesn't tell us how to identify such a system. This has always been a potential source of ambiguity for mechanics based on the principle of inertia. Strictly speaking, Newton's laws are epistemologically circular, so in practice we must apply it both inductively and deductively. First we use them inductively with our primitive observations to identify inertial coordinate systems by observing how things behave. Then at some point when we've gained confidence in the inertialness of our coordinates, we begin to apply the laws deductively, i.e., we begin to deduce how things will behave with respect to our inertial coordinates. Ultimately this is how all physical theories are applied, first inductively as an organizing principle for our observations, and then deductively as "laws" to make predictions. Neither Galilean nor special relativity is able to justify the privileged role given to a particular class of coordinate systems, nor to provide a non-circular means of identifying those systems. In practice we identify inertial systems by means of an incomplete induction. Obviously Einstein was aware of the deficiency of this approach (which he subsequently labored to eliminate from the general theory), but in 1905 he judged it to be the only pragmatic way forward.

The next fundamental assertion in 1 of Einstein's paper is that lengths and time intervals can be measured by (and expressed in terms of) a set of primitive elements called "measuring rods" and "clocks". Again Einstein was fully aware of the deficiency. As he later wrote

It is striking that the theory introduces (in addition to the four-dimensional space) two kinds of physical things, namely, (1) measuring rods and clocks, and (2) all other things, including the electromagnetic field, the material point, and so on. This, in a certain sense, is inconsistent; strictly speaking, measuring rods and clocks should emerge as solutions of the basic equations (objects consisting of moving atomic configurations), not, as it were, as theoretically self-sufficient entities. ...it was clear from the very beginning that the postulates of the theory are not strong enough to deduce from them equations for physical events sufficiently complete and sufficiently free from arbitrariness in order to base upon such a foundation a theory of measuring rods and clocks... it was better to admit such inconsistency - with the obligation, however, of eliminating it at a later stage of the theory...

Thus the introduction of clocks and rulers as primitive entities was another pragmatic concession, and one that Einstein realized was not strictly justifiable on any other grounds than provisional expediency.

Next Einstein acknowledges that we could content ourselves to time events by using an observer located at the origin of the coordinate system, which corresponds to the absolute time of Lorentz, as discussed in Section 1.6. Following this he describes the "much more practical arrangement" based on the reciprocal operational definition of simultaneity. He says

We assume this definition of synchronization to be free of any possible contradictions, applicable to arbitrarily many points, and that the following relations are universally valid:
1. If the clock at B synchronizes with the clock at A, the clock at A synchronizes with the clock at B.
2. If the clock at A synchronizes with the clock at B and also with the clock at C, the clocks at B and C also synchronize with each other.

These are important and non-trivial assumptions about the viability of the proposed operational procedure for synchronizing clocks, but they are only indirectly invoked by the reference to "the sense of time intervals" in the statement of Einstein's second principle. Furthermore, as mentioned in Section 1.6, Einstein himself subsequently identified at least three more assumptions (homogeneity, spatial isotropy, memorylessness) that are tacitly invoked in the formal development of special relativity. Of course, the list of unstated assumptions would actually be even longer if we were to construct a theory beginning from nothing but an individual's primitive sense perceptions. The justification for leaving them out of a scientific paper is that these can mostly be classified as what Euclid called "common notions", i.e., axioms that are common to all fields of thought.

In many respects Einstein modeled his presentation of special relativity on the formal theory of thermodynamics, which is founded on the principle of the conservation of energy. There are different kinds of energy, with formally different units, e.g., mechanical and gravitational potential energy are typically measured in terms of joules (a force times a distance, or equivalently a mass times a squared velocity), whereas heat energy is measured in calories (the amount of heat required to raise the temperature of 1 gram of water by one degree C). It's far from obvious that these two things can be treated as different aspects of the same thing, i.e., energy. However, through careful experiments and observations we find that whenever mechanical energy is dissipated by friction (or any other dissipative process), the amount of heat produced is proportional to the amount of mechanical energy dissipated. Conversely, whenever heat is involved in a process that yields mechanical work, the heat content is reduced in proportion to the amount of work produced. In both cases the constant of proportionality is found to be 4.1833 joules per calorie.

Now, the First Law of thermodynamics asserts that the total energy of any physical process is always conserved, provided we "correctly" account for everything. Of course, in order for this assertion to even make sense we need to define the proportionality constants between different kinds of energy, and those constants are naturally defined so as to make the First Law true. In other words, we determine the proportionality between heat and mechanical work by observing these quantities and assuming that those two changes represent equal quantities of something called "energy". But this assumption is essentially equivalent to the First Law, so if we apply these operational definitions and constants of proportionality, the conservation of energy can be regarded as a tautology or a convention.

This shows clearly that, just as in the case of Newton's laws, these propositions are actually principles rather than postulates, meaning that they first serve as organizing principles for our measurements and observations, and only subsequently do they serve as "laws" from which we may deduce further consequences. This is the sense in which fundamental physical principles always operate. Wein's letter of 1912 nominating Einstein and Lorentz for the Nobel prize commented on this same point, saying that "the confirmation of [special relativity] by experiment... resembles the experimental confirmation of the conservation of energy".

Einstein himself acknowledged that he consciously modeled the formal structure of special relativity on thermodynamics. He wrote in his autobiographical notes

Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results. The example I saw before me was thermodynamics. The general principle was there given in the proposition: The laws of nature are such that it is impossible to construct a perpetuum mobile (of the first and second kinds).

This principle is a meta-law, i.e., it does not express a particular law of nature, but rather a general principle to which all the laws of nature conform. In 1907 Ehrenfest suggested that special relativity constituted a closed axiomatic system, but Einstein quickly replied that this was not the case. He explained that the relativity principle combined with the principle of invariant light speed is not a closed system at all, but rather it provides a coherent framework within which to conduct physical investigations. As he put it, the principles of special relativity "permit certain laws to be traced back to one another (like the second law of thermodynamics)."

Not only is there a close formal similarity between the axiomatic structures of thermodynamics and special relativity, each based on two fundamental principles, these two theories are also substantively extensions of each other. The first law of thermodynamics can be placed in correspondence with the basic principle of relativity, which suggests the famous relation E = mc2, thereby enlarging the realm of applicability of the first law. The second law of thermodynamics, like Einstein's second principle of invariant light speed, is more sophisticated and more subtle. A physical process whose net effect is to remove heat from a body and produce an equivalent amount of work is called perpetual motion of the second kind. It isn't obvious from the first law that such a process is impossible, and indeed there were many attempts to find such a process - just as there were attempts to identify the rest frame of the electromagnetic ether - but all such attempts failed. Moreover, they failed in such a way as to make it clear that the failures were not accidental, but that a fundamental principle was involved.

In the case of thermodynamics this was ultimately formulated as the second law, one statement of which (as alluded to by Einstein in the quote above) is simply that perpetual motion of the second kind is impossible - provided the various kinds of energy are defined and measured in the prescribed way. (This theory was Einstein's bread and butter, not only because most of his scientific work prior to 1905 had been in the field of thermodynamics, but also because a patent examiner inevitably is called upon to apply the first and second laws to the analysis of hopeful patent applications.) Compare this with Einstein's second principle, which essentially asserts that it's impossible to measure a speed in excess of the constant c - provided the space and time intervals are defined and measured in the prescribed way. The strength of both principles is due ultimately to the consistency and coherence of the ways in which they propose to analyze the processes of nature.

It's interesting to compare these modern axiomatic structures with one of the first formal systems of thought ever proposed. In Book I of The Elements, Euclid consolidated and systematized plane geometry as it was known circa 300 BC into a formal deductive system. In the form that has come down to us, it is based on five postulates together with several definitions and common notions. The first four of these postulates are stated very succinctly

1. A straight line may be drawn from any point to any other point.
2. A straight line segment can be uniquely and indefinitely extended.
3. We may draw a circle of any radius about any point.
4. All right angles are equal to one another.

Despite the fact that each of these assertions actually entails a fairly complicated set of premises and ambiguities, they were accepted as unobjectionable for two thousand years. However, Euclid's final postulate (like Einstein's second principle) was regarded with suspicion from earliest times. It has a very different appearance from the others - a difference that Euclid (and his subsequent editors and translators) did not attempt to disguise. The fifth postulate is expressed as follows:

5. If a straight line falling on two straight lines makes the [sum of the] interior angles on the same side less than two right angles, then the two straight lines, if produced indefinitely, meet on that side on which the angles are less than two right angles.

This postulate is equivalent to the statement that there's exactly one line through a given point P parallel to a given line L, as illustrated below

Although this proposition is fairly plausible (albeit somewhat awkward to state), many people suspected that it might be logically deducible from the other postulates, axioms, and common notions This is analogous to efforts to replace Einstein's second principle with the assertion that Maxwell's equations (in their standard form) are fundamental laws of physics. There were also many attempts to substitute for Euclid's fifth postulate a simpler or more self-evident proposition, and there have been analogous attempts to find a more self-evident substitute for Einstein's second principle. (The reciprocity between v and 1/v discussed in Section 1.5 is one example.)

We now understand that Euclid's fifth postulate is logically independent of the rest of Euclid's logical structure, and that in fact it's possible to develop logically consistent geometries in which Euclid's fifth postulate is false. For example, we can assume that there are infinitely many lines through P that are parallel to (i.e., never intersect) the line L. It might seem (at first) that it would be impossible to reason with such an assumption, that it would either lead to contradictions or else cause the system to degenerate into a logical triviality about which nothing interesting could be said, but, remarkably, this turns out not to be the case.

Suppose that although there are infinitely many lines through P that never intersect L, there are also infinitely many that do intersect L. This, combined with the other axioms and postulates of plane geometry, implies that there are two lines through P defining the boundary between lines that do intersect L and lines that don't, as shown below:

 

This leads to the original non-Euclidean geometry of Lobachevski, Bolyai, and Gauss, i.e., the hyperbolic plane. The correspondence with Minkowski spacetime is obvious. This example (although positive-definite) is nicely suggestive of how the light-lines in spacetime serve as the dividing lines between those lines through P that intersect with the future "L" and those that don't (distinguishing between spacelike and timelike intervals). This is also a nice illustration of the fact that even though Minkowski spacetime is "flat" in the Riemannian sense, it is nevertheless distinctly non-Euclidean. The geometrical structure of the effective spatio-temporal manifold of events is profoundly different than had been assumed for thousands of years, and this realization led naturally to a new set of principles with which to organize and interpret our experience. (Incidentally, it can be argued that Minkowski's geometrical interpretation of special relativity turned Einstein's "theory of principle" into a constructive theory, i.e., the Minkowski metric of spacetime represents a primitive concept on the basis of which special relativity is constructively founded, just as the idea of atoms is the constructive basis for the kinetic theory of heat.)

Regarding the justification for Einstein's proposed principles, many popular accounts of special relativity gives a prominent place to the famous experiments of Michelson and Morley, especially the crucial version performed in 1889, often presenting this as the "brute fact" that precipitated relativity. Why, then, does Einsteins 1905 paper fail to cite this famous experiment? It does mention at one point "the various unsuccessful attempts to measure the Earths motion with respect to the ether", but never refers to Michelson's results specifically. The conspicuous absence of any reference to this important experimental result has puzzled biographers and historians of science. Clearly Einsteins intent was to present the most persuasive possible case for the relativity of space and time, and Michelson's results would (it seems) have been a very strong piece of evidence in his favor. Could he simply have been unaware of the experiment at the time of writing the paper?

Einsteins own recollections on this point were not entirely consistent. He sometimes said he couldnt remember if he had been aware in 1905 of Michelson's experiments, but at other times he acknowledged that he had known of it from having read the works of Lorentz. Indeed, considering Einsteins obvious familiarity with Lorentzs works, and given all the attention that Lorentz paid to Michelsons ether drift experiments over the years, its difficult to imagine that Einstein never absorbed any reference to those experiments. Assuming he was aware of Michelson's results prior to 1905, why did he chose not to cite them in support of his second principle? Of course, his paper includes no formal "references" at all (which in itself seems peculiar, especially to modern readers accustomed to extensive citations in scholarly works), but it does refer to some other experiments and theories by name, so an explicit reference to Michelsons result would not have been out of place.

One possible explanation for Einsteins reluctance to cite Michelson, both in 1905 and subsequently, is that he was sophisticated enough to know that his "theory" was technically just a re-interpretation of Lorentzs theory - making identical predictions - so it could not be preferred on the basis of agreement with experiment. To Einstein the most important quality of his interpretation was not its consistency with experiment, but its inherent philosophical soundness. In other words, conflict with experiment was bad, but agreement with experiment by means of ad hoc assumptions was hardly any better. His critique of Lorentzs theory (or what he knew of it at the time) was not so much that it was empirically "wrong" (which it wasnt), but that the length contraction and time dilation effects had been inserted ad hoc to match the null results Michelson. (Its debatable whether this critique was justified, as discussed in Section 8.8.) Therefore, Einstein would naturally have been concerned to avoid giving the impression that his relativistic theory had been contrived specifically to conform with Michelsons results. He may well have realized that any appeal to the Michelson-Morley experiment in order to justify his theory would diminish rather than enhance its persuasiveness.

This is not to suggest that Einstein was being disingenuous, because its clear that the principles of special relativity actually do emerge very naturally from just the first-order effects of magnetic induction (for example), and even from more basic considerations of the mathematical intelligibility of Galilean versus Lorentzian transformations (as stressed by Minkowski in his famous 1908 lecture). It seems clear that Einsteins explanations for how he arrived at special relativity were sincere expressions of his beliefs about the origins of special relativity in his own mind. He was focused on the phenomenon of magnetic induction and the unphysical asymmetry of the pre-relativistic explanations. This was combined with a strong instinctive belief in the complete relativity of physics. He told Shankland in 1950 that the experimental results which had influenced him the most were stellar aberration and Fizeau's measurements on the speed of light in moving water. "They were enough," he said.

Return to Table of Contents

Сайт управляется системой uCoz