Invariance, Contravariance, Covariance

To get an intuitive idea of the difference between invariance, 
covariance, and contravariance, suppose we have an aquarium tank 
filled with water, and we define rectangular cartesian coordinates 
(x,y,z) to identify each point in the tank.  We could now express 
the temperature of the water at each point by the function T(x,y,z).
This is just a single number associated with each point in the tank, 
representing the temperature (let's say in degrees C) at that point.  

Now suppose we change our minds and decide to use polar coordinates 
(r,phi,theta) instead of rectangular coordinates (x,y,z).  These new 
coordinates are known functions of the original coordinates r(x,y,z), 
phi(x,y,z), and theta(x,y,z).

Clearly the value of T is *invariant* with respect to changes in the 
coordinate system.  Thus, the temperature T(r,theta phi) in polar 
coordinates is related to the temperature T(x,y,z) in rectangular 
coordinates by

              T(r,phi,theta)  =  T(x,y,z)

where r, phi, and theta are each functions of x, y,and z.  This means 
the value of T at a given point is the same, regardless of the 
coordinate system we choose.  

However, suppose we had determined the _gradient_ G(x,y,z) of 
the temperature at each point.  This is a vector at each point 
(x,y,z) with the components

                G_1(x,y,z)  =  dT/dx

                G_2(x,y,z)  =  dT/dy

                G_3(x,y,z)  =  dT/dz

where these are partial derivatives.  Now suppose we want to convert 
the gradient to polar coordinates.  If the gradient was invariant 
with respect to coordinates changes we would expect the components 
to be unchanged at any given point.  That is, we would expect the 
components of the gradient to be given by

             G_i(r,theta,phi)  =  G_i(x,y,z)

for i=1,2,3.  However, that's clearly not the case, because the
components of the temperature gradient with respect to polar 
coordinates are

                G_1(r,theta,phi)  =  dT /d r

                G_2(r,theta,phi)  =  dT /d theta

                G_3(r,theta,phi)  =  dT /d phi

and these represent the derivatives of temperature with respect to
the polar coordinates, not with respect to the cartesian coordinates.  
Thus, the components of the gradient at a given point clearly depend 
on the coordinate system we are using.

Fortunately, it's still possible to express the components of G_polar 
in terms of the components of G_xyz, but we need to take into account 
the relation between the polar coordinates and the cartesian 
coordinates.  For example, the conversion for G_1 looks like this

                    dx              dy              dz
 G_1(r,theta,phi) = -- G_1(x,y,z) + -- G_2(x,y,z) + -- G_3(x,y,z)
                    dr              dr              dr

where r, theta, and phi are each functions of x,y,z (and the derivatives 
are partials).  Entities like the temperature gradient whose components 
transform according to this kind of rule are called *covariant*.

Finally, suppose the water in the tank is not perfectly motionless,
but has some velocity at each point.  Like the gradient, this is a
vector at each point, and it has the components

              V_1(x,y,z)   =   dx/dt

              V_2(x,y,z)   =   dy/dt

              V_3(x,y,z)   =   dz/dt

where t stands for time.  Its worthwhile to compare these components
with those of the temperature gradient considered previously.  With 
the gradient we had  G_1(x,y,z) = dT/dx  whereas with the velocity 
we have  V_1(x,y,z) = dx/dt.  So the gradient consists of partial 
derivatives of some "other" variable (T) with respect to the coordinates, 
whereas the velocity consists of partial derivatives of the coordinates 
with respect to some "other" variable (t).  

Like the gradient, the velocity vector is not invariant under 
coordinate transformations, and the conversion depends on the 
relation between the two sets of coordinates.  However, the 
conversion has a different form.  For example, the first component 
of the velocity vector in polar coordinates is given by

                    dr              dr              dr
 V_1(r,theta,phi) = -- V_1(x,y,z) + -- V_2(x,y,z) + -- V_3(x,y,z)
                    dx              dy              dz

and similarly for the other two components.  Entities that transform 
from one coordinate system to another according to this kind of rule 
are called *contravariant*.

This discussion has focused on scalars and vectors, but the same ideas
apply to tensors of any order.  You can also have "mixed" tensors, 
which are covariant with respect to some of their indicies and 
contravariant with respect to others.

At this point people often wonder how we can talk about a vector being
contravariant or covariant when the direction and magnitude of a
vector (which are its defining properties) are actually invariant
with respect to coordinate changes.  This question points out a 
problem with the terminology.  People commonly talk about contra-
variant and covariant vectors and tensors, when they really mean 
contravariant and covariant *components*.  A given velocity vector 
(for example) has whatever direction and magnitude it has, independent 
of the coordinate system we use to express it.  So it's true that the 
velocity vector itself doesn't change when we switch coordinate 
systems.  However, the components of the vector change.

For example, suppose we have a velocity vector in the plane with
components (1,1) relative to a particular xy coordinate system.  This
vector has a magnitude of sqrt(2) and is pointing at 45 degrees up
from the x axis.  However, if we rotate the coordinate system about
the origin so that the x-axis lines up with the vector, it now has
coordinates (sqrt(2),0).  Notice that it's magnitude is still sqrt(2)
because the magnitude is *invariant*, but the components of the vector
are different.  We didn't change the direction of the vector, we
changed the orientation of the coordinate system, so the components
of the vector had to change accordingly.

Now, if we accept that the components of a tensor have to change when
we change coordinate systems, we might still wonder why they change
differently depending on whether they are contravariant or covariant.
The distinction between these two kinds of components is a bit subtle.
Essentially, contravariant components are directed PARALLEL to the
coordinate AXES, whereas covariant components are directed NORMAL
(perpindicular) to constant coordinate SURFACES.  Of course, in the
case of orthogonal cartesian coordinates the axes are, by definition,
normal to constant coordinate surfaces, so the distinction between
contravariant and covariant components vanishes.

Return to MathPages Main Menu
Сайт управляется системой uCoz