vendredi 30 juillet 2010

Newton's Law of Gravity

The Proverbial Apple

The famous story that Newton came up with the idea for the law of gravity by having an apple fall on his head is not true, although he did begin thinking about the issue on his mother's farm when he saw an apple fall from a tree. He wondered if the same force at work on the apple was also at work on the moon. If so, why did the apple fall to the Earth and not the moon? Along with his Three Laws of Motion, Newton also outlined his law of gravity in the 1687 book Philosophiae naturalis principia mathematica (Mathematical Principles of Natural Philosophy), which is generally referred to as the Principia. Johannes Kepler (German physicist, 1571-1630) had developed three laws governing the motion of the five then-known planets. He did not have a theoretical model for the principles governing this movement, but rather achieved them through trial and error over the course of his studies. Newton's work, nearly a century later, was to take the laws of motion he had developed and apply them to planetary motion to develop a rigorous mathematical framework for this planetary motion.

Gravitational Forces

Newton eventually came to the conclusion that, in fact, the apple and the moon were influenced by the same force. He named that force gravitation (or gravity) after the Latin word gravitas which literally translates into "heaviness" or "weight." In the Principia, Newton defined the force of gravity in the following way (translated from the Latin):
Every particle of matter in the universe attracts every other particle with a force that is directly proportional to the product of the masses of the particles and inversely proportional to the square of the distance between them.
Mathematically, this translates into the force equation shown to the right. In this equation, the quantities are defined as:
  • Fg = The force of gravity (typically in newtons)
  • G = The gravitational constant, which adds the proper level of proportionality to the equation. The value of G is 6.67259 x 10-11 N * m2 / kg2, although the value will change if other units are being used.
  • m1 & m1 = The masses of the two particles (typically in kilograms)
  • r = The straight-line distance between the two particles (typically in meters)

Interpreting the Equation

This equation gives us the magnitude of the force, which is an attractive force and therefore always directed toward the other particle. As per Newton's Third Law of Motion, this force is always equal and opposite. Click on the picture to see an illustration of two particles interacting through gravitational force. In this picture, you will see that, despite their different mass and sizes, they pull on each other with equivalent force. Newton's Three Laws of Motion give us the tools to interpret the motion caused by the force and we see that the particle with less mass (which may or may not be the smaller particle, depending upon their densities) will accelerate more than the other particle. This is why light objects fall to the Earth considerably faster than the Earth falls toward them. Still, the force acting on the light object and the Earth is of identical magnitude, even though it doesn't look that way.
It is also significant to note that the force is inversely proportional to the square of the distance between the objects. As objects get further apart, the force of gravity drops very quickly. At most distances, only objects with very high masses such as planets, stars, galaxies, and black holes have any significant gravity effects.

Center of Gravity

In an object composed of many particles, every particle interacts with every particle of the other object. Since we know that forces (including gravity) are vector quantities, we can view these forces as having components in the parallel and perpendicular directions of the two objects. In some objects, such as spheres of uniform density, the perpendicular components of force will cancel each other out, so we can treat the objects as if they were point particles, concerning ourselves with only the net force between them. The center of gravity of an object (which is generally identical to its center of mass) is useful in these situations. We view gravity, and perform calculations, as if the entire mass of the object were focused at the center of gravity. In simple shapes - spheres, circular disks, rectangular plates, cubes, etc. - this point is at the geometric center of the object.
This idealized model of gravitational interaction can be applied in most practical applications, although in some more esoteric situations such as a non-uniform gravitational field, further care may be necessary for the sake of precision.

Inertial space


The expression inertial space refers to the background reference that is provided by the phenomenon of inertia.
Inertia is opposition to change of velocity, that is: change of velocity with respect to the background, the background that all physical processes are embedded in. Accelerometers measure how hard an object is accelerating with respect to inertial space. More precise: accelerometers measure the magnitude of the change of velocity with respect to inertial space.
The Inertial guidance systems that are used in navigation and in guidance of missiles work by detecting acceleration and rotation with respect to inertial space.

Derivatives with respect to time

Position, velocity and acceleration form a natural sequence. Position can be seen as the zeroth time derivative of position, velocity is the first time derivative of position, and acceleration is the second time derivative of position.
The scientific understanding of space and time is that there does not exist such a thing as measuring an object's position with respect to inertial space, and no such thing exists as measuring an object's velocity with respect to inertial space. It is the third in the sequence, acceleration with respect to the background, that is the first to be physically manifest.

Gyroscopes

A spinning gyroscope, when suspended in such a way that no torque acts on the gyroscope wheel, will remain pointing in the same direction with respect to inertial space. The spinning gyroscope is locked onto the direction of inertial space that the gyroscope happened to be directed in when it was spun up. Two gyroscopes that start out pointing in the same direction will remain aligned with respect to each other. Since both gyroscopes are locked onto the same inertial space, it is impossible for two spinning gyroscopes to drift with respect to each other.

Astronomy

In 1899 the astronomer Karl Schwarzschild pointed out an observation about double stars. The motion of two stars orbiting each other is planar, the two orbits of the stars of the system lie in a plane. In the case of sufficiently near double star systems, it can be seen from Earth whether the perihelion of the orbits of the two stars remains pointing in the same direction with respect to the solar system. Schwarzschild pointed out that that was invariably seen: the direction of the angular momentum of all observed double star systems remains fixed with respect to the direction of the angular momentum of the Solar system. The logical inference is that just like gyroscopes, the angular momentum of all celestial bodies is angular momentum with respect to a universal inertial space. 1

Applications in navigation

Inertial guidance systems detect acceleration with respect to inertial space, and with those data it is possible to calculate the current velocity and position with respect to the velocity and position at the moment the acceleratometers started registering data.
For detecting rotation, gyroscopes and fiber optic ring interferometers are used. The operating principle of ring interferometers is called the Sagnac effect.
A gyrocompass, employed for navigation of seagoing vessels, finds the geometric north. It does so, not by sensing the Earth's magnetic field, but by using inertial space as its reference. The outer casing of the gyrocompass device is held in such a way that it remains aligned with the local plumb line. When the gyroscope wheel inside the gyrocompass device is spun up, the way the gyroscope wheel is suspended causes the gyroscope wheel to gradually align its spinning axis with the Earth's axis. Alignment with the Earth's axis is the only direction for which the gyroscope's spinning axis can be stationary with respect to the Earth and not be required to change direction with respect to inertial space. After being spun up, a gyrocompass can reach the direction of alignment with the Earth's axis in as little as a quarter of an hour.

References

1In the Shadow of the Relativity Revolution Section 3: The Work of Karl Schwarzschild (2.2 MB PDF-file) Download from mpiwg-berlin.mpg.de

Quantity of motion


Momentum and kinetic energy have in common that they are a measure of quantity of motion.
We take the following two postulates:
  • Time-reversal symmetry for perfectly elastic collision
  • Galilean invariance

Time reversal symmetry

When two objects of equal mass approach each other with equal velocity, and they collide with perfect elasticity, then their velocities will be reversed. That was a principle that from the 17th century on scientists relied upon in thinking about physical processes. Nowadays we have movie camera's, and we can play movies in reverse. We can readily see that in the case of a collision with perfect bounce we cannot discern whether the movie is being played forward or backward; perfectly elastic collision is symmetric in time.
(17th century scientists saw proof of the time-reversal symmetry in collision experiments. For instance, the case of two pendulums side by side so that when hanging still the bobs just touch. Then when both bobs are released from the same height, moving towards each other, they bounce back to the same height as the height they were released from. Of course, in coming to these conclusions the scientists had to assume that the small discrepancies they saw were due to friction only.)

Galilean invariance

To be a powerful set of principles the time-reversal symmetry must be paired with the principle that was introduced by Galilei, which nowadays is called 'Galilean relativity'. Imagine a set of large boats, all sailing along on perfectly smooth water. Each boat has a uniform velocity, all boats have some velocity relative to the other boats of the set. Then any experiment conducted onboard any of those boats will find the same laws of motion.
The 17th century scientist Huygens pointed out the following procedure: if you want to calculate the outcome of any collision, then transform to the coordinate system that is co-moving with the common center of mass, reverse the velocities and then transform back to the original coordinate system.
A more challenging case is where the mass of the two objects is unequal.

Common Center of Mass

In statics the Common Center of Mass (CCM) is an equilibrium point. Let two objects, with unequal mass, be connected by a stick with length L (for simplicity regard the stick as massless). Somewhere along the stick there will be an equilibrium point. If object 1 has twice the mass of object 2 then object 1 is twice as close to the Common Center of Mass as object 2.
The respective distances to the CCM can be notated as d1 and d2.
The system is in static equilibrium if m2d2 = m1d1 . That specifies a ratio between d1 and d2
If velocities are imparted to the objects then the CCM will remain motionless if:
m1v1 = m2v2
If a force is applied to both objects then the CCM will remain unaccelerated if:
m1a1 = m2a2

Derivation of laws from the invariance principles

I will discuss only the case of motion along a single line, in other words, the case of 1-dimensional motion. Two spherical objects with unequal masses move towards each other, they collide and then move apart again.

Galilean invariant notation

We can express their respective velocities as a va and vb with respect to some chosen origin, but for the intended derivation there is a much better representation: expressing the velocities as motion with respect to the common center of mass.
Vr = the relative velocity between the two objects
Vc = the velocity of the CCM relative to some chosen origin
The relative velocity is taken as positive when the two object are approaching each other, and negative when the two objects are receding.
m1  Mass of object 1
m2  Mass of object 2
v1   Velocity of object 1 relative to the CCM
v2   Velocity of object 2 relative to the CCM
Then we have:
v_1 = V_r \frac{m_2}{m_1 + m_2}
v_2 = - V_r \frac{m_1}{m_1 + m_2}
v1 and v2 are in opposite directions, so one is added to Vc and the other is subtracted from it. You can readily verify that (v1 - v2) = Vr.
The benefit of this notation is that it literally embodies galilean relativity. Motion is expressed in terms of two separate entities: velocity relative to some chosen point of origin, and the relative velocity between the two objects. Thus the notation embodies that only the relative velocity matters for the physics taking place.
The notation enforces the demand that momentum is always conserved. Expressing velocities in terms of Vc and Vr is valid only if before and after a collision the CCM keeps the same velocity.
Arguably conservation of momentum and the principle of Galilean invariance are one and the same principle.

Kinetic energy

Whereas stating conservation of kinetic energy is very common the following property seems to be somewhat overlooked: kinetic energy is Galilean invariant. It's a necessary property: if kinetic energy would not be galilean invariant calculations would run into inconsistencies.
If kinetic energy is galilean invariant it must be possible to derive the conservation of kinetic energy from the invariance principles. So let's try that.
Let the total kinetic energy be called Ek. Kinetic energy is ½mv2, but I have omitted the ½ because here it's non-essential. Then we have the following expression for the kinetic energy before the collision. (The symbol ∝ means 'is proportional to'.)
E_k \quad  \varpropto \quad  m_1 \left( V_c + V_r  \frac{m_2}{m_1 + m_2} \right)^2  \;  + \; m_2 \left( V_c - V_r  \frac{m_1}{m_1 + m_2} \right) ^2
Because of the minus sign quite a few terms drop away against each other. After that cleanup the expression regroups as follows:
\begin{align} 
E_k & = (m_1 + m_2) {V_c}^2 + \frac{m_1{m_2}^2 + m_2{m_1}^2}{(m_1 + m_2)^2} {V_r}^2 \\
       & = (m_1 + m_2) V_c^2 + \frac{m_1m_2}{m_1 + m_2} V_r^2 \\
\end{align}

After the collision

From symmetry it's immediately clear that in the expression for the total kinetic energy after the collision the same terms will drop away against each other, so after the cleanup and regrouping the expression will be the same as above.
There is the following limitation: while the derivation shows that there will be a conserved quantity that is proportional to the masses involved and to (Vr)2, it doesn't go beyond that; it doesn't single out a particular expression for Ek.

Separate contibutions

It's interesting to see how readily the total kinetic energy can be separated into independent contributions: a component that correlates to the relative velocity (Vr) of two objects, and a component that correlates to their common velocity (Vc) with respect to some reference. This shows that kinetic energy satisfies Galilean invariance: the amount of kinetic energy that is involved in the collision process depends only on the relative velocity between the two objects.

Momentum and kinetic energy

What is the relation between momentum and kinetic energy?
Given that the corresponding conservation laws are both derivatives of the time symmmetry and Galilean invariance it appears that momentum and kinetic energy must in some sense be two sides of the same coin.

The general theory of relativity


The following exposition of the general theory of relativity does not follow the historical chain of events, even though many historical events are discussed. I present newtonian dynamics in such a way that a transition to relativistic dynamics is prepared for as much as possible.
This article relies on the reader being comfortable with the preceding article about special relativity.

Preparation: Newtonian dynamics

Fields

In the course of the 19th century it became customary to understand the laws governing motion of particles in terms of fields. Physicists began to think about electrically charged particle as the origin of a field, and other particles may interact with that field. Rather than assuming that particles exert a force directly on each other the existence of a field is assumed, with the field acting as mediator of the interaction. The field cannot be observed directly; only the consequences are observable. The observations can also be accounted for in terms of a theory in which particles exert a force directly on each other, but over time scientists noticed that a theory in terms of an interaction mediating field is more economic. The field concept facilitates an economy of thought, and in the history of physics favoring the theory that offers the best economy of thought has proven to be a good strategy many times.
Naturally the question arose whether it would be possible to frame a field theory of gravitation. Is gravitational interaction mediated by a field?

Inertia and the concept of coupling to a field

Before turning to gravitation, I will discuss inertia.
Inertia is opposition to change of velocity. For instance, if you take a slingshot and you shoot a marble at a chunk of clay the marble will penetrate into the clay because of its inertia. It's interesting to contemplate what this says about inertia. Think of pushing a stick into the clay. It takes a serious force to press the stick into the clay, and in order to exert that kind of force you have to brace yourself. More generally, I will refer to this as a need for leverage; without leverage it's not going to happen. So, what gave the marble the leverage to penetrate the clay? It's no good saying that the marble's momentum carried it into the clay, that's just empty words. The leverage that the marble needs cannot be innate to the marble. (Compare the Baron von Münchhausen story where he was sinking into a swamp. He pulled himself out of the bog by grabbing his own hair and pulling himself up with all the strength he had! If you think the Münchhausen story is unphysical then it will also be obvious to you that the marble cannot push itself into the clay.)
The marble's leverage to penetrate the clay must come from coupling to something that is not part of the marble itself. Given the well-proven power of the field concept in electromagnetism it's a natural move to hypothesize that the marble couples to an inertia field; a uniform all-pervasive field with the property that it opposes change of velocity of objects that couple to the field.
The electric charge of a particle is a measure of how strongly it couples to an electric field. Accordingly the inertial mass of a particle is a measure of how strongly it couples to the inertia field.

Inductance

For an overview of the properties of the inertia field, reviewing the somewhat analogous phenomenon inductance is helpful. A coil of conducting wire with self-induction has the following property: it will offer little resistance to current, but it will oppose any change of current strength. (If the coil is cooled down to a temperature at which superconductance occurs, the coil wil offer zero resistance to current.) The mechanism of opposing change of current strength is as follows: change of current strength induces a changing magnetic field, which in turn induces an electric field that opposes the change of current strength.
In a wire with zero resistance and zero self-induction, applying a voltage would result in an instantaneous jump to an infinitely strong current.
In a wire with some resistance and no self induction, applying a voltage results in a jump to a particular current strength (with the magnitude of the current strength described by Ohm's law: V=I*R)
In a wire with no resistance and some self-induction, and starting with zero current strength, applying a voltage results in a steadily increasing current strength. That is, if only self-induction is involved, the rate of change of current strength is proportional to the applied voltage.
The case of no resistance and some self-induction is the one that inertia is analogous to. The rate of change of velocity is proportional to the applied force.
While the analogy between inductance and inertia is intriguing it does not in itself explain inertia; in the case of the inertia field no mechanism is known. All that is possible is to describe the properties of the inertia field and how particles interact with it. That is what Newton's laws of motion do.

Newton's laws of motion

Newton's first and second law of motion can be understood as the axioms of a theory of the inertia field. (Of course those laws weren't formulated with an inertia field in mind, the concept of a field was not available in Newton's time, but they can readily be reinterpreted in that way: no discrepancy arises.)
Newton's first law:
In retrospect we recognize that opting to use euclidean geometry for representing inertial space is in itself a theory of physics. Quite understandably, Newton and his contemporaries did not regard using euclidean geometry to represent inertial space as a theory of physics; at the time there was no alternative for euclidean geometry, and euclidean geometry was perfectly adequate for the purpose! Newton's first law (reinterpreted from a modern perspective) asserts that euclidean geometry is an appropriate physics model for the isotropy and homogeneity of the inertia field. In the absence of a force the object will move along a euclidean straight line.
Definition of Force:
In newtonian dynamics, force is defined as follows: it is the interaction between a pair of objects, with each object's momentum being changed by the force exerted by the other object. It's crucial to note that by opting to define force in that way inertia is not included in the category "force". The concept of force could have been defined otherwise, how to define 'force' is a matter of choice, but once a definition is in place it's necessary to keep all subsequent statements consistent with it. Whatever you do, you cannot refer to inertia as a force; inertia falls in a category of its own.
Newtons second law:
The inertia field opposes change of velocity. In order to change velocity with respect to the inertia field, a force must be exerted. The rate of change of velocity is proportional to the applied force.
Note that in order to achieve change of velocity a pair of objects must be present, and that there must be an interaction between those two objects. By exerting a force upon each other they both change the other object's velocity relative to the inertia field.

Inertia field

To be sure, the above considerations do not prove the existence of an inertia field. What can be shown is that postulating an inertia field facilitates an economy of thought, and throughout the history of physics that has always been an important consideration.

The equivalence class of inertial coordinate systems

The symmetries of the inertia field define an equivalence class of coordinate systems: the class of inertial coordinate systems. The criterium for distinguishing an inertial coordinate system from a non-inertial coordinate system is that in an inertial coordinate system the laws of motion hold good. (This is discussed in more detail in the article Inertial coordinate system)

Spacetime

The electromagnetic field is assumed to be an occupant of space, or in terms of relativistic physics, an occupant of spacetime. The concept of the inertia field extends way beyond that. The inertia field is not an occupant of spacetime; Minkowski spacetime and the inertia field are regarded as one and the same entity.

Unification: general relativity

Whereas the special theory of relativity is strictly a theory of motion, the general theory of relativity is both a theory of motion and a theory of gravitation.
In terms of the general theory of relativity both inertia and gravitation are accounted for in terms of interaction with a single field, and an appropriate name for this single field is the inertio-gravitational field. (The expression 'inertio-gravitational field' is relatively new. See documentation for a list of authors who have started using that expression.)
In terms of newtonian dynamics there is conceptually a distinction between inertial mass and gravitational mass. It is noted of course that the inertial mass and the gravitational mass always coincide, but newtonian dynamics does not account for this equivalence.
In terms of general relativity distinction between inertial mass and gravitational mass does not enter as a matter of principle: objects couple to a single field, the inertio-gravitational field. Inertial mass and gravitational mass aren't just thought of as having the same value; inertia and gravitation are thought of as arising from the same coupling.
To see how it is possible to formulate a theory of gravitation starting from the postulate of the single inertio-gravitational field it is helpful to explore acceleration in Minkowski spacetime.

Acceleration in Minkowski space-time

Picture 1. Animation <br>
Two accelerating spaceships, the red ship chasing the blue ship.
Picture 1. Animation
Two accelerating spaceships, the red ship chasing the blue ship.
Animation 1 represents two accelerating spaceships with the "red ship" following the "blue ship". The two ships are exchanging light signals (not shown) to maintain a constant separation. The red ship adjusts its acceleration in such a way that the transit time of a round trip of the light signal remains the same. When the roundtrip time of the light signal remains the same, the following applies:
  • The red ship is accelerating harder than the blue ship, that is: the red ship is pulling more G's.
  • Signals emitted by the blue ship are on reception by the reds blueshifted as measured by the reds.
  • Signals emitted by the red ship are on reception by the blues redshifted as measured by the blues.
  • For the blue ship a larger amount of proper time elapses than for the red ship.
Let me take a closer look at that time dilation effect. Take the case where the two spaceships are together in the beginning. First one starts to accelerate, then the other starts to follow, and for a duration of time the state of the red ship following behind the accelerating blue ship is maintained. This state is maintained far longer than the time it takes them to separate, or to rejoin later. When the two ships finally rejoin it will be seen that for the blue ship more proper time has elapsed than for the red ship. The important fact here is that the time dilation effect is not some illusion due to transmission delay: it's an actual physical effect.
Animation 1 does not represent a Minkowski spacetime diagram, for in the animation there is no global coordinate time, and no global standard of length. The vertical bars that move from left to right in the animation represent how the motion of inertially moving objects would be perceived by the red and the blue ship.

Picture 2. Animation <br>
Circular motion in Minkowski space-time.
Picture 2. Animation
Circular motion in Minkowski space-time.
Animation 2 shows two spaceships in circular motion. In this animation the diameter of the circle of circular motion is rather small. Now consider circular motion along a circle with an extremely large diameter. The situation is then very close to being indistinguishable from the situation depicted in animation 1, which depicts linear acceleration. This indistinguishability can be seen as a form of the principle of relativity of inertial motion. In the case of motion along a circular path there is a large sideways velocity, a uniform velocity perpendicular to the direction of acceleration. But in Minkowski space-time uniform velocity is relative.
An object that is released by the blue ship will from that moment on be moving inertially. It will then accelerate with respect to the blue and red ship, accelerating (while moving inertially) towards the red ship.

Mediator of gravitational interaction

Half of the story is that the inertia field is regarded as acting upon inertial mass: when an object travels along a path in spacetime that is not the spatially shortest path then there is less lapse of proper time as compared to travelling along the spatially shortest path. In the general theory of relativity this is complemented with the assumption that the inertia field is not immutable, but that it is acted upon by inertial mass. In the general theory of relativity it is assumed that in the presence of inertial mass the inertia field is deformed away from isotropy, and this deformation of the inertia field subsequently acts as the mediator of gravitational interaction. When the quantity of mass is sufficiently large, such as the quantity of mass of a planet, the effects of the deformation of the inertia field are very noticable.

Picture 3. Animation <br>
An infinitely large slab of matter generates a uniform gravitational field.
Picture 3. Animation
An infinitely large slab of matter generates a uniform gravitational field.
Animation 3 shows two spaceships and the grey area represents a sideways view on a slab of matter that is infinite in size. Such an infinite slab of matter would alter the physical properties of the inertio-gravitational field in a uniform way. In order to keep their distance to the slab the same, the spaceships must use thrusters. As is the case with animation 1, the animation does not represent a Minkowski spacetime diagram, for there is no global coordinate time, nor a global standard of length.
The assumption of just a single field, the inertio-gravitational field, implies that the situation depicted in animation 3 must be equivalent to the situation of animation 1 in all aspects of physics.
The uniform gravitational field that would extend away from an infinitely large slab of matter is physically unrealistic, of course. The purpose of animation 1, 2, and 3 is to focus on the concept of the inertio-gravitational field as a single field, and to illustrate the properties. A uniform gravitational field is inherently undetectable. If the two space-ships can only perform transit time measurements on signals that they are exchanging with each other (that is, only local measurements) then presence or non-presence of the infinitely large slab of matter is inherently undetectable. Given the assumption of the single inertio-gravitational field a uniform gravitational field has only a relative existence.
The gravitational field that extends away from a lump of matter such as a planet is a non-uniform gravitational field. The gravitational field that extends away from a planet is characterized by tidal effects, which are not relative.

Gravitational time dilation

Picture 4. Image <br>
All over the world, clocks located at sealevel will count the same amount of proper time.
Picture 4. Image
All over the world, clocks located at sealevel will count the same amount of proper time.
Picture 4 shows the Earth, with its equatorial bulge very much exaggerated. For clocks located on Earth there are two factors that determine the amount of proper time that elapses. One factor is gravitational time dilation. Clocks located near the poles are closer to the Earth's center of mass; they are located deeper in the potential well, which corresponds to a smaller amount of proper time that elapses, as compared to objects located less close to the Earth's center of gravitation. The other factor is velocity time dilation. Objects that aren't located on either of the two poles are circumnavigating the Earth's axis, which corresponds to more velocity time dilation (smaller amount of proper time elapsing) than for objects located at the poles. The closer to the equator, the more velocity time dilation.
For a clock located somewhere between the poles and the equator the total time dilation can be understood as a combination of gravitational time dilation and velocity time dilation. For all terrestrial clocks the two effects add up to the same total amount of time dilation. Thus for all terrestrial clocks located at sealevel the same amount of proper time elapses; the Earth's sealevel is an isochrone. This state of the same amount of proper time elapsing all over the Earth's surface is a state towards which the system naturally evolves. For example, suppose that the Earth rate of rotation has slowed down somewhat, but that the equatorial bulge has not decreased yet. Then instead of equilibrium a larger amount of proper time elapses at the equator than at the poles. A difference in lapse of proper time corresponds to a difference in potential energy, and this difference tends to even out; there will be a shear stress in the solid Earth. Over time there will be a redistribution of the Earth's mass toward a smaller equatorial bulge, until the amount of lapse of proper time is the same all over the Earth's surface. (Remark: the distinction between gravitational time dilation and velocity time dilation is helpful for recognizability, but as seen from a deeper level it's an artificial distinction. As seen from a more profound level the distinction isn't there - but that's beyond the scope of this article.)
In the discussion above, only gravitational time dilation is discussed. Gravitational time dilation is one aspect of curvature of spacetime. A theory of gravitation that would only incorporate gravitational time dilation would in particular yield wrong predictions for very fast motion.
General relativity describes that gravitational deflection of light that just grazes the sun will be 1.75 arcseconds. A theory of gravitation that would only incorporate gravitational time dilation would predict half that value: 0.83 arcsecond. In fact, in a 1912 paper by Einstein on an early exploratory theory of gravitation that value was stated: 0.83 arcsecond.
A theory of gravitation that would only incorporate gravitational time dilation would predict planetary orbits that are very close to orbits described by newtonian dynamics and general relativity, but it would not correctly predict the precession of the planet Mercury. The actual general theory of relativity involves deformation of spacetime; both time and space are affected.

Motion along a geodesic

According to special relativity there is a connection between inertial motion and the amount of lapse of proper time. When a set of trajectories that bring an object/observer from event A in spacetime to event B is compared, then the inertial motion trajectory is seen to have the largest amount of lapse of proper time. Why motion in spacetime has this property is not known.
Picture 5. Image<br>The trajectory with only inertial motion (continuous straight line in de diagram) corresponds to a maximum in lapse of proper time. For all trajectories in which more spatial distance than that is travelled, there is less lapse of proper time.
Picture 5. Image
The trajectory with only inertial motion (continuous straight line in de diagram) corresponds to a maximum in lapse of proper time. For all trajectories in which more spatial distance than that is travelled, there is less lapse of proper time.
According to the general theory, the only way of describing inertial motion is in terms of that connection between inertial motion and the amount of lapse of proper time. The general theory describes that in the presence of matter physical properties of space-time are changed away from isotropy. This change of the properties of space-time away from isotropy, which is referred to as 'curvature of space-time' acts as the mediator of gravitational interaction. Given a curvature of spacetime, a trajectory that from afar is seen to be curvilinear can be the trajectory with the largest amount lapse of proper time.
The general theory replaces Newton's first law with a more general law: objects in inertial motion move along the trajectory that is the trajectory along which the amount of proper time is the largest possible.

Wheeler's antipode-antipode corridor

John Wheeler has introduced the example of a corridor drilled through a planet to illustrate the concept of moving along a geodesic. A tunnel is constructed that connects in a straight line two point of a planet that are on opposite locations on that planet. The tunnel diameter is so small compared to the size of the planet that the difference in mass distribution is negligable.
When a capsule is released at one entrance to the corridor, it will from that moment on be oscillating in that corridor. In the case of the Earth the period of one oscillation would be about 90 minutes, the same amount of time as a circular orbit that circumnavigates the Earth at very low altitude. Given the properties of the space-time that the capsule is in, the oscillating trajectory is the path that corresponds to the path with the largest possible amount of lapse of proper time. If there would also be some friction, then the capsule would eventually come to rest at the midpoint of the corridor.
A counterintuitive aspect is that one hand it is stated that objects will tend to follow the trajectory that maximizes the amount of proper time that elapses, and on the other hand it is stated that objects tend to move towards the lowest point of the gravitational well. (The lower in the gravitational well the less lapse of proper time). That raises the question: if objects tend to move along a trajectory that maximizes lapse of proper time, then why do object move towards a region where less proper time elapses? The determining factor is that for an object that moves in free fall from one altitude to another more proper time elapses than in a state of being fixed at either of the two altitudes. (Here the state of 'free fall' also applies for ascending motion. An object that is thrown upwards is also in free fall during its rise to its highest point.)
Picture 6. Animation <br>
The blue object follows the path with the largest lapse of proper time.
Picture 6. Animation
The blue object follows the path with the largest lapse of proper time.
By analogy with euclidean geometry, this path that corresponds to an extremum is called a 'geodesic'. In euclidean geometry, a geodesic is the path with the shortest possible spatial length. In the general theory of relativity, the expression 'geodesic' refers to the path with the largest amount of lapse of proper time.
An accelerometer onboard the capsule would at all times measure zero acceleration; the capsule that is oscillating in the corridor is in inertial motion. On the other hand, objects that are at rest on the surface of the Earth are not in inertial motion. For an object at rest on the surface of the Earth the local inertial frame is accelerating towards the center of the Earth. An object at rest on the surface of the Earth is (due to the normal force exerted by the Earth's surface) being accelerated with respect to the local inertial frame.
In Minkowski space-time, the equivalence class of inertial coordinate systems is globally valid. In space-time as described by general relativity, the equivalence class of inertial systems at one point in space can be accelerating with respect to the equivalence class of inertial systems at another point. In the case of the solar system one can distinguish a hierarchy. A spacecraft that is in orbit around the Moon is in inertial motion. The center of mass of the Moon in in inertial motion around the common center of mass of the Earth-Moon system. The common center of mass of the Earth-Moon system is in inertial motion around the Sun. The center of mass of the solar system is in inertial motion around the center of the galaxy.

Analogies between two fundamental unifications

Picture 7. Image <br>
Each plane of simultaneity can be seen as cutting a different cross-section of Minkowski spacetime.
Picture 7. Image
Each plane of simultaneity can be seen as cutting a different cross-section of Minkowski spacetime.
Picture 7 is an illustration that was used in the article about special relativity. Every relative velocity has a corresponding distribution of coordinate time and coordinate distance. For every object/observer in spacetime there is a particular plane of simultaneity that is co-progressing in time with the object/observer. For each observer a particular slice of spacetime is valid; the slice that corresponds to his proper plane of simultaneity.
In the case of the electromagnetic field the observer's velocity relative to the electromagnetic field is the determining factor for what the observer will measure. In the four-dimensional world of Minkowski spacetime, the electromagnetic field is a single field. To any observer, there are apparently two fields: an electric field and a magnetic field. For observers at different velocities, the electromagnetic field "decomposes" differently in an electric component and a magnetic component.
In the case of the inertio-gravitational field, the determining factor for what an observer will measure is the acceleration of the observer with respect to the field. For observers that are accelerating at different accelerations with respect to the local inertio-gravitational field the field manifests itself differently.
The two unifications:

Analogies between two transitions

The following paragraphs are designed to highlight the remarkable parallels between the transition from newtonian mechanics to special relativity on one hand, and the transition from special relativity to general relativity on the other hand.
In the case of classical electrodynamics it is technically possible to formulate a theory in which it is assumed that there is an actually present background of newtonian absolute space and absolute time. Such a theory needs to find a way to accomodate the equivalence of the class of inertial coordinate systems. In such an ether theory, having a velocity with respect to the ether results in time dilation and length contraction effects that act in such a way that the assumed Newtonian background is rendered unobservable. Any theory that assumes separate space and time needs additional hypotheses to account for the fact that no experiment ever detects uniform velocity with respect to the presumed immutable Newtonian background. Special relativity has no such need because velocity with respect to spacetime does not enter special relativity.
In the case of special relativity it is technically possible to formulate a theory of motion and gravitation in which it is assumed that there is an actually present immutable Minkowski spacetime. That is: a theory that assumes a separate inertia field and gravitational field. Such a theory needs to find a way to accomodate the equivalence of gravitational and inertial mass. It turns out that a theory that assumes an immutable Minkowski spacetime needs to be amended by assuming time dilation and length contraction effects that act in such a way that the immutable Minkowski background is rendered unobservable. Any theory that assumes a separate inertia field and gravitational field would need additional hypotheses to account for the fact that no experiment ever detects uniform acceleration with respect to the assumed immutable background. General relativity has no such need because assumption of a fixed background does not enter general relativity.

Transition: replacing the metric of spacetime

In the article about special relativity I discussed that in the transition from newtonian dynamics to special relativity the spacetime metric was replaced: a shift from Euclidean metric to the Minkowski metric. In the transition from the special theory of relativity to its successor, the general theory of relativity, once again the metric was replaced. The metric that applies in the case of the general theory of relativity is called a 'Riemannian metric', as the metric allows for curvature of spacetime. This metric has a (+,-,-,-) signature; any tangent spacetime of it is described by the Minkowski metric.
Thus both the transition from newtonian dynamics to the special theory of relativity and the transition from the special theory of relativity to the general theory of relativity can be seen as replacing the preceding spacetime metric with a more generalized spacetime metric. In each case the predecessor's metric is a limiting case of the metric that replaced it. In the limit of velocity approaching zero the Minkowski metric reduces to the Euclidean metric. In the limit of spacetime curvature approaching zero the Riemannian (+,-,-,-) signature metric reduces to the Minkowski metric.

No fixed background

John Wheeler has coined the phrase: "spacetime is telling matter/energy how to move, matter/energy is telling spacetime how to curve." That is, in general relativity, the curvature of spacetime is a dynamic variable.
In Newtonian dynamics the purpose of writing down and solving the equation of motion is to find the motion of material objects with respect to the background. Solving equations of motion in Newtonian dynamics can be hard at times, but at least there is a fixed background: inertia. In general relativity, the purpose of writing down and solving the field equations is to find expressions of how the shape of the inertio-gravitational field evolves over time, and how the motion of objects evolves over time.
Let me elaborate on the above. When the purpose is to calculate a perfectly circular orbit then there is one factor that is not constant, the instantaneous direction of velocity. The other factors are constant: the distance to the center of rotation, the magnitude of the orbital velocity and the magnitude of the centripetal force. The case of a perfectly circular orbit readily yields to analysis and Huygens had derived a formula for the required centripetal force. But an orbit with some eccentricity is much harder to analyse. In the case of a keplerian orbit all the participating factors are in flux. Distance to the primary is affected by the velocity and the centripetal force, the velocity changes because of the centripetal force, the magnitude of the centripetal force is a function of the distance to the center of rotation. All of the factors are functions of each other - no starting point for a derivation. To overcome this problem Newton developed a new branch of mathematics, which he called 'fluxion reckoning'. Today this kind of mathematics is called differential calculus. The power of the differential calculus allows the calculation to be lifted over the problem of not having a starting point. The reason that the problem can be solved after all is that the participating factors can all be expressed as functions of each other.
As mentioned above, in solving the equations of newtonian dynamics space and time provide a constant background. So in fact not all of the participating factors are in flux. It is in the case of the problem posed by general relativity that really all participating factors are in flux. The inertio-gravitational field is curving as affected by distribution of matter in spacetime, with matter moving as affected by curvature of spacetime. Now there is really no starting point at all.
The equations of general relativity rely on a sophisticated mathematical apparatus that handles coordinate transformations, thus allowing equations to be expressed in a way that is not committed to a particular choice of coordinate system. The class of coordinate mappings that is supported by this apparatus is called the diffeomorphism class. Members of the diffeomorphism class can be transformed into one another by a transformation that does not "tear" or "cut" the "fabric" of space-time. That is, two points in space-time that are adjacent in one member of the diffeomorphism class are also adjacent points in all other members of the diffeomorphism class. Other than that, the mathematical apparatus accommodates a vast range of transformations: translation, uniform acceleration, any acceleration, rotation with constant rate, rotation with variable rate, any "flexing" deformation, etc.
Employing the mathematical apparatus that allows equations to be expressed in a form that is not committed to any choice of a particular coordinate system (among the diffeomorphism class) is called 'using coordinate-independent equations'. General relativity uses coordinate-independent equations to express certain physical properties. The physics of "matter/energy telling space-time how to curve, and curved spacetime telling matter/energy how to move" can be expressed in an coordinate-independent way. That is, the mathematical power of diffeomorphism invariance makes it possible to formulate equations even with the background itself being a dynamic variable. When a solution to the equations is obtained, the final step is to choose a coordinate system and map the solution to that coordinate system.

Matter, energy, fields and spacetime

In the introduction it was mentioned that the electromagnetic field can carry momentum, indicating that fields and matter are not as different as one might expect.
Electrodynamics describes that an oscillating electrically charged particle will radiate electromagnetic waves. The process of emitting electomagnetic waves decreases the kinetic energy of the oscillating particle.
General relativity describes that when two masses are orbiting each other in non-circular orbits, then that system will radiate gravitational waves. That is: gravitational waves (propagating undulations of the inertio-gravitational field) can carry away energy and momentum from a system of orbiting masses. General relativity indicates that Matter, Energy, Fields and Spacetime are not as different as one might expect.

Special Relativity

1. Why laws of motion are possible

The first relativity principle: Galilean relativity

In the history of physics, recognition of the central position of inertia has proceeded in stages. With the benefit of the current knowledge the Copernican revolution can be recognized as the first step to making inertia the prime organizing principle of dynamics understanding. (Here, I take 'Copernican revolution' in its widest possible sense: the revolution in thinking about motion that was completed by Newton's work.)
Galilei pointed out why it is possible to home in on laws of motion. Imagine you are in a cabin of a boat that is in motion over perfectly smooth water. I will call this particular boat the test boat. The test boat does not accelerate in any way, it is in uniform motion. You are on the test boat, juggling, or you are throwing darts, or something like that. (Galilei used other examples.) Galilei argued: no matter the velocity of the test boat, the skill for juggling the balls or throwing the darts is the same. There is a large range of circumstances that are different (boats having a velocity relative to each other), where as far as juggling is concerned things are the same for each separate boat. To become a skilful juggler is the acquirement of implicit knowledge (motoric knowledge) of how to work with the properties of motion.
The grid of the dartboard serves as a reference, the darter's aim is with respect to that reference system. One layer of 'the same' is that it does not matter where in space the dartboard is positioned, or in what direction the dartboard is facing; the properties of motion are the same. A second layer is that given a state of unaccelerated motion of the test boat its velocity relative to other boats does not matter either, the properties of motion as experienced on the test boat are always the same.

Symmetry of inertia

If you exert a force on an object, causing it to accelerate, then for all directions the same force results in the same acceleration; the isotropy of inertia. (Isotropy = same in all directions). For all locations and orientations in space, inertia is symmetrical. Likewise, for all uniform velocities, inertia is symmetrical. Imagine the opposite: imagine that inertia would be erratic, changing from place to place and from one instant to another - motion would be lawless then. But inertia is extremely symmetric. Formulating laws of motion is a fruitful undertaking because of the extensive symmetries of inertia.
A theory of motion describes the properties of inertia; in order to formulate a theory of motion it is necessary to have a way of embodying the symmetries of inertia. When a mathematical theory of motion was formulated, culminating in the work of Isaac Newton, embodying these symmetries was straightforward: euclidean space has the same symmetries as inertial space.
In retrospect we can recognize that the Copernican revolution changed the role that geometry played in physics. For the ancients the geometry was an instrument for describing shapes, the trajectory of a planet was thought of as a shape, and this shape was described by geometry. But in newtonian dynamics the euclidean geometry plays a much more profound role. In Newtonian dynamics inertia is the prime organizing principle and euclidean geometry is appropriate because it perfectly models the symmetries of inertia. In retrospect we can recognize that the act of using euclidean geometry as instrument for describing motion is actually the adoption of a physics theory.

Equivalence class of coordinate systems

The euclidean space is represented with a coordinate system (a coordinate grid). A coordinate system has a zero point and it has orientation. The comprehensive representation of the symmetries of inertial space must be specified as an equivalence class of coordinate systems.

Linear transformations

The symmetry demands narrow down the transformations to the following:
  • Transformations between coordinate systems that are at an angle with respect to each other
  • Transformation between coordinate systems that are translated with respect to each other
  • Transformations between coordinate systems that have a uniform velocity relative to each other
For any object that is in inertial motion there is a coordinate system that is co-moving with that object. A coordinate system that is co-moving with an object in inertial motion is an inertial coordinate system. What that means is that the law that describes velocity addition of pairs of objects and the law that describes the transformation between inertial coordinate systems are one and the same law.


2. Special relativity

The special theory of relativity is, like Newton's laws of motion, a theory of motion: it deals with relations between space, time and matter.
As is the case with Newton's laws of motion, special relativity uses the phenomenon of inertia as the prime organizing principle of dynamics understanding, but on a more profound level. The purpose of this article is to demonstrate that.

synchronization procedure

Picture 1. Animation<br> 
Three clocks.
Picture 1. Animation
Three clocks.
Picture 1 represents three clocks, counting time. To keep those clocks perfectly synchronised you need a way of disseminating time. One way is to use pulses of light, I will get to that later. The next animation shows the three clocks, with smaller clocks shuttling between them.

Picture 2. Animation<br> 
Three spaceships and a procedure to maintain synchronised fleet time.
Picture 2. Animation
Three spaceships and a procedure to maintain synchronised fleet time.
Animation 2 represents a spacetime diagram. The yellow lines represent the worldlines of pulses of light that are emitted at t=0. The consecutive frames of the animation combined represent a single diagram.
An observer can always define his own position as the origin of a coordinate system to map the positions of other objects, such as the ships of a fleet of spaceships. In the animations of this article the fleet consists of three ships, but it can be any number of ships, and those ships can be regarded as forming a grid. That grid provides a coordinate system to assign a coordinate distance and a coordinate time-lapse between two events.
Each ship of the fleet logs the events taking place at its spatial coordinate, recording at what point in fleet time the event took place. The ships of the fleet communicate these logs to each other and on each ship of the fleet the information in the received logs can be put together in a comprehensive spacetime mapping of the events. Animation 2 is an example of such a comprehensive mapping.

The red circles represent clocks onboard the ships, counting time. The two orange circles represent miniclocks shuttling back and forth between the ships of the fleet, the miniclocks are used for a procedure to maintain synchronised fleet time. The synchronization procedure employed here relies on the isotropy of inertia. The ships of the fleet take care that every time the miniclocks are sent on their journey they are in both directions propelled with equal force. (More precisely: all the mini-clocks have the same mass, and in propelling each miniclock the ships of the fleet take care to impart the same amount of kinetic energy each time.)
The ships of the red fleet are 4 units of distance apart. Here, 4 units of distance means that as measured by the clocks of the fleet pulses of light take 4 units of time to propagate from one ship to another. In this article, distance is measured in terms of time: the amount of transit time of pulses of light. In this animation the miniclocks take 5 units of fleet time to travel from one ship to another, so their velocity relative to the fleet is 4/5th of the speed of light.

Time dilation

The animation depicts special relativity's prediction for this case that during the journey from one ship to another the miniclocks count 3 units of proper time. (Note that the time dilation doesn't prevent the synchronization. The amount of time dilation is perfectly predictable, so it can be accounted for.)
Over the course of a complete synchronization cycle you see the red clocks go round 10 times. That is, for the red clocks 10 units of time elapse. Conversely, you see the orange clocks go round 6 times; for the shuttles a complete cycle takes 6 units of shuttle time.

Transmission delays not fundamental

It's important to note that transmission delays play no part in special relativity. They must be accounted for, of course, in order to assemble a comprehensive spacetime mapping, but the substance of special relativity begins only after transmission delays have been taken into account. In order to get the relativistic effect in focus the animation is designed to avoid observational differences that arise from transmission delay.

No explanation

Relativistic physics does not provide an explanation as to how and why time dilation occurs. The starting point of relativistic physics is to assume that this is how things are, and the content of the theory consists of working out the ramifications of this assumption. The justification of the assumption is in the success of relativistic theory in applied physics.

Minkowski spacetime geometry

The conceptual shift in the transition from classical dynamics to the spacetime of special relativity is a shift from euclidean space and time to Minkowski spacetime.
Picture 3. Image <br>
The chrono-geometry of spacetime intervals.
Picture 3. Image
The chrono-geometry of spacetime intervals.
The line on which the points A, B, C, and D are grouped connects all the points in spacetime that have in common that for an object moving with uniform velocity from point O in spacetime to any point on that line, 3 units of proper time elapse. For instance, the lines OA and OC in image 2 correspond to the wordlines of the miniclocks in animation 2.

Spacetime interval

The concept of spacetime interval in Minkowski spacetime is somewhat analogous to the concept of radial distance in Euclidean space. In Euclidean space with 2 dimensions of space there is the relation:
r2 = x2 + y2
Which is of course Pythagoras' theorem.
The radial distance between two points is an invariant, in the sense that it is independent of the particular choice of mapping a space with a cartesian coordinate system. More precisely, radial distance between two points is invariant under a coordinate transformation that corresponds to a spatial rotation.
In this article the word 'space' is used in a very abstract sense, in a meaning that is quite different from the everyday meaning of the word. In this article, everything is described in terms of time. Spatial distance is measured in terms of the amount of time that it takes light to cover the distance. Both time and spatial distance are counted in units of time.
The invariant spacetime interval of Minkowski spacetime geometry embodies a relation between space and time. The size of the spacetime interval is counted in units of time: the proper time as measured by co-moving clocks. The standard symbol for proper time is τ (the Greek letter 'tau'.)
τ2 = t2 - x2
The radical difference is the presence of the minus sign.
When represented geometrically, the spacetime interval is associated with a hyperbola curve, as depicted in image 2, whereas radial distance in Euclidean space is of course associated with a circle. Note that image 2 represents a mapping of a Minkowski space onto Euclidean space, rather than representing a Minkowski space directly.

Physical consequences

The shuttles are going back and forth between the ships of the red fleet. Each time they return to the ship that they came from and dock there they are still the same size as when they left. Space would be very strange indeed if returning from a journey you would find yourself to be smaller than when you left.
Now to the physical effect that does occur: the effect on elapsed time
The shuttles are taking a path that is not the spatially shortest path. Whenever that happens then on rejoining an object that did move along the spatially shortest path less proper time has elapsed for the traveller.

Metric of Minkowski spacetime

The measure of distance between two points in Euclidean space that is invariant under a rotation of the mapping coordinate system is called 'the metric of Euclidean space'.
In the case of Minkowski spacetime it is common to refer to its properties as 'geometry of Minkowski spacetime'. (A more accurate expression would be 'chrono-geometry of Minkowski spacetime', but that expression is rarely used.) By analogy with the concept of a metric in Euclidean context the formula for the invariant spacetime interval is referred to as 'the metric of Minkowski spacetime'.
The expression 'metric of Minkowski spacetime' is common usage, but because of the difference with the general concept of a metric it is also referred to as a 'pseudo-metric'. This signals that while in mathematical expressions the pseudo-metric performs exactly the same function as a metric, it is fundamentally different from a metric.
The concept of a metric can be applied in many different geometric contexts; A simple example of a metric is the metric of the way that in the game of chess the King moves around. To go from one corner to another along a column or a row takes 7 steps, and to go diagonally also takes 7 steps. That metric is an example of a non-euclidean metric, for Pythagoras' theorem does not apply.
The metric of Minkowski spacetime, with the square of one dimension being subtracted from the square of another dimension, is unexplained. For the question of how the structure of space and time can be like the way it is there is no established theory. At present, the Minkowski spacetime geometry must be assumed in order to be able to formulate a theory at all.

Spacetime interval

Special relativity implies that the spacetime interval is more fundamental than spatial distance. Special relativity implies that space cannot be thought of as an entity with an independent existence. Rather, physicists feel compelled to think of space as a sort of 3D silhouette of some more fundamental entity: the spacetime continuum, involving three spatial dimensions and one time dimension. Depending on how the spacetime is mapped spatial distances come out differently, in a way that is reminiscent of projective geometry.

Equivalence of different coordinate mappings

Picture 4. Animation <br>
Three spaceships and a procedure to maintain synchronised fleet time.
Picture 4. Animation
Three spaceships and a procedure to maintain synchronised fleet time.
The three dark green circles represent a fleet of spaceships. As in the first animations miniclocks are shuttling back and forth between the ships of the fleet, as part of a procedure to maintain synchronised fleet time. The trajectories of the green miniclocks correspond to the worldlines OB and OD in image 3.
The green fleet has a velocity with respect to the red fleet of 2/5th of the speed of light. The spacetime diagram in animation 4 represents how the motion of the green fleet is mapped in a coordinate system that is co-moving with the red fleet.
The metric of Minkowsk spacetime describes how everything will proceed for the green fleet. The central ship sends the miniclocks in opposite directions and each miniclock has a relative velocity of 4/5th of the speed of light with respect to the fleet. For each leg of the procedure, the miniclocks count 3 units of proper time, and the fleet clocks count 5 units of proper time for each leg of the procedure.

Symmetry

Picture 5. Image
Picture 5. Image
Image 5 shows spacetime diagrams that map both the procedure of the red fleet and the procedure of the green fleet. The diagram on the left shows a mapping of events in spacetime in a coordinate system that is co-moving with the green fleet, the diagram on the right shows a mapping of events in spacetime in a coordinate system that is co-moving with the red fleet.
In this particular case the synchronization procedure and its mapping in a spacetime diagram was introduced with the red fleet first, mapping the physics in a coordinate system that is co-moving with the red fleet. It could also have been introduced with the green fleet first, mapping the physics in a coordinate system that is co-moving with the green fleet. According to special relativity there is complete symmetry between the two coordinate representations.

Equivalence class of coordinate systems

Picture 6. Animation <br>
Equivalence class of coordinate systems.
Picture 6. Animation
Equivalence class of coordinate systems.
In animation 6 the complete symmetry illustrated with image 5 is represented as an animation. Here, the sequence of frames is a sequence of coordinate systems with a velocity relative to each other. Each frame represents the same physics series of events: the synchronization procedure as outlined above. All individual frames of the animation represent the physics taking place equally well. Together the set of all frames in which the physics taking place is represented equally well constitutes an equivalence class of coordinate systems.
In particular the spacetime interval is the same in all spacetime mappings. On the other hand, in each frame simultaneity comes out differently relative to other frames.

Relativity of simultaneity

A tacit assumption in classical mechanics is that motion and simultaneity are distinct entities. That assumption does not carry over to special relativity. There is no inherent criterium to regard two events as simultaneous or not simultaneous. This is called the relativity of simultaneity.
While there is no inherent criterium for assigning simultaneity, there is an economy criterium. If you take as definition of simultaneity the synchronization procedure as above (synchronization that uses the symmetries of inertia as reference), then the physical laws, such as the equations for electromagnetism, are in their simplest form.

Comparison: luminiferous ether and Minkowski spacetime

Diagrams 7 to 10 represent cases of a synchronization procedure being applied. The yellow lines represent the worldlines of pulses of light.
In the previous examples the synchronization procedure used miniclocks. The advantage of that is that at each encounter you can see how much proper time has elapsed for the miniclocks during their journey, enabling you to check the fidelity of the synchronization. Every encounter it is checked that for each miniclock the same amount of proper time has elapsed during its journey. When synchronizing with pulses of light that information isn't there: no comparison of transit time is possible.

Classical physics

Picture 7. Diagram<br>
The synchronization procedure in classical spacetime, using pulses of light. The procedure takes 10 units of time to complete
Picture 7. Diagram
The synchronization procedure in classical spacetime, using pulses of light. The procedure takes 10 units of time to complete
Picture 8. Diagram <br>
The case of a fleet that has a velocity relative to the luminiferous ether: the procedure takes more than 10 units of time to complete
Picture 8. Diagram
The case of a fleet that has a velocity relative to the luminiferous ether: the procedure takes more than 10 units of time to complete
Picture 9. Diagram <br>
The synchronization procedure in Minkowski spacetime. The coordinate system is co-moving with the fleet of spaceships. The procedure takes 10 units of time to complete.
Picture 9. Diagram
The synchronization procedure in Minkowski spacetime. The coordinate system is co-moving with the fleet of spaceships. The procedure takes 10 units of time to complete.
Picture 10. Diagram <br>
The synchronization procedure mapped in a coordinate system that is moving relative to the fleet. After the procedure is completed 10 units of proper time have elapsed.
Picture 10. Diagram
The synchronization procedure mapped in a coordinate system that is moving relative to the fleet. After the procedure is completed 10 units of proper time have elapsed.
Diagrams 7 and 8 represent what you expect to happen when the light signals are supposed to propagate in a medium, usually referred to as the luminiferous ether. If a fleet of spaceships has a velocity with respect to the luminiferous ether the overall pathlength of the light signals is longer, and the procedure will take more time than when the spaceships are stationary with respect to the luminiferous ether.

Minkowski spacetime

Diagrams 9 and 10 represent what you expect to happen when the environment is Minkowski spacetime. Every mapping of the procedure will indicate that 10 units of proper time will elapse. In other words, the synchronization procedure will not reveal anything about a velocity with respect to some background structure.

Einstein synchronization procedure

When pulses of light are used for synchronization of clocks the procedure is referred to as Einstein synchronization procedure. In the article, 'On the electrodynamics of moving bodies' in which Einstein introduced special relativity Einstein had introduced that procedure as a definition of simultaneity.
The diagrams illustrate in which environment the Einstein synchronization procedure is applicable. In space and time as envisioned prior to relativistic physics you expect the synchronisation procedure to take more time when the senders/receivers have a velocity relative to the luminiferous ether. That difference will give rise to inconsistencies, making the procedure unfit. In Minkowski spacetime, however, the Einstein synchronization procedure is the appropriate setup.

consequences for measurements of the speed of light

The only way to measure the speed of light is to set up a back and forth trip. If the environment is Newtonian space and time then you expect to find a different value for the speed of light, depending on the velocity of the measurement rig to the luminiferous ether.
Diagrams 8 and 9 illustrate how it works out in Minkowski spacetime. Different measurement setups, with a velocity relative to each other, will each find the same value for the speed of light. That means that the speed of light is an invariant.

Symmetric velocity time dilation

Picture 11. Image
Picture 11. Image
The situation is symmetrical. The red fleet and the green fleet have a velocity relative to each other, so for each unit of red time less than one unit of green time elapses, and for each unit of green time, less than one unit of red time elapses.
At time t=0 the two central ships of both fleets pass each other. At t=0, let the red ship emit a signal with a particular frequency, as measured in red fleet time. The green ship receives that signal, and that signal will be shifted to a lower frequency, as measured by green fleet time.
Conversely: at t=0 let the green ship emit a signal with a particular frequency, as measured in green fleet time. The red ship receives that signal, and that signal will be shifted to a lower frequency, as measured by red fleet time.
This type of time dilation is called symmetric velocity time dilation.
An example of that is the trajectories of the time-disseminating shuttles in the animations. At all times the shuttles have a velocity relative to each other, so there is a corresponding symmetric velocity time dilation. When the shuttles rejoin it is seen that among the shuttles there no difference in the amount of elapsed proper time.

Nonsymmetric velocity time dilation

Picture 12. Animation <br>
A straight worldline and a helical worldline.
Picture 12. Animation
A straight worldline and a helical worldline.
Schematic representation of asymmetric velocity time dilation. The animation represents motion as mapped in a Minkowski spacetime diagram, with two dimensions of space, (the horizontal plane) and position in time vertically. The circles represent clocks, counting lapse of proper time. The Minkowski coordinate system is co-moving with the non-accelerating clock.
The clock in circular motion counts a smaller amount of proper time than the non-accelerating clock. Here, the difference in the amount of proper time that elapses is in a ratio of 1:2, which corresponds to a transversal velocity of 0.866 times the speed of light.
Any light emitted by the non-accelerating clock and received by the circling clock is received as a blue-shifted signal, in a ratio of 1:2. Any light emitted by the circling clock and received by the non-accelerating clock is received as a red-shifted signal, in a ratio of 2:1 .
In this situation, symmetry is broken, and there is a difference in amount of proper time that elapses.


3. Unification: motion of objects and propagation of light

The Principle of relativity of inertial motion

The synchronization procedure that the fleets use relies on the isotropy of inertia for objects in inertial motion. Both the red fleet and the green fleet are in inertial motion; for both fleet inertia is isotropic. Both the red fleet and the green fleet take care to transfer to the "left" and the "right" miniclock the same amount of kinetic energy. Since an identical amount of kinetic energy is transferred to the "left" and "right" miniclock (and given that they have identical mass), their velocity relative to the spaceship they were expelled from will be identical.
When Galilei formulated the principle of relativity of inertial motion the obvious supposition was that velocity-vectors add in the same way as vectors of euclidean space add. The assumption of galilean relativity is an assumption (a most intuitive one) about the chrono-geometric structure of space and time. At first sight it appears that galilean relativity is the only possible embodiment of the principle of relativity of inertial motion.
As discussed in the introduction: the symmetry demands that follow from the principle of relativity of inertial motion narrow down the transformations to the following:
  • Transformations between coordinate systems that are at an angle with respect to each other
  • Transformation between coordinate systems that are translated with respect to each other
  • Transformations between coordinate systems that have a uniform velocity relative to each other
The revolution of special relativity lies in the recognition that there is yet another chrono-geometric structure that embodies the above set of symmetries: Minkowski spacetime. (Palash B. Pal has written up some very neat derivations, showing how both galilean relativity and special relavity satisfy the principle of relativity of inertial motion. Nothing but relativity (PDF-file 64 KB)

The limit of approaching the speed of light

In the animations the miniclocks have a velocity relative to the fleet of 4/5th of the speed of light. What happens at even faster velocities, approaching the speed of light? Subatomic particles such as protons and electrons can be accelerated to very close to the speed of light and expelled at that velocity. Then both as mapped in a coordinate system that is co-moving with the red fleet and in a coordinate system that is co-moving with the green fleet the particles move very close to the speed of light.

Light

Light itself is at the very extremal point. The physics of Minkowski spacetime is such that light always propagates away from any emitter with a velocity relative to that emitter of c, regardless of the velocity of that emitter relative to other emitters. Ultimately, synchronization with particles or synchronization with light relies on the same principle, the principle of inertia.

From Classical to Quantum Physics


In 1901, when the first Nobel Prizes were awarded, the classical areas of physics seemed to rest on a firm basis built by great 19th century physicists and chemists. Hamilton had formulated a very general description of the dynamics of rigid bodies as early as the 1830s. Carnot, Joule, Kelvin and Gibbs had developed thermodynamics to a high degree of perfection during the second half of the century.
Maxwell's famous equations had been accepted as a general description of electromagnetic phenomena and had been found to be also applicable to optical radiation and the radio waves recently discovered by Hertz.
Everything, including the wave phenomena, seemed to fit quite well into a picture built on mechanical motion of the constituents of matter manifesting itself in various macroscopic phenomena. Some observers in the late 19th century actually expressed the view that, what remained for physicists to do was only to fill in minor gaps in this seemingly well-established body of knowledge.
However, it would very soon turn out that this satisfaction with the state of physics was built on false premises. The turn of the century became a period of observations of phenomena that were completely unknown up to then, and radically new ideas on the theoretical basis of physics were formulated. It must be regarded as a historical coincidence, probably never foreseen by Alfred Nobel himself, that the Nobel Prize institution happened to be created just in time to enable the prizes to cover many of the outstanding contributions that opened new areas of physics in this period.
One of the unexpected phenomena during the last few years of the 19th century, was the discovery of X-rays by Wilhelm Conrad Röntgen in 1895, which was awarded the first Nobel Prize in Physics (1901). Another was the discovery of radioactivity by Antoine Henri Becquerel in 1896, and the continued study of the nature of this radiation by Marie and Pierre Curie. The origin of the X-rays was not immediately understood at the time, but it was realized that they indicated the existence of a hitherto concealed world of phenomena (although their practical usefulness for medical diagnosis was evident enough from the beginning). The work on radioactivity by Becquerel and the Curies was rewarded in 1903 (with one half to Becqurel and the other half shared by the Curies), and in combination with the additional work by Ernest Rutherford (who got the Chemistry Prize in 1908) it was understood that atoms, previously considered as more or less structureless objects, actually contained a very small but compact nucleus. Some atomic nuclei were found to be unstable and could emit the a, b or y radiation observed. This was a revolutionary insight at the time, and it led in the end, through parallel work in other areas of physics, to the creation of the first useful picture of the structure of atoms.
In 1897, Joseph J. Thomson, who worked with rays emanating from the cathode in partly evacuated discharge tubes, identified the carriers of electric charge. He showed that these rays consisted of discrete particles, later called "electrons". He measured a value for the ratio between their mass and (negative) charge, and found that it was only a very small fraction of that expected for singly charged atoms. It was soon realized that these lightweight particles must be the building blocks that, together with the positively charged nuclei, make up all different kinds of atoms. Thomson received his Prize in 1906. By then, Philipp E.A. von Lenard had already been acknowledged the year before (1905) for elucidating other interesting properties of the cathodic rays, such as their ability to penetrate thin metal foils and produce fluorescence. Soon thereafter (in 1912) Robert A. Millikan made the first precision measurement of the electron charge with the oil-drop method, which led to a Physics Prize for him in 1923. Millikan was also rewarded for his works on the photoelectric effect.
In the beginning of the century, Maxwell's equations had already existed for several decades, but many questions remained unanswered: what kind of medium propagated electromagnetic radiation (including light) and what carriers of electric charges were responsible for light emission? Albert A. Michelson had developed an interferometric method, by which distances between objects could be measured as a number of wavelengths of light (or fractions thereof). This made comparison of lengths much more exact than what had been possible before. Many years later, the Bureau International de Poids et Mesures, Paris (BINP) defined the meter unit in terms of the number of wavelengths of a particular radiation instead of the meter prototype. Using such an interferometer, Michelson had also performed a famous experiment, together with E. W. Morley, from which it could be concluded that the velocity of light is independent of the relative motion of the light source and the observer. This fact refuted the earlier assumption of an ether as a medium for light propagation. Michelson received the Physics Prize in 1907.
The mechanisms for emission of light by carriers of electric charge was studied by Hendrik A. Lorentz, who was one of the first to apply Maxwell's equations to electric charges in matter. His theory could also be applied to the radiation caused by vibrations in atoms and it was in this context that it could be put to its first crucial test. As early as 1896 Pieter Zeeman, who was looking for possible effects of electric and magnetic fields on light, made an important discovery namely, that spectral lines from sodium in a flame were split up into several components when a strong magnetic field was applied. This phenomenon could be given a quite detailed interpretation by Lorentz's theory, as applied to vibrations of the recently identified electrons, and Lorentz and Zeeman shared the Physics Prize in 1902, i.e. even before Thomson's discovery was rewarded. Later, Johannes Stark demonstrated the direct effect of electric fields on the emission of light, by exposing beams of atoms ("anodic rays", consisting of atoms or molecules) to strong electric fields. He observed a complicated splitting of spectral lines as well as a Doppler shift depending on the velocities of the emitters. Stark received the 1919 Physics Prize.
With this background, it became possible to build detailed models for the atoms, objects that had existed as concepts ever since antiquity but were considered more or less structureless in classical physics. There existed already, since the middle of the previous century, a rich empirical material in the form of characteristic spectral lines emitted in the visible domain by different kinds of atoms, and to this was added the characteristic X-ray radiation discovered by Charles G. Barkla (Physics Prize in 1917, awarded in 1918), which after the clarification of the wave nature of this radiation and its diffraction by Max von Laue (Physics Prize in 1914), also became an important source of information on the internal structure of atoms.
Barkla's characteristic X-rays were secondary rays, specific for each element exposed to radiation from X-ray tubes (but independent of the chemical form of the samples). Karl Manne G. Siegbahn realized that measuring characteristic X-ray spectra of all the elements would show systematically how successive electron shells are added when going from the light elements to the heavier ones. He designed highly accurate spectrometers for this purpose by which energy differences between different shells, as well as rules for radiative transitions between them, could be established. He received the Physics Prize in 1924 (awarded in 1925). However, it would turn out that a deeper understanding of the atomic structure required a much further departure from the habitual concepts of classical physics than anyone could have imagined.
Classical physics assumes continuity in motion as well as in the gain or loss of energy. Why then, do atoms send out radiations with sharp wavelengths? Here, a parallel line of development, also with its roots in late 19th century physics, had given important clues for interpretation. Wilhelm Wien studied the "black-body" radiation from hot solid bodies (which in contrast to radiation from atoms in gases, has a continuous distribution of frequencies). Using classical electrodynamics, he derived an expression for the frequency distribution of this radiation and the shift of the maximum intensity wavelength, when the temperature of a black body is changed (the Wien displacement law, useful for instance in determining the temperature of the sun). He was awarded the Physics Prize in 1911.
However, Wien could not derive a distribution formula that agreed with experiments for both short and long wavelengths. The problem remained unexplained until Max K.E.L. Planck put forward his radically new idea that the radiated energy could only be emitted in quanta, i.e. portions that had a certain definite value, larger for the short wavelengths than for the long ones (equal to a constant h times the frequency for each quantum). This is considered to be the birth of quantum physics. Wien received the Physics Prize in 1911 and Planck some years later, in 1918 (awarded in 1919). Important verifications that light comes in the form of energy quanta came also through Albert Einstein's interpretation of the photoelectric effect (first observed in 1887 by Hertz) which also involved extensions of Planck's theories. Einstein received the Physics Prize for 1921 (awarded in 1922). The prize motivation cited also his other "services to theoretical physics," which will be referred to in another context.
Later experiments by James Franck and Gustav L. Hertz demonstrated the inverse of the photoelectric effect (i.e. that an electron that strikes an atom, must have a specific minimum energy to produce light quanta of a particular energy from it) and showed the general validity of Planck's expressions involving the constant h. Franck and Hertz shared the 1925 prize, awarded in 1926. At about the same time, Arthur H. Compton (who received one-half of the Physics Prize for 1927) studied the energy loss in X-ray photon scattering on material particles, and showed that X-ray quanta, whose energies are more than 10,000 times larger than those of light, also obey the same quantum rules. The other half was given to Charles T.R. Wilson (see later), whose device for observing high energy scattering events could be used for verification of Compton's predictions.
With the concept of energy quantization as a background, the stage was set for further ventures into the unknown world of microphysics. Like some other well-known physicists before him, Niels H. D. Bohr worked with a planetary picture of electrons circulating around the nucleus of an atom. He found that the sharp spectral lines emitted by the atoms could only be explained if the electrons were circulating in stationary orbits characterized by a quantized angular momentum (integer units of Planck's constant h divided by 2pi) and that the emitted frequencies vcorresponded to emission of radiation with energy hvequal to the difference between quantized energy states of the electrons. His suggestion indicated a still more radical departure from classical physics than Planck's hypothesis. Although it could only explain some of the simplest features of optical spectra in its original form, it was soon accepted that Bohr's approach must be a correct starting point, and he received the Physics Prize in 1922.
It turned out that a deeper discussion of the properties of radiation and matter (until then considered as forming two completely different categories), was necessary for further progress in the theoretical description of the microworld. In 1923 Prince Louis-Victor P. R. de Broglie proposed that material particles may also show wave properties, now that electromagnetic radiation had been shown to display particle aspects in the form of photons. He developed mathematical expressions for this dualistic behavior, including what has later been called the "de Broglie wavelength" of a moving particle. Early experiments by Clinton J. Davisson had indicated that electrons could actually show reflection effects similar to that of waves hitting a crystal and these experiments were now repeated, verifying the associated wavelength predicted by de Broglie. Somewhat later, George P. Thomson (son of J. J. Thomson) made much improved experiments on higher energy electrons penetrating thin metal foils which showed very clear diffraction effects. de Broglie was rewarded for his theories in 1929 and Davisson and Thomson later shared the 1937 Physics Prize.
What remained was the formulation of a new, consistent theory that would replace classical mechanics, valid for atomic phenomena and their associated radiations. The years 1924-1926 was a period of intense development in this area. Erwin Schrödinger built further on the ideas of de Broglie and wrote a fundamental paper on "Quantization as an eigenvalue problem" early in 1926. He created what has been called "wave mechanics". But the year before that, Werner K. Heisenberg had already started on a mathematically different approach, called "matrix mechanics", by which he arrived at equivalent results (as was later shown by Schrödinger). Schrödinger's and Heisenberg's new quantum mechanics meant a fundamental departure from the intuitive picture of classical orbits for atomic objects, and implied also that there are natural limitations on the accuracy by which certain quantities can be measured simultaneously (Heisenberg's uncertainty relations).
Heisenberg was rewarded by the Physics Prize for 1932 (awarded 1933) for the development of quantum mechanics, while Schrödinger shared the Prize one year later (1933) with Paul A.M. Dirac. Schrödinger's and Heisenberg's quantum mechanics was valid for the relatively low velocities and energies associated with the "orbital" motion of valence electrons in atoms, but their equations did not satisfy the requirements set by Einstein's rules for fast moving particles (to be mentioned later). Dirac constructed a modified formalism which took into account effects of Einstein's special relativity, and showed that such a theory not only contained terms corresponding to the intrinsic spinning of electrons (and therefore explaining their own intrinsic magnetic moment and the fine structure observed in atomic spectra), but also predicted the existence of a completely new kind of particles, the so-called antiparticles with identical masses but opposite charge. The first antiparticle to be discovered, that of the electron, was observed in 1932 by Carl D. Anderson and was given the name "positron" (one-half of the Physics Prize for 1936).
Other important contributions to the development of quantum theory have been distinguished by Nobel Prizes in later years. Max Born, Heisenberg's supervisor in the early twenties, made important contributions to its mathematical formulation and physical interpretation. He received one-half of the Physics Prize for 1954 for his work on the statistical interpretation of the wave function. Wolfgang Pauli formulated his exclusion principle (which states that there can be only one electron in each quantum state) already on the basis of Bohr's old quantum theory. This principle was later found to be associated with the symmetry of wave functions for particles of half-integer spins in general, distinguishing what is now called fermions from the bosonic particles whose spins are integer multiples of h/2pi. The exclusion principle has deep consequences in many areas of physics and Pauli received the Nobel Prize in Physics in 1945.
The study of electron spins would continue to open up new horizons in physics. Precision methods for determining the magnetic moments of spinning particles were developed during the thirties and forties for atoms as well as nuclei (by Stern, Rabi, Bloch and Purcell, see later sections) and in 1947 they had reached such a precision, that Polykarp Kusch could state that the magnetic moment of an electron did not have exactly the value predicted by Dirac, but differed from it by a small amount. At about the same time, Willis E. Lamb worked on a similar problem of electron spins interacting with electromagnetic fields, by studying the fine structure of optical radiation from hydrogen with very high resolution radio frequency resonance methods. He found that the fine structure splitting also did not have exactly the Dirac value, but differed from it by a significant amount. These results stimulated a reconsideration of the basic concepts behind the application of quantum theory to electromagnetism, a field that had been started by Dirac, Heisenberg and Pauli but still suffered from several insufficiencies. Kusch and Lamb were each awarded half the the Physics Prize in 1955.
In quantum electrodynamics (QED for short), charged particles interact through the interchange of virtual photons, as described by quantum perturbation theory. The older versions involved only single photon exchange, but Sin-Itiro Tomonaga, Julian Schwinger and Richard P. Feynman realized that the situation is actually much more complicated, since electron-electron scattering may involve several photon exchanges. A "naked" point charge does not exist in their picture; it always produces a cloud of virtual particle-antiparticle pairs around itself, such that its effective magnetic moment is changed and the Coulomb potential is modified at short distances. Calculations starting from this picture have reproduced the experimental data by Kusch and Lamb to an astonishing degree of accuracy and modern QED is now considered to be the most exact theory in existence. Tomonaga, Schwinger and Feynman shared the Physics Prize in 1965.
This progress in QED turned out to be of the greatest importance also for the description of phenomena at higher energies. The notion of pair production from a "vacuum" state of a quantized field (both as a virtual process and as a real materialization of particles), is also a central building block in the modern field theory of strong interactions, quantum chromodynamics (QCD).
Another basic aspect of quantum mechanics and quantum field theory is the symmetries of wave functions and fields. The symmetry properties under exchange of identical particles lie behind Pauli's exclusion principle mentioned above, but symmetries with respect to spatial transformations have turned out to play an equally important role. In 1956, Tsung-Dao Lee and Chen Ning Yang pointed out, that physical interactions may not always be symmetric with respect to reflection in a mirror (that is, they may be different as seen in a left-handed and a right-handed coordinate system). This means that the wave function property called "parity", denoted by the symbol "P", is not conserved when the system is exposed to such an interaction and the mirror reflection property may be changed. Lee's and Yang's work was the starting point for an intense search for such effects and it was shown soon afterwards that the beta decay and the pi decay, which are both caused by the so-called "weak interaction" are not parity-conserving (see more below). Lee and Yang were jointly awarded the Physics Prize in 1957.
Other symmetries in quantum mechanics are connected with the replacement of a particle with its antiparticle, called charge conjugation (symbolized by "C"). In the situations discussed by Lee and Yang it was found that although parity was not conserved in the radioactive transformations there was still a symmetry in the sense that particles and antiparticles broke parity in exactly opposite ways and that therefore the combined operation "C"x"P" still gave results which preserved symmetry. But it did not last long before James W. Cronin and Val L. Fitch found a decay mode among the "K mesons" that violated even this principle, although only to a small extent. Cronin and Fitch made their discovery in 1964 and were jointly awarded the Physics Prize in 1980. The consequences of their result (which include questions about the symmetry of natural processes under reversal of time, called "T") are still discussed today and touch some of the deepest foundations of theoretical physics, because the "P"x"C"x"T" symmetry is expected always to hold.
The electromagnetic field is known to have another property, called "gauge symmetry", which means that the field equations keep their form even if the electromagnetic potentials are multiplied with certain quantum mechanical phase factors, or "gauges". It was not self-evident that the "weak" interaction should have this property, but it was a guiding principle in the work by Sheldon L. Glashow, Abdus Salam, and Steven Weinberg in the late 1960s, when they formulated a theory that described the weak and the electromagnetic interaction on the same basis. They were jointly awarded the Physics Prize in 1979 for this unified description and, in particular, for their prediction of a particular kind of weak interaction mediated by "neutral currents", which had been found recently in experiments.
The last Physics Prize (1999) in the 20th century was jointly awarded to Gerhardus 't Hooft and Martinus J. G. Veltman. They showed the way to renormalize the "electro-weak" theory, which was necessary to remove terms that tended to infinity in quantum mechanical calculations (just as QED had earlier solved a similar problem for the Coulomb interaction). Their work allowed detailed calculations of weak interaction contributions to particle interactions in general, proving the utility of theories based on gauge invariance for all kinds of basic physical interactions.
Quantum mechanics and its extensions to quantum field theories is one of the great achievements of the 20th century. This sketch of the route from classical physics to modern quantum physics, has led us a long way toward a fundamental and unified description of the different particles and forces in nature, but much remains to be done and the goal is still far ahead. It still remains, for instance, to "unify" the electro-weak force with the "strong" nuclear force and with gravity. But here, it should also be pointed out that the quantum description of the microworld has another main application: the calculation of chemical properties of molecular systems (sometimes extended to biomolecules) and of the structure of condensed matter, branches that have been distinguished with several prizes, in physics as well as in chemistry.

Microcosmos and Macrocosmos

"From Classical to Quantum Physics", took us on a trip from the phenomena of the macroscopic world as we meet it in our daily experience, to the quantum world of atoms, electrons and nuclei. With the atoms as starting point, the further penetration into the subatomic microworld and its smallest known constituents will now be illustrated by the works of other Nobel Laureates.
It was realized, already in the first half of the 20th century, that such a further journey into the microcosmos of new particles and interactions would also be necessary for understanding the composition and evolution histories of the very large structures of our universe, the "macrocosmos". At the present stage elementary particle physics, astrophysics, and cosmology are strongly tied together, as several examples presented here will show.
Another link connecting the smallest and the largest objects in our universe is Albert Einstein's theories of relativity. Einstein first developed his special theory of relativity in 1905, which expresses the mass-energy relationship E=mc2. Then, in the next decade, he continued with his theory of general relativity, which connects gravitational forces to the structure of space and time. Calculations of effective masses for high energy particles, energy transformations in radioactive decay as well as Dirac's predictions that antiparticles may exist, are all based on his special theory of relativity. The general theory is the basis for calculations of large scale motions in the universe, including discussions of the properties of black holes. Einstein received the Nobel Prize in Physics in 1921 (awarded in 1922), motivated by work on the photo-electric effect which demonstrated the particle aspects of light.
The works by Becquerel, the Curies, and Rutherford gave rise to new questions: What was the source of energy in the radioactive nuclei that could sustain the emission of a.b. and y radiation over very long time intervals, as observed for some of them, and what were the heavy alphaparticles and the nuclei themselves actually composed of? The first of these problems (which seemed to violate the law of conservation of energy, one of the most important principles of physics) found its solution in the transmutation theory, formulated by Rutherford and Frederick Soddy (Chemistry Prize for 1921, awarded in 1922). They followed in detail several different series of radioactive decay and compared the energy emitted with the mass differences between "parent" and "daughter" nuclei. It was also found that nuclei belonging to the same chemical element could have different masses; such different species were called "isotopes". A Chemistry Prize was given in 1922 to Francis W. Aston for his mass-spectroscopic separation of a large number of isotopes of non-radioactive elements. Marie Curie had by then already received a second Nobel Prize (this time in Chemistry in 1911), for her discoveries of the chemical elements radium and polonium.
All isotopic masses were found to be nearly equal to multiples of the mass of the proton, a particle also first seen by Rutherford when he irradiated nitrogen nuclei with alpha particles. But the different isotopes could not possibly be made up entirely of protons since each particular chemical element must have one single value for the total nuclear charge. Protons were actually found to make up less than half of the nuclear mass, which meant that some neutral constituents had to be present in the nuclei. James Chadwick first found conclusive evidence for such particles, the neutrons, when he studied nuclear reactions in 1932. He received the Physics Prize in 1935.
Soon after Chadwick's discovery, neutrons were put to work by Enrico Fermi and others as a means to induce nuclear reactions that could produce new "artificial" radioactivity. Fermi found that the probability of neutron-induced reactions (which do not involve element transformations), increased when the neutrons were slowed down and that this worked equally well for heavy elements as for light ones, in contrast to charge-particle induced reactions. He received the Physics Prize in 1938.
With neutrons and protons as the basic building blocks of atomic nuclei, the branch of "nuclear physics" could be established and several of its major achievements were distinguished by Nobel prizes. Ernest O. Lawrence, who received the Physics Prize in 1939, built the first cyclotron in which acceleration took place by successively adding small amounts of energy to particles circulating in a magnetic field. With these machines, he was able to accelerate charged nuclear particles to such high energies that they could induce nuclear reactions and he obtained important new results. Sir John D. Cockcroft and Ernest T.S. Walton instead, accelerated particles by direct application of very high electrostatic voltages and were rewarded for their studies of transmutation of elements in 1951.
Otto Stern received the Physics Prize in 1943 (awarded in 1944), for his experimental methods of studying magnetic properties of nuclei, in particular for measuring the magnetic moment of the proton itself. Isidor I. Rabi increased the accuracy of magnetic moment determinations for nuclei by more than two orders of magnitude, with his radio frequency resonance technique, for which he was awarded the Physics Prize for 1944. Magnetic properties of nuclei provide important information for understanding details in the build-up of the nuclei from protons and neutrons. Later, in the second half of the century, several theoreticians were rewarded for their work on the theoretical modelling of this complex many-body system: Eugene P. Wigner (one-half of the prize), Maria Goeppert-Mayer (one-fourth) and J. Hans D. Jensen (one-fourth) in 1963 and Aage N. Bohr, Ben R. Mottelson and L. James Rainwater in 1975. We will come back to these works under the heading "From Simple to Complex Systems".
As early as 1912, it was found by Victor F. Hess (awarded half the Prize in 1936 and the other half to Carl D. Anderson) that highly penetrating radiation is also reaching us continuously from outer space. This "cosmic radiation" was first detected by ionization chambers and later by Wilson's cloud chamber referred to earlier. Properties of particles in the cosmic radiation could be inferred from the curved particle tracks produced when a strong magnetic field was applied. It was in this way that C. D. Anderson discovered the positron. Anderson and Patrick M.S. Blackett showed that electron positron pairs could be produced by y rays (which needed a photon energy equal to at least 2mec2) and that electrons and positrons could annihilate, producing rays as they disappeared. Blackett received the Physics Prize in 1948 for his further development of the cloud chamber and the discoveries made with it.
Although accelerators were further developed, cosmic radiation continued for a couple of decades to be the main source of very energetic particles (and still surpasses the most powerful accelerators on earth in this aspect, although with extremely low intensities), and it provided the first glimpses of a completely unknown subnuclear world. A new kind of particles, called mesons, was spotted in 1937, having masses approximately 200 times that of electrons (but 10 times lighter than protons). In 1946, Cecil F. Powell clarified the situation by showing that there were actually more than one kind of such particles present. One of them, the "pimeson", decays into the other one, the meson". Powell was awarded the Physics Prize in 1950.
By that time, theoreticians had already been speculating about the forces that keep protons and neutrons together in nuclei. Hideki Yukawa suggested in 1935, that this "strong" force should be carried by an exchange particle, just as the electromagnetic force was assumed to be carried by an exchange of virtual photons in the new quantum field theory. Yukawa maintained that such a particle must have a mass of about 200 electron masses in order to explain the short range of the strong forces found in experiments. Powell's pimeson was found to have the right properties to act as a "Yukawa particle". The µ particle, on the other hand, turned out to have a completely different character (and its name was later changed from "µ meson" to "muon"). Yukawa received the Physics Prize in 1949. Although later progress has shown that the strong force mechanism is more complex than what Yukawa pictured it to be, he must still be considered as the first one who led the thoughts on force carriers in this fruitful direction.
More new particles were discovered in the 1950s, in cosmic radiation as well as in collisions with accelerated particles. By the end of the 50s, accelerators could reach energies of several GeV (109 electron volts) which meant that pairs of particles, with masses equal to the proton mass, could be created by energy-to-mass conversion. This was the method used by the team of Owen Chamberlain and Emilio Segrè when they first identified and studied the antiproton in 1955 (they shared the Physics Prize for 1959). High energy accelerators also allowed more detailed studies of the structures of protons and neutrons than before, and Robert Hofstadter was able to distinguish details of the electromagnetic structure of the nucleons by observing how they scattered electrons of very high energy. He was rewarded with half the Physics Prize for 1961.
One after another, new mesons with their respective antiparticles appeared, as tracks on photographic plates or in electronic particle detectors. The existence of the "neutrino" predicted on theoretical grounds by Pauli already as early as the 1930s, was established. The first direct experimental evidence for the neutrino was provided by C. L. Cowan and Frederick Reines in 1957, but it was not until 1995 that this discovery was awarded with one-half the Nobel Prize (Cowan had died in 1984). The neutrino is a participant in processes involving the "weak" interaction (such as beta decay and pimeson decay to muons) and, as the intensity of particle beams increased, it became possible to produce secondary beams of neutrinos from accelerators. Leon M. Lederman, Melvin Schwartz and Jack Steinberger developed this method in the 1960s and demonstrated that the neutrinos accompanying µ emission in pi decay were not identical to those associated with electrons in beta decay; they were two different particles, vu and ve.
Physicists could now start to distinguish some order among the particles: the electron (e), the muon (µ), the electron neutrino (ve), the muon neutrino (vu) and their antiparticles were found to belong to one class, called "leptons". They did not interact by the "strong" nuclear force, which on the other hand, characterized the protons, neutrons, mesons and hyperons (a set of particles heavier than the protons). The lepton class was extended later in the 1970s when Martin L. Perl and his team discovered the taulepton, a heavier relative to the electron and the muon. Perl shared the Physics Prize in 1995 with Reines.
All the leptons are still considered to be truly fundamental, i.e. point-like and without internal structure, but for the protons, etc, this is no longer true. Murray Gell-Mann and others managed to classify the strongly interacting particles (called "hadrons") into groups with common relationships and ways of interaction. Gell-Mann received the Physics Prize in 1969. His systematics was based on the assumption that they were all built up from more elementary constituents, called "quarks". The real proof that nucleons were built up from quark-like objects came through the works of Jerome I. Friedman, Henry W. Kendall and Richard E. Taylor. They "saw" hard grains inside these objects when they studied how electrons (of still higher energy than Hofstadter could use earlier) scattered inelastically on them. They shared the Physics Prize in 1990.
It was understood that all strongly interacting particles are built up by quarks. In the middle of the 1970s a very short-lived particle, discovered independently by the teams of Burton Richter and Samuel C.C. Ting, was found to contain a so far, unknown type of quark which was given the name "charm". This quark was a missing link in the systematics of the elementary particles and Burton and Ting shared the Physics Prize in 1976. The present standard model of particle physics sorts the particles into three families, with two quarks (and their antiparticles) and two leptons in each: the "up" and "down" quarks, the electron and the electron-neutrino in the first; the "strange" and the "charm" quark, the muon and the muon neutrino in the second; the "top" and the "bottom" quark, the tauon and the tau neutrino in the third. The force carriers for the combined electro-weak interaction are the photon, the Z-particle and the W-bosons, and for the strong interaction between quarks the so-called gluons.
In 1983, the existence of the W- and Z-particles was proven by Carlo Rubbia's team which used a new proton-antiproton collider with sufficient energy for production of these very heavy particles. Rubbia shared the 1984 Physics Prize with Simon van der Meer, who had made decisive contributions to the construction of this collider by his invention of "stochastic cooling" of particles. There are speculations that additional particles may be produced at energies higher than those attainable with the present accelerators, but no experimental evidence has been produced so far.
Cosmology is the science that deals with the structure and evolution of our universe and the large-scale objects in it. Its models are based on the properties of the known fundamental particles and their interactions as well as the properties of space-time and gravitation. The "big-bang" model describes a possible scenario for the early evolution of the universe. One of its predictions was experimentally verified when Arno A. Penzias and Robert W. Wilson discovered the cosmic microwave radiation background in 1960. They shared one-half of the Physics Prize for 1978. This radiation is an afterglow of the violent processes assumed to have occurred in the early stages of the big bang. Its equilibrium temperature is 3 kelvin at the present age of the universe. It is almost uniform when observed in different directions; the small deviations from isotropy are now being investigated and will tell us more about the earliest history of our universe.
Outer space has been likened to a large arena for particle interactions where extreme conditions, not attainable in a laboratory, are spontaneously created. Particles may be accelerated to higher energies than in any accelerator on earth, nuclear fusion reactions proliferate in the interior of stars, and gravitation can compress particle systems to extremely high densities. Hans A. Bethe first described the hydrogen and carbon cycles, in which energy is liberated in stars by the fusion of protons into helium nuclei. For this achievement he received the Physics Prize in 1967.
Subramanyan Chandrasekhar described theoretically the evolution of stars, in particular those ending up as "white dwarfs". Under certain conditions the end product may also be a "neutron star", an extremely compact object, where all protons have been converted into neutrons. In supernova explosions, the heavy elements created during stellar evolution are spread out into space. The details of some of the most important nuclear reactions in stars and heavy element formation were elucidated by William A. Fowler both in theory and in experiments using accelerators. Fowler and Chandrasekhar received one-half each of the 1983 Physics Prize.
Visible light and cosmic background radiation are not the only forms of electromagnetic waves that reach us from outer space. At longer wavelengths, radio astronomy provides information on astronomical objects not obtainable by optical spectroscopy. Sir Martin Ryle developed the method where signals from several separated telescopes are combined in order to increase the resolution in the radio source maps of the sky. Antony Hewish and his group made an unexpected discovery in 1964 using Ryle's telescopes: radio frequency pulses were emitted with very well-defined repetition rates by some unknown objects called pulsars. These were soon identified as neutron stars, acting like fast rotating lighthouses emitting radiowaves because they are also strong magnets. Ryle and Hewish shared the Physics Prize in 1974.
By 1974, pulsar search was already routine among radio astronomers, but a new surprise came in the summer of the same year when Russell A. Hulse and Joseph H. Taylor, Jr. noticed periodic modulations in the pulse frequencies of a newly discovered pulsar, called PSR 1913+16. It was the first double pulsar detected, so named because the emitting neutron star happened to be one of the components of a close double star system, with the other component of about equal size. This system has provided, by observation over more than 20 years, the first concrete evidence for gravitational radiation. The decrease of its rotational frequency is in close agreement with the predictions based on Einstein's theory, for losses caused by this kind of radiation. Hulse and Taylor shared the Physics Prize in 1993. However, the direct detection of gravitational radiation on earth still has to be made.