Chapter 23 - The Theory of Special Relativity
23.1Overview¶
In this chapter, we introduce the theory of Special Relativity, originally formulated by Albert Einstein in 1905. Along with the development of Quantum Mechanics, Special Relativity marks the start of “modern physics”, and the introduction of theories to describe our world that are decidedly counter-intuitive.
23.2Introduction: The issue with Maxwell’s equations¶
In Chapter 22.1, we summarized our knowledge of electromagnetism using Maxwell’s four equations. As far as we can tell, this is the best description that we have of classical electric and magnetic phenomena (classical in the sense that the equations do not describe the behaviour of particles that are described by Quantum Mechanics). One of the consequences of Maxwell’s equations is that they describe the existence of electromagnetic waves that propagate with a speed, , given by:
where and are the permittivity and permeability of free-space, respectively. We are familiar with waves that propagate through specific media. For example, ocean waves move through water and sound waves through air. The obvious question to ask about these electromagnetic waves is: “In what medium do these waves propagate?”. In the late 1800s, it was thought that the Universe was bathed in a substance called the “luminous ether” (or just “ether”), through which electromagnetic waves propagate. It was then thought that the speed, , of these waves was, naturally, measured with respect to the ether. This led to the idea that there exists a special inertial frame of reference in the Universe, corresponding to that frame of reference in which light travels at a speed, . This frame of reference would be at rest relative to the ether.
In the late 1880s, Michelson and Morley developed a clever experiment to measure the speed of the Earth relative to the ether. If the ether exists, and the Earth is moving through it, then a beam of light travelling parallel to the motion of the Earth should travel at a slightly different speed than a beam of light travelling in the perpendicular direction. However, Michelson and Morley conclusively demonstrated that this was not the case. There is no detectable motion of the Earth through a medium in which light (a electromagnetic wave) propagates. There is no ether. This was a very puzzling discovery, with strange implications for Maxwell’s equations.
Let us demonstrate, through a simple example, an “issue” with the theory of electromagnetism. Rather, it is not an issue, but a very strange implication. Consider two infinitely-long wires, separated by a distance, , each carrying a uniform charge per unit length, λ, as illustrated in Figure 1.
We can easily calculate the magnitude of the repulsive electric force, , exerted by one charged wire on a section of length of the other wire. The magnitude of the electric field at a distance, , from an infinitely-long wire with charge per unit length λ is given by:
A section of length of the other wire carries charge, , so that the force on that section of wire has a magnitude:
And the force per unit length, on either one of the wires, has a magnitude:
This is the only force exerted on one of the wires, and will thus allow us to completely specify the motion of that wire (we know all of the forces exerted on the wire, so we can use Newton’s Second Law to determine its acceleration and describe its motion).
Consider the same two wires, each carrying a charge per unit length, as viewed from a frame of reference that is moving downwards (parallel to the wires), with a speed, . In this frame of reference, the infinite wires still have a net charge per unit length, but they also appear to have an upwards moving current, , since we observe positive charges moving upwards through space.
In this new frame of reference, we see two wires with charges on them, moving upwards with speed, . In an interval of time, , we see a length of wire, , go by, with total charge, . For reasons that will be clear below, we use a different charge density, , in the moving frame of reference, although we expect that . This corresponds to a current, , given by:
Thus, in the downward going frame of reference, we see two wires with upwards current in them, and these wires must extract an attractive magnetic force between each other, with magnitude (per unit length):
where the prime (') on the force indicates that the force is measured in this different inertial frame of reference, and the minus sign indicates that it is in the opposite direction from the repulsive electric force.
In the downwards going frame of reference, the wires are still charged, and must still exert a repulsive electric force, with magnitude (per unit length):
where, again, we used primes ('), to denote quantities that are measured in the moving frame of reference.
The description of how the wires will move should not depend on the frame of reference in which we choose to model the wires (they will move under the forces exerted on them regardless of whether we are observing them from a fixed or a moving point, and indeed regardless of whether we observe them at all!). Thus, the net force (per unit length) exerted on a wire cannot depend on our frame of reference. The total repulsive electric force, , calculated in the stationary frame of reference must be equal to the sum of the magnetic and electric forces, and , calculated in the moving frame of reference [47]:
where we recognized that the charge per unit length, , must be different in the moving frame of reference, or the above would give an inconsistent equation (the electric forces would cancel and we would find that the magnetic force is equal to zero). Thus, the repulsive electric force must be larger as observed in the moving frame of reference, or the net force on the wire would be different when evaluated in the two frames of reference. This is a truly bizarre conclusion, as we will see.
Before proceeding, let us clearly state our assumptions in modelling the force between the two charged wires:
- The net force on the wire, allowing us to describe its motion, cannot depend on our frame of reference. We expect the laws of physics to be applicable from any inertial frame of reference.
- We assume that Maxwell’s equations hold in all inertial frames of reference. In particular, we assume that the constants, and , are the same in all inertial reference frames.
The first assumption allows us to state that the net force in the two frames of reference must be the same. The second assumption implies that we must change the charge density, , in the moving frame of reference, since the constants must remain the same, and this is the only quantity that can lead to a different electric force in the moving frame of reference (which is required if the net force is to be the same, according to our first assumption). Let us determine the new charge density, , in terms of the charge density that is measured at rest. Starting with the requirement that the net force on the wire must not depend on the frame of reference, we find:
Finally, recognizing that we can use the speed of light, , to replace the combination of constants, , we find:
Thus, the charge per unit length on the wire is larger when measured from the moving frame of reference ( is greater than λ if ). It should be somewhat bothersome to you that the charge per unit length depends on the frame of reference in which it is measured, but this is the only way for our two assumptions to hold.
So far, this has just been some math to ensure that “things work out”, namely that our description of the motion of the wire does not depend on our frame of reference. However, the consequences of what we just derived are profound. We concluded that the charge per unit length on a wire depends on our frame of reference.
Since we are dealing with infinitely long wires, we can draw two lines to define a section of the wire in the rest frame, as in Figure 3. The charge per unit length on the wire, λ, is found by counting the number of charges between the two lines and dividing that by the distance between the two lines. Now, both an observer at rest relative to the wire, and one that is moving relative to the wire will agree on the number of charges contained between the two lines. They will both count the same number. Thus, if the observer moving relative to the wire is to measure a larger charge density, then the distance between the lines must be smaller for that observer! To the observer moving relative to the wire, the wire is actually shorter. It does not appear to be shorter, it IS shorter!
To summarize, by requiring that the laws of physics are the same in all inertial frames of reference, and by requiring that Maxwell’s equations are the same in all inertial frames of reference, we conclude that the charge per unit length that is measured on a wire must depend on the frame of reference in which it is measured. Since it cannot be the number of charges on the wire that depends on the frame of reference, it must be the length of the wire that depends on the frame of reference. Thus, either we accept that Maxwell’s equations are incorrect, or we accept that they are correct but that they imply that objects shrink in length when they are moving (regardless of whether charges are involved). It turns out that the latter choice provides a better description of nature (and one that has not been invalidated!).
An additional consequence of accepting these implications from Maxwell’s equations is that the definition of the electric and magnetic fields must depend on the frame of reference. In the example from this section, we saw that what looks like an electric field in the stationary frame of reference can appear as the combination of a magnetic and electric fields in a moving frame of reference.
23.3Einstein’s postulates¶
Albert Einstein was the first to provide a complete description of how to deal with the issues that arise from Maxwell’s equations when these are examined in different inertial frames of reference. The Theory of Special Relativity is based on Einstein’s two postulates:
- The laws of physics are the same in all inertial reference frames. There is no experiment that can be performed to determine whether one is at rest or moving with constant velocity.
- The speed of light propagating in vacuum is the same in all inertial reference frames. Any observer in an inertial frame of reference, regardless of their velocity, will measure that light has a speed of when it propagates in vacuum.
These postulates are equivalent to the assumptions that we made above to model the force between the two wires (we stated that the constants, and , were independent of the reference frame, instead of ). While the first postulate is perhaps “acceptable” to our common sense, the second one grossly defies common intuition. Consider two archers, as illustrated in Figure 4.
Both archers can fire an arrow with speed . One archer fires her arrow from the ground, and that arrow will hit its target with speed . The other archer is on a train that is moving with speed in the same direction that she wishes to shoot her arrow. She measures her arrow to leave her bow with speed , but, as seen from the ground (and from the target), her arrow has a speed , and it will hit the target with a higher speed, as expected.
Now, consider two people that instead fire a pulse of laser light at a target on the ground, as illustrated in Figure 5.
In this case, according to Einstein’s second postulate, the speed of the pulses as measured on the ground (by the target), will be , regardless of whether one of the pulses was fired from a moving train. This is truly strange and not compatible with our experience. For example, imagine that the train moves close to the speed of light. The person on the train would fire a laser pulse that he would observe to move away from him at the speed of light. However, when observed from the ground, we would see the pulse of light move away from them very slowly.
23.3.1Simultaneity¶
As a first consequence of Einstein’s postulates, let us consider the notion of simultaneity. Figure 6 shows Alice on the platform of a train station. Alice is midway between two clocks, and . Both identical clocks were configured so that they send a pulse of laser light when the time is 20 minutes past four o’clock. Since Alice is midway between the clocks, if they emit their pulses of light at the same time, then Alice will see two pulses of light arrive at her location at the same time. She signals that the two pulses of light have reached her at the same time by raising her hands.
Brice is located on a train that is travelling with speed, , in the direction from clock to clock , as illustrated in Figure 7. He sees Alice and the platform moving towards him.
Brice must agree that the two pulses arrived at Alice’s location at the same time, since he can also see her raise her hands. In Brice’s frame of reference, the two pulses of light must travel with the speed of light (Einstein’s second postulate). Once the pulse of light has been emitted from clock , Brice observes that Alice is moving away from the location of where the pulse was emitted, so that pulse must travel a large distance, . On the other hand, once the pulse from clock is emitted, Brice observes that Alice moves towards where the pulse was emitted, so it only needs to travel a shorter distance, , in order to reach Alice. Thus, for both pulses to arrive at Alice at the same time and travel at the speed of light, the pulse from clock had to be emitted first, according to Brice.
That is, while Alice measures the clocks to be synchronized and emit pulses at the same time, Brice measures that clock is running ahead of clock . The two observers, Alice and Brice, in different reference frames, cannot agree on whether two events are simultaneous. Even worse, if a third observer, \chloens, is located on a train going in the opposite direction from Brice’s train, she will conclude that the pulse from clock was emitted earlier than the pulse from clock . A consequence of Einstein’s postulates is that observers in different frames of reference will not agree on whether two events happen at the same time, and in some cases, as the one we illustrated, the observers will not agree on which event happened first. Think of the implications for causality!
23.4Time dilation¶
Einstein was famous for his “thought experiments”, which allow us to understand the consequences of a theory by performing thought experiments that would be impractical to actually carry out (such as the experiment with Alice and Brice described above, which would be impractical to carry out, since the speed of light is so high that Brice would never notice that clock emitted the pulse slightly earlier).
Imagine that we build a clock using a pulse of light travelling (oscillating) back and forth between two mirrors, separated by a distance, , as illustrated in Figure 8.
Since the speed of light is , the time that it will take for the pulse of light to travel back and forth between the two mirrors, namely the period of the clock, is given by:
where the speed of light, , is given by the total distance travelled by the pulse of light divided by the time taken to do so:
Now, imagine placing this clock on a spaceship that travels with speed , perpendicular to the direction of the movement of the light. The clock is illustrated in Figure 9, as seen from the ground.
From the perspective of a person watching the clock go by, the pulse of light travels a larger distance over one clock period, since the mirrors move to the right as the pulse of light moves up and down. However, by Einstein’s second postulate, the pulse of light must still travel with the same speed, , so it must take the pulse of light longer to bounce between the two mirrors than it did when the clock is at rest. Let us determine the relationship between the period of the clock, , measured when the clock is at rest, and the period of the clock, , as measured by an observer that sees the clock go by with speed, .
To an observer that sees the clock move by with speed, , the speed of the pulse of light, which must also be equal to , is given by:
where the distance in the numerator was simply found by Pythagoras’ theorem, as the spaceship will travel a horizontal distance, , as measured by the observer that is not moving with the spaceship. Squaring this relationship, we can isolate the period of the clock, , as measured by the observer that sees the clock move with speed, :
Note that the term is simply the period of the clock as measured in a frame of reference where the clock is stationary. Thus, we can relate the two clock periods:
To re-iterate: the period of the clock, , as measured in a frame of reference that is moving relative to the clock is longer than the period of the clock, , as measured in the “rest frame” of the clock (the reference frame where the clock is stationary). We call this effect “time dilation”, and it is not just some mathematical curiosity. The clock that we imagined with a pulse of light is a real clock that one could actually construct; we could use it to measure time. That clock will appear to tick slower if it is moving. Time goes by slower in a moving reference frame. If a person climbs on a ship that is moving, that person will age at a slower rate than a person that remained on Earth. By travelling at high speeds, you effectively travel into the future, as observed on Earth. The equation above allows us to relate the amount of time that went by in one reference frame to the amount of time that went by in a different frame of reference.
We define the time that is measured at rest as the “proper time”. In our example, is the proper time (proper period) for the clock, since it is defined in a frame of reference where the clock is at rest. The “dilated time”, , is measured in a frame of reference that is moving relative to the clock.
The factor by which time is dilated comes up often in Special Relativity, and is called the gamma factor:
As a corollary to Einstein’s postulates, we will see that nothing can ever exceed the speed of light in vacuum. The gamma factor is always greater than one, since (the speed between the two different inertial frames of reference) must always be smaller than . You may also recognize that the gamma factor appeared in our introductory example with the force between two wires. Here, we derived the gamma factor from kinematic considerations, whereas in the example with the two wires, it came straight out of the equations for electromagnetism.
Time-dilation is a real effect that has been observed, for example, by placing high precision atomic clocks on an airplane to observe their period slow down. Another example of time-dilation is the fact that we observe many particles called muons at the surface of the Earth. Muons are very similar to electrons, except that they have a larger mass, and that they are unstable (they radioactively decay into an electron and neutrinos, after on average). Muons are produced in large amounts when cosmic rays (high energy particles from outside our Solar System) strike the molecules in our upper atmosphere, at altitudes of tens of kilometres. As the muons travel down towards the Earth, they decay.
Suppose that muons are produced travelling at the speed of light; in that case, they would travel a distance , on average, before decaying. However, muons are produced tens of kilometres above the surface of the Earth, travel slower than the speed of light, and yet, we are able to detect many muons at the surface of the Earth. We would expect that all muons would have decayed before reaching the surface of the Earth.
We can understand this in terms of time dilation; in the reference frame of the muon, the muon decays after . In a reference frame from which the muon appears to move with speed , the “clock” that measures how long the muon has existed ticks slower. Thus, from the Earth, we observe that the muon takes longer than to decay, giving it time to reach the surface of the Earth.
One interesting issue uncovered by Example 23.2 is the so-called “twin-paradox”. Imagine that Alice has a twin brother, Brice, that remains on Earth. Alice travels to Proxima Centauri and back (return trip), and will have aged by about 14 months, whereas Brice, will have aged by about 8.4 years (using the numbers in Example 23.2). However, Einstein’s first postulate implies that there are no special frames of reference that are at rest. We should be able to think about this situation from the perspective where Alice is at rest, and it is the Earth (with Brice on it), that moves away from her and then back. In this case, Alice is at rest, and she will conclude that it takes about 8.4 years for Brice to move away and come back, and that Brice would have aged by about 7 months. When Alice and Brice meet up again, clearly Alice cannot be both younger and older than Brice, so which one is it? (You will have to look this up, see associated question in the “Thinking about the material” section).
23.5Length contraction¶
As we saw in the examples from the previous section, time dilation implies “length contraction”. When an object is measured in a frame of reference that is at rest relative to the object, the length of the object, , is called the “rest length” or the “proper length” of the object. If that object is moving relative to an observer, the observer will measure the object to be shorter, and have a “contracted length”, , given by:
In Example 23.2, Alice measured a contracted distance between Earth and Proxima Centauri, as she was in a frame of reference that is moving relative to the Earth-Proxima Centauri reference frame. One point that is important to note is that length contraction only occurs along the direction parallel to the direction of motion.
Length contraction also allows us to discuss a famous paradox (the “barn”, or “ladder” or “barn-pole” paradox). Consider a train that has a rest length of , travelling at a speed such that . As the train goes by, from Earth, it appears to have a (contracted) length:
Suppose that there is a tunnel on Earth that is exactly long, so that the train, when contracted, will fit in the tunnel. When the train passes, an operator briefly closes (and re-opens) the doors at the ends of the tunnel, briefly “capturing” the train, and since the train is contracted, it never hits any of the doors, and all is fine.
From the train’s frame of reference, the train has a proper length of , and the tunnel is contracted to a length of:
Thus, from the train’s perspective, if the doors of the tunnel are closed, there is no way that the long train can ever fit in the long tunnel, as illustrated in Figure 11. So what happens when the operator on Earth closes the doors of the tunnel to briefly “capture” the train?
Clearly, people on the Earth and people on the train have to agree on whether the train was destroyed by the tunnel doors. The operator on Earth can clearly close both doors of the tunnel when the train is inside and not destroy the train. Hence, people on the train must agree that the train never collided with the doors, and that the doors were closed. The answer to this paradox lies in the fact that simultaneity is relative. The tunnel operator believes that she has closed the two doors of the tunnel at exactly the same time, precisely when the contracted train is lined up with the tunnel. However, to people on the train, in a different frame of reference, the doors did not close at the same time, since events that are simultaneous in one frame of reference are not necessarily simultaneous in a different frame of reference. To people on the train, there was never a time when the train was in the tunnel and both doors were closed at the same time!
23.6Electric and magnetic fields and Special Relativity¶
In this section, we present one more example to show how Special Relativity is connected to electromagnetism. Consider a wire that carries an electric current towards the left, and a positive charge, , located next to the wire, as illustrated in Figure 12.
Inside the wire, negative electrons are moving towards the right with a drift velocity, , while positive ions remain stationary. Since the charge has a velocity of zero, it experiences no magnetic force. Furthermore, the wire appears to be neutral, with no net electric charge.
If the charge, , has a velocity, , towards the right, it will experience a downwards magnetic force, as illustrated in Figure 13.
Now, consider this from the perspective of the charge, , as illustrated in Figure 14. The charge is moving towards the right at the same speed as the electrons in the wire. In the reference frame of the charge, , the charge has a velocity of zero, and thus will experience no magnetic force. The wire still appears to have a (different) current, , as the positive ions move to the left, creating a magnetic field, , out of the page.
In the “lab” frame of reference, where the electrons and the charge move towards the right at the same speed, , the electrons appear closer together (length contracted) than they are in the frame of reference of the electrons (or of the charge , since it is moving with the electrons). In the frame of reference of the charge , the electrons thus appear to be spaced further apart (less dense). On the other hand, in the frame of reference of , the positive ions, which are moving towards the left, appear closer together, as the distance between them is now contracted, as illustrated in Figure 14.
In the frame of reference of the charge , the wire no longer appears neutral, but appears to have a net positive charge. This results in an electric field away from the wire that will exert a downwards force on . In both frames of reference, we conclude that the charge will experience a downwards force. Whether that force is magnetic or electric depends on the frame of reference! Here, we came to the conclusion by using the notion of length contraction, but remember that length contraction itself is a consequence of Maxwell’s equations holding in different frames of reference, as we illustrated at the beginning of this chapter.
In most real-world applications, we do not see the effects of Special Relativity, as the speeds involved must be very high for the gamma factor to be appreciably greater than one. However, we see these effects in electromagnetism even though the drift speed of electrons in a wire is usually (much) less than . This is because, when dealing with the electric and magnetic forces (fields), even a minuscule length contraction of the electrons/ions at those speeds leads to relativistic effects. This can be thought of in terms of how strong the electric force really is; even a minute change in charge density (due to length contraction) has a sizeable relativistic effect.
23.7Lorentz transformations and space-time¶
23.7.1Four-dimensional space-time¶
So far, we have seen that our notions of time intervals (the time between two events) and space intervals (the distance between two locations) depend on our frame of reference. We also saw how space and time are connected, for example by the fact that time-dilation must go hand-in-hand with length contraction. Additionally, we concluded that there is no absolute concept of time, and that time is relative.
In the context of Special Relativity, we introduce the concept of space-time. To describe the location of an object in space-time, we must specify both the location/position coordinates (, , ) and the time “coordinate”, . We usually specify the time coordinate by multiplying it by speed of light, , so that it has dimensions of length rather than time. Thus, position in space-time is given by 4 coordinates: .
23.7.2Space-time diagrams¶
It is practically impossible to visualize situations in three dimensions, so four dimensions is hopeless! However, we can gain a lot of insight into Special Relativity models by using “space-time diagrams”. In a space-time diagram, we use only one of the space coordinates (typically ) along with the time coordinate, , to define the two axes of a space-time diagram. Space-time diagrams are analogous to “position as a function of time” graphs that one would draw in kinematics, although they are fundamentally different in that, for a space-time diagram, the coordinates should be thought of as independent (one is not plotting a dependent variable as a function of an independent variable).
Figure 15 shows a space-time diagram for an object that was located at position at time (location ), and at position at time (location ). The path of an object through space-time, indicated by the line that connects and , is called the “world line” of the object.
A pulse of light travelling in the direction will always have a world-line that makes a angle with the horizontal (space) axis (since ). The world line of any object that travels with a speed below the speed of light must always make an angle with the horizontal axis that is greater than .
A position in space-time is usually called an “event”. We can draw a set of lines, at degrees from the horizontal axis, that intersect at an event in space-time. Those lines define two “light cones” corresponding to: (1) locations in space-time in the past that could have had a causal effect on the event (the “past light cone”), and (2), locations in space-time in the future for which the event can have a causal effect (the “future light cone”).
Figure 16 shows the light cones associated with an event, , in space-time. The past light cone is the only region of space-time in which a different event could have had an impact on the event . For example, the event might be that “the object is at position at time ”, so that the past light cone corresponds to the only locations in space-time that the object could have been in the past. Similarly, the future light-cone defines the locations in space-time upon which the event could have an effect. For example, this could define the possible locations of the object in the future. The regions outside the light cones can never have an effect on the event ; they are not causally connected. A signal or object would need to travel faster than the speed of light in order to have an effect on something outside of its light cone. There are locations in space-time, in the future of our Universe, that we cannot influence, no matter what we do.
When two events in space-time are within each other’s light cones, we say that the space-time interval between them (the line that you draw from one event to the other) is “time-like”. Time-like events are such that all observers, in any frame of reference, will agree that one event happened before the other. Thus, events that are causally related must have a time-like interval between them (they are connected by a line that makes an angle greater than with the horizontal axis).
Two events that are outside of each other’s light cones are said to be “space-like”. Events that are connected by space-like intervals cannot be causally related (one cannot cause the other). Observers in different frames of reference will disagree on the time ordering of space-like events. For example, when Alice observed the two clocks on the platform to emit pulses of light at the same time, Brice disagreed; those two events are connected by a space-like interval.
Finally, the space-time interval between events that are on each other’s light-cone (connected by a line that makes a angle with the x-axis), is said to be “light-like” or “null”.
23.7.3Lorentz transformations¶
In this section, we consider how to transform the space-time coordinates, , as measured in a frame of reference, , to coordinate , as measured in a frame of reference, , that is moving with a constant speed, , relative to the frame, . For simplicity, we assume that frame is moving with speed in the positive direction, as measured in frame, , and that the origin of the two coordinate systems coincided at time . Figure 17 shows an illustration of how the two frames of reference are related (note that these are actual coordinate systems, not space-time diagrams).
If we ignore any of Special Relativity, then the coordinates in are easily related to those in the frame of reference using the “Galilean transformations”:
and this corresponds to transformations that we have implicitly used before considering Special Relativity. These equations also allow us to relate the speeds measured in different frames of reference. Suppose that an object has a velocity, , as measured in the frame of reference, . We can obtain the components of the velocity vector, , as measured in the frame of reference, , by taking the time derivatives of the above equations:
which is trivial, since . The transformations above are equivalent (identical) to the rules for transforming velocity that we derived in Section %s for kinematics. In Galilean relativity, time is an absolute quantity that does not depend on the frame of reference. In Special Relativity, the time coordinate depends on the frame of reference, so we cannot simply convert a time derivative in to a derivative in . Instead, we must use the Lorentz transformations.
We can use the formulas for length contraction and time dilation to derive the Lorentz transformations. Referring to Figure 17, refers to the distance between a point in space-time and the origin of the axis in frame , as measured in frame . Similarly, , is the distance to the point in space-time as measured in frame , from the origin of . In frame , the distance will be contracted to the length , so that the Galilean transformation for the coordinate is modified as follows:
The and coordinates are the same between frames of references, since all of the length contraction will take place in the direction of the relative motion between frames of reference, which we chose to be in the direction.
We can obtain the equation for the time coordinate by considering that, in the frame of reference, it is the coordinate that is contracted to . In the frame of reference, the distance between the origins of the two systems is (note the prime on ). We can thus write the contracted distance , in the frame of reference:
We can eliminate from the last equation using the Lorentz transformation for that we just found:
where we wrote out the γ factor out explicitly in the fourth line. We can summarize the Lorentz transformations as follows:
and the inverse relations are easily found:
Note that the Lorentz transformations reduce to the Galilean transformations when the speed, , between frames of reference is small (so that ).
Einstein’s second postulate states that the speed of light is independent of the frame of reference. Consider two points in space-time corresponding to the emission () and the absorption () of a pulse of light. In the reference frame, , the distance squared in space between these two events must be equal to the distance (squared) that light travelled between the time of emission and absorption:
where and are the space-time coordinates of events and . The above equation must hold in all frame of references (e.g. adding a prime to each coordinate), since it is a statement that the speed of light is .
We can define as the “space-time interval” between events and :
which turns out to be “Lorentz invariant” (meaning that this value is the same in all reference frames). The space-time interval can be thought of as a “distance” in space-time that is the same in all reference frames. If the events and corresponds to the emission and absorption of light, then , and we say that the interval between and is light-like or null. If , the events are on a time-like interval, and if , the events are separated by a space-like interval. Since does not depend on the frame of reference, all observers will agree on whether events are separated by time or space-like intervals.
We can visualize the effect of Lorentz transformations on space-time diagrams, as in Figure 18, which shows the space-time diagrams for a reference frame, , and a second reference frame, , moving with speed in the direction relative to .
The effect of the Lorentz transformation on a space-time diagram is to tilt both the space and time axes “inwards”[46], by an angle, α, given by:
Figure 18 shows a light-like interval between two points, and , and how to determine the space-time coordinates in the two reference frames. You can think of space-time as the sheet of paper on which events happen. You can then draw different coordinate systems on that piece of paper to describe the position (in space and time) of different events.
23.7.4Lorentz addition of velocities¶
In the previous section, we reviewed the Galilean velocity transformations that allow us to convert a velocity, , as measured in one frame of reference, to a velocity, , measured in a different frame of reference. We now derive the equivalent relations based on the Lorentz transformation. Again, we assume that frame moves in the positive direction with speed , relative to frame .
The component of the velocity vector, , for some object in the frame of reference is given by:
In Galilean relativity, we could simply replace the derivative over by a derivative over , since the two are equivalent. This is no longer the case. However, we can use the chain rule and the Lorentz transformations to convert a derivative over to a derivative over :
where we recognized that . The component of the velocity, as measured in the frame of reference, is then given by:
where we made use of the Lorentz transformation: . We can proceed in a similar way to determine the and components. Note that, unlike the Galilean case, all of the velocity components must transform, since the time derivative is involved for each component. Intuitively, we expect all components of velocity to be affected, since one needs to guarantee that the total speed is always below . The velocity transformations for all components are given by the following:
and the reverse transformations are given by:
23.8Relativistic momentum and energy¶
In this section, we show how to define momentum and energy in a way that is consistent with the postulates of Special Relativity. We expect that, since time and space depend on the frame of reference of the observer, so too will the momentum and the energy of an object. Consider an object of mass , moving in frame of reference , with velocity (we reserve to represent the speed between two inertial frames of reference) in the direction. At some time, , the object will be at position along the axis. We define the relativistic momentum as:
where is the time as measured in the rest frame of the object. By defining momentum in terms of the proper time of the object, all observers will agree on the value of . In the frame of reference, , (with time ) this corresponds to:
where is the speed of the particle in frame, . We can use time dilation to re-express the derivative:
where in the last line, we simply took the limit of an infinitesimally short time interval. Therefore, the relativistic momentum of the particle in frame can be defined as
where γ is calculated with the same speed, , since that is the speed of the reference frame of the object relative to . Note that as the speed, , of the particle approaches the speed of light, the factor of γ approaches infinity. This means that an object with a mass can never reach the speed of light, as it would have an infinite momentum. In order to define momentum in a way that resembles the classic definition, one can think of the mass of the object as depending on the speed of the object. We define the rest-mass, , of the object as the mass that is measured when the object is at rest. We can then model the mass of the object as increasing with its speed:
so that the relativistic momentum would be defined as:
In this case, we can think of the mass of the object as increasing with its speed. The object would acquire infinite mass if it were to reach the speed of light.
With the relativistic definition of momentum, Newton’s Second Law can be written as:
Recall that we defined kinetic energy in Section 7.3 by defining the change in kinetic energy of an object as the net work done on that object. We use the same formalism here to redefine kinetic energy using relativistic dynamics.
The work done by the net force, , on an object that goes from position to position is given by
where we recognized that an infinitesimal segment, , along the path of the object is given by . The time infinitesimals, , cancel, and we are left with:
which we can integrate by parts. We can integrate this over the speed, , and we assume that the object started with a speed of at the beginning of the path and has a speed, , at the end of the path:
Since the object started at rest (with a speed ) the above integral corresponds to what we would call the kinetic energy of the object, with speed :
This form for the relativistic kinetic energy of the object is not at all similar to the form that we obtained in classical physics. As the speed of the object approaches the speed of light, the γ factor approaches infinity, as does the kinetic energy. Thus, it would take an infinite amount of work to accelerate an object to the speed of light, and again, we see that it is impossible for anything with mass to ever reach the speed of light. The formula above, however, should always be correct, even in the non-relativistic limit, when . We can approximate the gamma factor using the binomial expansion for the case where :
So that, when (and ), the gamma factor is approximated by:
In this limit, the relativistic kinetic energy reduces to:
which is the classical definition of kinetic energy. The kinetic energy is also zero when the speed is zero.
The kinetic energy has two terms in it:
The first term increases with speed and behaves as we would expect. The second term is constant, and depends only on the rest mass of the object (we call this term the rest mass energy). We can think of this in slightly different terms. Let us define the total energy, , of the object as:
so that the total energy is just the rest mass energy plus the kinetic energy. This highlights a key aspect of Special Relativity. An object will have energy, , even when it is at rest. That energy, at rest, is called the rest mass energy, and corresponds to energy that an object has by virtue of having mass. This is, of course, Einstein’s famous equation:
This equation implies that mass can be thought of as a form of energy. Nuclear reactors function by converting a small amount of mass of uranium atoms into energy (in the form of heat), that is then used to produce high pressure steam to rotate a turbine.
Einstein’s relation is often used to express the mass of subatomic particles in terms of energy. For example, an electron has a mass of in these units.
Finally, it is interesting to examine the relationship between the momentum and the energy of a relativistic object. Consider the quantity :
where we recognized that is simply the energy, , squared. This is generally called the “energy-momentum” relation and written:
An interesting consequence of this relationship is that particles with no mass will still have a momentum. For example, the photon, which is a particle of light and must thus have a mass of zero (or it could not move at the speed of light), will have a momentum given by:
Thus, one can use light to impart momentum to something. This is how a solar sail, a proposed propulsion mechanism for space travel, operates.
23.9Closing remarks¶
In this chapter, we introduced the first hints of how the laws of physics become counter-intuitive and quite bizarre. One can wrap one’s head around Newton’s Second Law, , and develop some intuition as to how an object may behave. However, it is difficult to imagine how people age slower if they travel faster, and how cars become shorter when they are moving. However, as far as we can tell, this is the best way to describe the Universe around us.
This all goes back to our original statements about physics. The goal is to come up with rules that allow us to describe Nature. It’s nice when those rules make sense, but, unfortunately, that is not a requirement. It does appear that the rules that describe Nature do not make sense, at least not based on our common experience, living in a macroscopic world where speeds are much less than the speed of light. With Special Relativity, we introduced the modern framework for modelling dynamics. We have not introduced Quantum Mechanics, which describes how elementary particles behave.
Quantum Mechanics is even less intuitive than Special Relativity, as it implies that particles act as if they are in multiple places at the same time. Even worse, Quantum Mechanics requires us to abandon the concept of determinism that is critical in Classical Mechanics; in Quantum Mechanics, we can only ever determine probabilities. For example, we can only determine the probability that a particle will be at a particular location at a particular time, but we cannot use kinematics and dynamics to predict where it will be at some time based on the forces acting upon it.
If you decide to pursue further studies in physics, you will get to learn more about these theories, which are quite marvellous. It should not bother you that physics is not intuitive, as that is not the purpose. The exciting part of physics is that, even if Nature behaves in an exquisitely weird way, it does appear that this can all be described with a rather limited set of mathematical equations. One can argue that there is beauty in the fact that succinct mathematics can describe a large number of seemingly unrelated phenomena, as Newton’s Universal Theory of Gravity was able to describe both the motion of a falling apple and the orbit of the moon.
23.10Summary¶
The Theory of Special Relativity is based on Einstein’s two postulates:
- The laws of physics are the same in all inertial reference frames. There is no experiment that can be performed to determine whether one is at rest or moving with constant velocity.
- The speed of light propagating in vacuum is the same in all inertial reference frames. Any observer in an inertial frame of reference, regardless of their velocity, will measure that light has a speed of , when it propagates in vacuum.
These postulates are required in order for the equations from electromagnetism to be valid in all inertial frames of reference. However, they lead to very counter-intuitive results. For example, if two events, and , are simultaneous in one frame of reference, an observer in a different frame of reference will observe event to happen earlier/later than event (earlier or later will depend on the direction of motion of the moving observer).
The Theory of Special Relativity allows us to relate observations made in one inertial frame of reference, , to observations made in a different inertial frame of reference, , that is moving with constant velocity, , relative to . We always choose to define the axis in the and frames of reference so that they are both co-linear with the velocity of , , which is defined to be in the positive direction in frame, . Furthermore, we assume that the origin of both frames of reference coincided at time .
We define the gamma factor, γ, based on the speed, , of relative to :
The gamma factor is always greater or equal to 1. If a time interval, , is measured in frame, , then a “dilated” time interval, , will be measured in frame :
since . We call the time that is measured in a frame of reference that we consider “at rest” to be the “proper time” in that frame of reference. For example, a muon decays in when at rest. If a muon moves at high speed, in the frame of reference where the muon is moving, it will take longer (time dilation), for the muon to decay. The time is the “proper time” for the muon decay (since it is measured when the muon is at rest). As a consequence of time dilation, observers in different frames of reference will measure different lengths due to “length contraction”. If an object has a “proper length”, , in a frame of reference, , that is at rest relative to the object, the object will have a contracted length, , in a reference frame, , moving with speed, , relative to :
Note that only the dimension of the object that is co-linear with the velocity vector, , is contracted.
We also noted that Special Relativity is intimately connected to electromagnetism. In particular, we described how what we model as a magnetic force in one frame of reference might be modelled as an electric force in a different frame of reference.
In order to describe the motion of objects, we found that we need to define a four-dimensional space-time, where positions in space-time are labelled by 4 “coordinates”, , instead of the usual 3 (space) position coordinates. This is a result of the fact that time is no longer absolute and depends on the frame of reference (e.g. time dilation).
In space-time, we think in terms of events that occur at specific locations in space and instants in time. We can visualise space-time using “space-time diagrams”, where one axis corresponds to space (), and the other axis corresponds to time (). The path of an object through space-time is called its “world line”.
For a given event in space-time, we can define past and future “light cones”. Only events in the past light-cone could have had a causal effect on the event. Similarly, only events in the future light-cone can ever be influenced by that event. Events that can be causally connected (within each other’s light cones) are said to be “time-like”. Events that are outside of each other’s light cones are said to be “space-like”. If two events are time-like, all observers will agree on the order in which the events happened, preserving the notion of causality. Different observers can disagree on the order in which space-like events occurred.
The Lorentz transformations allow us to convert the coordinates of events in one frame of reference, , to those in a frames, , moving with constant speed, , relative to :
and the inverse relations are easily found:
Certain quantities, which are measured to be the same in all frames of reference, are said to be “Lorentz invariant”. In particular, we can define the space-time interval, , between two events in space-time as:
One can think of this as a sort of “distance” in space-time, that does not depend on the frame of reference.
If an object has a velocity vector, , as measured in frame of reference , then its velocity, , in a frame, , moving with speed, , relative to , is given by:
and the reverse transformations are given by:
In order for momentum and energy to be conserved in Special Relativity, these need to be redefined. If a particles with rest mass, , has a velocity, , in an inertial frame of reference, its relativistic momentum, , is defined to be:
where the gamma factor is evaluated using the speed, :
This relativistic definition of momentum is equivalent to the classical definition when . We can think of relativistic momentum in the same way as classical momentum, if we model the mass of the object as increasing with its speed:
where is the mass of the object measured when the object is at rest (its “rest mass”). An object with a rest mass can never reach the speed of light, as this would correspond to it having infinite momentum (or infinite mass).
With the relativistic definition of momentum, one can still use Newton’s Second Law in the form:
We define the total energy, , of an object as:
which has a contribution from its kinetic energy, , and from its mass (the second term). The energy that an object has by virtue of having a mass is called “rest mass energy”, which implies that mass and energy can really be thought of as the same thing; one can convert mass into energy and vice versa (as in a nuclear reactor).
The kinetic energy of an object moving with speed, , is given by:
where the gamma factor is obtained using the speed, . This relativistic definition of kinetic energy is equivalent to the classical definition when . The total energy of a particle can also be written as:
Since energy and mass are simply related by a constant, one can use units of energy to describe the mass of a particle. It is common in particle physics to express the mass of particles in units of .
Finally, we saw that the relativistic momentum and energy of an object are related:
In particular, particles of light, which have no mass but have kinetic energy, have non-zero momentum:
23.11Thinking about the material¶
23.12Sample problems and solutions¶
23.12.1Problems¶
23.12.2Solutions¶
Solution 23.1
In order to determine the amount of energy released in each reaction, we need to determine the difference in mass between the two sides of the equation:
On the left-hand side, the total mass is:
whereas on the right-hand side, the total mass is:
Thus, the total energy released in each reaction is given by:
where we showed the answer in both and . Although it may not seem like that much energy per reaction, keep in mind that there are of order reactions per second in the Sun, corresponding to a power output of order , enough to keep us warm in the summer.
Solution 23.2
- a. From the total energy, we can calculate the gamma factor, which will give us the velocity of the proton (in the reference frame of the scientist):
- b. In the frame of the lab, when one second goes by, the proton will travel a distance:
- c. In order to find out how far the proton travels in the lab when one second of proper time goes by in the proton’s frame of reference, we need to determine how much time went by in the lab’s frame of reference.
The gamma factor for the proton can be obtained from the speed that we determined in part a), or from the total energy directly:
Thus, when elapses in the proton’s frame of reference, a time dilated time, , elapses in the lab frame of reference:
In the lab frame, the proton will travel a distance: