Appendix A - Calculus
A.1Overview¶
This appendix gives a very brief introduction to calculus with a focus on the tools needed in physics.
A.2Functions of real numbers¶
In calculus, we work with functions and their properties, rather than with variables as we do in algebra. We are usually concerned with describing functions in terms of their slope, the area (or volumes) that they enclose, their curvature, their roots (when they have a value of zero) and their continuity. The functions that we will examine are a mapping from one or more independent real numbers to one real number. By convention, we will use to indicate independent variables, and and , to denote functions. For example, if we say:
we mean that is a function that can be evaluated for any real number, , and the result of evaluating the function is to square the number . In the second line, we evaluated the function with . Similarly, we can have a function, of multiple variables:
We can easily visualize a function of 1 variable by plotting it, as in Figure A.1.
Plotting a function of 2 variables is a little trickier, since we need to do it in three dimensions (one axis for , one axis for , and one axis for ). Figure A.2 shows an example of plotting a function of 2 variables.
Unfortunately, it becomes difficult to visualize functions of more than 2 variables, although one can usually look at projections of those functions to try and visualize some of the features (for example, contour maps are 2D projections of 3D surfaces, as shown in the xy plane of Figure A.2). When you encounter a function, it is good practice to try and visualize it if you can. For example, ask yourself the following questions:
- Does the function have one or more maxima and/or minima?
- Does the function cross zero?
- Is the function continuous everywhere?
- Is the function always defined for any value of the independent variables?
A.3Derivatives¶
Consider the function that is plotted in Figure A.1. For any value of , we can define the slope of the function as the “steepness of the curve”. For values of the function increases as increases, so we say that the slope is positive. For values of , the function decreases as increases, so we say that the slope is negative. A synonym for the word slope is “derivative”, which is the word that we prefer to use in calculus. The derivative of a function is given the symbol to indicate that we are referring to the slope of when plotted as a function of .
We need to specify which variable we are taking the derivative with respect to when the function has more than one variable but only one of them should be considered independent. For example, the function will have different values if and are changed, so we have to be precise in specifying that we are taking the derivative with respect to . The following notations are equivalent ways to say that we are taking the derivative of with respect to :
The notation with the prime () can be useful to indicate that the derivative itself is also a function of .
The slope (derivative) of a function tells us how rapidly the value of the function is changing when the independent variable is changing. For , as gets more and more positive, the function gets steeper and steeper; the derivative is thus increasing with . The sign of the derivative tells us if the function is increasing or decreasing, whereas its absolute value tells how quickly the function is changing (how steep it is).
We can approximate the derivative by evaluating how much changes when changes by a small amount, say, . In the limit of , we get the derivative. In fact, this is the formal definition of the derivative:
where is the small change in that corresponds to the small change, , in . This makes the notation for the derivative more clear, is in the limit where , and is , in the same limit of .
As an example, let us determine the function that is the derivative of . We start by calculating :
We now calculate :
and take the limit :
We have thus found that the function, , is the derivative of the function . This is illustrated in Figure A.3. Note that:
- For , is positive and increasing with increasing , just as we described earlier (the function is increasing and getting steeper).
- For , is negative and decreasing in magnitude as increases. Thus decreases and gets less steep as increases.
- At , indicating that, at the origin, the function is (momentarily) flat.
A.3.1Common derivatives and properties¶
It is beyond the scope of this document to derive the functional form of the derivative for any function using equation (A.4). Table A.1 below gives the derivatives for common functions. In all cases, is the independent variable, and all other variables should be thought of as constants:
Table A.1:Common derivatives of functions.
Function, | Derivative, |
---|---|
If two functions of 1 variable, and , are combined into a third function, , then there are simple rules for finding the derivative, , based on the derivatives and . These are summarized in Table A.2 below.
Table A.2:Derivatives of combined functions.
Function, | Derivative, |
---|---|
(The product rule) | |
(The quotient rule) | |
(The Chain Rule) |
A.3.2Partial derivatives and gradients¶
So far, we have only looked at the derivative of a function of a single independent variable and used it to quantify how much the function changes when the independent variable changes. We can proceed analogously for a function of multiple variables, , by quantifying how much the function changes along the direction associated with a particular variable. This is illustrated in Figure A.4 for the function , which looks somewhat like a saddle.
Suppose that we wish to determine the derivative of the function at and . In this case, it does not make sense to simply determine the “derivative”, but rather, we must specify in which direction we want the derivative. That is, we need to specify in which direction we are interested in quantifying the rate of change of the function.
One possibility is to quantify the rate of change in the direction. The solid line in Figure A.4 shows the part of the function surface where is fixed at -2, that is, the function evaluated as . The point on the figure shows the value of the function when and . By looking at the solid line at point , we can see that as increases, the value of the function is gently decreasing. The derivative of with respect to when is held constant and evaluated at and is thus negative. Rather than saying “The derivative of with respect to when is held constant” we say “The partial derivative of with respect to ”.
Since the partial derivative is different than the ordinary derivative (as it implies that we are holding independent variables fixed), we give it a different symbol, namely, we use ∂ instead of :
Calculating the partial derivative is very easy, as we just treat all variables as constants except for the variable with respect to which we are differentiating[1]. For the function , we have:
At , the partial derivative of is indeed negative, consistent with our observation that, along the solid line, at point , the function is decreasing.
A function will have as many partial derivatives as it has independent variables. Also note that, just like a normal derivative, a partial derivative is still a function. The partial derivative with respect to a variable tells us how steep the function is in the direction in which that variable increases and whether it is increasing or decreasing.
Since the partial derivatives tell us how the function changes in a particular direction, we can use them to find the direction in which the function changes the most rapidly. For example, suppose that the surface from Figure A.4 corresponds to a real physical surface and that we place a ball at point . We wish to know in which direction the ball will roll. The direction that it will roll in is the opposite of the direction where increases the most rapidly (i.e. it will roll in the direction where decreases the most rapidly). The direction in which the function increases the most rapidly is called the “gradient” and denoted by .
Since the gradient is a direction, it cannot be represented by a single number. Rather, we use a “vector” to indicate this direction. Since has two independent variables, the gradient will be a vector with two components. The components of the gradient are given by the partial derivatives:
where and are the unit vectors in the and directions, respectively (sometimes, the unit vectors are denoted and ). The direction of the gradient tells us in which direction the function increases the fastest, and the magnitude of the gradient tells us how much the function increases in that direction.
The gradient is itself a function, but it is not a real function (in the sense of a real number), since it evaluates to a vector. It is a mapping from real numbers to a vector. As you take more advanced calculus courses, you will eventually encounter “vector calculus”, which is just the calculus for functions of multiple variables to which you were just introduced. The key point to remember here is that the gradient can be used to find the vector that points in the direction of maximal increase of the corresponding multi-variate function. This is precisely the quantity that we need in physics to determine in which direction a ball will roll when placed on a surface (it will roll in the direction opposite to the gradient vector).
A.3.3Common uses of derivatives in physics¶
The simplest case of using a derivative is to describe the speed of an object. If an object covers a distance in a period of time , it’s “average speed”, , is defined as the distance covered by the object divided by the amount of time it took to cover that distance:
If the object changes speed (for example it is slowing down) over the distance , we can still define its “instantaneous speed”, , by measuring the amount of time, , that it takes the object to cover a very small distance, . The instantaneous speed is defined in the limit where :
which is precisely the derivative of with respect to . is a function that gives the position, , of the object along some axis as a function of time. The speed of the object is thus the rate of change of its position.
Similarly, if the speed is changing with time, then we can define the “acceleration”, , of an object as the rate of change of its speed:
A.4Anti-derivatives and integrals¶
In the previous section, we were concerned with determining the derivative of a function . The derivative is useful because it tells us how the function varies as a function of . In physics, we often know how a function varies, but we do not know the actual function. In other words, we often have the opposite problem: we are given the derivative of a function, and wish to determine the actual function. For this case, we will limit our discussion to functions of a single independent variable.
Suppose that we are given a function and we know that this is the derivative of some other function, , which we do not know. We call the anti-derivative of . The anti-derivative of a function , written , thus satisfies the property:
Since we have a symbol for indicating that we take the derivative with respect to (), we also have a symbol, , for indicating that we take the anti-derivative with respect to :
Earlier, we justified the symbol for the derivative by pointing out that it is like but for the case when . Similarly, we will justify the anti-derivative sign, , by showing that it is related to a sum of , in the limit . The sign looks like an “S” for sum.
While it is possible to exactly determine the derivative of a function , the anti-derivative can only be determined up to a constant. Consider for example a different function, , where is a constant. The derivative of with respect to is given by:
Hence, the function is also an anti-derivative of . The constant can often be determined using additional information (sometimes called “initial conditions”). Recall the function, , shown in Figure A.3 (left panel). If you imagine shifting the whole function up or down, the derivative would not change. In other words, if the origin of the axes were not drawn on the left panel, you would still be able to determine the derivative of the function (how steep it is). Adding a constant, , to a function is exactly the same as shifting the function up or down, which does not change its derivative. Thus, when you know the derivative, you cannot know the value of , unless you are also told that the function must go through a specific point (a so-called initial condition).
In order to determine the derivative of a function, we used equation (A.4). We now need to derive an equivalent prescription for determining the anti-derivative. Suppose that we have the two pieces of information required to determine completely, namely:
- the function (its derivative).
- the condition that must pass through a specific point, .
The procedure for determining the anti-derivative is illustrated above in Figure A.5. We start by drawing the point that we know the function must go through, . We then choose a value of and use the derivative, , to calculate , the amount by which changes when changes by . Using the derivative evaluated at , we have:
We can then estimate the value of the function at the next point, , as illustrated by the black arrow in Figure A.5
Now that we have determined the value of the function at , we can repeat the procedure to determine the value of the function at the next point, . Again, we use the derivative evaluated at , , to determine , and add that to to get , as illustrated by the grey arrow in Figure A.5:
Using the summation notation, we can generalize the result and write the function evaluated at any point, :
The result above will become exactly correct in the limit :
Let us take a closer look at the sum. Each term in the sum is of the form , and is illustrated in Figure A.6 for the same case as in Figure A.5 (that is, Figure A.6 shows that we know, and Figure A.5 shows that we are trying to find).
As you can see, each term in the sum corresponds to the area of a rectangle between the function and the axis (with a piece missing). In the limit where , the missing pieces (shown by the hashed areas in Figure A.6) will vanish and will become exactly the area between and the axis over a length . The sum of the rectangular areas will thus approach the area between and the axis between and :
Re-arranging equation (A.29) gives us a prescription for determining the anti-derivative:
We see that if we determine the area between and the axis from to , we can obtain the difference between the anti-derivative at two points,
The difference between the anti-derivative, , evaluated at two different values of is called the integral of and has the following notation:
As you can see, the integral has labels that specify the range over which we calculate the area between and the axis. A common notation to express the difference is to use brackets:
Recall that we wrote the anti-derivative with the same symbol earlier:
The symbol without the limits is called the indefinite integral. You can also see that when you take the (definite) integral (i.e. the difference between evaluated at two points), any constant that is added to will cancel. Physical quantities are always based on definite integrals, so when we write the constant it is primarily for completeness and to emphasize that we have an indefinite integral.
As an example, let us determine the integral of between and , as well as the indefinite integral of , which is the case that we illustrated in Figures \ref{fig:Calculus:fint} and \ref{fig:Calculus:fintarea}. Using equation (A.32), we have:
where we have:
Note that is the number of times we have in the interval between and . Thus, taking the limit of is the same as taking the limit . Let us illustrate the sum for the case where , and thus when , corresponding to the illustration in Figure A.6:
where in the second line, we noticed that we could factor out the because it appears in each term. Since we only used 4 points, this is a pretty coarse approximation of the integral, and we expect it to be an underestimate (as the missing area represented by the hashed lines in Figure A.6 is quite large).
If we repeat this for a larger value of N, (), we should obtain a more accurate answer:
Writing this out again for the general case so that we can take the limit , and factoring out the :
Now, consider the combination:
that appears above. This corresponds to the arithmetic average of the values from to (sum the values and divide by the number of values). In the limit where , then the value . The average value of in the interval between and is simply given by the value of at the midpoint of the interval:
Putting everything together:
where in the last line, we substituted in the values of and . Writing this as the integral:
we can immediately identify the anti-derivative and the indefinite integral:
This is of course the result that we expected, and we can check our answer by taking the derivative of :
We have thus confirmed that is the anti-derivative of .
A.4.1Common anti-derivative and properties¶
Table A.3 below gives the anti-derivatives (indefinite integrals) for common functions. In all cases, is the independent variable, and all other variables should be thought of as constants:
Table A.3:Common indefinite integrals of functions.
Function, | Anti-derivative, |
---|---|
$F(x)=\ln( | |
Note that, in general, it is much more difficult to obtain the anti-derivative of a function than it is to take its derivative. A few common properties to help evaluate indefinite integrals are shown in Table A.4 below.
Table A.4:Some properties of indefinite integrals.
Anti-derivative | Equivalent anti-derivative |
---|---|
(sum) | |
(subtraction) | |
(multiplication by constant) | |
(integration by parts) |
A.4.2Common uses of integrals in Physics - from a sum to an integral¶
Integrals are extremely useful in physics because they are related to sums. If we assume that our mathematician friends (or computers) can determine anti-derivatives for us, using integrals is not that complicated.
The key idea in physics is that integrals are a tool to easily performing sums. As we saw above, integrals correspond to the area underneath a curve, which is found by summing the (different) areas of an infinite number of infinitely small rectangles. In physics, it is often the case that we need to take the sum of an infinite number of small things that keep varying, just as the areas of the rectangles.
Consider, for example, a rod of length, , and total mass , as shown in Figure A.7. If the rod is uniform in density, then if we cut it into, say, two equal pieces, those two pieces will weigh the same. We can define a “linear mass density”, μ, for the rod, as the mass per unit length of the rod:
The linear mass density has dimensions of mass over length and can be used to find the mass of any length of rod. For example, if the rod has a mass of and a length of , then the mass density is:
Knowing the mass density, we can now easily find the mass, , of a piece of rod that has a length of, say, . Using the mass density, the mass of the rod is given by:
Now suppose that we have a rod of length that is not uniform, as in Figure A.7, and that does not have a constant linear mass density. Perhaps the rod gets wider and wider, or it has holes in it that make it not uniform. Imagine that the mass density of the rod is instead given by a function, , that depends on the position along the rod, where is the distance measured from one side of the rod.
Now, we cannot simply determine the mass of the rod by multiplying and , since we do not know which value of to use. In fact, we have to use all of the values of , between and .
The strategy is to divide the rod up into pieces of length . If we label our pieces of rod with an index , we can say that the piece that is at position has a tiny mass, . We assume that is small enough so that can be taken as constant over the length of that tiny piece of rod. Then, the tiny piece of rod at , has a mass, , given by:
where is evaluated at the position, , of our tiny piece of rod. The total mass, , of the rod is then the sum of the masses of the tiny rods, in the limit where :
But this is precisely the definition of the integral (equation (A.29)), which we can easily evaluate with an anti-derivative:
where is the anti-derivative of .
Suppose that the mass density is given by the function:
with anti-derivative (Table \ref{tab:Calculus:commonints}):
Let and let’s say that the length of the rod is . The total mass of the rod is then:
With a little practice, you can solve this type of problem without writing out the sum explicitly. Picture an infinitesimal piece of the rod of length at position . It will have an infinitesimal mass, , given by:
The total mass of the rod is the then the sum (i.e. the integral) of the mass elements
and we really can think of the sign as a sum, when the things being summed are infinitesimally small. In the above equation, we still have not specified the range in over which we want to take the sum; that is, we need some sort of index for the mass elements to make this a meaningful definite integral. Since we already know how to express in terms of , we can substitute our expression for using one with :
where we have made the integral definite by specifying the range over which to sum, since we can use to “label” the mass elements.
One should note that coming up with the above integral is physics. Solving it is math. We will worry much more about writing out the integral than evaluating its value. Evaluating the integral can always be done by a mathematician friend or a computer, but determining which integral to write down is the physicist’s job!
A.5Summary¶
The derivative of a function, , with respect to can be written as:
and measures the rate of change of the function with respect to . The derivative of a function is generally itself a function. The derivative is defined as:
Graphically, the derivative of a function represents the slope of the function, and it is positive if the function is increasing, negative if the function is decreasing and zero if the function is flat. Derivatives can always be determined analytically for any continuous function.
A partial derivative measures the rate of change of a multi-variate function, , with respect to one of its independent variables. The partial derivative with respect to one of the variables is evaluated by taking the derivative of the function with respect to that variable while treating all other independent variables as if they were constant. The partial derivative of a function (with respect to ) is written as:
The gradient of a function, , is a vector in the direction in which that function is increasing most rapidly. It is given by:
Given a function, , its anti-derivative with respect to , , is written:
is such that its derivative with respect to is :
The anti-derivative of a function is only ever defined up to a constant, . We usually write this as:
since the derivative of will also be equal to . The anti-derivative is also called the “indefinite integral” of .
The definite integral of a function , between and , is written:
and is equal to the difference in the anti-derivative evaluated at and :
where the constant no longer matters, since it cancels out. Physical quantities only ever depend on definite integrals, since they must be determined without an arbitrary constant.
Definite integrals are very useful in physics because they are related to a sum. Given a function , one can relate the sum of terms of the form over a range of values from to to the integral of over that range:
A.6Thinking about the Material¶
A.7Sample problems and solutions¶
A.7.1Problems¶
A.7.2Solutions¶
Solution A.1
Solution A.2
We are given the speed of the train as a function of time, which is the rate of change of its position:
We need to find how its position, , changes with time, given the speed. In other words, we need to find the anti-derivative of to get the function for the position as a function of time, :
where is an arbitrary constant. The distance covered, , between time and time is simply the difference in position at those two times: