Preliminary Mathematics for online MSc programmes in Data AnalyticsUnit 2: Differentiation in 1D (minima and maxima)
Differentiation
Introduction to differentiation
We are often interested in the rate at which some variable is changing. For example, we may be interested in the rate at which the temperature is changing in a chemical reaction or in the rate at which the pressure in a vessel is changing. Rapid rates of change of a variable may indicate that a system is not operating normally and is approaching critical values.
Rates of change may be positive, zero, or negative. A positive rate of change means that the variable is increasing; a zero rate of change means that the variable is not changing; while a negative change of rate means that the variable is decreasing.
Consider the function for , shown below.
Between and , the function is decreasing rapidly. Across this interval the rate of change of the function is large and negative. Between and the function is still decreasing but not as rapidly as before. Across this interval the rate of change of the function is small and negative. There is a small interval, that the function seems to not change at all. Across that interval the rate of change is zero. Between and the function is increasing rapidly; the rate of change is large and positive.
It is often not sufficient to describe a rate of change as "large and positive" or "small and negative". A precise value is needed. The technique for calculating the rate of change of any function is called differentiation. Use of differentiation provides a precise value or expression for the rate of change of a function.
Average rate of change across an interval
We have already seen that a function can have different rates of change at different points on its graph. Let's first define and calculate the average rate of change of a function across an interval and later on we will also define the rate of change at a point. The figure below shows a function ; two possible argument values, and , and their two respective outputs and .
Consider that is increasing from to . The change in is . As increases from to , then the function increases from to . The change in is . Then the average rate of change of across the interval is
Another way to think of the average rate of change of a function is by visualising it as the slope of a line that passes through two points on the function. This line, called a secant line, can be drawn on a graph of a function so that we can quantify the value of the slope of the line. A secant line passing through the points and has a vertical rise of and a horizontal run of . The slope of the line, between the points and , is (which is exactly the same as the average rate of change).
Let's calculate the average rate of change of across the following intervals
(a) to (b) to
For the first interval the change in is equal to . When , ; while when , . Thus, the change of is . So, the avarage rate of change across the interval is . What does this mean though? It means that across the interval , on average the value increases by for every unit increase in .
This is a good time for you to try out the second interval. (The average rate of change turns out to be 2.)
Rate of change at a point
We often need to know the rate of change of a function at a point, and not simply an average rate of change across an interval. Let's assume that is really close to . To better reflect this is our notation, we will call what we used to call , , and what we used to call , , with being a very small number.
As mentioned earlier, the average rate of change of across the interval is
What do you think would happen if we assumed that the distance, , between the two points was made increasingly small (in Mathematics notation )?
If we assumed that, it would mean that the second point is really close to . This is exactly what we will assume in order to find the rate of change at the point . Let's say that we assumed that . If we now focus again on the graph above and assume that , the distance between the two points and would get smaller and likewise the difference between their respective outputs, and , would also get smaller. We can define those respective differences as and respectively. The term reads as "delta x" and represents a small change in the direction. In our case and .
Thus, the rate of change at a point is
Let's look at a couple of examples first and then focus on terminology and notation.
One of the simplest functions to consider is a linear function. Let's assume that we have .
What should we do if we want to find the rate of change at any point of the function? (We want to essentially answer the question "What is the change in the direction when the change in the direction is small")
Let's use the definition we saw earlier and calculate the rate of change at any point of the function (think of it as looking at the two points and with ).
Wait. The rate of change for the function at any point is 2? What does that mean?
It means that the value increases by for every small increase, , in . So it doesn't matter which value we are looking at (e.g. or ); the value will always increase by for every small increase, , in (i.e. or where ).
For nonlinear functions a one unit increase in the value of leads to different increases in .
Consider a quadratic function .
Before we use the previous definition and calculate the rate of change at any point, let's try something else.
What will happen to the values:

if and we increase it by unit (i.e. )? The values will increase by (i.e. ).

if and we increase it by unit (i.e. )? The values will increase by (i.e. ).

if and we increase it by unit (i.e. )? The values will increase by (i.e. ).
Thus, in a quadratic function a unit increase in leads to different increases in the values.
Let's now use the definition to find out what is happening in the values when is increased by with (instead of being increased by ).
So, the rate of change for the function at a point is . This means that the value increases by for every small increase, , in . Thus, the rate of change along a quadratic function is changing constantly (according to the value of we are looking at), the rate of change has to be computed separately at each possible value of . The rate of change is thus a local phenomenon: it does not give us any information about the rate of change globally.
Note that the rate of change, , for the function is itself a function of .
Terminology and notation
The process of finding the rate of change of a given function is called differentiation. The function is said to be differentiated. If (read " is equivalent to ") is a function of we say that is differentiated with respect to . The rate of change of a function is also known as the derivative of the function.
There is a notation for writing down the derivative of a function. If the function is , we denote the derivative of by
(read "dee y (by) dee x", "dee f of x dee x" and "f prime").
This is the point where you should start asking yourselves "Wait a minute, do I have to compute every time I need to find the derivative of a function at a point ?". Thankfully, the answer is no.
Table of derivatives
Table 1 lists some of the common functions used in Mathematics and Statistics and their corresponding derivatives. The symbols and are constants while the symbol represents a variable.
Function  Derivative 

constant  
Find the derivative of .
We note that is of the form where . This means that .
Find the derivative of .
This function is constant, hence its derivative is zero.
Find the derivative of .
This function is of the form with and , hence its derivative is .
Find the derivative of .
We first rewrite the function as . This means that the function is of the form with and . This means that .
Find the derivative of .
We first rewrite the function as . This means that the function is of the form with and . This means that .
Find the derivative of .
This function is of the form with , hence its derivative is .
Ok, that is a good start but what do we do with functions like , and ?
The first function involves adding two functions (the first one being of the form while the second one is a constant function).
The second function, , involves multiplying two functions ( and ) while the last one, , involves dividing two functions ( and ).
We need to introduce some simple rules to enable us to extend the range of functions that we can differentiate.
Rules of differentiation

Differentiation is linear: For any functions and and any real numbers and , the derivative of the function with respect to is

Product rule: For any functions and the derivative of a function with respect to is

Quotient rule: For any functions and the derivative of a function , where , with respect to is

Chain rule: The derivative of the function of a composite function with respect to is What is a composite function you ask? It is a function that takes another function as its argument. So, instead of having a function that has as its input, we have a function which takes as its input. Thus, it becomes .
Function  Derivative 

Find the derivative of .
This function is of the form with , , and . Hence, and , which yields
(We could have also used , , and .)
Find the derivative of .
This function is of the form with , , and . Hence, and