The pattern hiding in x², x³, x⁴
You already have two derivatives, earned the long way through limits:
Look at the exponents. Two becomes one. Three becomes two. The old exponent comes out the front as a multiplier. If the pattern holds, then
It holds. Not just for whole numbers either: it holds for , for , for .
A picture for why
Take the square with side . Its area is . Now grow the side by a tiny amount . The new area is , and the change in area is everything that just got added: two thin strips along the existing sides, each of size , plus a tiny corner square of size .
Divide by to get the rate of change of area per unit of side:
Send . The corner square vanishes; the two strips survive. The rate is . That is , derived from a picture instead of an algebraic limit.
The same picture works in three dimensions for : a cube grows three thin slabs of area , giving . In general, grows “slabs” each of size , hence .
The power rule
For any real number :
That is one line. It handles every power you will ever take, positive or negative, integer or fraction or irrational.
No new tricks. The same shape, every time.
Power, drilled
What is at ?
Constants and sums
Two rules so obvious they barely need a name. The derivative is linear: it slides through addition, and it slides past constant multipliers.
In words: a constant has zero slope (it never changes), a scalar multiplier passes straight through, and the derivative of a sum is the sum of the derivatives.
Combine those with the power rule and you can differentiate any polynomial by inspection:
The dies (it is a constant). Each remaining term loses one exponent and picks up the old exponent as a coefficient.
Polynomial, by inspection
Differentiate term by term, then evaluate the derivative at .
What is ?
Sine and cosine, derived from the unit circle
Two more derivatives, neither of which you can get from the power rule:
You already saw this in the previous lesson when you swept across on the tracer and the slope-curve was . The geometric reason: and are the height and width of a point spinning on the unit circle. A tiny rotation moves that point by along the circle, perpendicular to its current direction. The change in height (i.e., in ) is . The change in width (in ) is . The minus sign is the same minus sign you see in the formula.
If you want the full geometric derivation, it lives in module 3. For now the two formulas are tools to use.
The slope of cosine at the start
Use the rule above.
What is the slope of at ?
The exponential e is the function that is its own derivative
There is one function in mathematics whose derivative is itself:
That is the property that distinguishes from every other base. For any other base ,
For , the multiplier . For , . There is exactly one base where the multiplier is exactly , and the derivative becomes the function itself. That base is .
You can hunt for it by eye. Drag the base on the right panel below until the tangent at has slope . The left panel shows climbing to the same number from a completely different direction.
The two paths meet at . That is not a coincidence. It is one constant earning its name from two definitions at once.
Find the slope-1 base
On the widget above, drag on the right panel until the slope of at equals exactly .
What value of makes this happen? (Round to two decimals.)
The logarithm
One more, also new:
The justification will be cleanest after we have the chain rule, but the intuition is short. The natural logarithm is the inverse of . Inverses reflect across the line , which swaps “input” and “output” everywhere. If at the point has slope , then at the reflected point also has slope . Repeat at on the exponential — slope — and at the reflected point on the logarithm — slope . The slopes are reciprocals of each other. At , has slope .
What you have
Seven differentiation rules. With these, you can take the derivative of every elementary function you will meet in this course except those built by multiplying functions together or composing them. Those two cases — products and compositions — get their own rules next.
And the sum/scalar rules that glue them together. Most of the “differentiation” that runs inside a neural-network library is, ultimately, these seven formulas applied a trillion times.
Lesson complete