Polynomials and degree
A polynomial is a sum of terms, each a number times a whole-number power of the variable:
The degree is the largest power with a nonzero coefficient. Degree is a line, , last lesson’s object. This lesson is degree :
That term bends the graph. A degree-2 polynomial is a quadratic, and its graph is a parabola: a single smooth U (or, when is negative, an upside-down U).
One curve, three equations
The parabola is the real object. The equation is just how you write it down, and there are three useful ways, each anchored to something different.
Standard form, : the default, good for reading off (the -intercept).
Vertex form, : anchored at the vertex , the parabola’s turning point. This form says “take the simplest parabola and shift it to sit at .”
Factored form, : anchored at the roots , the spots where the curve crosses the -axis.
Drag the vertex to move the parabola; drag the steepness handle to bend it. All three equations update at once. They are never in conflict, because they are three descriptions of the one curve you’re holding.
The vertex sits at -b/(2a)
A parabola is perfectly symmetric about a vertical line through its vertex. That line sits at
so the vertex’s -coordinate is , and its height is whatever you get by plugging that back in. Every parabola written in standard form hands you its turning point through this one formula.
Find the turning point
The quadratic has and .
Find the -coordinate of its vertex using . What is it?
Factoring: roots by sum and product
For a quadratic (with ), factoring means finding two numbers so that
Multiply the right side out and match: the two roots multiply to and add to . That pairing is called Vieta’s relation, and it turns factoring into a small search: find two numbers with the right product and the right sum.
For : you need two numbers multiplying to and adding to . That’s and . So , and the roots are and .
Zero product, not zero and
Once factored, finding the roots uses one property: if a product is zero, at least one factor is zero.
So forces or . That’s an or, giving two separate roots, and . It is not an and; no single makes both factors zero at once. The parabola crosses the axis at each root in turn, not at both simultaneously.
Factor and read the roots
Factor into by finding two numbers with product and sum .
What is the smaller of the two roots?
Completing the square, once
Factoring is lovely when the roots are tidy integers. They usually aren’t. So instead of guessing, derive a formula that always works, by completing the square.
Start from . Divide by , move the constant, and add exactly the right number to make the left side a perfect square:
Take the square root of both sides and solve for . Nothing here is a trick; it’s the rewrites from lesson 1, applied with care. What falls out is the quadratic formula:
It is completing the square, frozen into a reusable result. Factoring is just this formula’s easy special case, the one where the roots happen to be nice.
The discriminant counts the roots
Look at the piece under the square root:
This is the discriminant, and its sign alone tells you the root structure before you compute anything.
- : the square root is a real nonzero number, the splits it, two real roots. The parabola crosses the -axis twice.
- : the square root is zero, the does nothing, one repeated root. The parabola just kisses the axis at its vertex.
- : the square root of a negative number is not real, no real roots. The parabola floats entirely above or below the axis.
You can see why in the widget: as the parabola slides off the -axis, the discriminant badge flips sign in step with the crossings disappearing.
Compute the discriminant
For , compute the discriminant .
What is ? (Its sign tells you this parabola never touches the -axis.)
Where this goes next
A quadratic is the simplest curved function, and the parabola is the shape of the simplest loss surface you can optimize. When module 10 teaches gradient descent, the first bowl it rolls a ball down is a parabola, and the minimum it hunts for is the vertex you’ve been dragging around.
g(f(x)) is one layer wired into the next. log(∏) = ∑(log) is why a million tiny probabilities don’t sink training. Everything that follows is those two facts at scale, and the parabola is the curve they first bend.
Lesson complete