The cocoercive bound is arguably the most useful inequality in smooth, convex optimization. Unlike the quadratic upper bound alone, which gives a clean visual picture of \(L\)-smoothness, cocoercivity provides a necessary and sufficient condition for a function to be both smooth and convex. Below we derive it geometrically in the 1-dimensional setting, but this can easily be generalized to any dimension using the gradient instead of the derivative and inner products instead of multiplication.
For a convex, \(L\)-smooth function \(f\), two key bounds hold simultaneously. From smoothness, the quadratic upper bound (in orange) for any \(y\) and \(z\) has
which we write as
This represents the distance from the upper bounding quadratic (at \(y\)) to the function itself (at \(z\)).
From convexity, the linear lower bound (in blue) has for any \(z\) and \(x\)
which we write as
Adding these two inequalities, the gap for fixed \(x\) and \(y\) from the linear lower bound to the quadratic upper bound
is nonnegative everywhere. Minimizing \(d_{x,y}\) over \(z\) by setting \(d_{x,y}’(z) = 0\) yields the minimizer \(z^\star \;=\; y - \frac{1}{L}\bigl(f’(y) - f’(x)\bigr)\). Evaluating \(d(z^\star) \geq 0\) and simplifying gives the cocoercive bound:
In the above graph this minimimum distance between the quadratic and linear bounds is highlighted in green. Finally, since \(x\) and \(y\) were chosen arbitrarily, we can this as