Dynamic Programming – An Introduction #1

Dynamic programming is a super power.

Once you understand what it is, how it works and grasp the many applications it has in both computing and in mathematical optimisation you will feel … special.

Maybe even … elite.

First a bit of history.

Richard Bellman published a very influential set of books and papers in the 1950s and 60s using the technique he named “Dynamic Programming”. Despite the weird name his technique was able to solve some very complex problems in operations research – which is the name for applied mathematics in problems that people in industry, government and defence care about.

DP is based on Bellman’s Principle of Optimality:

An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.

Imagine you are in the town of START which has two roads to the town of END – the High road and the Low road. Now 50 km down the High road is the town of A and 20 km down the Low road is the town of B. Furthermore someone clever has already worked out the shortest path from A to End and B to End which have many towns along the way – perhaps the paths even overlap. Have look at the figure:

Bellman’s optimality principle means in practice that the shortest path from Start to End is the minimum of :

High Road: $$Distance(Start, End)_A =dist(A) + Distance(A, End)$$ or 380 km.
Low Road: $$Distance(Start, End)_B = dist(B) + Distance(B, End)$$ or 410 km.

Bellman Equation: $$Distance(Start,End) = \min_i \{ d_{start,i} + Distance(i, End) $$

The proof is almost by definition. Assume there was a shorter path. Then we get an immediate contradiction: we asserted a priori that we knew the shortest paths from A and B to End. If there exists a shorter path then someone is lying.

I can’t decide if Bellman’s principle is profound or just obvious. Both I guess. But it is an enormous enabler because it allows us to write down a recursive equation for our optimisation problem known as the Bellman Equation. Solving the Bellman equation for a problem may be very difficult in practice but it is the starting point for analysing the problem in depth.

Dynamic Programming problems generally have these kind of features:

“Optimal Substructure”: The solution to the problem can be modelled recursively using solutions to smaller problems. This is the same idea as Bellman’s Optimality Principle and his equation. I also want to say that this is NOT the same thing as “divide and conquer” which relies on solving disjoint sub-problems independently.
Solutions generally involve a sequence of decision operations with each decision depending on the state at that time.
Often you can represent the problem as a graphical model in which you are looking for either the longest or shortest path from beginning to end.

All we have done so far is just give an introduction the main ideas of DP and some of the language involved. So far you won’t have noticed any glowing new super powers but we are on the high road towards dynamic programming enlightenment.

Next I want to show a very simple example of practical DP. From there I will build up to some more complex problems.