Video lectures of MIT course 6-00Fall-2008 “Introduction to Computer Science and Programming” are available as part of open courseware. One of the topics is “Dynamic Programming” where the knapsack problem is discussed. The programming language used in the course is Python. I wish however that scala or haskell was used instead.

The technique used in Dynamic Programming was invented by Bellman. The Professor said that Bellman was under US Government contract to work on something else and so he gave an obscure name “Dynamic Programming” so as to not arouse suspicion — and the name stuck. Reminds me of a Saturday Night Live skit: Dynamic Programming is neither Dynamic nor Programming — discuss among yourselves!

For the 0/1 knapsack problem, the brute force solution is an exhaustive enumeration and this is an exponential time algorithm. The professor then discusses a slightly better algorithm — where some paths are eliminated early — and uses the following Python code to implement it.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # 0/1 knapsack problem (python) numCalls = 0 def maxVal(w, v, i, aW): global numCalls numCalls += 1 if i == 0: if w[i] <= aW: return v[i] else: return 0 without_i = maxVal(w, v, i-1, aW) if w[i] > aW: return without_i else: with_i = v[i] + maxVal(w, v, i-1, aW - w[i]) return max(with_i, without_i) w1 = [5,3,2] v1 = [9,7,8] w2 = [1,1,5,5,3,3,4,4] v2 = [15,15,10,10,4,4,5,5] m1 = maxVal(w1, v1, len(v1) -1, 5 ) print (m1, numCalls) numCalls = 0 m2 = maxVal(w2, v2, len(v2) -1, 8 ) print (m2, numCalls) |

For the first problem where the weights are [5,3,2] and values are [9,7,8], the maxvalue of items that can be placed in the knapsack is 15 and the number of calls made is 7. For the second the maxvalue is 40 and number of calls is 85. Though this is better than brute force, it is still an exponential time algorithm.

I have coded a slightly modified version in the scala language. It gives the same answer but makes only 6 and 65 calls respectively. Not sure if this is better as there are potentially extra checks in each call.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | // 0/1 knapsack problem (scala) object knapsack { def max(a:Int, b:Int):Int = if (a>=b) a else b def knapMax(weights:List[Int], values:List[Int], weightAvailable:Int):(Int,Int) = { var numCalls = 0 def knap(nextIndex:Int, weightAvailable:Int, currentValue:Int):Int = { numCalls += 1 val remainingWeight = weightAvailable - weights(nextIndex) if ( nextIndex == 0 ) if ( remainingWeight >= 0 ) currentValue + values(nextIndex) else currentValue else max (knap(nextIndex-1, weightAvailable, currentValue), if ( remainingWeight > 0 ) knap(nextIndex-1, (weightAvailable - weights(nextIndex)), (currentValue + values(nextIndex))) else if ( remainingWeight == 0 ) currentValue + values(nextIndex) else -1) } val result = knap(weights.length - 1, weightAvailable, 0) (result, numCalls) } } knapsack.knapMax(List (5,3,2), List (9,7,8), 5) knapsack.knapMax(List (1,1,5,5,3,3,4,4), List (15,15,10,10,4,4,5,5), 8 ) |

To visualize the algorithm, I wrote a simple (unoptimized) program in haskell. This doesn’t compute the max value but creates a tree data structure that shows how the algorithm works.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | -- 0/1 knapsack problem (haskell) import Text.PrettyPrint import System.Cmd (system) data Tree a = Leaf a | Branch a (Tree a) (Tree a) deriving (Eq, Show) -- Create a tree knapsack weightsAndValues weightAllowed = knap startIndex weightAllowed 0 where knap nextIndex weightAvailable currentValue | weightAvailable <= 0 || nextIndex == -1 = Leaf (nextIndex, weightAvailable, currentValue) | otherwise = Branch (nextIndex, weightAvailable, currentValue) (knap (nextIndex-1) weightAvailable currentValue) (knap (nextIndex-1) (weightAvailable - weights !! nextIndex) (currentValue + values !! nextIndex)) (weights, values) = unzip weightsAndValues startIndex = length weights -1 |

We need a way of displaying this tree. To do this I will first convert the Tree to a Graphviz dot file format and then use the dot program to convert that into a png file.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | -- Convert the tree into a dot format treeToDot t = text "digraph g" <+> lbrace $+$ dotNodes t $+$ rbrace where dotNodes (Leaf i) = printInfo i dotNodes (Branch i tl tr) = printInfo i <+> text " -> " <+> dotNodes tl <+> printInfo i <+> text " -> " <+> dotNodes tr printInfo (a,b,c) | a /= -1 && b == 0 = val <+> val <+> text "[color=blue, style=filled]" | b < 0 = val <+> val <+> text "[color=red, style=filled]" | otherwise = val where val = (doubleQuotes.text.show) (a,b,c) -- Use the dot program to convert the .dot file to png knapToPng weightsAndValues weightAllowed fileName = do writeFile (fileName ++ ".dot") $ show $ treeToDot $ knapsack weightsAndValues weightAllowed system ("dot -Tpng " ++ fileName ++ ".dot -o " ++ fileName ++ ".png") main1 = knapToPng (zip [5,3,2] [9,7,8]) 5 "dp1" |

When you load this inside ghci and call main1, it will create a png file which I have displayed below. Each node of the tree shows 3 numbers which are the nextIndex, weightAvailable and currentValue. You start of with (2,5,0). The left fork from any node represents *not taking* the item corresponding to the index; the right fork represents *taking* the item. Note that we have 3 items and so the index values are 2,1 and 0. For e.g., the left and right forks from (2,5,0) represent *taking* and *not taking*, respectively, the item corresponding to index 2 which is the item with weight 2 and value 8.

Red color represents nodes where the weightAvailable is negative, which means that this path will not yield a solution — this can happen anywhere and not just at the leaf nodes. Blue color represents nodes where the weightAvailable is 0, which means that we cannot take any more items. Note that what the python and scala programs do is to take the max of the value from the left fork and the right fork, which recurses into the next level and so on until we terminate — the max values propagate from bottom to top.

As mentioned earlier, though the algorithm is slightly better than brute force, it still needs exponential time. One of the reasons for the inefficiency is due to the fact that the same subproblem is solved several times. This can be observed by looking at the tree corresponding to the problem with weights [1,1,5,5,3,3,4,4] values [15,15,10,10,4,4,5,5] and max allowed weight of 8.

main2 = knapToPng (zip [1,1,5,5,3,3,4,4] [15,15,10,10,4,4,5,5]) 8 “dp2”

You will see that there are multiple arrows from some nodes to other nodes, indicating that the same sub-problem is being solved several times. You can see it clearly by opening this png file in a separate tab. This duplication of effort can be avoided by trading space for time using the technique of memoizing: the first time we solve a problem we store the results, the next time we encounter the same problem we look up the answer. Watch the above video and the others in the series where the Professor discusses memoizing.

The video lectures on this and other topics from Mit Open Courseware are a real boon and I wish other Universities (hint: Stanford) would follow suit.