Video lectures of MIT course 6-00Fall-2008 “Introduction to Computer Science and Programming” are available as part of open courseware. One of the topics is “Dynamic Programming” where the knapsack problem is discussed. The programming language used in the course is Python. I wish however that scala or haskell was used instead.
The technique used in Dynamic Programming was invented by Bellman. The Professor said that Bellman was under US Government contract to work on something else and so he gave an obscure name “Dynamic Programming” so as to not arouse suspicion — and the name stuck. Reminds me of a Saturday Night Live skit: Dynamic Programming is neither Dynamic nor Programming — discuss among yourselves!
For the 0/1 knapsack problem, the brute force solution is an exhaustive enumeration and this is an exponential time algorithm. The professor then discusses a slightly better algorithm — where some paths are eliminated early — and uses the following Python code to implement it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # 0/1 knapsack problem (python) numCalls = 0 def maxVal(w, v, i, aW): global numCalls numCalls += 1 if i == 0: if w[i] <= aW: return v[i] else: return 0 without_i = maxVal(w, v, i-1, aW) if w[i] > aW: return without_i else: with_i = v[i] + maxVal(w, v, i-1, aW - w[i]) return max(with_i, without_i) w1 = [5,3,2] v1 = [9,7,8] w2 = [1,1,5,5,3,3,4,4] v2 = [15,15,10,10,4,4,5,5] m1 = maxVal(w1, v1, len(v1) -1, 5 ) print (m1, numCalls) numCalls = 0 m2 = maxVal(w2, v2, len(v2) -1, 8 ) print (m2, numCalls) |
For the first problem where the weights are [5,3,2] and values are [9,7,8], the maxvalue of items that can be placed in the knapsack is 15 and the number of calls made is 7. For the second the maxvalue is 40 and number of calls is 85. Though this is better than brute force, it is still an exponential time algorithm.
I have coded a slightly modified version in the scala language. It gives the same answer but makes only 6 and 65 calls respectively. Not sure if this is better as there are potentially extra checks in each call.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | // 0/1 knapsack problem (scala) object knapsack { def max(a:Int, b:Int):Int = if (a>=b) a else b def knapMax(weights:List[Int], values:List[Int], weightAvailable:Int):(Int,Int) = { var numCalls = 0 def knap(nextIndex:Int, weightAvailable:Int, currentValue:Int):Int = { numCalls += 1 val remainingWeight = weightAvailable - weights(nextIndex) if ( nextIndex == 0 ) if ( remainingWeight >= 0 ) currentValue + values(nextIndex) else currentValue else max (knap(nextIndex-1, weightAvailable, currentValue), if ( remainingWeight > 0 ) knap(nextIndex-1, (weightAvailable - weights(nextIndex)), (currentValue + values(nextIndex))) else if ( remainingWeight == 0 ) currentValue + values(nextIndex) else -1) } val result = knap(weights.length - 1, weightAvailable, 0) (result, numCalls) } } knapsack.knapMax(List (5,3,2), List (9,7,8), 5) knapsack.knapMax(List (1,1,5,5,3,3,4,4), List (15,15,10,10,4,4,5,5), 8 ) |
To visualize the algorithm, I wrote a simple (unoptimized) program in haskell. This doesn’t compute the max value but creates a tree data structure that shows how the algorithm works.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | -- 0/1 knapsack problem (haskell) import Text.PrettyPrint import System.Cmd (system) data Tree a = Leaf a | Branch a (Tree a) (Tree a) deriving (Eq, Show) -- Create a tree knapsack weightsAndValues weightAllowed = knap startIndex weightAllowed 0 where knap nextIndex weightAvailable currentValue | weightAvailable <= 0 || nextIndex == -1 = Leaf (nextIndex, weightAvailable, currentValue) | otherwise = Branch (nextIndex, weightAvailable, currentValue) (knap (nextIndex-1) weightAvailable currentValue) (knap (nextIndex-1) (weightAvailable - weights !! nextIndex) (currentValue + values !! nextIndex)) (weights, values) = unzip weightsAndValues startIndex = length weights -1 |
We need a way of displaying this tree. To do this I will first convert the Tree to a Graphviz dot file format and then use the dot program to convert that into a png file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | -- Convert the tree into a dot format treeToDot t = text "digraph g" <+> lbrace $+$ dotNodes t $+$ rbrace where dotNodes (Leaf i) = printInfo i dotNodes (Branch i tl tr) = printInfo i <+> text " -> " <+> dotNodes tl <+> printInfo i <+> text " -> " <+> dotNodes tr printInfo (a,b,c) | a /= -1 && b == 0 = val <+> val <+> text "[color=blue, style=filled]" | b < 0 = val <+> val <+> text "[color=red, style=filled]" | otherwise = val where val = (doubleQuotes.text.show) (a,b,c) -- Use the dot program to convert the .dot file to png knapToPng weightsAndValues weightAllowed fileName = do writeFile (fileName ++ ".dot") $ show $ treeToDot $ knapsack weightsAndValues weightAllowed system ("dot -Tpng " ++ fileName ++ ".dot -o " ++ fileName ++ ".png") main1 = knapToPng (zip [5,3,2] [9,7,8]) 5 "dp1" |
When you load this inside ghci and call main1, it will create a png file which I have displayed below. Each node of the tree shows 3 numbers which are the nextIndex, weightAvailable and currentValue. You start of with (2,5,0). The left fork from any node represents not taking the item corresponding to the index; the right fork represents taking the item. Note that we have 3 items and so the index values are 2,1 and 0. For e.g., the left and right forks from (2,5,0) represent taking and not taking, respectively, the item corresponding to index 2 which is the item with weight 2 and value 8.
Red color represents nodes where the weightAvailable is negative, which means that this path will not yield a solution — this can happen anywhere and not just at the leaf nodes. Blue color represents nodes where the weightAvailable is 0, which means that we cannot take any more items. Note that what the python and scala programs do is to take the max of the value from the left fork and the right fork, which recurses into the next level and so on until we terminate — the max values propagate from bottom to top.
![Item Weights [5,3,2] Item Values [9,7,8] Allowed Weight 5 Item Weights [5,3,2] Item Values [9,7,8] Allowed Weight 5](http://blog.srinivasan.biz/wp-content/uploads/2010/02/dp1.png)
As mentioned earlier, though the algorithm is slightly better than brute force, it still needs exponential time. One of the reasons for the inefficiency is due to the fact that the same subproblem is solved several times. This can be observed by looking at the tree corresponding to the problem with weights [1,1,5,5,3,3,4,4] values [15,15,10,10,4,4,5,5] and max allowed weight of 8.
main2 = knapToPng (zip [1,1,5,5,3,3,4,4] [15,15,10,10,4,4,5,5]) 8 “dp2″
![Item Weights [1,1,5,5,3,3,4,4] Item Values [15,15,10,10,4,4,5,5] Allowed Weight 8 Item Weights [1,1,5,5,3,3,4,4] Item Values [15,15,10,10,4,4,5,5] Allowed Weight 8](http://blog.srinivasan.biz/wp-content/uploads/2010/02/dp2.png)
You will see that there are multiple arrows from some nodes to other nodes, indicating that the same sub-problem is being solved several times. You can see it clearly by opening this png file in a separate tab. This duplication of effort can be avoided by trading space for time using the technique of memoizing: the first time we solve a problem we store the results, the next time we encounter the same problem we look up the answer. Watch the above video and the others in the series where the Professor discusses memoizing.
The video lectures on this and other topics from Mit Open Courseware are a real boon and I wish other Universities (hint: Stanford) would follow suit.





February 9th, 2010 at 12:47 pm
Stanford does offer online courses. Check this one. http://see.stanford.edu/see/courses.aspx
March 18th, 2010 at 11:52 pm
Hi Babu,
This may be an unusual request, but I’m looking for a “dumbed down” explanation of this algorithm so that I can make a PHP implementation of the 0-1 Knapsack solve.
I’ve found a number of solutions in various other languages, but none that I am familiar with being primarily a web programmer. One in particular that was ideal for my application was written in Java, and although I recognize some of the control structures, the language primitives are still too foreign for me to make sense of.
http://www.alesdar.org/oldSite.....pplet.html
I need to be able to take a list of items with a weight and value (both positive integers) and return the best optimal combination for a specified maximum weight. The above linked applet works perfectly for my needs (though I don’t need the additional GA algorithm that it’s comparing against).
Any help understanding the code or a simplified explanation would be of tremendous assistance. I will happy post the final PHP code for others encountering the same issue…
Brian
March 19th, 2010 at 12:58 am
OK I think I was just trying too hard to understand the Java and got discouraged… The Python example was a lot easier to understand and translate to PHP…
For those who are interested, here is a direct translation of the first (Python) version above in PHP:
0-1 Knapsack Problem Solve
March 19th, 2010 at 11:27 pm
Here is the memoization optimized version in PHP:
<? ## 0-1 Knapsack Problem Solve (with memoization optimization) # $w = weight of item # $v = value of item # $i = index # $aW = Available Weight # $m = memo array function knapSolveFast($w,$v,$i,$aW,$m) { global $numcalls, $m; $numcalls ++; // echo "Called with i=$i, aW=$aW"; // Return memo if we have one if (isset($m[$i][$aW])) { return $m[$i][$aW]; } else { if ($i == 0) { if ($w[$i] $aW) { $m[$i][$aW] = $without_i; return $without_i; } else { $with_i = $v[$i] + knapSolveFast($w, $v, ($i-1), ($aW - $w[$i]),$m); $res = max($with_i, $without_i); $m[$i][$aW] = $res; return $res; } } } $w3 = array(1, 1, 1, 2, 2, 2, 4, 4, 4, 44, 96, 96, 96); $v3 = array(1, 1, 1, 2, 2, 2, 4, 4, 4, 44, 96, 96, 96); $numcalls = 0; $m = array(); $m3 = knapSolveFast($w3, $v3, sizeof($v3) -1, 54,$m); print_r($w3); echo ""; echo "Max: $m3 ($numcalls calls)"; ?>April 1st, 2010 at 5:54 am
here it is: http://see.stanford.edu/see/courses.aspx
April 4th, 2010 at 10:36 am
Neil, Raveendra also mentioned this. It is a start by Stanford but they have a long way to go. I would call it Open Sample Ware as they have only 10 courses.
Programming Methodology
Programming Abstractions
Programming Paradigms
Introduction to Robotics
Natural Language Processing
Machine Learning
The Fourier Transform and its Applications
Introduction to Linear Dynamical Systems
Convex Optimization I
Convex Optimization II
May 13th, 2010 at 11:33 pm
Hi Babu. I’m a fan of your blog and writing, but especially impressed with the intelligent discussion you continue in your comments.
Thanks for keeping this blog. It’s very inspiring.
July 14th, 2010 at 4:38 pm
Hi Babu, I would also recommend Dynamic Programming Practice Problems for understanding DP. It has flash videos of some common DP problems such as Balanced Partition problem explained in a lucid manner.