Skip to content

Week 7 - Dynamic Programming

Notes

Matrix-chain multiplication

Given a chain of n matrices \langle A_1, ..., A_i, ..., A_n \rangle , where A_i has dimension p_{i-1} \times p_i, parenthesize it in a way that minimizes the number of scalar multiplications.

Optimal structure

Within optimal parentheses, two splitted sub-chains must be optimal as well.

Prove: alternative 'better' sub-chains will yield a solution 'better' than the optimum: a contradiction.

Recursively defining

  1. Definition: for 1 \leq i \leq j, let m[i, j] be optimal solution to chain [i, j]

  2. The optimal solution is made of optimal solutions of sub-chains

    • When the problem is trivial: i = j, single matrix, m[i, j] = m[i, i] = 0
    • When the problem is not trivial: i < j, let A_{i...k} and A_{k+1...j} be the optimal parentheses, the cost will be sum of two sub-product plus product of two sub-matrix, which is
    m[i, j] = m[i, k] + m[k + 1, j] + p_{i-1} p_k p_j
  3. All possible splitting must be considered to yield optimal solutions of sub-chains, so we review all i \leq k < j and find the optimal (minimum here):

\min_{i \leq k < j} \{m[i, j] = m[i, k] + m[k + 1, j] + p_{i-1} p_k p_j\}

In conclusion, we recursively define the minimum cost of parenthesizing product as

\begin{align*} m[i, j] = \begin{cases} 0 &\text{if }i = j\\\\ \min_{i \leq k < j} \{m[i, j] = m[i, k] + m[k + 1, j] + p_{i-1} p_k p_j\} &\text{if }i < j\end{cases} \end{align*}

Overlapping subproblems

It's trivial.

Computing Optimal

Running time O(n^3)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class ch15:
    """Algorithms in chapter 15
    """

    def matrix_up_np(self, p):
        """matrix-chain product
        Given a chain <A_1, ..., A_i ..., A_n> of n matrices, where A_i has 
            dimension p_{i-1} \times p_i, fully parenthesize in a way that 
            minimizes the number of scalar multiplications.

        Arguments:
            p       {list} -- list of matrix shape consecutively
            memo    {dict} -- dict of memoization
        """
        import numpy as np

        n = len(p) - 1
        m = np.zeros((n+1, n+1))        # m -> optimal number
        s = np.zeros((n+1, n+1))        # s -> optimal k

        for i in range(1, n+1):
            m[i, i] = 0

        for l in range(2, n+1):         # l -> chain length
            for i in range(1, n-l+2):   # head position
                j = i + l - 1           # tail position
                m[i, j] = float('inf')

                for k in range(i, j):
                    q = m[i, k] + m[k + 1, j] + p[i - 1]*p[k]*p[j]

                    if q < m[i, j]:
                        m[i, j] = q
                        s[i, j] = k

        print(m[1:, 1:])                # start from 1
        print(s[1:, 1:])
        return m, s

    def matrix_print(self, s, i, j):
        """print parentheses

        Arguments:
            s {np.array} -- chart of 'k'
            i {int} -- from
            j {int} -- to
        """
        i, j = int(i), int(j)

        if i == j:
            print('A_'+str(i), end=' ')
        else:
            print('(', end='')
            # print(i, j)
            self.matrix_print(s, i, s[i, j])
            self.matrix_print(s, s[i, j] + 1, j)
            print(')', end='')

>>> quiz = ch15()
>>> p = [30, 35, 15, 5, 10, 20, 25] # same to the textbook sample
>>> _, s = quiz.matrix_up_np(p)
[[    0. 15750.  7875.  9375. 11875. 15125.]
 [    0.     0.  2625.  4375.  7125. 10500.]
 [    0.     0.     0.   750.  2500.  5375.]
 [    0.     0.     0.     0.  1000.  3500.]
 [    0.     0.     0.     0.     0.  5000.]
 [    0.     0.     0.     0.     0.     0.]]
[[0. 1. 1. 3. 3. 3.]
 [0. 0. 2. 3. 3. 3.]
 [0. 0. 0. 3. 3. 3.]
 [0. 0. 0. 0. 4. 5.]
 [0. 0. 0. 0. 0. 5.]
 [0. 0. 0. 0. 0. 0.]]
 >>> quiz.matrix_print(s, 1, 6)
((A_1 (A_2 A_3 ))((A_4 A_5 )A_6 ))

Longest common subsequence

Recursively Defining

\begin{align*} z[i, j] = \begin{cases} 0 &\text{ if }i = 0 \text{ or }j = 0\\\\ z[i-1, j-1] + 1 &\text{ if }X_i = Y_j\\\\ \max(z(i-1, j), z(i, j-1)) &\text{ if }X_i != Y_j\end{cases} \end{align*}

Optimal Binary Search Tree

\begin{align} E &= \sum_{i=1}^n (\text{depth}_T(k_i) + 1)\times p_i + \sum_{i=0}^n (\text{depth}_T(d_i) + 1)\times q_i \\ &= 1 + \sum_{i=1}^n \text{depth}_T(k_i)\times p_i + \sum_{i=0}^n \text{depth}_T(d_i)\times q_i \end{align}

Optimal Structure

If an optimal BST has a subtree, the latter must be optimal as well.

Prove: If not, replace the subtree with the optimal one, results in a 'better than optimal' BST: contradiction to optimality.

Recursively Defining

  1. Definition

    • e[i, j] is cost optimal BST containing keys \langle k_i, \dots, k_j\rangle and dummy keys \langle d_{i-1}, \dots, d_j\rangle.
  2. Push down cost: e[i, j] is the cost when we access the tree directly. When it becomes a subtree of a node, the original subtree is pushed down one level. This increases the cost of each node by 1 level, and the sum of which will be

    w(i, j) = \sum_{r=i}^j p_r + \sum_{l=i-1}^j q_r

    Note that if key_r if the root, then

    w(i, j) = w(i, r-1) + p_r + w(r+1, j)
  3. Thus, if there exist a root, we choose k_r, i \leq r \leq j as the root

    • The cost of root node is p_r
    • The cost of left subtree = direct access cost + push down 1 level
    • The cost of right subtree = direct access cost + push down 1 level
    \begin{align} e[i, j] &= \text{cost of root node} + \text{cost of left subtree} + \text{cost of right subtree} \\ &= p_r + (e[i, r-1] + w(i, r-1)) + (e[r+1, j] + w(r+1, j))\\ &= e[i, r-1] + e[r+1, j] + (w(i, r-1) +p_r + w(r+1, j))\\ &= e[i, r-1] + e[r+1, j] + w(i, j) \end{align}
  4. In order to find the optimal (minimum) cost above, we shall consider all possible root:

    e[i, j] = \min_{i\leq r\leq j} \{e[i, r-1] + e[r+1, j] + w(i, j)\}
  5. The trivial case is there's no root but only dummy key, which happens when j = i-1, and the cost of which is the dummy key itself q_{i-1}.

  6. Combine the trivial case with none trivial ones, we have:

\begin{align} e[i, j] =\begin{cases} q_{i-1} &\text{ if }j = i - 1 \\\\ \min_{i\leq r\leq j} \{e[i, r-1] + e[r+1, j] + w(i, j)\} &\text{ if }i\leq j \\ \end{cases} \end{align}

Assignment - Due 11/10/2018

Credit: source code of questions are from walkccc.

15.1-2

Show, by means of a counterexample, that the following ''greedy'' strategy does not always determine an optimal way to cut rods. Define the density of a rod of length i to be p_i / i, that is, its value per inch. The greedy strategy for a rod of length n cuts off a first piece of length i, where 1 \le i \le n, having maximum density. It then continues by applying the greedy strategy to the remaining piece of length n - i.

\begin{array}{l|ccccc} i & 1 & 2 & 3 & 4 & 5 \\ \hline p_i & 1 & 10 & 12 & 8 & 10 \\ p_i/i & 1 & 5 & 4 & 2 & 2 \\ \end{array}

Given the price table above, suppose ord length is 3, according to p_i/i greedy strategy, it will be cut to \{2, 1\} with income of 11. However the optimal way is without cutting, having income of 12.

15.1-3

Consider a modification of the rod-cutting problem in which, in addition to a price p_i for each rod, each cut incurs a fixed cost of c. The revenue associated with a solution is now the sum of the prices of the pieces minus the costs of making the cuts. Give a dynamic-programming algorithm to solve this modified problem.

Top-down method

top-down

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def rod_dynamic_down(price={1:1, 2:5, 3:8, 4:9, 5:10, 6:17, 7:17, 8:30}, 
                     avail=21, pool={}, c=0):
    """Get maximum value of rod to the avail, top-down implementation

    Keyword Arguments:
        price {dict} -- dict of length and price {length:price, }
        avail {int}  -- [description] (default: {21})
    """
    price_len = sorted(price.keys())

    if avail < min(price_len):          # less than min-cutting
        return 0

    if pool.get(avail):                 # dynamic - found
        return pool.get(avail)

    max_r = 0
                                        # in case rod <= avail
    for rod in [i for i in price_len if i <= avail]:   
        max_r = max(max_r, 
                    price[rod] + rod_dynamic_down(price, avail-rod, pool) - c)

    pool[avail] = max_r                 # dynamic - not found

    return max_r

Bottom-up method

bottom-up

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def rod_dynamic_up(price={1:1, 2:5, 3:8, 4:9, 5:10, 6:17, 7:17, 8:30}, 
                   avail=21, c=0):
    """Bottom-up

    Keyword Arguments:
        price {dict} -- dict of length and price {length:price, }
        avail {int}  -- [description] (default: {100})
    """
    record = {0:0, }

    for total in range(1, avail+1):
        max_r  = 0

        for cut in range(1, min(total, max(price)) + 1):
            max_r = max(max_r, price[cut] + record[total - cut] - c)

        record[total] = max_r

    return max_r

15.2-1

Find an optimal parenthesization of a matrix-chain product whose sequence of dimensions is \langle 5, 10, 3, 12, 5, 50, 6 \rangle.

15.2-3

Use the substitution method to show that the solution to the recurrence \text{(15.6)} is \Omega(2^n).

15.4-1

Determine an \text{LCS} of \langle 1, 0, 0, 1, 0, 1, 0, 1 \rangle and \langle 0, 1, 0, 1, 1, 0, 1, 1, 0 \rangle.

15.4-2

Give pseudocode to reconstruct an \text{LCS} from the completed c table and the original sequences X = \langle x_1, x_2, \ldots, x_m \rangle and Y = \langle y_1, y_2, \ldots, y_n \rangle in O(m + n) time, without using the b table.

15.4-3

Give a memoized version of \text{LCS-LENGTH} that runs in O(mn) time.

15.4-4

Show how to compute the length of an \text{LCS} using only 2 \cdot \min(m, n) entries in the c table plus O(1) additional space. Then show how to do the same thing, but using \min(m, n) entries plus O(1) additional space.

15.5-1

Write pseudocode for the procedure \text{CONSTRUCT-OPTIMAL-BST}(root) which, given the table root, outputs the structure of an optimal binary search tree. For the example in Figure 15.10, your procedure should print out the structure

\begin{align} & \text{$k_2$ is the root} \\ & \text{$k_1$ is the left child of $k_2$} \\ & \text{$d_0$ is the left child of $k_1$} \\ & \text{$d_1$ is the right child of $k_1$} \\ & \text{$k_5$ is the right child of $k_2$} \\ & \text{$k_4$ is the left child of $k_5$} \\ & \text{$k_3$ is the left child of $k_4$} \\ & \text{$d_2$ is the left child of $k_3$} \\ & \text{$d_3$ is the right child of $k_3$} \\ & \text{$d_4$ is the right child of $k_4$} \\ & \text{$d_5$ is the right child of $k_5$} \end{align}

corresponding to the optimal binary search tree shown in Figure 15.9(b).

15.5-2

Determine the cost and structure of an optimal binary search tree for a set of n = 7 keys with the following probabilities

\begin{array}{c|cccccccc} i & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\ \hline p_i & & 0.04 & 0.06 & 0.08 & 0.02 & 0.10 & 0.12 & 0.14 \\ q_i & 0.06 & 0.06 & 0.06 & 0.06 & 0.05 & 0.05 & 0.05 & 0.05 \end{array}