Max sum sub-array (1-D)

Brute Force

Simply, we have to check all the possible combinations of (contiguous) subarrays.
We achieve that by a double for loop that grows the size of the subarray and moves it along while keeping track of the max sum and the respective indices that achieved the maximum value.

Runtime: \(\mathcal{O}(n^2)\)

def max_subarr(A):
    max_sum = 0
    ind_i = 0
    ind_j = 0
    for i in range(len(A)):
        total = 0
        for j in range(i, len(A)):
            total += np.sum(A[j])
            if total > max_sum:
                max_sum = total
                ind_i = i
                ind_j = j

Recursion

We could approach this problem from a recrusive perspective by dividing the problem into sub-problems.
We utilize an idea from “MergeSort” and break the array into sub-arrays of which we compute and compare their sums. then we save the sub-array that has the max sum.
We finally compare that to the sub-array that includes both halves overlapping in the middle.

Runtime: \(\mathcal{O}(n)\)

def max_subarr(A, start_ind, end_ind):
    mid = (start_ind + end_ind - 1) / 2
    l_sum, l_i, l_j = max_subarr(A, start_ind, mid)
    r_sum, l_i, r_j = max_subarr(A, mid+1, end_ind)
    i = min(minPLeft, minPRight) # Min of the sums in [start−1, ..., end−1]
    j = max(maxPLeft, maxPRight) # Max of the sums in [start, ..., end]
    c_sum = r_j - l_i # Considering the center and overlap
    M = max(l_sum, r_sum, c_sum)
    return M, i, j

Dynamic Programming

We smartly look at all the sets of subarrays that end with a given index “\(i\)”. If we can do that effeciently enough then we will be able to replicate the performance of the recursive algorithm.

Let \(A_i\) be the maximum subarray sum ending at position \(i\).
Now, we consider the sub-problem, what does \(A_{i+1}\) equal to? what if I already have the solution to \(A_i\)?
Simply,

$${\displaystyle A_{i+1}=max(A[i+1] + A_{i}, A[i+1])} {\displaystyle=max(A_{i+1},A_{i+1}+B_{i})}$$

i.e. The maximum subarray sum ending at position \(i+1\) either will be the singlton element at the \(i+1\)-st position or will be that added to the maximum subarray sum ending at position \(i\), \(A_i\).

Now, if we realize that all we need is to consider the ending indices is just to go over the whole array and grow as needed, we realize that we can do it all in one pass!
\(\implies\)

Runtime: \(\mathcal{O}(n)\)

def max_subarr(A):
    grow_el = A[0]
    max_sum = A[0]
    i = j = c = 0
    for ind in range(1, len(A)):
        i = c if max_sum < grow_el else pass
        j = j if max_sum < grow_el else pass
        grow_el = max(A[ind], grow_el + A[ind])
        max_sum = max(max_sum, grow_el)
        c = i+1 if grow_el < 0 else pass
    return max_sum

Max sum sub-matrix (2-D)

Brute Force

In the 2D case, we could still use a naive approach and try all the possible kernel sizes and just compute a convolution amongst all these kernels.
We will need to make all the possible kernel sizes, there are \(n^4\) of them. For each one of those we need to compute a convolution over the whole matrix, which in turn takes \(\mathcal{O}(n^2)\),
\(\implies\)

Runtime: \(\mathcal{O}(n^6)\)

def max_submat(A):
    max_sum = Integer.MIN_VALUE;
    max_ri = max_ci = max_rj = max_cj = -1

    for ri in range(len(A)):
        for ci in range(len(A)):
            for rj in range(len(A)):
                for cj in range(len(A)):
                    sum = 0
                    for i in range(rj):
                        for j in range(cj):
                            sum += A[i][j]
                    if max_sum < sum:
                        max_sum = sum
                        max_ri = ri
                        max_ci = ci
                        max_rj = rj
                        max_cj = cj
    return max_sum

Dynamic Programming

If we wanted to exploit the optimal substructure approach, then we could utilize a dynamic programming solution that checks every sub-matrix starting at row \(r1\) to \(r2\) and cols \(c1\) to \(c2\), calculates the sums for each sub-matrix, and updating the parameters as needed.

The summation operation on “\(n\)” numbers is actually a linear time operation, and not constant as opposed to what a lot of people actually think. To avoid the problem of summing a huge matrix, we need to compute something called a sum_Matrix, also known as Prefix Sums.
The sum-Matrix will include all the sums in the matrix in such a way that computing the sum of any arbitrary subset in the matrix is still only one operation, reducing it to constant time. Namely, for this matrix,

$$ \left[ \begin{array}{ccc} a & d & g \\ b & e & h \\ c & f & i \end{array} \right] \rightarrow \left[ \begin{array}{ccc} a & d & g \\ a+b & d+e & g+h \\ a+b+c & d+e+f & g+h+i \end{array} \right] $$

\(\implies\)

Runtime: \(\mathcal{O}(n^4)\)
Space-Complexity: \(\mathcal{O}(n^2)\)

Let us assume that we precomputed the sum-matrix and called it sum_Matrix.

def max_submat(A):
    max_sum = MIN_VALUE
    arr = [0 for 0 in range(len(A))]


    for r1 in range(A.shape[0]):
        for r2 in range(r1, A.shape[0]):
            for c1 in range(A.shape[1]):
                for c2 in range(c2, A.shape[1]):
                    max_sum = max(max_sum, max_sum_subarr(arr, n))

    return max_sum

def sum_mat(A, r1, r2, c1, c2):
    if r1 == 0 && c1 == 0;
        return sum_Matrix[r2][c2]
    else if r1 == 0:
        return sum_Matrix[r2][c2] - sum_Matrix[r2][c1-1]
    else if c1 == 0:
        return sum_Matrix[r2][c2] - sum_Matrix[r1-1][c2]
    else:
        return sum_Matrix[r2][c2] - sum_Matrix[r2][c1-1]- sum_Matrix[r1-1][c2] + sum_Matrix[r1-1][c1-1]

Dynamic Programming with max subarray solution

We could, also, utilize the solution to the “Max Sub-Array Problem (1D)”, known as kadanes’ algorithm, by finding sub-solutions to each column and then growing the matrix and keeping track of the paramteres that allows the overall sum to grow.
If we pre-compute the matrix sums (for matrices with size \(> 10^4\)) then we will only need to perform the kadanes algorithm once for every iteration over the columns, where we repeat the operation for each column, assuming we start at the subsequent one everytime.
Now, since the sub-algorithm is only \(\mathcal{O}(n)\) and we only have to run it \(n^2\) times,
\(\implies\)

Runtime: \(\mathcal{O}(n^3)\)
Space-Complexity: \(\mathcal{O}(n^2)\)

def max_submat(A):
    max_sum = MIN_VALUE
    arr = [0 for 0 in range(len(A))]

    for r in range(A.shape[0]):
        for c in range(r, A.shape[1]):
            for i in range(A.shape[0]):
                arr[i] += A[i][c]
        max_sum = max(max_sum, max_sum_subarr(arr, n))

    return max_sum

Branch and Bound Method

Here I will explore the unique approach of replacing the sliding-window search with a different search method, namely, Branch and Bound search.
The default quality function is to use a spatial pyramid kernel with levels of size 1x1, 2x2 … NxN where N can be chosen at runtime. Default is N=1, i.e. bag-of-visual-words model.
\(\implies\)

Runtime: \(\mathcal{O}(n^3) \:\:\:,\:\:\: \mathcal{\Omega}(n^2)\)

from ctypes import Structure, c_int, c_double
from numpy.ctypeslib import load_library, ndpointer

# Box_struct courtesy:
# https://github.com/npinto
class Box_struct(Structure):
        """Structure to hold left,top,right,bottom and score of a box instance.
           The fields have to coincide with the C-version in pyramid_search.h
        """
        _fields_ = [("left", c_int), 
                    ("top", c_int), 
                    ("right", c_int), 
                    ("bottom", c_int), 
                    ("score", c_double) ]

def pyramid(n_pts, w, h, x_pos, y_pos, cluster, bins, lvls, weights):):
    pyramidlib = load_library("libess.so",".")
    pyramidlib.pyramid_search.restype = Box_struct
    pyramidlib.pyramid_search.argtypes = [c_int,c_int,c_int,        
            ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS'),
            ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS'),
            ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS'),
            c_int, c_int,                                           
            ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS')]

    box = pyramidlib.pyramid_search(n_pts, w, h, 
                      x_pos, y_pos, cluster, bins, lvls, weights)
    return box

def subwindow_search(n_pts, w, h, x_pos, y_pos, cluster, weights):
    return pyramid(n_pts, w, h, x_pos, y_pos, cluster, max(cluster)+1, 1, weights)



Final Thought and Conclusions

Regarding my answers during the Interview

Further Development

Final Comments

Please note that all code and descriptions here were completely written by me.
However, credit was given for the "C++" implementation of the "Box-Struct".
All code, descriptions and explanations are under a public license
Copyright (C) 2017 MIT