Max sum sub-array (1-D)
Brute Force
Simply, we have to check all the possible combinations of (contiguous) subarrays.
We achieve that by a double for loop that grows the size of the subarray and moves it along while keeping track of the max sum and the respective indices that achieved the maximum value.
Runtime: \(\mathcal{O}(n^2)\)
def max_subarr(A):
max_sum = 0
ind_i = 0
ind_j = 0
for i in range(len(A)):
total = 0
for j in range(i, len(A)):
total += np.sum(A[j])
if total > max_sum:
max_sum = total
ind_i = i
ind_j = j
Recursion
We could approach this problem from a recrusive perspective by dividing the problem into sub-problems.
We utilize an idea from “MergeSort” and break the array into sub-arrays of which we compute and compare their sums. then we save the sub-array that has the max sum.
We finally compare that to the sub-array that includes both halves overlapping in the middle.
Runtime: \(\mathcal{O}(n)\)
def max_subarr(A, start_ind, end_ind):
mid = (start_ind + end_ind - 1) / 2
l_sum, l_i, l_j = max_subarr(A, start_ind, mid)
r_sum, l_i, r_j = max_subarr(A, mid+1, end_ind)
i = min(minPLeft, minPRight) # Min of the sums in [start−1, ..., end−1]
j = max(maxPLeft, maxPRight) # Max of the sums in [start, ..., end]
c_sum = r_j - l_i # Considering the center and overlap
M = max(l_sum, r_sum, c_sum)
return M, i, j
Dynamic Programming
We smartly look at all the sets of subarrays that end with a given index “\(i\)”. If we can do that effeciently enough then we will be able to replicate the performance of the recursive algorithm.
Let \(A_i\) be the maximum subarray sum ending at position \(i\).
Now, we consider the sub-problem, what does \(A_{i+1}\) equal to? what if I already have the solution to \(A_i\)?
Simply,
$${\displaystyle A_{i+1}=max(A[i+1] + A_{i}, A[i+1])} {\displaystyle=max(A_{i+1},A_{i+1}+B_{i})}$$
i.e. The maximum subarray sum ending at position \(i+1\) either will be the singlton element at the \(i+1\)-st position or will be that added to the maximum subarray sum ending at position \(i\), \(A_i\).
Now, if we realize that all we need is to consider the ending indices is just to go over the whole array and grow as needed, we realize that we can do it all in one pass!
\(\implies\)
Runtime: \(\mathcal{O}(n)\)
def max_subarr(A):
grow_el = A[0]
max_sum = A[0]
i = j = c = 0
for ind in range(1, len(A)):
i = c if max_sum < grow_el else pass
j = j if max_sum < grow_el else pass
grow_el = max(A[ind], grow_el + A[ind])
max_sum = max(max_sum, grow_el)
c = i+1 if grow_el < 0 else pass
return max_sum
Max sum sub-matrix (2-D)
Brute Force
In the 2D case, we could still use a naive approach and try all the possible kernel sizes and just compute a convolution amongst all these kernels.
We will need to make all the possible kernel sizes, there are \(n^4\) of them. For each one of those we need to compute a convolution over the whole matrix, which in turn takes \(\mathcal{O}(n^2)\),
\(\implies\)
Runtime: \(\mathcal{O}(n^6)\)
def max_submat(A):
max_sum = Integer.MIN_VALUE;
max_ri = max_ci = max_rj = max_cj = -1
for ri in range(len(A)):
for ci in range(len(A)):
for rj in range(len(A)):
for cj in range(len(A)):
sum = 0
for i in range(rj):
for j in range(cj):
sum += A[i][j]
if max_sum < sum:
max_sum = sum
max_ri = ri
max_ci = ci
max_rj = rj
max_cj = cj
return max_sum
Dynamic Programming
If we wanted to exploit the optimal substructure approach, then we could utilize a dynamic programming solution that checks every sub-matrix starting at row \(r1\) to \(r2\) and cols \(c1\) to \(c2\), calculates the sums for each sub-matrix, and updating the parameters as needed.
The summation operation on “\(n\)” numbers is actually a linear time operation, and not constant as opposed to what a lot of people actually think. To avoid the problem of summing a huge matrix, we need to compute something called a sum_Matrix, also known as Prefix Sums.
The sum-Matrix will include all the sums in the matrix in such a way that computing the sum of any arbitrary subset in the matrix is still only one operation, reducing it to constant time.
Namely, for this matrix,
$$ \left[ \begin{array}{ccc} a & d & g \\ b & e & h \\ c & f & i \end{array} \right] \rightarrow \left[ \begin{array}{ccc} a & d & g \\ a+b & d+e & g+h \\ a+b+c & d+e+f & g+h+i \end{array} \right] $$
\(\implies\)
Runtime: \(\mathcal{O}(n^4)\)
Space-Complexity: \(\mathcal{O}(n^2)\)
Let us assume that we precomputed the sum-matrix and called it
sum_Matrix.
def max_submat(A):
max_sum = MIN_VALUE
arr = [0 for 0 in range(len(A))]
for r1 in range(A.shape[0]):
for r2 in range(r1, A.shape[0]):
for c1 in range(A.shape[1]):
for c2 in range(c2, A.shape[1]):
max_sum = max(max_sum, max_sum_subarr(arr, n))
return max_sum
def sum_mat(A, r1, r2, c1, c2):
if r1 == 0 && c1 == 0;
return sum_Matrix[r2][c2]
else if r1 == 0:
return sum_Matrix[r2][c2] - sum_Matrix[r2][c1-1]
else if c1 == 0:
return sum_Matrix[r2][c2] - sum_Matrix[r1-1][c2]
else:
return sum_Matrix[r2][c2] - sum_Matrix[r2][c1-1]- sum_Matrix[r1-1][c2] + sum_Matrix[r1-1][c1-1]
Dynamic Programming with max subarray solution
We could, also, utilize the solution to the “Max Sub-Array Problem (1D)”, known as kadanes’ algorithm, by finding sub-solutions to each column and then growing the matrix and keeping track of the paramteres that allows the overall sum to grow.
If we pre-compute the matrix sums (for matrices with size \(> 10^4\)) then we will only need to perform the kadanes algorithm once for every iteration over the columns, where we repeat the operation for each column, assuming we start at the subsequent one everytime.
Now, since the sub-algorithm is only \(\mathcal{O}(n)\) and we only have to run it \(n^2\) times,
\(\implies\)
Runtime: \(\mathcal{O}(n^3)\)
Space-Complexity: \(\mathcal{O}(n^2)\)
def max_submat(A):
max_sum = MIN_VALUE
arr = [0 for 0 in range(len(A))]
for r in range(A.shape[0]):
for c in range(r, A.shape[1]):
for i in range(A.shape[0]):
arr[i] += A[i][c]
max_sum = max(max_sum, max_sum_subarr(arr, n))
return max_sum
Branch and Bound Method
Here I will explore the unique approach of replacing the sliding-window search with a different search method, namely, Branch and Bound search.
The default quality function is to use a spatial pyramid kernel with levels of size 1x1, 2x2 … NxN where N can be chosen at
runtime. Default is N=1, i.e. bag-of-visual-words model.
\(\implies\)
Runtime: \(\mathcal{O}(n^3) \:\:\:,\:\:\: \mathcal{\Omega}(n^2)\)
from ctypes import Structure, c_int, c_double
from numpy.ctypeslib import load_library, ndpointer
# Box_struct courtesy:
# https://github.com/npinto
class Box_struct(Structure):
"""Structure to hold left,top,right,bottom and score of a box instance.
The fields have to coincide with the C-version in pyramid_search.h
"""
_fields_ = [("left", c_int),
("top", c_int),
("right", c_int),
("bottom", c_int),
("score", c_double) ]
def pyramid(n_pts, w, h, x_pos, y_pos, cluster, bins, lvls, weights):):
pyramidlib = load_library("libess.so",".")
pyramidlib.pyramid_search.restype = Box_struct
pyramidlib.pyramid_search.argtypes = [c_int,c_int,c_int,
ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS'),
ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS'),
ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS'),
c_int, c_int,
ndpointer(dtype=c_double, ndim=1, flags='C_CONTIGUOUS')]
box = pyramidlib.pyramid_search(n_pts, w, h,
x_pos, y_pos, cluster, bins, lvls, weights)
return box
def subwindow_search(n_pts, w, h, x_pos, y_pos, cluster, weights):
return pyramid(n_pts, w, h, x_pos, y_pos, cluster, max(cluster)+1, 1, weights)
Final Thought and Conclusions
Regarding my answers during the Interview
- During the interview I was thinking of the dynamic programming approach of trying out the matrices and growing them in sizes after having precomputed their sum values.
- I, also, tried exlporing the \(\mathcal{O}(n^3)\) after you discusses the 1D approach.
- The branch and bound method is interesting but solves the problem from a different perspective.
Further Development
-
I believe that the \(\mathcal{O}(n^3)\) utilizing Kadane algorithm could be improved by calling the algorithm only in the first loop (not the second) by smartly computing the overlapping values and going across cols then rows instead (two runs, i.e. constant).
This will lead the algorithm to be \(\mathcal{O}(n^2)\) instead but the idea needs further exploration. -
Another approach would be to rely on looking at the distribution of the numbers in the matrix (linear), then to sample smartly using an ML approach, perhaps by fitting a hough transform that detects large sum “chunks”.
Final Comments
- I will be updating this post whenever I have time.
- Code has been Unit-Tested and most but not all has been stress-tested with edge-cases.