Home » My second favorite interview question

My second favorite interview question

An analysis of my second favorite interview question

Approximate reading time: 6 minutes

Current weather: 23°C (Partly cloudy)

By Beto Dealmeida (roberto@dealmeida.net) Este, sou eu on 2020-09-18 under engineering programming interview

Recently I wrote a blog post analyzing my favorite interview question. Here's my second favorite:

Given an array of integers and an integer k, find the maximum sum that can be made with k contiguous numbers.

Another way of saying this is: "what is the maximum sum of all possible subsequences of size k?"

This is just a warmup question. Interviews are stressful and unnatural, so I like putting candidates at ease by starting with a question that they probably will know the answer. All candidates are able to come up with a brute force solution to this question, and half of the candidates will start with the optimal solution.

Initial solution

The brute force solution iterates over the sum of all possible subsequences:

def find_max_sum(numbers: List[int], k: int) -> int:
    max_sum = -float("inf")

    for i in range(len(numbers) - k):
        candidate = 0
        for j in range(k):
            candidate += numbers[i + j]

        max_sum = max(max_sum, candidate)

    return max_sum

Notice that we're not doing any kind of error checking here — for example, verifying that k is less than the size of the array. Good candidates will talk about error checking, and discuss what would be the return type. I usually don't ask them to code the error handling, being satisfied with them just mentioning it.

The time complexity of the solution is O(nk), and the space complexity is constant. Usually weak candidates will initialize max_sum to zero, and in that case I ask what would be the solution if all numbers in numbers were negative.

A better solution

Since this is a warmup question, I usally try to discuss the question with the candidate until they've come up with a moving window solution. Only then I ask them to translate the solution to code. It usually looks something like this:

def find_max_sum(numbers: List[int], k: int) -> int:
    max_sum = 0
    for i in range(k):
        max_sum += numbers[i]

    candidate = max_sum

    for i in range(k, len(numbers)):
        candidate += numbers[i]
        candidate -= numbers[i - k])
        max_sum = max(max_sum, candidate)

    return max_sum

This solution is linear in time (as fast as possible), and also uses constant space. It works by computing the sum of a window of size k, and then moving that window one step at a time, by summing the new value and subtracting the one that is no longer part of the window. For example, for a k equals to 3:

[1  2  3] 4  5  6  =>  1 + 2 + 3 = 6
 1 [2  3  4] 5  6  =>  6 + 4 - 1 = 9
 -        +
    2 [3  4  5] 6  =>  9 + 5 - 2 = 12
    -        +

The main question

Once the candidate has answered the warmup question, I move to the main question:

Given an array of integers and an integer k, find the maximum sum that can be made with 3 non-overlapping windows of size k.

Good candidates quickly realize that this is a much harder question. They will try to reuse their initial solution to the warmup question, coming up with a greed solution. I usually give them this example, which does not work with a greedy solution:

1  4  6  2  1  1  4  7  8  5  1

For the input above and k equals to 2 the answer should be:

1 [4  6] 2  1  1 [4  7][8  5] 1  =>  4 + 6 + 4 + 7 + 8 + 5 = 34

But a greedy solution would first find [7 8], then [4 6], and finally [5 1], giving as an answer 31.

Some candidates will assume that this is a dynamic programming question. It can be solved with dynamic programming, but I would always tell candidates thay they were free to use dynamic programming, but it was not necessary. In general I tend to find dynamic programming solutions too opaque, and I have a hard time getting good signal from candidates when they use it.

Brute force solution

Most candidates realize that the brute force solution has a cubic time complexity. If they can't come up with a better solution I ask them to code the brute force solution, mentioning that we can always try to improve it later. Candidates usually write an iterative solution similar to this:

def find_max_sum_3(numbers: List[int], k: int) -> int:
    max_sum = -float("inf")
    n = len(numbers)
    for i in range(n - 3 * k):
        for j in range(i + k, n - 2 * k):
            for l in range(j + 2 * k, n - k):
                candidate = sum(numbers[i:i+k]) + sum(numbers[j:j+k]) + sum(numbers[l:l+k])
                max_sum = max(max_sum, candidate)

    return max_sum

The time complexity is now O(n^3 k). A cleaner and more efficient solution uses two pointers to split the array into 3 subsequences:

def find_max_sum_3(numbers: List[int], k: int) -> int:
    max_sum = -float("inf")
    n = len(numbers)
    for i in range(n - 2 * k):
        for j in range(i + k, n - k):
            candidate = (
                find_max_sum(numbers[:i]) +
                find_max_sum(numbers[i:j]) +
                find_max_sum(numbers[j:])
            )
            max_sum = max(max_sum, candidate)

    return max_sum

I like this solution because it reuses the solution from the warmup question. The time complexity is now O(n^3), which is a small improvement. Space complexity can be improved, since slicing Python lists creates new lists, instead of views. We can modify the original solution to take indexes:

def find_max_sum(numbers: List[int], k: int, start: int, end: int) -> int:
    max_sum = 0
    for i in range(start, start + k):
        max_sum += numbers[i]

    candidate = max_sum

    for i in range(start + k, end):
        candidate += numbers[i]
        candidate -= numbers[i - k])
        max_sum = max(max_sum, candidate)

    return max_sum

def find_max_sum_3(numbers: List[int], k: int) -> int:
    max_sum = -float("inf")
    n = len(numbers)
    for i in range(n - 2 * k):
        for j in range(i + k, n - k):
            candidate = (
                find_max_sum(numbers, 0, i) +
                find_max_sum(numbers, i, j) +
                find_max_sum(numbers, j, n)
            )
            max_sum = max(max_sum, candidate)

    return max_sum

Comments

You can engage with this post on Twitter or Webmention.

Tao etc.

Musings about the path and other things