Binary Search Efficiency

Section 26.5 Binary Search Efficiency

Subsection 26.5.1 Iterative Analysis

Below is a simplified version of the binary search algorithm:

int low = 0;
int high = vec.size() - 1;

while (low <= high) {
    int mid = low + (high - low) / 2;
    T midValue = vec.at(mid);

    if (midValue == key) {
    } else if (midValue < key) {
        low = mid + 1;
    } else {
        high = mid - 1;
    }
}
return -1;

As always, the indexing, comparisons, arithmetic and assignment operations are all constant time. As is asking for the size of a vector. So we are simply left with:

🔗

// O (1)
while (low <= high) {
  // O (1)
}
// O (1)

We are simply left with the loop. How many times will it run? The loop runs until the low index is greater than the high index. Each iteration of the loop cuts the search space in half. So if we start with \(n\) items between low and high, after one iteration we have at most \(n/2\) items left to check. After two iterations we have at most \(n/4\) items left to check, and so on.

🔗

\(n ... n/2 ... n/4 ... n/8 ... n/16 ... \) is a classic logarithmic sequence. The number of steps before the search space is reduced to 1 is \(\log_2 n\text{.}\) So, we now have:

🔗

// O (1)
loop O(log n) times {
  // O (1)
}
// O (1)

The loop represents \(O(\log n) \cdot O(1) = O(\log n)\) work in total and dominates the extra constant time work that happens outside the loop. So our binary search algorithm has a time complexity of \(O(\log n)\text{.}\)

🔗

Subsection 26.5.2 Recursive Analysis

We should expect the recursive version of binary search to have the same time complexity as the iterative version. Let’s see if that is the case.

🔗

Here is the recursive version again:

🔗

int binarySearchRecursive(const vector<T>& vec, T key, int lowIndex, int highIndex) {
    if (lowIndex > highIndex) {
        return -1; // Base case: key not found
    }
    int midIndex = lowIndex + (highIndex - lowIndex) / 2;
    T midValue = vec.at(midIndex);
    if (midValue == key) {
        return midIndex; // Key found
    } else if (midValue < key) {
        return binarySearchRecursive(vec, key, midIndex + 1, highIndex);
    } else {
        return binarySearchRecursive(vec, key, lowIndex, midIndex - 1);
    }
}

All of the work is constant other than the recursive calls. Which leaves:

🔗

int binarySearchRecursive(const vector<T>& vec, T key, int lowIndex, int highIndex) {
    // O(1)
    if (midValue == key) {
        // O(1)
    } else if (midValue < key) {
        return binarySearchRecursive(vec, key, midIndex + 1, highIndex);
    } else {
        return binarySearchRecursive(vec, key, lowIndex, midIndex - 1);
    }
}

Note that although there are two recursive calls, we only do one or the other. Which ever recursive call is made, the size of the problem is reduced by half. So, we are left with a recursive call that operates on a problem of size \(n/2\text{.}\)

🔗

Intuitively, we see the same pattern as with iteration. Our recursive calls will be on a range of size \(n/2\text{,}\) then \(n/4\text{,}\) then \(n/8\text{,}\) and so on, until we run out of items. This will mean \(\log n\) recursive calls.

🔗

We can also see that the pattern matches one of the recurrence relation we defined in Section 25.12. To do binary search on \(n\) items, you have to do some constant work and then do binary search on \(n/2\) items. That corresponds to \(T(n) = T(n/2) + O(1)\text{,}\) which solves to \(O(\log n)\text{.}\)

🔗

Subsection 26.5.3 Average Case

As with linear search, the average case for binary search will end up the same as the worst case. However it is even trickier to reason about.

🔗

Imagine starting a list of items. In our first iteration, we check the middle item. That one item can be found in one step. If it is not the correct item, then we will either check the middle of the upper half or the middle of the lower half. Those two items can be found in two steps. If we still haven’t found the item, we will be checking the middle of one of the quarters of the original list. On this third iteration, there are 4 things that could potentially be found.

🔗

Figure 26.5.1. The items that can be found in the first three iterations of a binary search on 15 items. The first iteration checks index 7. The second iteration checks indexes 3 or 11. The third iteration checks indexes 1, 5, 9, or 13.

🔗

The next iteration (the fourth) would get us to 8 possible items. Assuming there were more items, the fifth iteration would get us to 16 more items, and so on. Because the number of items that can be found doubles with each iteration, half of the items will not be reached until the final iteration. (Once we have managed to reach half the potential items, one more doubling will reach all the remaining items.) These items at the last iteration will require \(\log n\) steps to find.

🔗

So there are \(\frac{n}{2}\) items that take the full \(\log n\) time to be found. If we assume all the other items are “free” (cost nothing), we still have \(\frac{n \cdot \log n}{2}\) total work represented by these items from the last iteration. The average work per item will be:

🔗

\begin{equation*} \frac{\text{total work}}{\text{number of items}} = \frac{\frac{n \cdot \log n}{2}}{n} = \frac{\log n}{2} \end{equation*}

Of course, \(\frac{\log n}{2}\text{,}\) is \(O(\log n)\text{.}\) So even assuming half the items were “free”, the average work is still \(O(\log n)\text{.}\)

🔗

Binary search has a time complexity of \(O(\log n)\) in the worst and average cases.

🔗

You have attempted of activities on this page.

🔗

Prev Top Next