Skip to main content
Logo image

Section 12.1 Searching Algorithms

Computers can store vast amounts of data. And lots of programs exist to find things in that data. For instance, if we have a dictionary of the million-plus words in English we might want to be able to check whether a given word is in the dictionary. At a much smaller scale, if we manage or our todo list with a computer program, that program might need to be able to find the item that is due soonest.
Both of these are examples of search algorithms. In both cases there is some set of dataโ€”all the words in our dictionary or the items in or todo listโ€”and we want the program to find a particular item or tell us that itโ€™s not present.
In this chapter we will cover two of the most important kinds of search algorithms: linear (or sequential) and binary. For the AP CSA exam you will need to be able to implement linear searches (good news, we already learned how to do that back in Subsectionย 6.3.3ย Searching) and to analyze the behavior of linear and binary searches.
The following video is also on YouTube at https://youtu.be/DHLCXXX1OtE. It introduces the concept of searching including sequential search and binary search.

Subsection 12.1.2 Linear search with 2D arrays

We can also apply the linear search algorithm to data in a 2D array. There are two ways to think about this, which makes sense since there are two dimensions.
One way is if we think of the 2D array as a grid of individual values and we are searching for one specific value. For instance, if the grid represented a chess board and we were searching for the position of the black queen, we could loop over all the row of the 2D array and and then loop through the values in each row, checking each position to see if it held black queen. We are essentially searching linearly through the rows and then again within each row.
The code below demonstrates this with a 2D array of integers. Click on the Code Lens button to step through this code. Then, change it to work with a 2D array of Strings. Remember to use the equals method to compare Strings.

Activity 12.1.6.

What will the following code print? Click on the Code Lens button to step through this code. Can you change the code to work for a String 2D array instead of an int array? Note that the indices row and col will still be ints. Remember to use the equals method to compare Strings.
The other way to think about searching a 2D array is to think of the array as a table of data, such we might load from a CSV file as we discussed in Sectionย 11.1ย Files. When we have that kind of date, often what weโ€™re searching for is a row within that table that matches some criteria. In this case weโ€™d loop over the rows of the 2D array and then for each row test whether it matches our criteria. That could be as simple as checking whether some value at a given position in the row is a particular value. But it might also require us to do a linear search within the row such as if weโ€™re looking for a row that contains a particular value in any position.

Subsection 12.1.4 Algorithmic run time

At a high level, linear search and binary search are two different algorithms that solve the same problem. How do we choose between them? Sometimes the choice is easy: if our array or list is not sorted we pretty much have to use a linear search.
But suppose our data is sorted and we could use either. Should we always use a binary search? What does it mean when we say binary search is faster than linear search? Is that always true?
Computer scientists like to answer these questions by analyzing algorithms in terms of their run time which isnโ€™t measured by running a program and timing it but rather by characterizing, slightly abstractly, how many steps the algorithm will have to execute before it finishes. In particular computer scientist think about the worst case run times of algorithms, meaning how many steps will the algorithm take if given the worst possible input.
With searching, the abstract step is something like, how many items does the algorithm have to compare to the thing it is searching for and the worst case is usually if the thing we are searching for isnโ€™t there to be found.
A linear search, searching an array would have to look at every item in the array before it could say that the item wasnโ€™t present. By contrast a binary search would only have to look at a handful of items before the search space would collapse and it could determine that the search had failed.
Finally, the thing computer scientists most often care about is not even how many steps an algorithm takes on any given input but how the worst case run time changes as the size of the problem changes. For searching it makes sense that it would take longer to search a bigger array or list. And indeed both linear and binary searches take longer, the more things they have to search. But we can actually compare how they change.
If we define run time of a search algorithm as the number of elements the algorithm has to compare to the target and the worst case run time occurring when the target isnโ€™t present, then itโ€™s easy to analyze linear search: the worst case run time is the size of the array being searched, because the algorithm has to look at every element of the array before it can be sure the target isnโ€™t present.
Binary search on the other hand, doesnโ€™t always need to look at every element. It only needs to look at the middle elements of ever-shrinking search spaces until the search space is just one element. After comparing the middle element of the array to the target, it will reduce the search space by half which means half of the elements will never be looked at. And half of the remaining half will be eliminated with one more comparison.
The table below shows how both the linear and binary search run times change. Notice how doubling the size of of the input n adds only one more comparison with binary search. So even though for a two-element array linear search and binary search take the same number of steps, because the number of steps increases so much slower as the size of the input grows we say that binary search is faster than linear search.
Table 12.1.2.
N Linear Search Binary Search
2 2 comparisons 2 comparisons
4 4 3
8 8 4
16 16 5
100 100 7
Run times can be described with mathematical functions that describe the rough mathematical relationship between the size of the input and the run time of the algorithm. For instance, since linear searchโ€™s run time grows by a constant amount with every increase in the input size, we can describe it with the function \(f(n) = n\text{.}\) As \(n\) (the input size) gets bigger, so does the run time. Because we are analyzing things abstractly, we donโ€™t worry about coefficientsโ€”if every additional element we are searching actually adds two steps rather than one, thatโ€™s still a linear function. Computer scientists notate algorithmic run times by putting the function that describes the growth of an algorithmโ€™s run time in parentheses preceded by an \(O\text{,}\) which stands for โ€œorder of approximationโ€. This is called big-O notation. Thus, the big-O notation for linear search is \(O(n)\text{.}\)
Binary searchโ€™s run time grows logarithmically since the number of steps is basically log base 2 of the size of the input, so \(f(n) = \log_{2}n\text{.}\) But again, we donโ€™t care about coefficients and the base of the logarithm is really just a coefficient, so a computer scientist would describe binary searchโ€™s run time as \(O(\log n)\text{.}\)
You wonโ€™t be asked about big-O notation on the AP exam, but you should be able to calculate how many steps binary search takes for a given input size. You can figure it out by counting how many times you can divide it the size in half. Or you can start at 1 and keep a count of how many times you can double it with the powers of two (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, etc.) until you reach a number that is slightly above the original size.

Activity 12.1.9.

Which will cause the shortest execution of a binary search looking for a value in an array of integers?
  • The value is the first one in the array
  • This would be true for sequential search, not binary.
  • The value is in the middle of the array
  • If the value is in the middle of the array the binary search will return after one iteration of the loop.
  • The value is the last one in the array
  • How would that be the shortest in a binary search?
  • The value isnโ€™t in the array
  • This is true for the longest execution time, but we are looking for the shortest.

Activity 12.1.10.

Which of the following conditions must be true in order to search for a value using binary search?
I. The values in the array must be integers.
II. The values in the array must be in sorted order.
III. The array must not contain duplicate values.
  • I only
  • You can use a binary search on any type of data that can be compared, but the data must be in order.
  • I and II
  • You can use a binary search on any type of data that can be compared.
  • II only
  • The only requirement for using a Binary Search is that the values must be ordered.
  • II and III
  • The array can contain duplicate values.

Activity 12.1.11.

How many times would the loop in the binary search run for an array int[] arr = {2, 10, 23, 31, 55, 86} with search(arr,55)?
  • 2
  • It will first compare with the value at index 2 and then index 4 and then return 4.
  • 1
  • This would be true if we were looking for 23.
  • 3
  • This would be true if we were looking for 31.

Activity 12.1.12.

If you had an ordered array of size 500, what is the maximum number of iterations required to find an element with binary search?
  • approximately 15 times
  • How many times can you divide 500 in half?
  • approximately 9 times
  • You can divide 500 in half, 9 times, or you can observe that 2^9 = 512 which is slightly bigger than 500.
  • 500 times
  • How many times can you divide 500 in half?
  • 2 times
  • How many times can you divide 500 in half?

Subsection 12.1.5 Coding Challenge: Search Run Times

Letโ€™s go back to the spellchecker that we created with arrays. Here is a version of the spellchecker below that reads the dictionary file into an ArrayList. The advantage of using an ArrayList instead of an array for the dictionary is that we do not need to know or declare the size of the dictionary in advance.
In the spellchecker challenge, we used linear search to find a word in the dictionary. However, the dictionary file is actually in alphabetical order. We could have used a much faster binary search algorithm! Letโ€™s see how much faster we can make it.
Write a linear search method and a binary search method to search for a given word in the dictionary using the code in this lesson as a guide. You will need to use size and get(i) instead of [] to get an element in the ArrayList dictionary at index i. You will need to use the equals and compareTo methods to compare Strings. Have the methods return a count of how many words they had to check before finding the word or returning.

Project 12.1.13.

This spellchecker uses an ArrayList for the dictionary. Write a linearSearch(word) and a binarySearch(word) method. Use get(i), size(), equals, and compareTo. Return a count of the number of words checked.
Run your code with the following test cases and record the run time for each word in this Google document (do File/Make a Copy) also seen below to record your answers.
What do you notice? Which one was faster in general? Were there some cases where each was faster? How fast were they with misspelled words? Record your answers in the window below.

Subsection 12.1.6 Summary

  • (AP 4.14.A.1) Linear search algorithms are standard algorithms that check each element in order until the desired value is found or all elements in the array or ArrayList have been checked. Linear search algorithms can begin the search process from either end of the array or ArrayList.
  • (AP 4.14.A.2) When applying linear search algorithms to 2D arrays, each row must be accessed then linear search applied to each row of the 2D array.
  • The binary search algorithm starts at the middle of a sorted array or ArrayList and eliminates half of the array or ArrayList in each iteration until the desired value is found or all elements have been eliminated.
  • (AP 4.17.B.1 preview) Data must be in sorted order to use the binary search algorithm.
  • (AP 4.17.B.2 preview) Binary search is typically more efficient than linear search.
  • (AP 4.17.B.3 preview) The binary search algorithm can be written either iteratively or recursively. (The recursive solution will be presented in lesson 4.17).
  • Informal run-time comparisons of program code segments can be made using statement execution counts.
You have attempted of activities on this page.