Big O Notation and Search Algorithms
Updated: Oct 14, 2020
A few weeks ago, I was assigned a task to create a python number guessing game, and It got me thinking, What if the computer guessed the numbers?. I looked on the internet and found something called binary and linear search. Here’s how it works.
Let's say we have a randomly generated number, we need our program to guess the number. We have a minimum value the random number can be, and the maximum.
So if you want to find the number you would probably search through every number from the minimum number to the maximum number using a for loop. On every iteration we can check if it is equal to the random number. This is called linear search.
Here’s some code in python:
from random import randint min_val = 0 max_val = 100 random_num = randint(min_val, max_val) attempts = 0 for i in range(min_val, _max_val): if i == random_num: print("The number was: " + str(random_num)) break attempts += 1 print("The number of guesses was: " + str(attempts))
This works, but in most cases it’s not as efficient as binary search.
Now this is where I came across big O notation.
Big O notation is used to show how the run time grows as the input size grows.
Something like getting the first item of an array always takes the same amount of time no matter the size of the array, so it’s big O notation would be O(1).
If we loop through an array, the time it takes depends on the size of the array, as the size goes up, the time it takes goes up. The ratio of time to size of the array is 1:1, the big O notation would be O(n), n being the size of the array.
Here’s some examples of different big O notations on a graph:
In our linear search example you can see that the for loop goes through every number between the maximum value and the minimum until it finds the random number. On average it goes through about half of the max value before it finds the number. So the big O notation would be O(n/2).
Ok, let's take a look at binary search. In binary search we don’t need to go through every possible number. Let’s say our minimum value is 0 and our maximum value is 100. The random number will be 32. First we find the halfway point between the two numbers, in this case it’s 50. Now we check if 50 is bigger or smaller than the random number. 50 is bigger than 32. Now we know the number is not bigger than 50, so it must be between 0 and 50. Next, let's find the halfway point between 0 and 50. In this case it’s 25. We check if it’s lower or higher than the random number. 25 is lower than 32. So we now know the random number is between 25 and 50. And we keep going, eventually we will find our random number! And that’s how binary search works!
And here’s a binary search algorithm I made in python:
from random import randint min_val = 0 max_val = 100 random_num = randint(min_val, max_val) attempts = 0 while True: guess = round((min_val + max_val) / 2) if guess == randnum: print("The number was: " + str(random_num)) break elif guess > randnum: print("Too big: " + str(guess)) max_val = guess elif guess < randnum: print("Too small: " + str(guess)) min_val = guess attempts += 1 print("The number of guesses was: " + str(attempts))
The big O notation of binary search (on average) is O(log n). Let me explain what that means. A logarithmic function is the opposite of an exponential function. When you say something grows exponentially, it’s being multiplied. When something grows logarithmically, it is being divided. In this case the time it takes doesn’t linearly increase with the amount of possible numbers. So using binary search for a number between 1 and 50 it doesn’t make much of a difference to using linear search, but with huge numbers, using binary search makes a big difference.
If you would like to experiment with the code feel free to copy and paste it!