Types of searches
- finds info in a data structure
Sequential/Linear search
- only option if not sorted and want to know if the list contains an item
- takes O(n) time
- searches each item in the data structure one at a time to see if it matches what it is looking for
- example:
def ss(number_list, n): found = False for i in number_list: if i == n: found = True break return found
Binary search
- can use if have a sorted list
- more efficient that linear search (because takes less iterations), takes O(logn) time
- a series of iterations that divide a sorted list in half to drill down to a specific element
- algorithm:
1 find middle card (if this is what you are searching for then you are done, otherwise...)
- determine if you are searching for something higher or lower
- then we will only search in the half our item is in
- repeat steps i-iii until find out item, or search is completed and our item is not in the list
- example:
def binary_search(the_list, target_item) first = 0 last = len(the_list) - 1 found = False while first <= last and not found: mid = (first + last) // 2 if the_list[mid] == item: found = True else: if item < the_list[mid]: last = mid - 1 else: first = mid + 1 return found
Finding Duplicates
- looping through to check for each item one at a time has a time complexity of O(n^2), which takes too long for large databases or lists
- can use Python sets ---
set()
- a set is a data structure that is a collection of unique(=> no duplicates), unordered, immutable objects
- example:
printsa_set = set() a_set.add('Susan Adams') a_set.add('Kwame Goodall') a_set.add('Jill Hampton') print(a_set)
{'Jill Hampton', 'Kwame Goodall', 'Susan Adams'}
- items are added to the beginning of a set, not the end
- above example continued:
printsa_set.add('Susan Adams') print(a_set)
{'Jill Hampton', 'Kwame Goodall', 'Susan Adams'}
- 'Susan Adams' was not added again and the list remains 3 items long
- to use
set()
to find the duplicates:- add items to set one by one
- if the length of the set does not change, then we know the item we tried to add was a duplicate
- example:
printsdef return_duplicates(a_list): dups = [] a_set = set() for item in a_list: length_one = len(a_set) a_set.add(item) length_two = lenlen(a_set) if length_one == length_two: dups.append(item) return dups a_list = ['Susan Adams', 'Jill Hampton', 'Kwame Goodall', 'Susan Adams'] dups = return_duplicates(a_list) print(dups)
['Susan Adams']
- algorithm:
- create an empty set and a new list
- use a for-loop to step through each item in the list
- get length of set
- attempt to add an item from the list into the set
- get length of set
- check if the length has changes
- append the item to the new list