Types of searches

  • finds info in a data structure
  • only option if not sorted and want to know if the list contains an item
  • takes O(n) time
  • searches each item in the data structure one at a time to see if it matches what it is looking for
  • example:
    def ss(number_list, n):
      found = False
      for i in number_list:
          if i == n:
              found = True
              break
      return found
  • can use if have a sorted list
  • more efficient that linear search (because takes less iterations), takes O(logn) time
  • a series of iterations that divide a sorted list in half to drill down to a specific element
  • algorithm: 1 find middle card (if this is what you are searching for then you are done, otherwise...)
    1. determine if you are searching for something higher or lower
    2. then we will only search in the half our item is in
    3. repeat steps i-iii until find out item, or search is completed and our item is not in the list
  • example:
    def binary_search(the_list, target_item)
        first = 0
        last = len(the_list) - 1
        found = False
        while first <= last and not found:
            mid = (first + last) // 2
            if the_list[mid] == item:
                found = True
            else:
                if item < the_list[mid]:
                    last = mid - 1
                else:
                    first = mid + 1
        return found

Finding Duplicates

  • looping through to check for each item one at a time has a time complexity of O(n^2), which takes too long for large databases or lists
  • can use Python sets --- set()
    • a set is a data structure that is a collection of unique(=> no duplicates), unordered, immutable objects
    • example:
      a_set = set()
      a_set.add('Susan Adams')
      a_set.add('Kwame Goodall')
      a_set.add('Jill Hampton')
      print(a_set)
      prints {'Jill Hampton', 'Kwame Goodall', 'Susan Adams'}
    • items are added to the beginning of a set, not the end
    • above example continued:
      a_set.add('Susan Adams')
      print(a_set)
      prints {'Jill Hampton', 'Kwame Goodall', 'Susan Adams'}
      • 'Susan Adams' was not added again and the list remains 3 items long
    • to use set() to find the duplicates:
      • add items to set one by one
      • if the length of the set does not change, then we know the item we tried to add was a duplicate
    • example:
      def return_duplicates(a_list):
          dups = []
          a_set = set()
          for item in a_list:
              length_one = len(a_set)
              a_set.add(item)
              length_two = lenlen(a_set)
              if length_one == length_two:
                  dups.append(item)
          return dups
      a_list = ['Susan Adams', 'Jill Hampton', 'Kwame Goodall', 'Susan Adams']
      dups = return_duplicates(a_list)
      print(dups)  
      prints ['Susan Adams']
    • algorithm:
      1. create an empty set and a new list
      2. use a for-loop to step through each item in the list
      3. get length of set
      4. attempt to add an item from the list into the set
      5. get length of set
      6. check if the length has changes
      7. append the item to the new list

Copyright © 2022