Skip to content Skip to sidebar Skip to footer

Python Quicksort - List Comprehension Vs Recursion (partition Routine)

I watched the talk Three Beautiful Quicksorts and was messing around with quicksort. My implementation in python was very similar to c (select pivot, partition around it and recurs

Solution 1:

  1. Why is list comprehension so much faster?

    Because list comprehension implies C loop which is much faster than slow general way of using Python's for block.

  2. Some enlightenment on the limit on recursion in python. I first set it to 100000 in what cases should I be careful?

    In case you run out of memory.

  3. Trying to sort 1000000 elements hogged memory of my laptop (with the recursion method). What should I do if I want to sort so many elements? What kind of optimizations are possible?

    Python's recursion gives such an overhead because every function call allocates a lot of stack memory space on each call.

    In general, iteration is the answer (will give better performance in statistically 99% of use cases).

    Talking about memory structures, if you have simple data structures, like chars, integers, floats: use built-in array.array which is much more memory efficient than a list.


Solution 2:

Have you tried writing a non-recursive implementation of partition? I suspect that the performance difference is purely the partition implementation. You are recursing for each element in your implementation.

Update

Here's a quick implementation. It is still not super fast or even efficient, but it is much better than your original recursive one.

>>> def partition(data):
...  pivot = data[0]
...  less, equal, greater = [], [], []
...  for elm in data:
...   if elm < pivot:
...    less.append(elm)
...   elif elm > pivot:
...    greater.append(elm)
...   else:
...    equal.append(elm)
...  return less, equal, greater
...
>>> def qsort2(data):
...  if data:
...   less, equal, greater = partition(data)
...   return qsort2(less) + equal + qsort2(greater)
...  return data
... 

I also think that there are a larger number of temporary lists generated in the "traditional" version.


Solution 3:

Try to compare the list comprehension to an in-place algorithm when the memory goes really big. The code below get a near execution time when sorting 100K integers numbers, but you will probably get stucked in the list comprehension solution when sorting 1M integers. I've made the tests using a 4Gb machine. The full code: http://snipt.org/Aaaje2

class QSort:
def __init__(self, lst):
    self.lst = lst

def sorted(self):
    self.qsort_swap(0, len(self.lst))
    return self.lst

def qsort_swap(self, begin, end):
    if (end - begin) > 1:
       pivot = self.lst[begin]
       l = begin + 1
       r = end
       while l < r:
           if self.lst[l] <= pivot:
               l += 1
           else:
               r -= 1
               self.lst[l], self.lst[r] = self.lst[r], self.lst[l]

       l -= 1
       self.lst[begin], self.lst[l] = self.lst[l], self.lst[begin]    
       # print begin, end, self.lst
       self.qsort_swap(begin, l)
       self.qsort_swap(r, end)     

Post a Comment for "Python Quicksort - List Comprehension Vs Recursion (partition Routine)"