Sorting STL Containers

posted by Stephan Brumme, updated

std::sort

Yes, there is std::sort. Yes, it's pretty much the most efficient way to sort generic STL containers. No, it's not perfect.

Improving std::sort

What if the data is already almost sorted ? std::sort doesn't benefit a lot from processing already sorted data. However, it's quite common to have (partially) sorted data.

std::sort requires RandomAccessIterator which are sometimes not available for custom containers.

What about std::list ? The syntax is completely different:
hide std::list<int> mylist; // ... fill mylist with data ... // compiler error: std::sort(mylist.begin(), mylist.end()); // must use this instead: mylist.sort();
That's why I wrote some sorting routines on my own to overcome all those limitations: The sorting algorithms are ordered more or less by overall performance (slowest first). As you can see faster algorithms have stricter requirements:
Algorithm Stable ? Iterator Notes
Bubble Sort yes forward iterator code shown here requires bidirectional iterator, just remove both instances of --last; (will be slower then)
Selection Sort no forward iterator
Insertion Sort yes bidirectional iterator
Shell Sort no bidirectional iterator std::distance and std::advance benefit heavily from random access iterators
Heap Sort no random-access iterator my n-ary Heap Sort implementation works with bidirectional iterators, too
Merge Sort yes random-access iterator requires additional memory
Merge Sort (in-place) yes forward iterator no additional memory but slower than normal merge sort
Quick Sort no bidirectional iterator std::distance and std::advance benefit heavily from random access iterators
std::sort no random-access iterator
std::stable_sort yes random-access iterator
The algorithms are nothing new, they can be found in every decent book on algorithms and, of course, online (e.g. Wikipedia).

The live test shows that under certain circumstances even a simple bubble-sort can outperform std::sort.

Live Tests

My server (2.5GHz Quad-Xeon, GCC -O3) sorts several typical containers instantly while you type in a container size.
Javascript/Ajax needs to be enabled. The containers are identical for each sorting algorithm, even the Random Data containers.
Repeated sorts of the same container size may produce slightly different timings depending on the overall server load.

Container Size: (max. 1000000, some restrictions above 30000)

For complex data types the invokation of operator= and operator< might be expensive.
For example, copying a std::string often includes a slow memory allocation.

This little interactive test let's you play around with the various algorithms.

Container Size: (max. 10000)

comp. stands for comparisons:

Update October 13, 2013: You can supply your own comparison functors to each sorting routine (default is std::less of course). If your compiler supports C++11, std::move will be used where appropriate (scroll down to the last section for details).

Update February 24, 2014: I added Merge Sort and Heap Sort (incl. n-ary Heap Sort).

Git users: scroll down to the repository link
Download  sort.h
Latest release: February 27, 2014, size: 14.9 kBytes, 550 lines

CRC32: c0098aac
MD5: f56f1d1ac3c1659ee387961bdc6bfade
SHA1: ac47d0a06a31e39edcd1fef817ac0d882728ad58
SHA256:c27487c2f9d08bd33be9bf9770f2069845784c31c570146f985359d6d026b9ac

Stay up-to-date:git clone http://create.stephan-brumme.com/stl-sort/.git

If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com

Changelog

Bubble Sort

Bubble sort is the most simple sorting algorithm.
It's actually not that bad if you sort only a few elements or if they are almost sorted.

To boost bubble sort's performance it's essential to include two optimizations:
  1. After each pass, the last element is in its final position. Therefore we can decrement the end iterator after each pass.
  2. After each pass, we check whether we swapped any elements. If not, the container is sorted.
Especially the second optimization gives a huge speed-up on almost sorted containers.
hide bubbleSort /// Bubble Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void bubbleSort(iterator first, iterator last, LessThan lessThan) { if (first == last) return; // usually "last" points beyond the last element // now it points directly to that last element --last; // only one element => it's sorted if (first == last) return; bool swapped; do { // reset swapped flag swapped = false; iterator current = first; while (current != last) { // bubble up iterator next = current; ++next; // two neighbors in wrong order ? swap them ! if (lessThan(*next, *current)) { std::iter_swap(current, next); swapped = true; } ++current; } // last element is already sorted now --last; // remove this line if you only have a forward iterator // last will move closer towards first } while (swapped && first != last); } /// Bubble Sort with default less-than operator template <typename iterator> void bubbleSort(iterator first, iterator last) { bubbleSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Selection Sort

Selection sort does have a constant runtime, independent on the previous sorting order of the container. It has the lowest number of writes among the algorithms shown here (but is not optimal while Cycle sort is). It's a good compromise for sorting on a Flash/EEPROM device.
hide selectionSort /// Selection Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void selectionSort(iterator first, iterator last, LessThan lessThan) { for (iterator current = first; current != last; ++current) { // find smallest element in the unsorted part and remember its iterator in "minimum" iterator minimum = current; iterator compare = current; ++compare; // walk through all still unsorted elements while (compare != last) { // new minimum found ? save its iterator if (lessThan(*compare, *minimum)) minimum = compare; // next element ++compare; } // add minimum to the end of the already sorted part if (current != minimum) std::iter_swap(current, minimum); } } /// Selection Sort with default less-than operator template <typename iterator> void selectionSort(iterator first, iterator last) { selectionSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Insertion Sort

Insertion sort beats bubble sort in pretty much every aspect. The only reason to prefer bubble sort over insertion sort is that I can write a bug-free bubble sort even after three beers.
Somehow I fail to do the same with insertion sort even though it's not that complex, too.
hide insertSort /// Insertion Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void insertionSort(iterator first, iterator last, LessThan lessThan) { if (first == last) return; // skip first element, consider it as sorted iterator current = first; ++current; // insert all remaining unsorted elements into the sorted elements while (current != last) { // insert "compare" into the already sorted elements typename std::iterator_traits<iterator>::value_type compare = __std_move__(*current); // find location inside sorted range, beginning from the right end iterator pos = current; while (pos != first) { // stop if left neighbor is not smaller iterator left = pos; --left; if (!lessThan(compare, *left)) break; // shift that left neighbor one position to the right *pos = __std_move__(*left); pos = __std_move__( left); // same as --pos } // found final position if (pos != compare) *pos = __std_move__(compare); // sort next element ++current; } } /// Insertion Sort with default less-than operator template <typename iterator> void insertionSort(iterator first, iterator last) { insertionSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Shell Sort

Shell sort is based on insertion sort. It's faster when working with large containers but slower when the data is pre-sorted.
hide shellSort /// Shell Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void shellSort(iterator first, iterator last, LessThan lessThan) { const size_t numElements = std::distance(first, last); if (numElements <= 1) return; // sequence taken from Wikipedia (Marcin Ciura) static const size_t OptimalIncrements[] = { 68491, 27396, 10958, 4383, 1750, 701, 301, 132, 57, 23, 10, 4, 1, 0 }; size_t increment = OptimalIncrements[0]; size_t incrementIndex = 0; // increment must not be bigger than the number of elements to be sorted while (increment >= numElements) increment = OptimalIncrements[++incrementIndex]; // stumble through all increments in descending order while (increment > 0) { iterator stripe = first; size_t offset = increment; std::advance(stripe, offset); while (stripe != last) { // these iterators are always "increment" apart iterator right = stripe; iterator left = stripe; std::advance(left, -int(increment)); // value to be sorted typename std::iterator_traits<iterator>::value_type compare = *right; // note: stripe is simply the same as first + offset // but operator+() is expensive for non-random access iterators size_t posRight = offset; // only look at values between "first" and "last" while (true) { // found right spot ? if (!lessThan(compare, *left)) break; // no, still not correct order: shift bigger element to the right *right = __std_move__(*left); // go one step to the left right = left; posRight -= increment; if (posRight < increment) break; std::advance(left, -int(increment)); } // found sorted position if (posRight != offset) *right = __std_move__(compare); // next stripe ++stripe; ++offset; } // smaller increment increment = OptimalIncrements[incrementIndex++]; } } /// Shell Sort with default less-than operator template <typename iterator> void shellSort(iterator first, iterator last) { shellSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Heap Sort

A short version of Heap sort can be written in just two lines of C++ code because of the power of the STL:
hide heapSort /// Heap Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void heapSort(iterator first, iterator last, LessThan lessThan) { // just use STL code std::make_heap(first, last, lessThan); std::sort_heap(first, last, lessThan); } /// Heap Sort with default less-than operator template <typename iterator> void heapSort(iterator first, iterator last) { // just use STL code std::make_heap(first, last); std::sort_heap(first, last); }

n-ary Heap Sort

STL needs a random-access iterator. It's heap is a 2-ary heap, that means each node has at most two children. I wanted to explore the performance behavior of n-ary heap sort where each node may have more than two children. An 8-ary heap seems to be the sweet spot for integers. Larger values need an excessive amount of comparisons during traversal while smaller don't fully exploit the performance gain of data locality.

In contrast to the STL based heap sort, a simpler bidirectional iterator is sufficient for my naryHeapSort code:
hide naryHeapSort template <unsigned int Width, typename iterator, typename LessThan> void naryHeapSort(iterator first, iterator last, LessThan lessThan) { // width must be at least two if (Width < 2) { naryHeapSort<2>(first, last, lessThan); return; } size_t numElements = std::distance(first, last); if (numElements < 2) return; // based on n-ary heap sort pseudo code from http://de.wikipedia.org/wiki/Heapsort struct SiftDown { void operator() (iterator first, size_t pos, size_t stop, LessThan lessThan) { std::advance(first, pos); iterator parent = first; iterator child = first; typename std::iterator_traits<iterator>::value_type value = __std_move__(*parent); while (pos * Width + 1 < stop) { // locate children size_t increment = pos * (Width - 1) + 1; pos += increment; std::advance(child, increment); // figure out how many children we have to check size_t numChildren = Width; if (numChildren + pos > stop) numChildren = stop - pos; // find the biggest of them if (numChildren > 1) { iterator scan = child; ++scan; size_t maxPos = 0; for (size_t i = 1; i < numChildren; i++, scan++) // element in "scan" bigger than current best ? if (lessThan(*child, *scan)) { maxPos = i; child = scan; } pos += maxPos; } // is no child bigger than the parent ? => done if (!lessThan(value, *child)) { *parent = __std_move__(value); return; } // move biggest child one level up, parent one level down and continue *parent = __std_move__(*child); parent = child; } *child = __std_move__(value); } } heapify; // build heap where the biggest elements are placed in front size_t firstLeaf = (numElements + Width - 2) / Width; for (size_t i = firstLeaf; i > 0; i--) heapify(first, i - 1, numElements, lessThan); // take heap's largest element and move it to the end // => build sorted sequence beginning with last (= largest) element for (size_t i = numElements - 1; i > 0; i--) { --last; std::iter_swap(first, last); // re-adjust shrinked heap heapify(first, 0, i, lessThan); } } /// n-ary Heap Sort with default less-than operator template <unsigned int Width, typename iterator> void naryHeapSort(iterator first, iterator last) { naryHeapSort<Width, iterator>(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }
The unusual local struct is pretty much the same as a C++11 lambda. I could have written an external function, too, but I wanted to keep the namespace clean.

Merge Sort

Unlike all other sorting algorithm shown here, Merge sort requires significant additional memory.
hide mergeSort /// Merge Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void mergeSort(iterator first, iterator last, LessThan lessThan, size_t size = 0) { // determine size if not known yet if (size == 0 && first != last) size = std::distance(first, last); // by the way, the size parameter can be omitted but // then we are required to compute it each time which can be expensive // for non-random access iterators // one element is always sorted if (size <= 1) return; // divide into two partitions size_t firstHalf = size / 2; size_t secondHalf = size - firstHalf; iterator mid = first; std::advance(mid, firstHalf); // recursively sort them mergeSort(first, mid, lessThan, firstHalf); mergeSort(mid, last, lessThan, secondHalf); // merge sorted partitions std::inplace_merge(first, mid, last, lessThan); } /// Merge Sort with default less-than operator template <typename iterator> void mergeSort(iterator first, iterator last) { mergeSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }
To overcome Merge Sort's limitations, I wrote an in-place version - that means, aside from a few variables no additional memory need.
It only differs in the final merge step which needs only a forward iterator.
The in-place Merge Sort outperforms the original version when the container is already sorted but falls behind in pretty much every other situation.
Nevertheless, it's still faster than any other algorithm for forward iterators.
hide mergeSortInPlace /// in-place Merge Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void mergeSortInPlace(iterator first, iterator last, LessThan lessThan, size_t size = 0) { // determine size if not known yet if (size == 0 && first != last) size = std::distance(first, last); // by the way, the size parameter can be omitted but // then we are required to compute it each time which can be expensive // for non-random access iterators // one element is always sorted if (size <= 1) return; // divide into two partitions size_t firstHalf = size / 2; size_t secondHalf = size - firstHalf; iterator mid = first; std::advance(mid, firstHalf); // recursively sort them mergeSortInPlace(first, mid, lessThan, firstHalf); mergeSortInPlace(mid, last, lessThan, secondHalf); // merge partitions (left starts at "first", right starts and "mid") // move iterators towards the end until they meet iterator right = mid; while (first != mid) { // next value of both partitions in wrong order (smaller one belongs to the left) if (lessThan(*right, *first)) { // this value must be moved to the right partition typename std::iterator_traits<iterator>::value_type misplaced = __std_move__(*first); // this value belongs to the left partition *first = __std_move__(*right); // misplaced value must be inserted at correct position in the right partition iterator scan = right; iterator next = scan; ++next; // move smaller one position to the left while (next != last && lessThan(*next, misplaced)) *scan++ = __std_move__(*next++); // found the spot ! *scan = __std_move__(misplaced); } ++first; } } /// in-place Merge Sort with default less-than operator template <typename iterator> void mergeSortInPlace(iterator first, iterator last) { mergeSortInPlace(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Quick Sort

Quick sort is often a major part of stl::sort.
hide quickSort /// Quick Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void quickSort(iterator first, iterator last, LessThan lessThan) { size_t numElements = std::distance(first, last); // already sorted ? if (numElements <= 1) return; iterator pivot = last; --pivot; // choose middle element as pivot (good choice for partially sorted data) if (numElements > 2) { iterator middle = first; std::advance(middle, numElements/2); std::iter_swap(middle, pivot); } // scan beginning from left and right end and swap misplaced elements iterator left = first; iterator right = pivot; while (left != right) { // look for mismatches while (!lessThan(*pivot, *left) && left != right) ++left; while (!lessThan(*right, *pivot) && left != right) --right; // swap two values which are both on the wrong side of the pivot element if (left != right) std::iter_swap(left, right); } // move pivot to its final position if (pivot != left && lessThan(*pivot, *left)) std::iter_swap(pivot, left); // subdivide quickSort(first, left, lessThan); quickSort(++left, last, lessThan); // *left itself is already sorted } /// Quick Sort with default less-than operator template <typename iterator> void quickSort(iterator first, iterator last) { quickSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Additional Test Code

I wanted to restrict my code to use only what is really needed. Therefore I wrote a short helper class that implements only the less-than operator. All sort algorithms shown on this page successfully operate on such a trimmed down data type, too:
hide Simple data type that only comes with less-than /// provide only operator< class Number { public: /// set value Number(int x = 0) : value(x) { } /// needed for sorting algorithms bool operator<(const Number& other) const { return value < other.value; } private: /// actually just a simple integer int value; };
Similar code was written to ensure that my routines accept forward and/or bidirectional iterators - unlike std::sort which requires random-access iterators (click to open).
show std::vector based forward/bidirectional/random iterator // ////////////////////////////////////////////////////////// // container.h // Copyright (c) 2014 Stephan Brumme. All rights reserved. // see http://create.stephan-brumme.com/disclaimer.html // #pragma once #include <iterator> #include <vector> // forward declaration class Container; //#define BIDIRECTIONALITERATOR #define RANDOMACCESSITERATOR /// can be forward iterator, bidirectional or random access (see #define above) class Iterator : #if !defined(BIDIRECTIONALITERATOR) && !defined(RANDOMACCESSITERATOR) public std::iterator<std::forward_iterator_tag, int> #elseif defined(BIDIRECTIONALITERATOR) && !defined(RANDOMACCESSITERATOR) public std::iterator<std::bidirectional_iterator_tag, int> #else public std::iterator<std::random_access_iterator_tag, int> #endif { public: typedef int value_type; /// construct empty iterator Iterator() : iterator() {} /// construct new iterator Iterator(const std::vector<int>::iterator& ite) : iterator(ite) {} /// only copy position, skip container void operator=(const Iterator& other) { iterator = other.iterator; } /// step forward (prefix) Iterator& operator++() { ++iterator; return *this; } /// step forward (postfix) Iterator operator++(int) { Iterator result(*this); ++iterator; return result; } #if defined(BIDIRECTIONALITERATOR) || defined(RANDOMACCESSITERATOR) /// step backward (prefix) Iterator& operator--() { --iterator; return *this; } /// step backward (postfix) Iterator operator--(int) { Iterator result(*this); ++iterator; return result; } #endif #if defined(RANDOMACCESSITERATOR) /// leap forward Iterator operator+ (std::iterator_traits<Iterator>::difference_type howFar) const { return iterator + howFar; } /// leap backward Iterator operator- (std::iterator_traits<Iterator>::difference_type howFar) const { return iterator - howFar; } /// leap forward Iterator operator+=(std::iterator_traits<Iterator>::difference_type howFar) { iterator += howFar; return *this; } /// leap backward Iterator operator-=(std::iterator_traits<Iterator>::difference_type howFar) { iterator -= howFar; return *this; } /// ordered iterators bool operator<(const Iterator& other) const { return iterator < other.iterator; } /// ordered iterators bool operator>(const Iterator& other) const { return iterator > other.iterator; } /// measure distance std::iterator_traits<Iterator>::difference_type operator-(const Iterator& other) const { return iterator - other.iterator; } #endif /// pointing to same element ? bool operator==(const Iterator& other) const { return iterator == other.iterator; } /// pointing to different element ? bool operator!=(const Iterator& other) const { return iterator != other.iterator; } /// dereference int& operator*() { return *iterator; } /// dereference int operator*() const { return *iterator; } private: /// has random-access iterator std::vector<int>::iterator iterator; }; /// only implements iterators and element access class Container { public: /// construct vector Container(int initialSize = 0) : container(initialSize) {} /// return iterator pointing to first element Iterator begin() { return container.begin(); } /// return iterator pointing beyond last element Iterator end() { return container.end(); } /// read/write access to an element int& operator[](size_t pos) { return container[pos]; } /// read/write access to an element bool operator!=(const Container& other) const { return container != other.container; } private: /// actually just a simple vector std::vector<int> container; };
The second live test (the one that counts comparisons and assignments) operates on a custom data type (again, click to open):
show count comparisons and assignments /// count assignments and comparisions class Number { public: /// set value Number(int x = 0) : value(x) { } /// copy constructor is counted as an assignment Number(const Number& other) : value(other.value) { ++numAssignments; } static int numLessThan; /// count comparisons bool operator<(const Number& other) const { ++numLessThan; return value < other.value; } static int numAssignments; /// count assignments void operator=(const Number& other) { ++numAssignments; value = other.value; } bool operator==(const Number& other) const { return value == other.value; } /// reset counters static void reset() { numLessThan = numAssignments = 0; } private: /// actually just a simple integer int value; }; int Number::numLessThan = 0; int Number::numAssignments = 0;

Enable C++11's std::move if available

The code sample contained a user-defined macro __std_move__. It is automatically replaced by std::move if your C++ compiler supports C++11 or a simple copy/assignment else:
hide Enable C++11's std::move if available // use std::move if C++11 is enabled #if (__cplusplus__ >= 201103L) || __GXX_EXPERIMENTAL_CXX0X__ #undef __std_move__ #define __std_move__(x) std::move(x) #else #define __std_move__(x) (x) #endif
homepage