Interactive demos require Javascript. Anything else works.

Sorting STL Containers

posted by Stephan Brumme, updated

std::sort

Yes, there is std::sort. Yes, it's pretty much the most efficient way to sort generic STL containers. No, it's not perfect.

Improving std::sort

What if the data is already almost sorted ? std::sort doesn't benefit a lot from processing already sorted data. However, it's quite common to have (partially) sorted data.

std::sort requires RandomAccessIterator which are sometimes not available for custom containers.

What about std::list ? The syntax is completely different:
hide std::list<int> mylist; // ... fill mylist with data ... // compiler error: std::sort(mylist.begin(), mylist.end()); // must use this instead: mylist.sort();
That's why I wrote some sorting routines on my own to overcome all those limitations: The sorting algorithms are ordered more or less by overall performance (slowest first). As you can see faster algorithms have stricter requirements:
Algorithm Stable ? Iterator Notes
Bubble Sort yes forward iterator code shown here requires bidirectional iterator, just remove both instances of --last; (will be slower then)
Selection Sort no forward iterator
Insertion Sort yes bidirectional iterator
Shell Sort no bidirectional iterator std::distance and std::advance benefit heavily from random access iterators
Heap Sort no random-access iterator my n-ary Heap Sort implementation works with bidirectional iterators, too
Merge Sort yes random-access iterator requires additional memory
Merge Sort (in-place) yes forward iterator no additional memory but slower than normal merge sort
Quick Sort no bidirectional iterator std::distance and std::advance benefit heavily from random access iterators
Intro Sort no bidirectional iterator std::distance and std::advance benefit heavily from random access iterators
std::sort no random-access iterator
std::stable_sort yes random-access iterator
The algorithms are nothing new, they can be found in every decent book on algorithms and, of course, online (e.g. Wikipedia).

The live test shows that under certain circumstances even a simple bubble-sort can outperform std::sort.

Live Tests

My server (2.5GHz Quad-Xeon, GCC -O3) sorts several typical containers instantly while you type in a container size.
Javascript/Ajax needs to be enabled. The containers are identical for each sorting algorithm, even the Random Data containers.
Repeated sorts of the same container size may produce slightly different timings depending on the overall server load.

Container Size: (max. 1000000, some restrictions above 30000)

For complex data types the invokation of operator= and operator< might be expensive.
For example, copying a std::string often includes a slow memory allocation.

This little interactive test let's you play around with the various algorithms.

Container Size: (max. 10000)

comp. stands for comparisons:

Update October 13, 2013: You can supply your own comparison functors to each sorting routine (default is std::less of course). If your compiler supports C++11, std::move will be used where appropriate (scroll down to the last section for details).

Update February 24, 2014: I added Merge Sort and Heap Sort (incl. n-ary Heap Sort).

Update November 20, 2020: I added Intro Sort and switched completely to C++11.

Git users: scroll down to the repository link
Download  sort.h
Latest release: November 20, 2020, size: 16.1 kBytes, 607 lines

CRC32: 3b538aa5
MD5: 34a22fa244b88f52e63109e427308589
SHA1: d210d460f125ff1505d3e43ead9bce491ff315f4
SHA256:b81426d097cb20c73b4f2d5be5ff2dd1eea8dce76c500afa0ff3b95aec4734c7

Stay up-to-date:git clone https://create.stephan-brumme.com/stl-sort/git

GitHub mirror:https://github.com/stbrumme/stl-sort

If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com

License

This code is licensed under the zlib License:
This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution.zlib License

Changelog

Bubble Sort

Bubble sort is the most simple sorting algorithm.
It's actually not that bad if you sort only a few elements or if they are almost sorted.

To boost bubble sort's performance it's essential to include two optimizations:
  1. After each pass, the last element is in its final position. Therefore we can decrement the end iterator after each pass.
  2. After each pass, we check whether we swapped any elements. If not, the container is sorted.
Especially the second optimization gives a huge speed-up on almost sorted containers.
hide bubbleSort /// Bubble Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void bubbleSort(iterator first, iterator last, LessThan lessThan) { if (first == last) return; // usually "last" points beyond the last element // now it points directly to that last element --last; // only one element => it's sorted if (first == last) return; bool swapped; do { // reset swapped flag swapped = false; auto current = first; while (current != last) { // bubble up auto next = current; ++next; // two neighbors in wrong order ? swap them ! if (lessThan(*next, *current)) { std::iter_swap(current, next); swapped = true; } ++current; } // last element is already sorted now --last; // remove this line if you only have a forward iterator // last will move closer towards first } while (swapped && first != last); } /// Bubble Sort with default less-than operator template <typename iterator> void bubbleSort(iterator first, iterator last) { bubbleSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Selection Sort

Selection sort does have a constant runtime, independent on the previous sorting order of the container. It has the lowest number of writes among the algorithms shown here (but is not optimal while Cycle sort is). It's a good compromise for sorting on a Flash/EEPROM device.
hide selectionSort /// Selection Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void selectionSort(iterator first, iterator last, LessThan lessThan) { for (iterator current = first; current != last; ++current) { // find smallest element in the unsorted part and remember its iterator in "minimum" auto minimum = current; auto compare = current; ++compare; // walk through all still unsorted elements while (compare != last) { // new minimum found ? save its iterator if (lessThan(*compare, *minimum)) minimum = compare; // next element ++compare; } // add minimum to the end of the already sorted part if (current != minimum) std::iter_swap(current, minimum); } } /// Selection Sort with default less-than operator template <typename iterator> void selectionSort(iterator first, iterator last) { selectionSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Insertion Sort

Insertion sort beats bubble sort in pretty much every aspect. The only reason to prefer bubble sort over insertion sort is that I can write a bug-free bubble sort even after three beers.
Somehow I fail to do the same with insertion sort even though it's not that complex, too.
hide insertSort /// Insertion Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void insertionSort(iterator first, iterator last, LessThan lessThan) { if (first == last) return; // skip first element, consider it as sorted auto current = first; ++current; // insert all remaining unsorted elements into the sorted elements while (current != last) { // insert "compare" into the already sorted elements auto compare = std::move(*current); // find location inside sorted range, beginning from the right end auto pos = current; while (pos != first) { // stop if left neighbor is not smaller auto left = pos; --left; if (!lessThan(compare, *left)) break; // shift that left neighbor one position to the right *pos = std::move(*left); pos = std::move( left); // same as --pos } // found final position if (pos != current) *pos = std::move(compare); // sort next element ++current; } } /// Insertion Sort with default less-than operator template <typename iterator> void insertionSort(iterator first, iterator last) { insertionSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Shell Sort

Shell sort is based on insertion sort. It's faster when working with large containers but slower when the data is pre-sorted.
hide shellSort /// Shell Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void shellSort(iterator first, iterator last, LessThan lessThan) { auto numElements = std::distance(first, last); if (numElements <= 1) return; // sequence taken from Wikipedia (Marcin Ciura) static const int OptimalIncrements[] = { 68491, 27396, 10958, 4383, 1750, 701, 301, 132, 57, 23, 10, 4, 1, 0 }; size_t increment = OptimalIncrements[0]; size_t incrementIndex = 0; // increment must not be bigger than the number of elements to be sorted while (increment >= numElements) increment = OptimalIncrements[++incrementIndex]; // stumble through all increments in descending order while (increment > 0) { auto stripe = first; auto offset = increment; std::advance(stripe, offset); while (stripe != last) { // these iterators are always "increment" apart auto right = stripe; auto left = stripe; std::advance(left, -int(increment)); // value to be sorted auto compare = *right; // note: stripe is simply the same as first + offset // but operator+() is expensive for non-random access iterators auto posRight = offset; // only look at values between "first" and "last" while (true) { // found right spot ? if (!lessThan(compare, *left)) break; // no, still not correct order: shift bigger element to the right *right = std::move(*left); // go one step to the left right = left; posRight -= increment; if (posRight < increment) break; std::advance(left, -int(increment)); } // found sorted position if (posRight != offset) *right = std::move(compare); // next stripe ++stripe; ++offset; } // smaller increment increment = OptimalIncrements[incrementIndex++]; } } /// Shell Sort with default less-than operator template <typename iterator> void shellSort(iterator first, iterator last) { shellSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Heap Sort

A short version of Heap sort can be written in just two lines of C++ code because of the power of the STL:
hide heapSort /// Heap Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void heapSort(iterator first, iterator last, LessThan lessThan) { // just use STL code std::make_heap(first, last, lessThan); std::sort_heap(first, last, lessThan); } /// Heap Sort with default less-than operator template <typename iterator> void heapSort(iterator first, iterator last) { // just use STL code std::make_heap(first, last); std::sort_heap(first, last); }

n-ary Heap Sort

STL needs a random-access iterator. It's heap is a 2-ary heap, that means each node has at most two children. I wanted to explore the performance behavior of n-ary heap sort where each node may have more than two children. An 8-ary heap seems to be the sweet spot for integers. Larger values need an excessive amount of comparisons during traversal while smaller don't fully exploit the performance gain of data locality.

In contrast to the STL based heap sort, a simpler bidirectional iterator is sufficient for my naryHeapSort code:
hide naryHeapSort /// n-ary Heap Sort, allow user-defined less-than operator template <size_t Width, typename iterator, typename LessThan> void naryHeapSort(iterator first, iterator last, LessThan lessThan) { // width must be at least two if (Width < 2) { naryHeapSort<2>(first, last, lessThan); return; } auto numElements = std::distance(first, last); if (numElements < 2) return; // based on n-ary heap sort pseudo code from http://de.wikipedia.org/wiki/Heapsort struct SiftDown { void operator() (iterator first, size_t pos, size_t stop, LessThan lessThan) { std::advance(first, pos); auto parent = first; auto child = first; auto value = std::move(*parent); while (pos * Width + 1 < stop) { // locate children auto increment = pos * (Width - 1) + 1; pos += increment; std::advance(child, increment); // figure out how many children we have to check auto numChildren = Width; if (numChildren + pos > stop) numChildren = stop - pos; // find the biggest of them if (numChildren > 1) { iterator scan = child; ++scan; size_t maxPos = 0; for (size_t i = 1; i < numChildren; i++, scan++) // element in "scan" bigger than current best ? if (lessThan(*child, *scan)) { maxPos = i; child = scan; } pos += maxPos; } // is no child bigger than the parent ? => done if (!lessThan(value, *child)) { *parent = std::move(value); return; } // move biggest child one level up, parent one level down and continue *parent = std::move(*child); parent = child; } *child = std::move(value); } } heapify; // build heap where the biggest elements are placed in front size_t firstLeaf = (numElements + Width - 2) / Width; for (size_t i = firstLeaf; i > 0; i--) heapify(first, i - 1, numElements, lessThan); // take heap's largest element and move it to the end // => build sorted sequence beginning with last (= largest) element for (auto i = numElements - 1; i > 0; i--) { --last; std::iter_swap(first, last); // re-adjust shrinked heap heapify(first, 0, i, lessThan); } } /// n-ary Heap Sort with default less-than operator template <size_t Width, typename iterator> void naryHeapSort(iterator first, iterator last) { naryHeapSort<Width, iterator>(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }
The unusual local struct is pretty much the same as a C++11 lambda. I could have written an external function, too, but I wanted to keep the namespace clean.

Merge Sort

Unlike all other sorting algorithm shown here, Merge sort requires significant additional memory.
hide mergeSort /// Merge Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void mergeSort(iterator first, iterator last, LessThan lessThan, size_t size = 0) { // determine size if not known yet if (size == 0 && first != last) size = std::distance(first, last); // by the way, the size parameter can be omitted but // then we are required to compute it each time which can be expensive // for non-random access iterators // one element is always sorted if (size <= 1) return; // divide into two partitions auto firstHalf = size / 2; auto secondHalf = size - firstHalf; auto mid = first; std::advance(mid, firstHalf); // recursively sort them mergeSort(first, mid, lessThan, firstHalf); mergeSort(mid, last, lessThan, secondHalf); // merge sorted partitions std::inplace_merge(first, mid, last, lessThan); } /// Merge Sort with default less-than operator template <typename iterator> void mergeSort(iterator first, iterator last) { mergeSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }
To overcome Merge Sort's limitations, I wrote an in-place version - that means, aside from a few variables no additional memory need.
It only differs in the final merge step which needs only a forward iterator.
The in-place Merge Sort outperforms the original version when the container is already sorted but falls behind in pretty much every other situation.
Nevertheless, it's still faster than any other algorithm for forward iterators.
hide mergeSortInPlace /// in-place Merge Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void mergeSortInPlace(iterator first, iterator last, LessThan lessThan, size_t size = 0) { // determine size if not known yet if (size == 0 && first != last) size = std::distance(first, last); // by the way, the size parameter can be omitted but // then we are required to compute it each time which can be expensive // for non-random access iterators // one element is always sorted if (size <= 1) return; // divide into two partitions auto firstHalf = size / 2; auto secondHalf = size - firstHalf; auto mid = first; std::advance(mid, firstHalf); // recursively sort them mergeSortInPlace(first, mid, lessThan, firstHalf); mergeSortInPlace(mid, last, lessThan, secondHalf); // merge partitions (left starts at "first", right starts and "mid") // move iterators towards the end until they meet auto right = mid; while (first != mid) { // next value of both partitions in wrong order (smaller one belongs to the left) if (lessThan(*right, *first)) { // this value must be moved to the right partition auto misplaced = std::move(*first); // this value belongs to the left partition *first = std::move(*right); // misplaced value must be inserted at correct position in the right partition auto scan = right; auto next = scan; ++next; // move smaller one position to the left while (next != last && lessThan(*next, misplaced)) *scan++ = std::move(*next++); // found the spot ! *scan = std::move(misplaced); } ++first; } } /// in-place Merge Sort with default less-than operator template <typename iterator> void mergeSortInPlace(iterator first, iterator last) { mergeSortInPlace(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Quick Sort

Quick sort is often a major part of stl::sort.
hide quickSort /// Quick Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void quickSort(iterator first, iterator last, LessThan lessThan) { auto numElements = std::distance(first, last); // already sorted ? if (numElements <= 1) return; auto pivot = last; --pivot; // choose middle element as pivot (good choice for partially sorted data) if (numElements > 2) { auto middle = first; std::advance(middle, numElements/2); std::iter_swap(middle, pivot); } // scan beginning from left and right end and swap misplaced elements auto left = first; auto right = pivot; while (left != right) { // look for mismatches while (!lessThan(*pivot, *left) && left != right) ++left; while (!lessThan(*right, *pivot) && left != right) --right; // swap two values which are both on the wrong side of the pivot element if (left != right) std::iter_swap(left, right); } // move pivot to its final position if (pivot != left && lessThan(*pivot, *left)) std::iter_swap(pivot, left); // subdivide quickSort(first, left, lessThan); quickSort(++left, last, lessThan); // *left itself is already sorted } /// Quick Sort with default less-than operator template <typename iterator> void quickSort(iterator first, iterator last) { quickSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Intro Sort

Intro sort is a modified QuickSort which switches to Insertion Sort of the number of elements is small (e.g. 16).
hide quickSort /// Intro Sort, allow user-defined less-than operator template <typename iterator, typename LessThan> void introSort(iterator first, iterator last, LessThan lessThan) { // switch to Insertion Sort if the (sub)array is small auto numElements = std::distance(first, last); if (numElements <= 16) { // already sorted ? if (numElements <= 1) return; // micro-optimization for exactly 2 elements if (numElements == 2) { if (lessThan(*(first + 1), *first)) std::iter_swap(first + 1, first); return; } // between 3 and 16 elements insertionSort(first, last, lessThan); return; } auto pivot = last; --pivot; // choose middle element as pivot (good choice for partially sorted data) auto middle = first; std::advance(middle, numElements/2); std::iter_swap(middle, pivot); // scan beginning from left and right end and swap misplaced elements auto left = first; auto right = pivot; while (left != right) { // look for mismatches while (!lessThan(*pivot, *left) && left != right) ++left; while (!lessThan(*right, *pivot) && left != right) --right; // swap two values which are both on the wrong side of the pivot element if (left != right) std::iter_swap(left, right); } // move pivot to its final position if (pivot != left && lessThan(*pivot, *left)) std::iter_swap(pivot, left); // subdivide introSort(first, left, lessThan); introSort(++left, last, lessThan); // *left itself is already sorted } /// Intro Sort with default less-than operator template <typename iterator> void introSort(iterator first, iterator last) { introSort(first, last, std::less<typename std::iterator_traits<iterator>::value_type>()); }

Additional Test Code

I wanted to restrict my code to use only what is really needed. Therefore I wrote a short helper class that implements only the less-than operator. All sort algorithms shown on this page successfully operate on such a trimmed down data type, too:
hide Simple data type that only comes with less-than /// provide only operator< class Number { public: /// set value Number(int x = 0) : value(x) { } /// needed for sorting algorithms bool operator<(const Number& other) const { return value < other.value; } private: /// actually just a simple integer int value; };
Similar code was written to ensure that my routines accept forward and/or bidirectional iterators - unlike std::sort which requires random-access iterators (click to open).
show std::vector based forward/bidirectional/random iterator // ////////////////////////////////////////////////////////// // container.h // Copyright (c) 2014,2020 Stephan Brumme. All rights reserved. // see http://create.stephan-brumme.com/disclaimer.html // #pragma once #include <iterator> #include <vector> // stored data type typedef int Value; // you can choose between // - forward iterator, // - bidirectional iterator and // - random access iterator // uncomment at most one line to switch between those three iterators //#define BIDIRECTIONALITERATOR #define RANDOMACCESSITERATOR // fallback if neither BIDIRECTIONALITERATOR nor RANDOMACCESSITERATOR is defined #if !defined(BIDIRECTIONALITERATOR) && !defined(RANDOMACCESSITERATOR) #define FORWARDITERATOR #endif // forward declaration class Container; /// can be forward iterator, bidirectional or random access (see #define above) class Iterator : #if !defined(BIDIRECTIONALITERATOR) && !defined(RANDOMACCESSITERATOR) public std::iterator<std::forward_iterator_tag, int> #elseif defined(BIDIRECTIONALITERATOR) && !defined(RANDOMACCESSITERATOR) public std::iterator<std::bidirectional_iterator_tag, int> #else public std::iterator<std::random_access_iterator_tag, int> #endif { public: typedef Value value_type; /// construct empty iterator Iterator() : iterator() {} /// construct new iterator Iterator(const std::vector<Value>::iterator& ite) : iterator(ite) {} /// only copy position, skip container void operator=(const Iterator& other) { iterator = other.iterator; } /// step forward (prefix) Iterator& operator++() { ++iterator; return *this; } /// step forward (postfix) Iterator operator++(int) { Iterator result(*this); ++iterator; return result; } #if defined(BIDIRECTIONALITERATOR) || defined(RANDOMACCESSITERATOR) /// step backward (prefix) Iterator& operator--() { --iterator; return *this; } /// step backward (postfix) Iterator operator--(int) { Iterator result(*this); ++iterator; return result; } #endif #if defined(RANDOMACCESSITERATOR) /// leap forward Iterator operator+ (std::iterator_traits<Iterator>::difference_type howFar) const { return iterator + howFar; } /// leap backward Iterator operator- (std::iterator_traits<Iterator>::difference_type howFar) const { return iterator - howFar; } /// leap forward Iterator operator+=(std::iterator_traits<Iterator>::difference_type howFar) { iterator += howFar; return *this; } /// leap backward Iterator operator-=(std::iterator_traits<Iterator>::difference_type howFar) { iterator -= howFar; return *this; } /// ordered iterators bool operator<(const Iterator& other) const { return iterator < other.iterator; } /// ordered iterators bool operator>(const Iterator& other) const { return iterator > other.iterator; } /// measure distance std::iterator_traits<Iterator>::difference_type operator-(const Iterator& other) const { return iterator - other.iterator; } #endif /// pointing to same element ? bool operator==(const Iterator& other) const { return iterator == other.iterator; } /// pointing to different element ? bool operator!=(const Iterator& other) const { return iterator != other.iterator; } /// dereference int& operator*() { return *iterator; } /// dereference int operator*() const { return *iterator; } private: /// has random-access iterator std::vector<Value>::iterator iterator; }; /// only implements iterators and element access class Container { public: /// construct vector explicit Container(size_t initialSize = 0) : container(initialSize) {} /// return iterator pointing to first element Iterator begin() { return container.begin(); } /// return iterator pointing beyond last element Iterator end() { return container.end(); } /// read/write access to an element int& operator[](size_t pos) { return container[pos]; } /// read/write access to an element bool operator!=(const Container& other) const { return container != other.container; } private: /// actually just a simple vector std::vector<Value> container; };
The second live test (the one that counts comparisons and assignments) operates on a custom data type (again, click to open):
show count comparisons and assignments /// count assignments and comparisions class Number { public: /// set value Number(int x = 0) : value(x) { } /// copy constructor is counted as an assignment Number(const Number& other) : value(other.value) { ++numAssignments; } static int numLessThan; /// count comparisons bool operator<(const Number& other) const { ++numLessThan; return value < other.value; } static int numAssignments; /// count assignments void operator=(const Number& other) { ++numAssignments; value = other.value; } bool operator==(const Number& other) const { return value == other.value; } /// reset counters static void reset() { numLessThan = numAssignments = 0; } private: /// actually just a simple integer int value; }; int Number::numLessThan = 0; int Number::numAssignments = 0;
homepage