Multi-Core Sorting

posted by Stephan Brumme

The old-fashioned way

Most traditional sorting algorithms (I already blogged about their STL-compatible versions) work in a serial manner and are perfectly suited for single-core machines.
My current computer has a Intel Core i7 860 CPU which comes with 4 cores and hyperthreading, giving me a total of 8 logical cores. That means, CPU utilization with traditional sorting algorithms (i.e. only able to run on one 1 out of 8 cores) cannot exceed 12.5%. That sucks.


Luckily, the GCC team made parallelized versions of pretty all important STL algorithms available. They still call their implementation experimental but I haven't had any problems so far.
Everything is based on OpenMP which has been supported since GCC 4.2. Visual C++ and Intel C++ support OpenMP as well but I haven't tested GCC's library with these commercial compilers yet.

How to use it

There are two ways to switch to GCC's parallel mode:
  1. link against -D_GLIBCXX_PARALLEL or
  2. use the __gnu_parallel namespace
In both cases you have to enable compiler switch -fopenmp and include parallel/algorithm.
hide // g++ -O3 -fopenmp -march=native parallelSort.cpp -o parallelSort #include <algorithm> std::sort(data.begin(), data.end()); #include <parallel/algorithm> __gnu_parallel::sort(data.begin(), data.end());

(read more ...)


posted by Stephan Brumme

What is CLein ?

CLein is a tiny, portable freeware program which displays all relevant information about the OpenCL drivers installed on your system.
It contains a small benchmark to compare the computing power of your host CPU and your OpenCL device(s), too.

If you have no idea what OpenCL is then read this article on Wikipedia.

(read more ...)

browse through all postings:
newer stuff:

2016/Aug/17   smalLZ4 - optimal LZ4 compression
2016/Aug/04   xxHash - a hash algorithm as fast as memcpy
2015/Aug/11   Tiny Regular Expression Matcher
2015/Feb/04   Fast CRC32
2014/Aug/25   Practical String Searching
2014/Aug/19   The Mersenne Twister Pseudo Random Number Generator
2014/Jul/11   Parallel Sieve of Eratosthenes
2014/Jun/23   Windows and Linux Code Timing
2014/Jun/14   Portable C++ Hashing Library
2014/Feb/27   Sorting STL Containers
2013/Sep/18   Portable Memory Mapping C++ Class
2013/May/03   Testing Endian Aware Code
2013/Apr/24   Software License
2013/Feb/15   Read-Only Git Access Available
2011/Dec/05   Fowler-Noll-Vo Hash (FNV1a)
2011/Dec/01   Drawing Antialiased Circles and Ellipses
2011/Nov/27   Uncompressing DEFLATE Data
2011/Nov/23   Streaming MPEG-4 Movies From Your Own Server
2011/Nov/08   Binary Search Revisited
2011/Oct/20   Reverse MD5 Hash
2011/Oct/17   Windows 7 Screenshots
2011/Oct/16   Hide Strings In Your Executable
2011/Oct/11   Generate QR Codes With PHP and Google Charts API
2011/Oct/02   Feed.rss and Sitemap.xml
2011/Sep/28   My Experiences with AdSense
2011/Sep/27   Emulating MemCache in Pure PHP
2011/Sep/14   Simple Ajax
2011/Sep/13   Sparklines
2011/Sep/12   smallpt - Global Illumination in Javascript