Binary Search Revisited
posted by Stephan Brumme
Same same but different ?!
Some time ago I had to write code for a binary search of sorted data. Obviously there are libraries which can do the heavy lifting for me, such as C++'s STL.When I looked up the source code of STL's
binary_search
I was surprised to find that it
slightly differs from what I expected: it always performs log2(N) less-than comparisons and
only after that checks for equality exactly once:

while at least one element
{
find element in the middle
if it's smaller than x
strip all elements before and including middle element
else
strip all elements after middle element
}
return true if remaining element matches x, else false
The STL guys usually prefer fast algorithms - why did they do it ?
This clearly calls for some serious googling ... and the vast majority of web sites, including Wikipedia, combine the less-than comparison with an check for equality and indeed allow for an early exit if they find a match:

while at least one element
{
find element in the middle
if it matches x, return true <== NEW
if it's smaller than x
strip all elements before and including middle element
else
strip all elements after middle element
}
return false <== DIFFERENT
check if less than | check if equal | |
---|---|---|
STL | ≈ log2(N) | 1 |
"rest of world" | ≈ log2(N) - 1 | ≈ log2(N) - 1 |
There is only one way to find the best
binary_search
: WE NEED BENCHMARKS !
Download
Git users: scroll down to the repository link Latest release: April 24, 2013, size: 10.2 kBytes, 381 linesCRC32:
8ccb4820
MD5:
3e13d18e5c402b03dd670e18d11c8510
SHA1:
1dce5af123bf2f76c97bd1fd21c9c35d444dd32c
SHA256:
47efec0b9b5facca2a8ccf29d0b7a36972140ee61478f16de48054c57455e36b
Stay up-to-date:git clone https://create.stephan-brumme.com/binary-search/git
GitHub mirror:https://github.com/stbrumme/binary-search
If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com
License
This code is licensed under the zlib License:This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution.zlib License
Simplifying the Code
Visual C++'s and GCC's STL implementations ofbinary_search
rely on lower_bound
(I'm pretty sure all STL libraries do).
However, like most STL stuff the code is hardly readable.
After inlining lower_bound
, stripping off all the debugging stuff and
replacing the weird variables and type names by something more understandable,
binary_search
looks like this:

/// same algorithm as STL
template <typename Iterator, typename Value>
bool binary_search(Iterator begin, Iterator end, Value value)
{
// narrow down range of candidates
auto numElements = std::distance(begin, end);
// repeat until a single element is left
while (numElements > 0)
{
// pivot element
auto middlePos = numElements >> 1;
auto middle = begin;
std::advance(middle, middlePos);
// >>> UPDATE check for equality
//if (value == *middle)
// return true;
// <<< UPDATE
if (*middle < value)
{
// value must be on the right side of pivot element
begin = middle;
++begin;
numElements = numElements - middlePos - 1;
}
else
{
// value must be on the left side of pivot element (or it is the pivot element)
numElements = middlePos;
}
}
// out of bounds ?
if (begin == end)
return false;
// >>> UPDATE return always false if checking for equality in while-loop
// return false;
// <<< UPDATE
// hit ?
return value == *begin;
}
Note: you need a C++11 compiler because of the super-convenient
auto
keyword.
Benchmark
The main benchmark parameters are the container type (std::vector
,
std::deque
or std::list
) and:
/// data type
typedef int Number;
/// number of elements
const size_t numElements = 1000*1000;
for (size_t i = 0; i < numElements; i++)
sorted.push_back(Number(i*step));
step
(and notFound=step/2
)
is responsible for creating a certain gap between neighboring numbers.
These gaps are helpful to get a better estimate of the algorithm's runtime because they allow for
looking for numbers that are NOT in the container in an evenly distributed manner.
index | 0 | 1 | 2 | 3 | 4 | 5 | ... |
---|---|---|---|---|---|---|---|
sorted[index] | 0 | 10 | 20 | 30 | 40 | 50 | ... |
0
(succeeds), 5
(fails),
10
(succeeds), 15
(fails),
20
(succeeds), 25
(fails),
30
(succeeds), 35
(fails), etc.:

for (size_t i = 0; i < numElements; i++)
{
// hit
Number succeed = Number(i*step);
if (!std::binary_search(sorted.begin(), sorted.end(), succeed))
std::cout << "error @" << succeed << std::endl;
// miss
Number fail = Number(i*step+notFound);
if ( std::binary_search(sorted.begin(), sorted.end(), fail))
std::cout << "error @" << fail << std::endl;
}
while
-loop about 20 times.
All compiler optimizations I could think of were enabled.That's what I get:
1 million int |
Visual C++ 2010 | GCC 4.7 |
---|---|---|
std::binary_search | 131ms | 122ms |
my binary search | 107ms | 96ms |
early check for equality | 160ms | 97ms |
recursive | 242ms | 110ms |
hybrid (8) | 85ms | 133ms |
int
egers (compiled with Visual C++).My implementation of
std::binary_search
isn't as nested as the STL and
lacks some iterator checking.For GCC it doesn't really matter which variant of my binary search implementations is chosen but Visual C++ seriously loses ground when enabling the early check for equality.
I added
#define _SECURE_SCL 0
to disable Visual C++'s checked iterators
in release mode, too.Let's have a look at
float
(very similar results for double
):
1 million float |
Visual C++ 2010 | GCC 4.7 |
---|---|---|
std::binary_search | 251ms | 119ms |
my binary search | 217ms | 96ms |
early check for equality | 200ms | 126ms |
recursive | 290ms | 129ms |
hybrid (16) | 158ms | 132ms |
Even More Simplifying the Code
A recursive implementation of the binary search is quite short:
/// recursive implementation
template <typename Iterator, typename Value>
bool binary_search_recursive(Iterator begin, Iterator end, Value value)
{
// empty ?
if (begin == end)
return false;
// stop recursion ?
auto numElements = std::distance(begin, end);
if (numElements == 1)
return value == *begin;
// pivot element
auto middlePos = numElements >> 1;
auto middle = begin;
std::advance(middle, middlePos);
// keep looking on the right or left side of pivot element ?
if (value < *middle)
return binary_search_recursive(begin, middle, value);
else
return binary_search_recursive(middle, end, value);
}
Hybrid Approach
When the container holds small element types which are laid out next to each other in the memory (such asint
), the L1/L2 caches' influence plays a major role.My hybrid algorithm switches from binary subdivision to linear search after a given
thresholdLinear
was reached:

/// linear search in inner loop
template <size_t thresholdLinear, typename Iterator, typename Value>
bool binary_search_hybrid(Iterator begin, Iterator end, Value value)
{
// narrow down range of candidates
auto numElements = std::distance(begin, end);
// repeat until only a few elements are left
while (numElements > int(thresholdLinear))
{
// pivot element
auto middlePos = numElements >> 1;
auto middle = begin;
std::advance(middle, middlePos);
if (*middle < value)
{
// value must be on the right side of pivot element
begin = middle;
++begin;
numElements = numElements - middlePos - 1;
}
else
{
// value must be on the left side of pivot element (or it is the pivot element)
numElements = middlePos;
}
}
// skip all smaller values
while (begin != end && *begin < value)
++begin;
// hit ?
return begin != end && *begin == value;
}
while
compares against thresholdLinear
,
not 0
. Lines 30-35 are new, everything else should be familiar.The best threshold value heavily depends on the data type. For 32-bit
int
s I found thresholdLinear
=
8
to be the best parameter.Your mileage may vary.
my::binary_search_hybrid<8>(sorted.begin(), sorted.end(), lookFor));
The World Is Not A Vector
std::binary_search
accepts arbitrary forward iterators and makes it
easy to replace the container (benchmarks above: std::vector
).Here are the results of STL's double-ended queue (
std::deque
):
std::deque , 1 million int |
Visual C++ 2010 | GCC 4.7 |
---|---|---|
std::binary_search | 352ms | 206ms |
my binary search | 335ms | 204ms |
early check for equality | 482ms | 215ms |
recursive | 510ms | 540ms |
hybrid (8) | 309ms | 198ms |
std::deque
and utilizes the CPU caches best.When looking at
std::list
we have to keep in mind that
even though only a forward iterator is required, a lot of iterating is done
in std::advance
and std::distance
.Therefore instead of 1000000 only 10000
int
s are pushed into an std::list
:
std::list , ten thousand int |
Visual C++ 2010 | GCC 4.7 |
---|---|---|
std::binary_search | 692ms | 903ms |
my binary search | 683ms | 878ms |
early check for equality | 687ms | 882ms |
recursive | 982ms | 1261ms |
hybrid (8) | 686ms | 884ms |
Unlike all tables before, I didn't highlight the fastest algorithm - because it's not shown in this table !
A simple
std::find
is faster than all presented algorithm when working
on std::list
.
It can be improved for sorted data, too: abort when the iterator points at values larger than
the value we are looking for.
And here's the code of find
and find_sorted
:

/// same algorithm as STL's find
template <typename Iterator, typename Value>
Iterator find(Iterator begin, Iterator end, Value value)
{
// compare all values until found
while (begin != end)
{
if (*begin == value)
return begin; // found !
++begin;
}
// not found
return end;
}

/// find, optimized for sorted data
// requires: iff !(x<y) && !(x>y) then x == y
template <typename Iterator, typename Value>
Iterator find_sorted(Iterator begin, Iterator end, Value value)
{
// skip all smaller values
while (begin != end && *begin < value)
++begin;
// hit ?
if (begin != end && *begin == value)
return begin;
else
return end;
}
Iterator
instead of
bool
(just like the original std::find
).
std::list , 10'000 int |
Visual C++ 2010 | GCC 4.7 |
---|---|---|
find | 253ms | 292ms |
find_sorted | 163ms | 184ms |
std::vector
with only 10000 int
s
is still the clear winner:
std::vector , ten thousand int |
Visual C++ 2010 | GCC 4.7 |
---|---|---|
hybrid binary search | 1.2ms | 0.8ms |
find | 192ms | 160ms |
find_sorted | 125ms | 62ms |
int
in a std::list
takes several minutes for find
and find_sorted
to complete.
The World Is Not A Simple Number
My benchmarks were completely memory-based and the objects only simple numbers (int
or float
).
Performance was mainly influenced by instruction branching and caching.If data is stored on high-latency storage, especially storage with slow seeking times like a hard disk or some network-based solution, then the advantage of an early equality test truly kicks in. The hybrid approach computes the most comparisons and suffers severely if comparisons are expensive (e.g. large objects instead of primitive data types such as
int
).If your container is very small then
find
can be surprisingly efficient
because it gets support from modern CPU's intelligent prefetching.
For less than approx. 50 int
s, find_sorted
is as fast as a binary search and for less than approx. 25 int
s,
even find
becomes competitive.
The simple std::find
has a few advantages, too:
no need for the less-than operator and - more importantly - the input data doesn't have to be sorted.