Windows and Linux Code Timing
posted by Stephan Brumme
Whenever I have to do some performance testing, I try the most simple way: write a short program and run it within my Linux bash shell which comes with a built-in
time
command:
$ time ./testcode
real 0m0.788s
user 0m0.696s
sys 0m0.080s
The timing resolution isn't that great, either, and often lies above 10ms.
That's why I wrote a simple C routine to measure performance inside the code. My goal was to have numbers that are accurate down to about 1 microsecond (10-6).
I didn't opt for extremely high timing precision (count CPU ticks) because I rarely need that degree of precision and really want to avoid the tricky details of CPU frequency scaling, cache stalls etc.
Another feature is portability; simply recompile the code and run it on Windows and Linux. The code works fine with GCC as well as Visual C++.
All my timing code contains is a single function
seconds()
returning a double
.
You can invoke it in your programs like this:
double start = seconds();
// ... your code that should be measured ...
double duration = seconds() - start;
hide
A short test program attempts to determine the timer resolution:
// include OS specific timing library
#ifdef _MSC_VER
// Windows
#include <Windows.h>
#else
// Linux
#include <time.h>
#endif
/// return a timestamp with sub-second precision
/** QueryPerformanceCounter and clock_gettime have an undefined starting point (null/zero)
and can wrap around, i.e. be nulled again. **/
double seconds()
{
#ifdef _MSC_VER
static LARGE_INTEGER frequency;
if (frequency.QuadPart == 0)
::QueryPerformanceFrequency(&frequency);
LARGE_INTEGER now;
::QueryPerformanceCounter(&now);
return now.QuadPart / double(frequency.QuadPart);
#else
struct timespec now;
clock_gettime(CLOCK_MONOTONIC, &now);
return now.tv_sec + now.tv_nsec / 1000000000.0;
#endif
}
show
timing.c
// //////////////////////////////////////////////////////////
// timing.c
// Copyright (c) 2014 Stephan Brumme. All rights reserved.
// see http://create.stephan-brumme.com/disclaimer.html
//
// only for printf in the test code in main()
#include <stdio.h>
// include OS specific timing library
#ifdef _MSC_VER
// Windows
#include <Windows.h>
#else
// Linux
#include <time.h>
#endif
/// return a timestamp with sub-second precision
/** QueryPerformanceCounter and clock_gettime have an undefined starting point (null/zero)
and can wrap around, i.e. be nulled again. **/
double seconds()
{
#ifdef _MSC_VER
static LARGE_INTEGER frequency;
if (frequency.QuadPart == 0)
::QueryPerformanceFrequency(&frequency);
LARGE_INTEGER now;
::QueryPerformanceCounter(&now);
return now.QuadPart / double(frequency.QuadPart);
#else
struct timespec now;
clock_gettime(CLOCK_MONOTONIC, &now);
return now.tv_sec + now.tv_nsec / 1000000000.0;
#endif
}
// test code
int main(int argc, char** argv)
{
// find minimum gap between two consecutive timestamps => timing resolution
const int NumTests = 10;
int i;
for (i = 0; i < NumTests; i++)
{
double time1 = seconds();
double time2 = time1;
// wait until timestamp changes
while (time2 == time1)
time2 = seconds();
// compute difference ...
double microseconds = (time2 - time1) * 1000*1000;
double megahertz = 1 / microseconds;
// ... and display it
printf("%f microseconds => %f megahertz\n", microseconds, megahertz);
}
return 0;
}
Download
Git users: scroll down to the repository link Latest release: June 23, 2014, size: 1555 bytes, 65 linesCRC32:
8cdf40fa
MD5:
2d4012f200d62a14aa3e31b374509c10
SHA1:
80ed042505b97eb6969f8b28971ae07239fa3732
SHA256:
c1ab05c2ec925fb067ff90377d4a594af23f7195d22b128c3140fae4409233b5
Stay up-to-date:git clone https://create.stephan-brumme.com/windows-and-linux-code-timing/git
If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com
Changelog
- version 1
- latest and greatest
- June 23, 2014
- initial release
Linux
My Linux web server gives me these results:
hide
You can improve precision by replacing
$ uname -a
Linux icpu3665 3.2.58-infong-14157 #1 SMP Fri Jun 6 14:05:34 CEST 2014 x86_64 GNU/Linux
$ gcc -O3 timing.c -o timing -lrt && ./timing
1.223998 microseconds => 0.816995 megahertz
0.917999 microseconds => 1.089326 megahertz
0.900003 microseconds => 1.111107 megahertz
0.453001 microseconds => 2.207502 megahertz
0.462003 microseconds => 2.164488 megahertz
0.450003 microseconds => 2.222207 megahertz
0.453002 microseconds => 2.207497 megahertz
0.444001 microseconds => 2.252249 megahertz
0.450001 microseconds => 2.222219 megahertz
0.444003 microseconds => 2.252238 megahertz
CLOCK_MONOTONIC
by its siblings CLOCK_MONOTONIC_RAW
and
CLOCK_MONOTONIC_COARSE
.If you really need to avoid the occasional zero-wraparound, then the Linux call
gettimeofday
is worth looking into.As a nice side effect
seconds()
will return the number of seconds since January 1, 1970.That means, then you not only can compute time differentials but also use
seconds
for log outputs:
hide
Still pretty good timing resolution:
// changed include:
#include <sys/time.h>
// change code in seconds():
// lower resolution but never wraps around:
struct timeval now;
gettimeofday(&now, NULL);
return now.tv_sec + now.tv_usec / 1000000.0;
hide
Fully working code is available for download:
0.953674 microseconds => 1.048576 megahertz
0.953674 microseconds => 1.048576 megahertz
0.953674 microseconds => 1.048576 megahertz
1.192093 microseconds => 0.838861 megahertz
0.953674 microseconds => 1.048576 megahertz
1.192093 microseconds => 0.838861 megahertz
0.953674 microseconds => 1.048576 megahertz
0.953674 microseconds => 1.048576 megahertz
0.953674 microseconds => 1.048576 megahertz
0.953674 microseconds => 1.048576 megahertz
show
wallclock.c
// //////////////////////////////////////////////////////////
// wallclock.c
// Copyright (c) 2014 Stephan Brumme. All rights reserved.
// see http://create.stephan-brumme.com/disclaimer.html
//
// only for printf in the test code in main()
#include <stdio.h>
// include OS specific timing library
#ifdef _MSC_VER
// Windows
#include <Windows.h>
#else
// Linux
#include <sys/time.h>
#endif
/// return a timestamp with sub-second precision
/** QueryPerformanceCounter has an undefined starting point (null/zero)
and can wrap around, i.e. be nulled again.
gettimeofday (Linux only) returns the number of seconds since January 1, 1970 **/
double seconds()
{
#ifdef _MSC_VER
static LARGE_INTEGER frequency;
if (frequency.QuadPart == 0)
::QueryPerformanceFrequency(&frequency);
LARGE_INTEGER now;
::QueryPerformanceCounter(&now);
return now.QuadPart / double(frequency.QuadPart);
#else
// lower resolution but never wraps around:
struct timeval now;
gettimeofday(&now, NULL);
return now.tv_sec + now.tv_usec / 1000000.0;
#endif
}
// test code
int main(int argc, char** argv)
{
// find minimum gap between two consecutive timestamps => timing resolution
const int NumTests = 10;
int i;
for (i = 0; i < NumTests; i++)
{
double time1 = seconds();
double time2 = time1;
// wait until timestamp changes
while (time2 == time1)
time2 = seconds();
// compute difference ...
double microseconds = (time2 - time1) * 1000*1000;
double megahertz = 1 / microseconds;
// ... and display it
printf("%f microseconds => %f megahertz\n", microseconds, megahertz);
}
return 0;
}
Download
Git users: scroll down to the repository link Latest release: June 23, 2014, size: 1655 bytes, 67 linesCRC32:
fced30b9
MD5:
4c85bae1c200ee26f4b2a9765cdfd18e
SHA1:
2f7f5fb74bde9b33049277e2c9d4c7f6ffa9c555
SHA256:
78109424ee91949ba3b506fd4cf9ed7a9b1f4d9c04a346f0b7c05bbd543fc25a
Stay up-to-date:git clone https://create.stephan-brumme.com/windows-and-linux-code-timing/git
If you encounter any bugs/problems or have ideas for improving future versions, please write me an email: create@stephan-brumme.com
Windows
My Windows 7 desktop computer is a bit more consistent:
hide
0.301854 microseconds => 3.312860 megahertz
0.301858 microseconds => 3.312820 megahertz
0.301858 microseconds => 3.312820 megahertz
0.301854 microseconds => 3.312860 megahertz
0.301854 microseconds => 3.312860 megahertz
0.301858 microseconds => 3.312820 megahertz
0.301858 microseconds => 3.312820 megahertz
0.301858 microseconds => 3.312820 megahertz
0.301854 microseconds => 3.312860 megahertz
0.301858 microseconds => 3.312820 megahertz
time
reloaded
Most Linux systems have two time
commands:
One is built-in into most shells (as mentioned before) and then there is /usr/bin/time
.Its output is much more detailed:
$ /usr/bin/time ./myprogram
4.90user 10.33system 1:36.24elapsed 15%CPU (0avgtext+0avgdata 9600maxresident)k
4934176inputs+7270568outputs (2major+775minor)pagefaults 0swaps