I am trying to measure the time taken by a set of statements. The following is a pseudocode. The code is implemented in C++ on a Xilinx chipset, with a custom RTOS, so the traditional c++ clock functions do not work here.
I do not need help on the actual time measurement, but more on the math on how to calculate the actual execution time.
one = clock.getTime(); /*statement * 1 * * to * * 10 */ two = clock.getTime(); fTime = two - one;
Now I know the time taken by the statements. This time is also includes the time taken by getTime() too right?
one = clock.getTime(); clock.getTime(); two = clock.getTime(); cTime = two - one; //Just measure and the min value i get is 300 microseconds.
Now this block gives me the time taken by getTime().
Finalmente, mi pregunta es:
What is the actual time taken by the statements?
- fTime - cTime
- fTime - (2* cTime)
- Other equation ?
preguntado el 28 de mayo de 14 a las 13:05
Your time measurement shifts the time
if it is on stable enough platform then the shift is the same for all times so
one = t1 + dt two = t2 + dt
after substraction the shift eliminates itself so
two-one = (t2+dt)-(t1+dt) = t2-t1
so there is no need to make corrections for time measurement shift in this case.
Problems starts on multi-scalar/vector architectures where the code execution is variable
- due to different cache miss-es
- different prefetch invalidation
and so on then you have to play with cache invalidation. Also if your
getTime() waits for interrupt or HW event that can also add few error T's
In that case measure many times and get the avg or smallest IMPORTANTE result something like:
Typically when benchmarking we perform the measured tasks muchas muchas veces in between the timer calls, and then divide by the number of task executions.
This is to:
- smooth over irrelevant variations, and
- avoid the measured duration falling below the actual clock resolution, and
- leave the timing overhead as a totally negligible fraction of the time.
In that way, you don't have to worry about it any more.
That being said, this can introduce problems of its own, particularly with respect to things like branch predictors and caches, as they may be "warmed up" by your first few executions, impacting the veracity of your results in a way that wouldn't happen on a single run.
Benchmarking properly can be quite tricky and there is a wealth of material available to explain how to do it properly.