# ¿Cómo quitar el tiempo que tarda el reloj del tiempo total de ejecución medido?

I am trying to measure the time taken by a set of statements. The following is a pseudocode. The code is implemented in C++ on a Xilinx chipset, with a custom RTOS, so the traditional c++ clock functions do not work here.

I do not need help on the actual time measurement, but more on the math on how to calculate the actual execution time.

``````one = clock.getTime();
/*statement
*     1      *
*     to     *
*     10     */
two = clock.getTime();
fTime = two - one;
``````

Now I know the time taken by the statements. This time is also includes the time taken by getTime() too right?

``````one = clock.getTime();
clock.getTime();
two = clock.getTime();
cTime = two - one;     //Just measure and the min value i get is 300 microseconds.
``````

Now this block gives me the time taken by getTime().

Finalmente, mi pregunta es:

What is the actual time taken by the statements?

1. fTime - cTime
2. fTime - (2* cTime)
3. Other equation ?

preguntado el 28 de mayo de 14 a las 13:05

I would say it is closer to your equation 2, though `getTime` is a very fast operation, unless you are looking at very high performance code, I doubt you will see a big difference between `fTime` e `fTime - (2*cTime)` -

I don't know if this makes sense but if you need very precise results you could: Configure a GPIO port (output), Set some port to low, execute your statements, Set the port to high and finally measure with an oscilloscope -

Es muy important to run the 2nd snippet muchos times and only use the smallest value you get. No point in the extra getTime call. You can never be completely sure that you got an accurate value, do watch out for getting negative results from the 1st snippet. -

@Emilien I do not have an oscilloscope to measure. -

@HansPassant The second snippet gave me a min value of 300 microseconds. Why would the first snippet give me negative values? -

## 2 Respuestas

Your time measurement shifts the time

if it is on stable enough platform then the shift is the same for all times so

``````one = t1 + dt
two = t2 + dt
``````

after substraction the shift eliminates itself so

``````two-one = (t2+dt)-(t1+dt) = t2-t1
``````

so there is no need to make corrections for time measurement shift in this case.

Problems starts on multi-scalar/vector architectures where the code execution is variable

• due to different cache miss-es
• different prefetch invalidation

and so on then you have to play with cache invalidation. Also if your `getTime()` waits for interrupt or HW event that can also add few error T's

In that case measure many times and get the avg or smallest IMPORTANTE result something like:

Respondido el 20 de junio de 20 a las 10:06

That's not true if part of `dt` is work that `getTime` does before it actually measures the time. Then this impacts on `two` pero no en `one`. Same problem in reverse for work done inside `getTime` post-measurement. Unless you know that the overhead on both sides of the actual measurement is exactly equal, you can't strip it away like this. In general you can't reliably elide the overhead, which is why we typically do not bother trying. - Carreras de ligereza en órbita

@LightnessRacesinOrbit luckily the part of `dt` you are refer to is a very small percentage of the measurement itself. Just to be sure we are meaning the same things you mean stuff like: return value, stack, and code/data CACHE and their overhead difference between first and second `getTime` call).... - Spektre

That depends on what you're measuring, but you can make it assuredly so by performing the task 100,000 times (pick a number sensibly based on how long your task takes) instead of just once, which is the common approach for this and for a few other reasons also. - Carreras de ligereza en órbita

Typically when benchmarking we perform the measured tasks muchas muchas veces in between the timer calls, and then divide by the number of task executions.

This is to:

• smooth over irrelevant variations, and
• avoid the measured duration falling below the actual clock resolution, and
• leave the timing overhead as a totally negligible fraction of the time.

In that way, you don't have to worry about it any more.

That being said, this can introduce problems of its own, particularly with respect to things like branch predictors and caches, as they may be "warmed up" by your first few executions, impacting the veracity of your results in a way that wouldn't happen on a single run.

Benchmarking properly can be quite tricky and there is a wealth of material available to explain how to do it properly.

Respondido 19 Oct 18, 12:10

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.