Cómo medir el uso de registros en OpenCL
I noticed the Nvidia Visual Profiler for CUDA prints a line that shows register use:
Register Ratio = 0.75 ( 24576 / 32768 ) [48 registers per thread]
Is it possible to generate a line like that in OpenCL?
I have not seen any OpenCL way to query the number of registers, or the use of those registers.
preguntado el 27 de noviembre de 13 a las 01:11
As mentioned by DarkZeros it is implementation defined. And for a very good reason.
OpenCL does not make assumptions about the architecture, thus there is no general way of defining a single register ratio, let alone making any predictions based it. As an example on AMD HW you have 2 kinds of registers. Scalar and vector registers. They are disjoint in a sense that they spill independently etc.
In CPU the situation is again completely different and the compiler can even combine different work-items into a single thread.
To analyze AMD HW you need to use http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/ and the included kernel analyzer and for Intel you need to use http://software.intel.com/en-us/vcsource/tools/opencl-sdk
No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas opencl or haz tu propia pregunta.
That is implementation dependent, since OpenCL is designed to abstract the user as much as possible from the device. For the nVIDIA specific case you can use
-cl-nv-verboseat compile time to give you the register usage in the Build Log per thread. (Then you have to do the math to see the total register usage) - DarkZeros