Tiempo de ejecución escaso de la multiplicación de matrices

I am examining java version sparse matrix multiplication program which is from JGF benchmark. I run this program in many kinds of cpu frequency. I also do some profile for this program. I classify it as a memory-intensive program, because the cache locality is bad and has heavy memory access. The execution time of this kind of program running in slower frequency should decrease slightly compared to faster frequency due to it would waste cpu cycles in stall. But the execution time of this program is proportional to cpu frequency in my experiments. Why is the reasons?

The dimension of matrix(array) is 500000 and this program was run in i7-920 which has three layer cache. There are 32KB L1 data 2KB, L1 instruction per core, L2 256KB per core and L3 8MB shared cache.

I also got the execution statistics by perf:

Performance counter stats for 'java -cp . JGFSparseMatmultBenchSizeC':

  83925.084119 task-clock-msecs         #      1.001 CPUs
         2,045 context-switches         #      0.000 M/sec
            28 CPU-migrations           #      0.000 M/sec
        29,687 page-faults              #      0.000 M/sec
223,130,573,396 cycles                  #   2658.688 M/sec  (scaled from 66.68%)
66,679,432,987 instructions             #      0.299 IPC    (scaled from 83.33%)
12,779,607,690 branches                 #    152.274 M/sec  (scaled from 83.32%)
    11,389,605 branch-misses            #      0.089 %      (scaled from 83.32%)
11,056,332,293 cache-references         #    131.740 M/sec  (scaled from 83.34%)
 3,847,329,243 cache-misses             #     45.842 M/sec  (scaled from 83.35%)

   83.816412311  seconds time elapsed

preguntado el 02 de febrero de 12 a las 11:02

Perhaps the cache is big enough to make memory access irrelevant. You didn't give any info on problem size and cache size. -

How did you determine that cache locality is bad? A good package should optimize cache usage. Also, what is your data like? If it is very structured or repetitive, the implementation may detect this and further optimize for fewer memory accesses. -

I profiled the execution statistics by perf. there are heavy cache misses, and the IPC of this program is low. -

1 Respuestas

Integer objects that represent values close to 0 may be cached by JVM to save memory - maybe that could play some role in it.

Respondido 02 Feb 12, 17:02

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.