Equation Solution  
    High Performance by Design

Page: 6
Grandpa on Multicores (2)
How Good a Speedup is Good?
Grandpa on Multicores (1)
Multiprocessing is Selective
The Mingw Port of gfortran has a Fallback of Print Malfunction

  3   4   5   6   7   8   9  

How Good a Speedup is is Good?

[Posted by Jenn-Ching Luo on Apr. 02, 2011 ]

        The previous post, "Grandpa on Multicores (1)", showed parallel performances of a sparse equation solver. Some people may have a question how good a speedup is is good. It seems there is not a standard to determine good or bad.

        We did not see many speedup on multicores posted on the internet.

        How good is good? Here does not have a definite answer, but cites a speedup from the article, "Measuring Speedup is Challenging with Intel Turbo Boost Technology" (hereinafter "article"), to give you an idea how much speed had be improved on multicores from other place. That article performed tests on an Intel Core i7 820QM mobile microprocessor, which offers 4 physical cores (with Hyper-threading technology), a total of 8 hardware threads. Our interest is speedup. That article shows a speedup 1.361X achieved on 4 physical cores (8 hardware threads).

        Now, we go back to the previous post, "Grandpa on Multicores (1)", from which we copy a set of performance in the following:

number of cores factorization time (sec) speedup efficiency (%)
1 588.22 1.00 100.00
2 298.65 1.97 98.48
3 200.73 2.93 97.68
4 152.29 3.86 96.56
5 121.07 4.86 97.17
6 107.09 5.49 91.55
7 93.49 6.29 89.99
8 80.32 7.32 91.54

We can see grandpa can achieve 3.86X on 4 cores, and 7.32X on 8 cores.

        Next, we compare grandpa with that article. It can be seen grandpa efficiently ran on multicores. On 4 cores, grandpa sped up to 3.86X; while that article achieved a speedup, 1.361X, on 4 cores. This simple comparison gives us an idea how good grandpa is.

        Certainly, the primary purpose of that article is not to demonstrate a particular speedup, but to mention a potential difficulty in measuring speedup when "Intel Turbo Boost Technology" is enabled. That article showed a speedup 1.065X, achieved on 4 cores, when "Intel Boost Technology" was enabled. "Boost Technology" makes one core run at fast clock frequencies when one core is employed. That is a reason why that article claims "Measuring Speedup is challenging with Intel Turbo Boost Technology".

        This site will show more grandpa's performances of numerical analysis on multicores in the coming posts.