Equation Solution High Performance by Design 




Powerful Multicores Need Powerful Methods
[Posted by JennChing Luo on Apr. 21, 2009 ]
Multicores (or multiprocessors) provide extra computing power. Some users may think the extra computing power may allow us to have a bruteforce search, instead of having a rational analysis. Bruteforce search on a powerful multicore system may be defeated by a rational analysis on a single core.
I can give you an example. Bandwidth of a sparse matrix can be reduced by a proper permutation of columns and rows of matrix. For a matrix of order nxn, there exist n! possible permutations. Theoretically, we can have a bruteforce search to find the minimal bandwidth among the n! possible permutations. However, bruteforce search is impractical, because bruteforce search may have too many possibilities. For example, 1) When n=10, we have 3628800 possibilities; 2) When n=20, we have 2.43x10^{18} possibilities; 3) When n=100000, how many possibilities we might have? 100000! What a value! Bruteforce search is not always possible no matter how many cores are available. As compared with a rational analysis, bruteforce search is very timeconsuming. Bruteforce search on multicores may not catch up a rational analysis on a single core. For example, the web page (http://www.equation.com/servlet/equation.cmd?fa=jclexact) has a small example of n=37. A rational analysis could find the minimal bandwidth in 0.02 seconds; While bruteforce search required 74.58 seconds. Indeed, the bruteforce search that found the minimal bandwidth among the n! possible permutations could be implemented in parallel. The point is how many cores are required to complete the search in 0.02 seconds. It required 74.58/0.02=3729 cores for the bruteforce search to catch up a rational analysis that took only 0.02 seconds. Another example, in the same reference of web page, of size n=48, a rational analysis can find minimal bandwidth in 0.03 seconds; While bruteforce search was aborted after 4 hours (not finding the minimal bandwidth yet), which is about 480000 times more cpu time. That means even we have 480000 cores, a bruteforce search could not catch up a rational analysis on a single core on this example. Powerful multicores also need a rational method, and do not allow us to have a bruteforce search. 
