Subdomain local FE solver implementation using iterative solver in domain decomposition method


Abstract eng:
As the hardware architecture of modern supercomputers is getting more and more complicated, it is more and more difficult to achieve high performance in real production-level application codes. To obtain, not only inter-node parallel performance using MPI, but also higher intra-node thread-parallel performance, more efficient utilization of processor cache memory and SIMD vectorization should be considered. In case of the domain decomposition method (DDM), one obvious performance bottleneck is the subdomain-wise local FE solver. Here in this presentation, performance benchmark of subdomain local FE solvers implemented using preconditioned conjugate gradient solvers will be demonstrated. The strength of this implementation is that the working set of local solver just fits in the processor cache memory. They are tested on modern multi-core scalar based supercomputers, such as RIKEN K Computer, Fujitsu PRIMEHPC FX100, Intel Haswell and Knights Corner.

Publisher:
International Union of Theoretical and Applied Mechanics, 2016
Conference Title:
Conference Title:
24th International Congress of Theoretical and Applied Mechanics
Conference Venue:
Montreal (CA)
Conference Dates:
2016-08-21 / 2016-08-26
Rights:
Text je chráněný podle autorského zákona č. 121/2000 Sb.



Record appears in:



 Record created 2016-11-15, last modified 2016-11-15


Original version of the author's contribution as presented on CD, page 3104, code TS.FS02-1.06 .:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)