Create high productive CP2K 2022.1 build that works on AMD Zen2 microarchitecture

Create high productive CP2K 2022.1 build that works on AMD Zen2 microarchitecture

CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. While it can be successfully compiled by using Intel oneAPI, Intel oneAPI MKL, and Intel MPI, the produced executables cannot run simulations of large systems on AMD Zen2 CPU. For instance, the SCF minimization fails by sending “Segmentation fault” message to the standard error. The debugging results point to a problem with handling the memory inside Intel MPI library and it occurs only on AMD Zen2 based CPUs. Such problem is not observed on Intel Xeon CPU family. On the other side, most users need CP2K builds that runs as fast as possible and scale well on multiple processors, which is not exactly the case if GCC compiler set is employed. Hence, Intel oneAPI compiler set is still the most desirable compiler set for building CP2K, even on AMD Zen2 microarchitecture and the compile recipes need to be fixed to avoid the problem reported above.


One alternative way to create high productivity CP2K 2022.1 build it to base it on alternative MPI, FFTW3, ScaLAPACK, and BLAS libraries. The usual way the Intel oneAPI compiler set is employed by CP2K is in conjunction with MKL, where upon FFTW3, ScaLAPACK, and BLAS are included. Both FFTW3 and ScaLAPACK provided by MKL are linked against Intel MPI library. Which means they depend on Intel MPI and therefore the alternative CP2K build cannot employ them, because that way it won’t be possible to escape Intel MPI. Skipping MKL is not a perfect solution, since the libraries included therein are very well optimized and supported, but is the only alternative way to go.


One way to replace the MKL-included libraries required by the CP2K toolchain is to rely on CP2K installer to compile the code of OpenBLAS (replaces BLAS), FFTW3, and ScaLAPACK toolchain sub-packages included in the source code package. That method works, but the configure scripts of those sub-packages do not completely optimize the binary code to match Zen2. By compiling those sub-packages separately (outside CP2K) and by using the respective optimizations, FFTW3 might run little bit faster (~5%). The same is for ScaLAPACK (~3%). Compiling OpenBLAS separately is not necessary and CP2K may rely on its own OpenBLAS sub-package to have BLAS.


Libint needs to be compiled separately. CP2K toolchain installer cannot compile due to a failure in the toolchain installer script (that needs further investigation).


Intel MPI might be replaced by a custom build of either OpenMPI or MPICH, which source code is compiled using Intel oneAPI compiler set. Note that Intel MPI is based on MPICH. MPICH performs little bit better than OpenMPI on our cluster. But since OpenMPI is easy do debug, we employed it a MPI library for C2PK.


Our initial version of the build recipe for CP2K 2022.1 is available on-line at:


https://gitlab.discoverer.bg/vkolev/recipes/-/blob/main/cp2k/2022/1/cp2k-2022.1.intel.openmpi.recipe


One can start enhancing the recipe by adding to the toolchain installation more external libraries (start with FFTW3) and do some tests to prove that speeds up the simulations.

Leave a Reply

Your email address will not be published. Required fields are marked *

WordPress Appliance - Powered by TurnKey Linux