#1 Force theorem gets an error with MPI by Dongzhe 19.04.2021 17:50

Dear,

After the nice FLEUR workshop last week, I compiled the fleur_MPI in the HPC, and "ctest -I 1,35" passed successfully.

These are the modules loaded:
1) intel/18.2 3) netcdf/4.6.1 5) gcc/7.3.0 7) python/3.7.6-nointel 8) fftw/3.3.8
2) intelmpi/18.2 4) hdf5/1.10.2-intelmpi 6) cmake/3.11.2

I have successfully done the non-collinear SCF calculations including SOC with "Number of MPI-tasks: 36". However, if I run the Force theorem calculations in parallel, after the first step SCF, the code stops and produce the following error message:

elementName Forcetheorem_Loop
attributeNames calculationTypeNo
attributeValues MAE 1
elementList
ERROR: xml hierarchy too deep!

On the other hand, if I run with a single core, no problems... It seems like the force theorem part does not support the MPI parallelization.

Thank you in advance for your suggestions and comments.
Best regards,
Dongzhe Li

#2 RE: Force theorem gets an error with MPI by wortmann 20.04.2021 11:50

Thanks a lot for the report, we will look at this. I will also move this discussion to the Magnetism section.

Daniel

#3 RE: Force theorem gets an error with MPI by Gregor 20.04.2021 17:07

Thank you for reporting the issue. I was able to reproduce and fix the problem in the newest development version.

Your diagnosis was correct. It was an issue related to the MPI parallelization in combination with all kinds of force theorem calculations. This combination should work now (in the development version).

Best, Gregor.

#4 RE: Force theorem gets an error with MPI by Dongzhe 20.04.2021 18:58

Thank you for fixing the problem quickly.

Now I have another issue, it does not compile with hdf5.
I loaded the "hdf5/1.10.2-intelmpi" module, then "FC=mpif90 CC=mpicc ./configure.sh -hdf5 true"

I get the following error message:

HDF5 Library found:FALSE
CMake Warning at cmake/tests/test_HDF5.cmake:66 (message):
You asked for HDF5 but cmake couldn't find it. We will try to download and
compile HDF5 along with FLEUR
Call Stack (most recent call first):
cmake/CompilerConfig.txt:15 (include)
CMakeLists.txt:18 (include)


-- Found Git: /usr/bin/git (found version "1.8.3.1")
CMake Error at cmake/tests/test_HDF5.cmake:72 (message):
HDF5 source could not be downloaded.

We tried: 'git submodule init external/hdf5-git && git submodule update'
and resulted in error
Call Stack (most recent call first):
cmake/CompilerConfig.txt:15 (include)
CMakeLists.txt:18 (include)

Any hints to get rid of it? Many thanks.
Best regards,

#5 RE: Force theorem gets an error with MPI by Gregor 20.04.2021 19:52

This is difficult to answer because there may be many reasons. The module may have HDF5 compiled without --enable-fortran. Maybe you have to specify the paths to the library and/or the include directory explicitly or you have to adapt your LD_LIBRARY_PATH.

EDIT: Maybe your compilers for Fleur differ from those that were used for HDF5.

EDIT2: Actually from your configure line it is already clear that you don't use the Intel compilers for Fleur. But those would probably be more compatible to your HDF5 version. And remove that "-hdf5 true". It doesn't work on that computer since you are behind some firewall.

#6 RE: Force theorem gets an error with MPI by wortmann 21.04.2021 12:28

This problem arises because the git could not download the hdf5 sources. This might be due to different reasons. Most probably a connection issue of some kind, e.g. your computer does not allow a git clone or you can not access the server. You could try to run the command
' git submodule init external/hdf5-git ' in your fleur source directory to get more direkt info. You could also verify that a `git clone https://github.com/HDFGroup/hdf5.git` works correctly.

Daniel

#7 RE: Force theorem gets an error with MPI by Gregor 21.04.2021 13:52

There are actually two problems here. On the one hand the automatic downloading of hdf5 is somehow blocked, on the other hand the hdf5 from the module is not found. I think one should try to fix the issue with the module.

Check whether you have an mpiifort (from Intel) compiler and use that one.

Xobor Einfach ein eigenes Xobor Forum erstellen
Datenschutz