CalculiX using SPOOLES with MPI

jlucas · June 17, 2021, 10:22pm

Hello,

I would like to give a try to run CalculiX on a distributed computing environment (i.e Cluster). It seems there are some source files (v.2.17) with definitions to include the MPI version of spooles and to initialise the MPI protocol. These defines can thus be included on the compilation process by setting the flag -DCALCULIX_MPI on the makefile.

As I could not find any official instructions, I would like to know if someone knows about the intention of all this, and/or if is possible to run CalculiX with MPI?

Further, I managed to compile the code by setting this flag (see attached makefile), but had trouble on running the mpi version.

To run the mpi version I tried as a first test, to run CalculiX on a single node, by issuing the command:

>$ mpirun -np 2 ccx_2.17_MT_MPI spirello_model

The output from this command is pasted below. It shows that the program did not run because it got stuck on reopening the *.dat file (presumably by the second concurrent process). This *dat file seems to be always generated even when there is no *PRINT input card).

Any feedback/help would be greatly appreciated!

Thank you in advance,
Regards,
Jorge

PS: Could not attach the Makefile.1st time I am using the forum so it might be wrong doing, but it seems really odd that the interface does not allow to include text files… I included thus a jpeg snapshot of it

============================================================

$ mpirun -np 2 ccx_2.17_MT_MPI spirello_model

*ERROR in openfile: could not delete file spirello_model.dat

CalculiX Version 2.17, Copyright(C) 1998-2020 Guido Dhondt
CalculiX comes with ABSOLUTELY NO WARRANTY. This is free
software, and you are welcome to redistribute it under
certain conditions, see gpl.htm

You are using an executable made on Mi 12. Mai 23:27:22 CEST 2021

Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.

The numbers below are estimated upper bounds

number of:

nodes: 74813
elements: 13716
one-dimensional elements: 0
two-dimensional elements: 0
integration points per element: 8
degrees of freedom per node: 3
layers per element: 1

distributed facial loads: 0
distributed volumetric loads: 0
concentrated loads: 0
single point constraints: 5667
multiple point constraints: 2
terms in all multiple point constraints: 150
tie constraints: 17
dependent nodes tied by cyclic constraints: 0
dependent nodes in pre-tension constraints: 0

sets: 81
terms in all sets: 57557

materials: 4
constants per material and temperature: 9
temperature points per material: 1
plastic data points per material: 0

orientations: 9124
amplitudes: 5
data points in all amplitudes: 5
print requests: 0
transformations: 0
property cards: 0

*WARNING in usermpc: node 38426
is very close to the
rotation axis through the
center of gravity of
the nodal cloud in a
mean rotation MPC.
This node is not taken
into account in the MPC

STEP 1

Static analysis was selected

*INFO in gentiedmpc:
failed nodes (if any) are stored in file
WarnNodeMissTiedContact.nam
This file can be loaded into
an active cgx-session by typing
read WarnNodeMissTiedContact.nam inp

mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[52683,1],1]
Exit code: 201

jbr · June 18, 2021, 3:32am

jlucas · June 18, 2021, 7:29am

In an ideal world, there are ways to run both MPI and MP so to have multi-threading on every instance running on each node. So in theory, why should it be slower? Also I guess, on a cluster, it might be easier to get the amount of memory required to run very big models 10E6 eq. or so.

Bench-marking CalculiX (with spooles) against the available number of cores shows that the performance (i.e. speed of a calculation) decreases exponentially with the number of available cores and (at least on my system), after 1/2 of the number of available threads there is no improvement at all.

The alternative free solver PaStiX might be much better in terms of speed and memory requirements, but to get it to work is a really a nightmare! After many hours, I’ve managed to compile it (with some help) against system libraries, but could not get yet CalculiX to work. I hope for next release of CalculiX some effort will be put into getting a more standard build procedure so to ease this process. Also I do not like the idea of forking and forcing the use of an outdated version of PaStiX, as this library also evolves and gets bug fixes and newer features…

Thank you for your feedback…

jbr · June 18, 2021, 3:57pm

I agree with you on the build procedure for future releases and the nightmare of building with pastix.

rafal.brzegowy · June 18, 2021, 5:54pm

Info from Mathieu Faverge, regarding to the use of the standard (base) PaStiX in CalculiX:

I have been looking at the changes applied by the calculix team on pastix all morning, and my first impression is that you need the pastixStepsReset because everything has been “change to its inverse” in the code but I can’t understand why it has been done like this instead of following the internal logic.
I’ll try to look a little more into it, but we really need to find a way to integrate some of the work which is really nice, but we can’t do it this way. We will see with Peter Wauligman and Tobias Opel too.

jlucas · June 23, 2021, 1:58pm

Is there any hint for the date of next CalculiX release?

Topic		Replies	Views
Help with compiling SPOOLES with MPI	2	771	November 19, 2020
Any suggestion for parallel compilation of CalculiX	0	721	February 3, 2021
Suggestion for Faster Solver with parallelization	2	1401	November 29, 2020
Calculix compiled from source, but test case complains about spooles not being linked?	1	449	August 13, 2021
Calculix starts but does not finish	4	1013	September 30, 2020

CalculiX using SPOOLES with MPI

Related topics