Can Calculix run across multiple nodes?

Dear All,

I’ve been trying to run coupled conjugate heat transfer (CHT) cases using CalculiX, OpenFOAM, and PreCICE on an HPC system. My question concerns running CCX efficiently on an HPC, both on a single node and across multiple nodes, for optimal performance.

For additional details, please check my post on the PreCICE forum: How to run CCX on HPC.

I successfully installed precice_ccx with its dependencies (along with SPOOLES, SPOOLES-MT) on the HPC—some via building from source and others using Spack. I ensured that everything is compiled with the same GCC and MPI versions. I was able to run coupled simulations using SPOOLES with the following SLURM script:

cd solid
export OMP_NUM_THREADS=50
export CCX_NPROC_EQUATION_SOLVER=50
taskset -c 64-127 ccx_preCICE -i solid -precice-participant Solid > log.ccx
cd ..

Although it reports running on 50 cores, the speed-up is lower than expected. (Please check my precice post for complete slurm script.)

My questions:

  1. How can I properly run CCX on an HPC using a SLURM script?
  2. What is the best configuration or setup to achieve maximum speed-up (scalability) for CCX on an HPC?
  3. How can I report execution time or clock time at each timestep?

I would greatly appreciate any guidance, suggestions, or examples from those who have tackled similar issues.

Sincerely,
Umut