CalculiX with Intel's MPI cluster sparse solver

I just started testing this, and I got this message after installing it:

Can't open ccx_2.18step.c: No such file or directory at ./date.pl line 18.
icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message.

So far am doing the test suite and will report shortly.
For reference, OS=Ubuntu 22.04 LTS (WSL2) and compiler = Intel’s oneAPI HPC Toolkit (version 2022.3.1).

Thank you!

I have the following version:

a@DESKTOP-7B3I8SK:/mnt/e/q9$ mpiifort --version
ifort (IFORT) 2021.7.1 20221019
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.

And I use the following command:

. /opt/intel/oneapi/setvars.sh

Not sure. Seems some compiler conflict perhaps with your system compiler. Try compiling some even simpler hello world kind of program.

/*The Parallel Hello World Program*/
#include <stdio.h>
#include <mpi.h>

main(int argc, char **argv)
{
   int node;
   
   MPI_Init(&argc,&argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &node);
     
   printf("Hello World from Node %d\n",node);
            
   MPI_Finalize();
}
[feacluster@instance-3 ~]$ mpiicc hello_world.c
[feacluster@instance-3 ~]$ ./a.out
Hello World from Node 0
1 Like

If it created the executable then you can ignore those warnings.

1 Like

Looks like it is working just fine! Thank you so much @feacluster !!!

Same problem. The beginning and the end of the message below. I will try this on a Google Cloud machine. Maybe there will not be such a compiler collision there.

Thank you very much indeed!

icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use ‘-diag-disable=10441’ to disable this message.
In file included from /usr/include/stdio.h(43),
from hello.c(2):
/usr/include/x86_64-linux-gnu/bits/types/struct_FILE.h(95): error: identifier “size_t” is undefined
size_t __pad5;
^

In file included from /usr/include/stdio.h(43),
from hello.c(2):
/usr/include/x86_64-linux-gnu/bits/types/struct_FILE.h(98): error: identifier “size_t” is undefined
char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
^

In file included from hello.c(2):
/usr/include/stdio.h(52): error: identifier “__gnuc_va_list” is undefined
typedef __gnuc_va_list va_list;
^

In file included from hello.c(2):
/usr/include/stdio.h(292): error: “size_t” is not a type name
extern FILE *fmemopen (void *__s, size_t __len, const char *__modes)
^

In file included from hello.c(2):
/usr/include/stdio.h(298): error: “size_t” is not a type name
extern FILE *open_memstream (char **__bufloc, size_t *__sizeloc) __THROW __wur;
^

In file included from hello.c(2):
/usr/include/stdio.h(309): error: “size_t” is not a type name
int __modes, size_t __n) __THROW;

In file included from hello.c(2):
/usr/include/stdio.h(675): error: “size_t” is not a type name
extern size_t fwrite_unlocked (const void *__restrict __ptr, size_t __size,
^

In file included from hello.c(2):
/usr/include/stdio.h(675): error: “size_t” is not a type name
extern size_t fwrite_unlocked (const void *__restrict __ptr, size_t __size,
^

In file included from hello.c(2):
/usr/include/stdio.h(676): error: “size_t” is not a type name
size_t __n, FILE *__restrict __stream);
^

In file included from /usr/include/stdio.h(864),
from hello.c(2):
/usr/include/x86_64-linux-gnu/bits/stdio.h(39): error: identifier “__gnuc_va_list” is undefined
vprintf (const char *__restrict __fmt, __gnuc_va_list __arg)
^

compilation aborted for hello.c (code 2)

Which gcc version do you have? If old, maybe try upgrading it:

[feacluster@micro ~]$ gcc --version
gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)

I am not sure if you are asking me or jbr, but, just in case, the answer is below. Does not look like a very old version, but I’ll check.

a@DESKTOP-7B3I8SK:/mnt/e/q9$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

How about something simpler like this:

[feacluster@instance-3 ~]$ cat hello_world2.c
#include <iostream>

int main()
{
    std::cout << "Hello, World!\n";
}
[feacluster@instance-3 ~]$ icpc hello_world2.c
1 Like

Try also:

sudo apt install build-essential

1 Like

I have managed to install the software in google cloud. Will try to install on the local machine later, but even if I fail, it does not matter.

Thank you so much!

There were some deviations in the file error.14070:

deviation in file beamfsh1.dat
line: 103 reference value: -5.943977e-09 value: -1.334100e-09
absolute error: 4.609877e-09
largest value within same block: 1.276463e-07
relative error w.r.t. largest value within same block: 3.611446 %

beamhtfc2.dat and beamhtfc2.dat.ref do not have the same size !!!

deviation in file beamptied5.dat
line: 13 reference value: 2.004731e+06 value: 1.890823e+06
absolute error: 1.139080e+05
largest value within same block: 2.968844e+06
relative error w.r.t. largest value within same block: 3.836780 %

deviation in file beamptied6.dat
line: 14 reference value: 2.090934e+06 value: 2.005467e+06
absolute error: 8.546700e+04
largest value within same block: 3.071561e+06
relative error w.r.t. largest value within same block: 2.782527 %

beamread.dat and beamread.dat.ref do not have the same size !!!

beamread2.dat and beamread2.dat.ref do not have the same size !!!

beamread3.frd does not exist
beamread4.dat and beamread4.dat.ref do not have the same size !!!

deviation in file induction2.frd
line: 20203 reference value: 2.198320e+01 value: 1.998630e+01
absolute error: 1.996900e+00
largest value within same block: 2.257380e+01
relative error w.r.t. largest value within same block: 8.846096 %

deviation in file membrane2.frd
line: 135 reference value: 2.617620e-08 value: 2.024470e-09
absolute error: 2.415173e-08
largest value within same block: 2.833140e-07
relative error w.r.t. largest value within same block: 8.524722 %

deviation in file ringfcontact4.dat
line: 26 reference value: 2.033520e+07 value: 2.063512e+07
absolute error: 2.999200e+05
largest value within same block: 2.033520e+07
relative error w.r.t. largest value within same block: 1.474881 %

deviation in file segment.frd
line: 5467 reference value: 8.550770e+10 value: -8.550340e+10
absolute error: 1.710111e+11
largest value within same block: 8.551870e+10
relative error w.r.t. largest value within same block: 199.969246 %

deviation in file sens_freq_disp_cyc.frd
line: 3536 reference value: -3.302250e+03 value: -3.253000e+03
absolute error: 4.925000e+01
largest value within same block: 1.069350e+04
relative error w.r.t. largest value within same block: 0.460560 %

deviation in file sens_modalstress.frd
line: 3266 reference value: -1.000000e+00 value: -9.648930e-01
absolute error: 3.510700e-02
largest value within same block: 1.000000e+00
relative error w.r.t. largest value within same block: 3.510700 %

deviation in file simplebeampipe3.dat
line: 131 reference value: 8.716587e+04 value: -2.006259e+04
absolute error: 1.072285e+05
largest value within same block: 8.716587e+04
relative error w.r.t. largest value within same block: 123.016566 %

I think it looks correct. I get the same files with differences. I think most are due to differences in the eigensolver and natual frequencies. See discussion on that topic here:

What kind of google cloud machines are you going to run on? To see speedup you will need to use two dedicated machines with all the available cpus on each .

1 Like

Thank you very much for the explanation.

As for the google cloud machines, I have yet to decide what exactly I need. So far I just wanted to try your install procedure, so I set up an instance group where the number of instances could change from 1 to 10. The instances were E2 machines, the boot disk was 50 GB (oneAPI Base required at least 23 GB). At the moment the execution time is not the priority, but I need a lot of memory. My desktop at home has only 64 GB of memory, and it is not enough for my problem.

Any plans to port it to v2.20?

Only works with v2.18 . Will update it for newer versions if there is interest.

No plans yet unless someone has a compelling reason :slight_smile: Were you able to run this on a cluster and see any speedup?

1 Like

So I have run a buckling job in a google cloud cluster at last: one instance, custom, N2 machines, 8 vCPU, 120 GB RAM, 96 GB boot disk, at least 95 GB RAM was used, and it took about 100 minutes.

Thank you very much for the script and your help!

1 Like

That is good. But if just running on one instance then probably no need to run the mpi version. The regular executable with pardiso solver should work just as well.

1 Like

It does not matter that there are multiple processors?

Both the MPI and non-MPI version can use all the processors. The only difference is the MPI version can use multiple hosts also. So if you install option 2 below then you are limited to all the processors on one host.

(1) Spooles ( Not recommended. 2-3X slower than Pardiso and can not solve models with more than a million degrees of freedom )
(2) Pardiso ( Must have the Intel compiler. If not, it is available for free from Intel.com. Does not require administrative privelages )
(3) Pardiso MPI ( Same requirements as above, but needs HPC kit also. Only works with v2.18 . Will update it for newer versions if there is interest. )

1 Like

I might note I have run problems requiring more memory than I have by increasing my available page file size and using an SSD for the page file. I have a very fast SSD, but watching I/O bandwidth has indicated that is probably not necessary. Most of the FEM problems only access modest portions of the memory at one time or at nearly the same time. For my 64 GB machine working virtual memory set sizes up to 160% of physical memory have been usable. Larger starts to have a lot of paging slow things down a lot. Pastix does not use memory as efficiently as Pardiso, so problems don’t work as well when they get larger than physical memory, and they reach that limit sooner. I have not been using 18 yet.

1 Like