Spooles uses only one core

Nobody-86 · August 14, 2022, 3:06pm

Hello there,

I noticed that calculix doesn’t use all cores, at least not for every operation. I have set the number of cores to 6 via the environment variable OMP_NUM_THREADS=6. In the output calculix also writes:

Using up to 6 cpu(s) for the stress calculation.
Using up to 6 cpu(s) for the symmetric stiffness/mass contributions.

However, the SPOOLES solver uses only one core:

Factoring the system of equations using the unsymmetric spooles solver
Using 1 cpu for spooles.

What can I do so that SPOOLES also uses 6 cores?

Calc_em · August 14, 2022, 3:10pm

Check the documentation chapter “How to perform CalculiX calculations in parallel”. It describes what can and needs to be done with Spooles in terms of parallelization.

JohnM · August 14, 2022, 5:20pm

If you have a specific reason for sticking with spooles, ignore this comment, but I highly recommend you try the pastix or pardiso solver

Nobody-86 · August 14, 2022, 8:06pm

Hi,

@Calc_em: Thanks. The chapter says that SPOOLES had to be compiled with a multi treading option. How can I check if this is the case for me? I have an already precompiled version of Calculix and honestly I don’t even know where I got it from. It is a bit older, the archive is called CL34-win64.zip.

@JohnM: no, there is actually no specific reason to use SPOOLES. I just use it because it “came with it”. As far as i know, spastix and paradiso have to be compiled by the user, and I haven’t managed to do that yet.

JohnM · August 14, 2022, 10:03pm

if you are using Windows, there are pre-compiled versions on the calculix website, and you’re in luck, 2.20 is out. the link is easy to miss, it just says “files”. if you run ccx_dynamic.exe, it will default to pastix and switch to pardiso for modal. runs in parallel

JohnM · August 14, 2022, 11:25pm

http://www.dhondt.de/calculix_2.20_4win.zip

xyont · August 14, 2022, 11:57pm

Spooles is an old solver (1998) known to be tested so long, perform well and reputable. better use as references for another newest solver (e.g PaStiX or Pardiso) in which some bugs or unwell may exist

xyont · August 15, 2022, 12:41am

you can find the precompiled executable at GitHub, however an updates to the latest version of CalculiX is not available.

vicmw · August 15, 2022, 7:13am

Multithreaded Spooles produces wrong results for eigenvalue problems half the time. I don’t know if that’s Spooles itself or some dependency or CCX but I don’t go near multithreaded Spooles anymore.

xyont · August 15, 2022, 8:42am

i was hear about it in MT, left the user verify for all solver. even Taucs solver in commercial FE, the developer still notify.

Nobody-86 · August 15, 2022, 7:57pm

Thank you very much for all the answers. I downloaded the new version and it seems that now all cores are used for each operation. I have not generated results yet, but I will run some test models.

Hyp · August 17, 2022, 6:29am

Hi,

I am using Ubuntu 20.04 with the ccx_2.20 executable from http://www.dhondt.de/. By setting OMP_NUM_THREADS I get stress calculation and symmetric stiffness/mass contributions with several cores working but not Spooles (only 1 core). Same as Nobody-86 was explaining (but for Windows i guess).

So, are there compiled versions for Linux anywhere out there that I could use in a similar easy way?

It would be even better to get Pastix and/or Pardiso included as I know Spooles cannot swap to the hard-drive and the other ones seem to be faster.

Thanks a lot in advance!

JohnM · August 17, 2022, 12:51pm

Unless you are using Ubuntu to create a Linux environment for some sort of coupled analysis with open foam for example, I would suggest trying the windows environment. the pardiso and pastix solvers are both available, run multiple processor and work well.

Hyp · August 17, 2022, 12:59pm

Hey, thanks for your feedback. I have a working Windows environment. Lately this was lacking a little bit of RAM to run a bigger problem. As I have access to some Linux server I wanted to test this one to do the calculation. In the meantime i found in some older thread: FEA Cluster where an linux pardiso executable can be downloaded. I will try this out. In addition I ordered some more RAM for my Windows machine.

xyont · August 17, 2022, 1:14pm

right, it seems better to invest in large RAM and processor speed instead number of cores or disk speed.

some report shown MKL Pardiso degrades in out-of-core mode. also no guarantee to works when RAM is not enough, due to improperly in read/write process.

MichaelPE · August 17, 2022, 8:51pm

I had some luck setting up a large page file to a fast SSD to ease ram needs. Works well enough up to about 170% of the maximum working set. (i.e. working set up to 110 GB for 64 GB of ram. Much of the data and code is not used simultaneously. This is somewhat less true for Pastix than Pardiso as for me Pastix becomes slower than Pardiso for problems greater than 1,100,000 nodes.

Also for Spooles as run inside Mecway, Victor proficed a change to the MECWAY 13 init file so that the internal spooles would make swap files. I sucessfully ran a 4,000,000+ node problem, though it took near 24 hours.

xyont · August 17, 2022, 9:42pm

thanks for sharing experiences, is PaStiX stopped to ran and unfinished at some limit?

if i may know, detail is needed

processor type AMD or Intel based
analysis type: static, dynamic, cyclic
large deformation : activated or not
material : elastic or plastic
contact : exist or no, single/multi part, large/small sliding, friction

thanks in advances

MichaelPE · August 17, 2022, 10:01pm

| xyont
August 17 |

| - |

MichaelPE:

Works well enough up to about 170% of the maximum working set. (i.e. working set up to 110 GB for 64 GB of ram.

thanks for sharing experiences, is PaStiX stopped to ran and unfinished at some limit? About 500,000 nodes. Over that the 8i compilation is needed

MichaelPE:

This is somewhat less true for Pastix than Pardiso as for me Pastix becomes slower than Pardiso for problems greater than 1,100,000 nodes

if i may know, detail is needed

processor type AMD or Intel based AMD 3700X
analysis type: static, dynamic, cyclic Static non-linear
large deformation : activated or not not?
material : elastic or plastic Compression only
contact : exist or no, single/multi part, large/small sliding, friction No.

thanks in advances

xyont · August 17, 2022, 11:07pm

i miss something, it’s about MKL versions since many scientific community reported bad performances for not Intel based processor. some critical features (AVX) is deactivated or locked by the publisher.

i work in area of medium mesh size, but the model may complex enough frequently struggle and hard to finish. it’s static/cyclic, large deformation elastoplastic, bunch of multipart contact and large sliding. in my experiences PaStiX is faster to convergences than MKL

i have used MKL Pardiso for so long before PaStiX implemented, then trying keep away after issue in locked features.

xyont · December 26, 2023, 2:02pm

hi, there’s an update of CalculiX with Spooles MT solver from CFturbo. It seems many patches has been added.

Topic		Replies	Views
How do I build ccx?	32	1648	April 13, 2023
CalculiX with Intel's MPI cluster sparse solver	38	2631	December 20, 2024
Using the feacluster.com ccx install script with Intel HPC Toolkit v2024.1.0	36	748	May 22, 2024
Trouble compiling and running CalculiX with Pastix on Ubuntu 24.04	31	491	January 7, 2025
Building CalculiX with PaStiX on FreeBSD without CUDA	20	478	May 15, 2025

Spooles uses only one core

Related topics