Strange results for a simple beam

Hello,

I modeled a simple beam and got very different results using two ccx executables.
The good (I assume) results came from the one that comes with PrePoMax, and the bad results came from the executable found here.
I compiled the current version from Dhondtguido’s repo, but the results are indentically bad.
Can somebody explain what is happening? Thank you in advance!

The input file:

*INCLUDE,INPUT=msh.inp

*MATERIAL,NAME=ABSISO
*ELASTIC,TYPE=ISO
2560, 0.4, 294

*SOLID SECTION,ELSET=ignore,MATERIAL=ABSISO

*NSET, NSET=FIXED
1, 2, 3, 4, 9, 10, 11, 12, 93, 94, 95, 96, 97
*NSET, NSET=FORCE
6

*STEP
*STATIC

*BOUNDARY
FIXED, 1, 1, 0
FIXED, 2, 2, 0
FIXED, 3, 3, 0
FIXED, 4, 4, 0
FIXED, 5, 5, 0
FIXED, 6, 6, 0

*CLOAD
FORCE, 3, -10

*NODE FILE
U

*EL FILE
S, E

*END STEP 

The output of the bad session:

************************************************************

CalculiX Version 2.22, Copyright(C) 1998-2024 Guido Dhondt
CalculiX comes with ABSOLUTELY NO WARRANTY. This is free
software, and you are welcome to redistribute it under
certain conditions, see gpl.htm

************************************************************

You are using an executable made on Wed Sep 11 19:41:34 2024

  The numbers below are estimated upper bounds

  number of:

   nodes:          441
   elements:          186
   one-dimensional elements:            0
   two-dimensional elements:            0
   integration points per element:            4
   degrees of freedom per node:            3
   layers per element:            1

   distributed facial loads:            0
   distributed volumetric loads:            0
   concentrated loads:            1
   single point constraints:           78
   multiple point constraints:            1
   terms in all multiple point constraints:            1
   tie constraints:            0
   dependent nodes tied by cyclic constraints:            0
   dependent nodes in pre-tension constraints:            0

   sets:            4
   terms in all sets:          572

   materials:            1
   constants per material and temperature:            2
   temperature points per material:            1
   plastic data points per material:            0

   orientations:            0
   amplitudes:            2
   data points in all amplitudes:            2
   print requests:            0
   transformations:            0
   property cards:            0


 STEP            1

 Static analysis was selected

 Decascading the MPC's

 Determining the structure of the matrix:
 Using up to 1 cpu(s) for setting up the structure of the matrix.
 number of equations
 1284
 number of nonzero lower triangular matrix elements
 38958

 Using up to 1 cpu(s) for the stress calculation.

 Using up to 1 cpu(s) for the symmetric stiffness/mass contributions.

 Factoring the system of equations using the symmetric spooles solver
 Using 1 cpu for spooles.

 Using up to 1 cpu(s) for the stress calculation.


 Job finished

________________________________________

Total CalculiX Time: 0.169244
________________________________________

And the good one:

************************************************************

CalculiX Version 2.22, Copyright(C) 1998-2024 Guido Dhondt
CalculiX comes with ABSOLUTELY NO WARRANTY. This is free
software, and you are welcome to redistribute it under
certain conditions, see gpl.htm

************************************************************

You are using an executable made on Sun Aug  4 19:44:24     2024

  The numbers below are estimated upper bounds

  number of:

   nodes:          441
   elements:          186
   one-dimensional elements:            0
   two-dimensional elements:            0
   integration points per element:            4
   degrees of freedom per node:            3
   layers per element:            1

   distributed facial loads:            0
   distributed volumetric loads:            0
   concentrated loads:            1
   single point constraints:           78
   multiple point constraints:            1
   terms in all multiple point constraints:            1
   tie constraints:            0
   dependent nodes tied by cyclic constraints:            0
   dependent nodes in pre-tension constraints:            0

   sets:            4
   terms in all sets:          572

   materials:            1
   constants per material and temperature:            2
   temperature points per material:            1
   plastic data points per material:            0

   orientations:            0
   amplitudes:            2
   data points in all amplitudes:            2
   print requests:            0
   transformations:            0
   property cards:            0


 STEP            1

 Static analysis was selected

 Decascading the MPC's

 Determining the structure of the matrix:
 Using up to 1 cpu(s) for setting up the structure of the matrix.
 number of equations
 1284
 number of nonzero lower triangular matrix elements
 38958

 Using up to 1 cpu(s) for the stress calculation.

 Using up to 1 cpu(s) for the symmetric stiffness/mass contributions.

Not reusing csc.
+-------------------------------------------------+
+     PaStiX : Parallel Sparse matriX package     +
+-------------------------------------------------+
  Version:                                   6.0.1
  Schedulers:
    sequential:                            Enabled
    thread static:                         Started
    thread dynamic:                       Disabled
    PaRSEC:                               Disabled
    StarPU:                               Disabled
  Number of MPI processes:                       1
  Number of threads per process:                 1
  Number of GPUs:                                0
  MPI communication support:              Disabled
  Distribution level:                     2D( 128)
  Blocking size (min/max):             1024 / 2048

  Matrix type:  General
  Arithmetic:   Float
  Format:       CSC
  N:            1284
  nnz:          79200

+-------------------------------------------------+
  Ordering step :
    Ordering method is: Scotch
    Time to compute ordering:              0.0067
+-------------------------------------------------+
  Symbolic factorization step:
    Symbol factorization using: Fax Direct
    Number of nonzeroes in L structure:      90510
    Fill-in of L:                         1.142803
    Time to compute symbol matrix:        0.0057
+-------------------------------------------------+
  Reordering step:
    Split level:                                 0
    Stoping criteria:                           -1
    Time for reordering:                  0.0010
+-------------------------------------------------+
  Analyse step:
    Number of non-zeroes in blocked L:      181020
    Fill-in:                              2.285606
    Number of operations in full-rank LU   :    14.02 MFlops
    Prediction:
      Model:                             AMD 6180  MKL
      Time to factorize:                  0.0089
    Time for analyze:                     0.0026
+-------------------------------------------------+
  Factorization step:
    Factorization used: LU
    Time to initialize internal csc:      0.0049
    Time to initialize coeftab:           0.0005
    Time to factorize:                    0.0014  ( 9.73 GFlop/s)
    Number of operations:                      14.02 MFlops
    Number of static pivots:                     0
    Time to solve:                        0.0003
    - iteration 1 :
         total iteration time                   0.000311
         error                                  0.00014417
    - iteration 2 :
         total iteration time                   0.000368
         error                                  1.6783e-06
    - iteration 3 :
         total iteration time                   0.00049
         error                                  1.1417e-08
    - iteration 4 :
         total iteration time                   0.000346
         error                                  3.3638e-11
    - iteration 5 :
         total iteration time                   0.000374
         error                                  1.1274e-13
    Time for refinement:                  0.0303
    - iteration 1 :
         total iteration time                   0.000491
         error                                  1.0474e-13
    Time for refinement:                  0.0054
________________________________________

CSC Conversion Time: 0.001502
Init Time: 0.259605
Factorize Time: 0.020848
Solve Time: 0.041411
Clean up Time: 0.000000
---------------------------------
Sum: 0.323366

Total PaStiX Time: 0.323366
CCX without PaStiX Time: 0.098722
Share of PaStiX Time: 0.766111
Total Time: 0.422088
Reusability: 0 : 1
________________________________________

 Using up to 1 cpu(s) for the stress calculation.


 Job finished

________________________________________

Total CalculiX Time: 0.456052
________________________________________

I’ve uploaded the input files here: ccx_bug.zip

Try with the PASTIX_MIXED_PRECISION environment variable set to 0 or Pardiso instead of PaStiX as the matrix solver.

1 Like

I get the “good” result using both the SPOOLES and PASTIX solvers…

Having built CalculiX with both the SPOOLES and PASTIX solvers (the story of the latter you can find in this topic), I have experienced that how (with which options) the libraries that CalculiX uses are built can matter a lot. For example, in the aforementioned topic, I eventually discovered that I had to build OpenBlas without threading but with locking! This was because CalculiX uses it in multiple threads.

Your “bad” version is using SPOOLES. As you can see in this repo, I had to apply 10 patches to the SPOOLES source code. Most of those patches originated from the FreeBSD ports tree. I added some extra patches to silence warnings. So the issue could be caused by the way the SPOOLES or BLAS library used were built. Unfortunately you cannot easily see that from the resulting binary.

1 Like

Thank you! I recompiled CalculiX without multi-threading and it produces the good results now.
I thought that MT will bite me, it is good for now.

I only use SPOOLES for now, I did not want to overcomplicate the build process and I’m solving toy problems. But I’ll keep your suggestion in mind for later.