Hi. I would like to simulate plastic deformations of metal. To do this it is needed
-solvers which can use many cores like 32 or 64 very well
-strain-stress model which takes into acoount influence of strain rate like Johnson Cook
-remeshing or adaptive mesh (adaptive during simulations)
and other thing like non linear contacts . I read that the Calculix has everything to need simulations of plastic deformations of metal.
So to begin i need to know answers for few questions.
- Calculix is independent solver or Calculix contains solver like Sparse and PastiX ?
- Which platform is better linux or windows for faster computations ?
3)I understand I need preprocesor nad postprocesor. So which exactly software should I download
calculix 2.2 from calculix.de or Prepomax which contains calculix solver ?
- In which preprocesor can I activate refine mesh (automatic during simulation) using buttom ?
- Or to active refine mesh option I would have to type code ?
- In which preprocesor can I activate Johnson Cook model or I would have to type code , edit inp file
Shoud be like this ? Download and instal prepomax and calculix, next prepare model (without refine mesh and Johnson Cook model) next edit inp file by typing code ) next open open the previously edited file and run simulation ?
CalculiX is a finite element method solver, it uses libraries like Spooles, Pardiso and PaStiX to solve large linear systems of equations typical for this method.
I don’t think there’s any rule like this. The performance of FEA solvers depends on many conditions but not really on the operating system. But keep in mind that OS may influence the choice of a pre- and postprocessor.
It’s up to you but PrePoMax is very intuitive. However, some features are not supported by it yet.
In PrePoMax you can add it using Keyword Editor. Just 2 lines to add.
It might be best to wait for the next release of CalculiX (should be coming soon) and use the JC model via keywords. Again just a few lines to add with Keyword Editor in PrePoMax.
As far as number of cores, I use an AMD and while Calculix will use all cores available and assigned to it, performance maxes out with 4 to 6 cores assigned and used. This may be a memory bandwidth issue, so moderately high end intell with more channels of memory would probably be faster and use more cores effectively. Also AVX 512 used on high end intel processors seems to be reported as faster than the AVX 2 on most AMD processors. For large problems though, lots of memory is the most important.
Thank you for your answer. I figured it would be the best choice wait for new PrePoMax.
Thank you for your answer. So if processor has 8 channel memory, scalability would be linear ? For example If I use 16 core it will be 16x better perofrmance than 1 core or less ?
New version of PrePoMax is already available, we are waiting for new version of CalculiX.
No idea. Probably depends on many internal issues of the processor, like cache size, but probably faster, but not linearly, definately diminishing returns with more cores. This is all just a speculation on what was the choke point on my AMD 3700X system as it was clearly not the number of cores. Fastest was six cores, very slightly slower 4 cores, quite a bit slower 8 cores. Odd number of cores always slower. My problems were non-linear with up to about 4,000,000 nodes and some use of virtual memory for the larger problems. Speed is most strongly dependent on number of nodes, or rather degrees of freedom, and how fast the solution converges, which is sensitive to the material model.
you mean version called 2.21 ?
for better performance : try select maximum RAM supported and fill up no matter processor type is AMD or Intel, it makes the problem solved in-core and avoid paging in HD (out of core)
in case of large multipart contact analysis including plasticity, for in-core mode the solver PaStiX known to be faster than MKL.
Linux are recomended by many mainstream commerical FE solver (Nastran, Ansys, Diana) due to it’s stability and penformace, many reported the speed can be increase to 30% faster than running on Windows.
PrePoMax is good to great in pre and post processor, however it is running on Windows only. fortunately, these workflow can be easely to adapt by Netgen GUI mesher in Linux environment. CalculiX CGX can read directly the mesh result including surfaces definition of boundary condition and contact.
pre and post processor not to depend in total times to finish in case of automatic tetrahdral models and multipart contact analysis with plasticity. modeling, meshing and input files preparation can be done in several minitures to hours, but the solver convergences can be several hours to days dependening on complexity.
The Mecway forum seems to have more on speeds of systems. Below is my input a few years back.
Max Paradiso MKL Pastix
Threads Time (tot and CCX) Time (tot and CCX)
2 1:13 1:07
4 :55 :49 :38 :31
5 :56 :49
6 :52 :45.5
8 :52 :45.7 :35 :28
This Pastix tends to have undefined memory problems on large problems before the system runs out of ram available. But it is fast. I tend to run with 4 threads as cores are running 28% less for the same problem. Intel chips of various varieties may be different
As you can see 4 threads is only 37% faster than 2 threads, and 6 threads 47% faster.
Note Paridiso and PasTix are Solvers used by Calculix. MKL is a set of (free) proprietary DLL’s from intel for high performance computing needed by Calculix using Paridiso. MecWay is in this case a graphics based pre and post processor, though it can do a lot internally.
Pastix needs a recompile to use i8 for large problems, but its speed advantage for me fades away as the memory needed is larger than for Paridiso. Pastix can also use a GPU for Cuda which can speed up some things if your Nvidia GPU has high performance in double precision. I don’t use this feature as my high performance GPU is an AMD. Much of Pastix’s speedup, however, is due to using single precision for the first iterations before switching to double.
I read that Calculix solver is very similar to Abaqus solver so I expect similar scalability to Abaqus.
I good understood ? If I want use for example two 16 cores procesor each (16 physical cores) and Pastix solver I need recompile before use this for problems contains 5000 000- 10 000 000 elements ?
Yes, it will likely be named this way.
Definitely not. They only share similar keyword syntax but CalculiX was built from scratch, not based on Abaqus source code. Abaqus has been developed since many years by a large company while CalculiX is pretty much one man’s work. Thus, their scalability can be totally different. Especially when it comes to nonlinear analyses with contact and even more in the case of explicit dynamics.
If I would have for example xeon gold with 8 channeI memory and a lot of RAM it is possible linear scalability of performance to 32 physical cores in nonlinear plastic deformations problems with refine mesh ? Or near linear scalability of performance in this problems ?
Ah! Different use of the term node. I was using the term as a control point on A FEA element, This Promo sheet is using node as a computer in a cluster of computers. For a single computer 4,000,000 nodes as I referred to is probably between 1,000,000 and 3,000,000 elements. Memory needs probably increase with about the cube of the number of nodes, perhaps a minimum of 256 GB, to 2,000GB for the problem size you are looking at. My largest problem is about 4,080,000 FEA nodes for a 3D solid model of a bridge span comprising about 980,000 20 node elements. Most of my models are of 3 span units with a courser mesh and only 1,872,000 nodes. 3D elements can have between 4 and 20 elements, though the nodes are part of adjacent elements. I am using 20 node brick elements because they fit what I am modeling and seem efficient with little error. and have few issues. They do not locally refine well, something tetrahedral elements work better at, but usually need a finer mesh. If you really need 10,000,000 elements, the server node described in your attachment is probably about right, with as much memory as you can afford.
I understood but do you think we can expect linear scalability of performance when I will have for example xeon with 8 channel memory ?
indeed, or then it can be reduced by using linear hexahedral element with incompatible modes.
but, unfortunately not all model can be done by preset and single click to do automaticlly. hex-dominant meshing attract large user and give promise to simplified the task.
it seems, linear tetrahedral element seriously need to improves to fill this gaps.
Performance will never scale linealy with the number of processor, never, forget it. Beteween 8-12 cores the curve of time/cpu quantity will stabilize and new processors will add very little improvement.
I understood, so scalability this processor is marketing bullshit.?
I can also get link to article