I had a look at the source code, and I guess you know that, but the two functions allocation.f and calinput.f are slow for large element sets (due to the loops).
The following times are for 280k element sets, time in sec
total_time;2385.17334
readinput;1.39676
fort_allocation;170.82869
fort_calinput;527.75238
init_var;0.00001
descascade;0.00000
det_struct_mat;6.15488
linstatic_stress1;0.18480
linstatic_stiffness;2.77373
linstatic_stress2;5.35357
spooles_factoring_MT;1642.99561
spooles_solve;14.96334
spooles_cleanup;8.86842
The approach with temperature values instead of element sets would work, it is a lot faster, but there are some issues which I dont understand, thus I have to live with the bad performance atm.
I have an idea how to improve the performance with user defined materials, but I haven’t implemented it yet. I will give feedback if I have done that.