Introduction
If you think that this picture shows nice waiting seats at a modern, creatively designed bus station, you are very wrong. In fact it shows a truely iconic supercomputer of the 1980s, the Cray X-MP. Its peak performance was 0.8e9 FLOP/s, the fastest in the world from 1983 to 1985. I was fascinated as a school boy!
For comparison, an iPhone 11 from 2020 has around 5e12 FLOP/s. It also wins big on memory over the veteran from 40 years ago.
If you now think that performance and memory are a non-issue in these days of seemingly hardware abundance, you are very wrong again. Why so? Because neither Wirth’s law
Software is getting slower more rapidly than hardware is becoming faster.
What Andy giveth, Bill taketh away.
apply in the context of FEA codes such as Abaqus, as I have shown in my previous post. Of course both are funny and often very true adages!
The real reason for hardware still being relevant is the fact that the typical reference for an ambitious (but still pretty standard) simulation is the overnight run: If hardware and software get faster, you will easily compensate by including more physical complexity or, simply, more finite elements.
Enter NAFEMS
Because of the misconception that computing resources are abundant (another counter-example is gaming), it seems that mesh refinement studies are somewhat of a lost art. Rest assured that every support engineer in training I get my hand on as “drill sergeant” during the “boot camp” will be familiar with this NAFEMS benchmark. And so should you!
The problem description states that the reference solution is 92.7 MPa, easily verified by rebuilding the model geometrically in Abaqus/CAE and then using a “ridiculously” fine mesh (say, 0.002 mm edge length) of CPS8 elements. This results in 1.36 million elements, 8.19 million DOFs, and (mostly) a waste of 2.14e12 FLOPs (not FLOP/s, note the difference, one is a plural, the other a rate):
Enter Wetware
Now let us see with how few FLOPs we can get away if our target accuracy is to be within 1 per cent of the reference solution. Attention! Instead of the “anxiety mesh” (as I like to call a mesh that is way too fine because a clueless analyst was in fear of getting wrong results) we will now be using our brain and think about the problem. Remember little Gauss from my previous post?
Usually I will use Renard’s numbers for these cases, a rounded geometrical series progressing from 1 to 10 in 10 steps: 1, 1.2, 1.5, 2.0, 2.5, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0. Multiply by powers of 10 as needed. These numbers come in handy for other parametric studies as well, e.g. geometric dimensions (their historic origin), explicit time increments, numerical fiddle factors … I employ them so often that I know them by heart.
CPS4R
We will start with a trusted working horse, a linear reduced integration element. It turns out that with a mesh size of 0.004 mm we can achieve our target accuracy. This requires 341000 elements, 684000 DOFs and 2.79e10 FLOPs.
CPS3
Maybe we can get away even cheaper by using cheaper elements, such as the linear triangle? Indeed we need only a mesh size of 0.02 mm, 28200 elements, 28700 DOFs and 1.74e8 FLOPs.
CPS8R
If you have taken heed in your finite element courses at university or participated in our 2-day seminar about Element Selection in Abaqus you should know that for stress concentration problems such as ours second order shape functions are recommended. The result with the reduced integration CPS8R is stunning: 0.08 mm mesh size, 836 elements, 5270 DOFs and 3.28e7 FLOPs will suffice.
CPS8R, but with biased local mesh seeds
To requote my finite element teacher here: “A model is the better, the more is known about the result.” Using this, we can capitalize on the previous section and now really exploit the fact that we know that and where we will get high stresses: We will use a biased mesh with refinement towards the point of stress concentration and a very large global mesh size. And, no surprise, with a mesh size of 0.5 mm, biased to point D (see the documentation of this benchmark problem) with 0.08 mm (as before), a mere 53 elements, 376 DOFs and a measly 4.00e5 FLOPs we get 91.9 MPa:
Summary
By using our brain we were able to land within 1 per cent of the reference solution, but with a numerical effort smaller by a factor of 5 million compared to the “anxiety mesh” (which is so fine that with element edges shown the results are completely obscured, see above). The iPhone is “only” about 6000 times faster than the Cray. Wetware beats hardware. You first heard it here.
P.S.: Of course even with the “anxiety mesh” the model runs within minutes on a laptop. That is not my point. The immense savings are worth getting for larger problems, so think about your next overnight run.
Connect with Axel in the community.
SIMULIA offers an advanced simulation product portfolio, including Abaqus, Isight, fe-safe, Tosca, Simpoe-Mold, SIMPACK, CST Studio Suite, XFlow, PowerFLOW, and more. The SIMULIA Community is the place to find the latest resources for SIMULIA software and to collaborate with other users. The key that unlocks the door of innovative thinking and knowledge building, the SIMULIA Community provides you with the tools you need to expand your knowledge, whenever and wherever.