Design & SimulationDecember 14, 2022

Hardware, Software, and Wetware: A Story of Performance Tuning

An intellectual roller coaster ride! A story of Performance Tuning.
header
Avatar Axel Reichert
Introduction

Have you ever heard of Rudy Rucker? If not, he is the great-great-great-grandson of Hegel, the famous German philosopher. Not too interesting, more of a trivia fact. But it gets more exciting if you learn that Rucker (an American) is mathematician, computer scientist, and, as a science fiction author, one of the founders of “cyberpunk“. His non-fictional works include “Infinity and the Mind“, “The Fourth Dimension“, and “Mind Tools“, all located at the intersection of mathematics, computer science, philosophy, and formal logic. All highly inspiring and thus highly recommended. But be prepared for an intellectual roller coaster ride! His fictional “Ware Tetralogy” includes two novels “Software” and “Wetware“, see the title of this article.

Enter Gauss

Now you all know the story of little Gauss at school, where the teacher tried to occupy the children of the class by having them sum the numbers from 1 to 100. Before he had a chance to leave the classroom, Carl Friedrich was done and presented the correct solution of 5050 on his slate. He had “seen” that

1 + 2 + … + 50 + 51 + … + 99 + 100

could be regrouped to

(1 + 100) + (2 + 99) + … + (50 + 51)

and so had just to multiply the sum of each pair (101) with the number of pairs (50). By the way, it was not Gauss who discovered this. The fact had been known already in pre-Greek mathematics.

Perhaps you would feel tempted nowadays to use a computer for this. Try this in the Python prompt of Abaqus/CAE:

del sum # to use the pure Python version of “sum”, not the Abaqus one
sum(range(1+10**2)))

gives, of course 5050. Now try

sum(range(1+10**8)))

It takes quite some time before the prompt returns with

5000000050000000L

Some performance tuning? First, Hardware. Getting a faster machine might save some percent. Second, Software. Using a faster programming language (Python is terribly slow) such as C might give you a speedup factor of 100. Much better. But once we increase our exponent from 8 to 100 (to sum up to Googol) in the example above, we will again be out of luck. Third, Wetware. Using the “litte Gauss”, as German mathematicians in mock reference to the “big Gauss” (his fundamental theorem of algebra) name it, the answer is trivial and obtained very quickly:

50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000005000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

(That is a 5, 99 zeros, another 5, and another 99 zeros.)

So Wetware beats Software beats Hardware.

Enter Abaqus

Last week I had on my desk a model about which the customer claimed that it prevented him to use newer Abaqus versions, because “they are slower”. Now that sounded pretty unlikely, after all, every update seminar in every year show dozens of slides about performance enhancements in various areas of the code.

The job was run on 2, 4, and 8 cores (more did not make sense with respect to the small model size) on identical hardware with the customer’s version, 6.14, which is 8 years old (!), and the current 2022. The numbers in hours:

CoresAbaqus 6.14Abaqus 2022
28.157.03
44.623.98
82.682.30

So the current version is 14 percent faster. Case closed? Not yet, I wanted to see what Wetware would be able to achieve.

.inp file

The customer’s model was nothing special, a simple and rather small middle-of-the-road Abaqus analysis:

  • ~10 *PART
  • 2 *ELEMENT, TYPE=C3D8R
  • *ELEMENT, TYPE=C3D10M for the rest
  • *ELASTIC, *PLASTIC
  • 10 *TIE
  • 5 *CONTACT PAIR
  • *FRICTION 0.15
  • *STATIC, NLGEOM=YES
  • Non-zero *BOUNDARY on a *PRE-TENSION SECTION node

.dat file

  • For 2 *TIEs the secondary surface was less fine than the main surface
  • 150000 elements
  • 1000000 DOFs

.msg file

  • 50 increments
  • 440 iterations

.sta file

  • Too large initial time increment resulting in cut-back
  • Several increments with 10 equilibrium iterations or more
Call in the “CUND” Squad
  • Changed C3D10M to C3D10 where element quality criteria permitted
  • Set ADJUST=NO on all *TIEs (used to be a problem many years back, but no more
  • Swapped 2 *TIEs
  • Converted *CONTACT PAIR, TIED to *TIE
  • *CONTACT CONTROLS, STABILIZE
  • Used my “Qonvergence Quartet” (CUND):
    • *CONTACT
    • *STEP, UNSYMM=YES
    • *STEP, NLGEOM=YES (was already active)
    • *DYNAMIC, APPLICATION=QUASI-STATIC
Results
  • 38 increments (formerly 50)
  • 208 iterations (formerly 440)
  • 1.66 h (formerly 2.30 h)

So the better model ran 28 percent faster. Again, Wetware beats Software. Case closed.

Connect with Axel in the community.



SIMULIA offers an advanced simulation product portfolio, including AbaqusIsightfe-safeToscaSimpoe-MoldSIMPACKCST Studio SuiteXFlowPowerFLOW, and more. The SIMULIA Community is the place to find the latest resources for SIMULIA software and to collaborate with other users. The key that unlocks the door of innovative thinking and knowledge building, the SIMULIA Community provides you with the tools you need to expand your knowledge, whenever and wherever.

Stay up to date

Receive monthly updates on content you won’t want to miss

Subscribe

Register here to receive a monthly update on our newest content.