TRACE User Guide  TRACE Version 9.7.5
Best Practice Guidelines

The guidelines in the following sections can be helpful in choosing the right settings and models for generic cases. Those might not be optimal for any specific geometry, mesh or simulation goal, but should provide a sensible setup for most cases.

Specifically covered in this guide are:

General solver settings

TRACE is a density based solver for the Navier-Stokes equations designed specifically for application on turbomachinery. It works best for Mach numbers between 0.1 and 2.5. With the optional low-Mach preconditioning it is possible to extend this range below 0.1, e.g. for the simulation of cavities. The most general use case is the steady simulation of a turbomachinery component. The major settings for this simulation type will be briefly described in this section and can also provide a basis for choosing sensible settings in other solver modes.

Spatial discretization scheme for structured grids

Several discretization schemes are available. The recommended setting is a 2nd order Fromm scheme, which provides the best compromise between stability and accuracy. The first order scheme should not be used for results, but can give a good starting point for a 2nd order scheme for an otherwise unstable simulation. For cases with larger spatial temperature gradients, e.g. when computing flows with film cooling, small oscillations of the temperature may appear. If those oscillations are a relevant problem, they can be suppressed by using the vanAlbada1 limiter. This limiter is slightly less accurate than the default limiter for 2nd order schemes, VanAlbadaSqr. Therefore, we recommend switching limiters only if there is a problem.

Scheme First order Fully upwind Fromm Third Order
Pros best stability very stable
medium accuracy
good stability
good accuracy
highest accuracy
Cons very low accuracy medium accuracy - low stability
Order 1st order 2nd order 2nd order 3rd order
Limiter - MinMod VanAlbadaSqr none
Entropy-Fix 0.075 0.02 - 0.1 transition: 0.001
low-Re: 0.01
wall function: 0.075
0.001 - 0.01

Spatial discretization scheme for unstructured grids

For unstructured grids, either a first or second order scheme can be selected. Again, second order is recommended, first order may be useful for stability at the start of the simulation. The entropy fix value can be taken from the table above. A gradient reconstruction method must also be selected. LeastSquaresFA is recommended for better accuracy and stability. All other options provide only a small speed-up at the expense of accuracy and stability. The Venkatarkrishnan limiter is recommended. The limiter can be made more restrictive by lowering the argument of SetGradientLimiter if the solution shows spatial oscillations. The default value is 100.

Pseudotime discretization

Two schemes are available, Gauss-Seidel and predictor-corrector. The predictor-corrector scheme is recommended, with a CFL number of 50 - 250. A sensible initial guess would be 100. Setting a high initial CFL number might prove unstable. Therefore, a CFL ramp should be used with an increase of 0.5 - 1 per time step. If the simulation is unstable, lowering the CFL number is recommended. CFL numbers above 250 rarely provide any additional speed-up.

The Gauss-Seidel usually does not allow for high CFL numbers, 25 - 50 is a good initial guess. Due to those low CFL numbers the predictor-corrector scheme is faster in most cases. But for equal CFL numbers Gauss-Seidel is slightly faster. It may be useful to give it a try for simulations with already low CFL numbers. Otherwise, use the predictor-corrector scheme.

Some control file options are recommended to accelerate the iterations and improve convergence. More information can be found in the linked documentation:

Steady boundary conditions

At inlets and outlets the "Steady 2D" boundary condition is the best option for most simulations done with TRACE. It is the default option in GMC and is suitable for all cases where stripes of cell faces (bands) can be defined either circumferentially aligned at constant radii for rotational cases or in y-direction for translational cases. Furthermore, the solution has to be periodic along those bands. If those prerequisites are not met, the "Steady 1D Characteristics" option should be used. This boundary condition can lead to reflections at the boundaries. If a 2D field is to be prescribed at an inlet or an outlet, the Riemann boundary condition has to be used.

At viscous walls some y+ limits should be known. In general, for low Reynolds grids the y+ should be 1 or less. For wall functions a y+ of 30 or higher is considered optimal, though lower values should be modeled adequately.

For more advanced options (e.g. roughness, bleed, etc.) please refer to the respective entries in the user guide.

Unsteady solver settings

Use a converged steady-state simulation to start an unsteady simulation. The recommended spatial discretization settings are identical to those used in the steady state simulations.

Time discretization

For most cases the BDF2 time discretization scheme is recommended. In GMC it is available under the misnomer "Euler Backward 2nd order". A less robust, but possibly faster and more accurate alternative is an implicit Runge-Kutta scheme of 3rd or 4th order. A single iteration will be more expensive but fewer should be necessary. For both schemes the predictor-corrector should be used. The explicit Runge-Kutta scheme is recommended for scale resolving simulations or when very small time steps are required.

For the implicit solver schemes a dual time stepping approach is used. Therefore, a time step and a pseudo time step have to be given. The physical time step can either be given directly or specified as a number of time steps per period. The latter is recommended for turbomachinery configurations. A period is defined by size of the segments given in the mesh and not by the number of blades in the machine. This makes it difficult to recommend any value, since it is inherently grid dependent. A high number of time steps is generally desirable for accuracy, but can significantly increase computational cost. To give some very rough approximates, for a URANS simulation designed to be accurate, about 60 - 100 time steps per simulated blade interaction are required. If multiple blades per row or very large passages are simulated, that number has to increase proportionally. If a fast, but more dissipative, simulation is intended fewer time steps might be sufficient. Although a higher number generally increases stability.

A good initial guess for the CFL number when using BDF2 is around 50 and can then be changed as necessary. In general, the lowest pseudotime step computed in any cell with that CFL number multiplied with the number of sub iterations should be larger than the physical time step. For many cases it is not useful to calculate more than 20 - 40 sub-iterations. If TRACE gives a warning that the pseudo time step is too small in several cells, it is usually more effective to increase the number of physical time steps than to calculate more sub-iterations. In general the setup of an URANS simulation, without any experience for similar cases, requires some tweaking. Therefore, it is advisable to check that the residuals of the sub-iterations decrease several orders of magnitude. If not, first increase the number of physical time steps per period, then increase the sub-iterations.

Unsteady boundary conditions

Unsteady boundary conditions are strongly recommended for unsteady time domain simulations, although steady boundary conditions are also available. Steady boundary conditions cause reflections since they do not allow temporal fluctuations of boundary states. The prerequisites for 2D boundary conditions are identical to those found in Steady boundary conditions. In general, the unsteady 2D frequency domain boundary conditions are most accurate and consistent with those used in harmonic balance. However, they require temporally periodic flows and for very large cases the memory requirements may prohibit their use. If the unsteady 2D frequency domain boundary conditions can not be used, the unsteady 2D time domain boundary conditions are a suitable alternative. If 2D boundary conditions can not be used at all, unsteady 1D boundary conditions are also available. Those are recommended for scale resolving simulations.

Turbulence and transition modeling

Several turbulence models are implemented in TRACE. This guide will focus on the two most used models, Wilcox's \(k-\omega\) model and Menter's SST model. Both are equally viable in TRACE, but the SST model is generally considered somewhat better at predicting size and location of separation bubbles.

Both models have problems predicting turbulence in stagnation points, like those at leading edges. Therefore, it is strongly recommended to use a stagnation point anomaly fix. To model some effects of rotation on turbulence three different extensions are available. For off-design compressor simulations the model by Bardina can be recommended. Modeling compression effects is only relevant for Ma > 1.6. In general, the solution method ILU is recommended.

Two kinds of transition models are available in TRACE. The algebraic MultiMode model was specifically designed for transition on airfoils in turbomachinery, but can't give accurate predictions for different flows. For the model to work correctly, the grid can not be splitted into blocks arbitrarily. The complete boundary layer must fit within the wall adjacent block. Usually, the blade would be divided into a suction and pressure side panel with only one corresponding block, respectively. The transport equation based Gamma-ReTheta model is in common use by many CFD solvers and offers better results for more general flows. Additionally, it can be used without any special treatment when splitting the computational domain. Several model extension are available for Gamma-ReTheta, though none are universally recommended for a generic simulation. The Gamma model is derived from Gamma-ReTheta and is designed to be a more stable alternative. It can be tested if the standard model proves to be unstable.

More complex models are available in TRACE if the two equation turbulence models do not provide a satisfactory result for complex flow phenomena. This includes the two equation Hellsten EARSM and the seven equation Reynolds stress models SSG/LRR- \(\omega\), Jakirlic/Hanjalic- \(\omega^h\) and Wilcox Stess- \(\omega\). Those models can give better results for flows with high anisotropy in the Reynolds stresses, e.g. for strong secondary flows, separations or rotations. This comes at the cost of an increase in computational effort and frequently with a reduced stability, compared with the standard two equation models. In all those metrics the Hellsten EARSM resides in between the standard two-equation models and the full RSMs.

For general turbomachinery configurations the turbulent heat flux is calculated with sufficient accuracy by a constant Prandtl number of 0.9. If that approaches proves unsatisfactory, more complex but also less robust models can be tested. This means choosing a Reynolds stress model in conjunction with one of the more complex heat flux models, e.g. SSG/LRR- \(\omega\) and Younis.

Harmonic Balance solver

The Harmonic Balance (HB) solver offers unsteady simulations with considerably less computational cost than a traditional URANS simulation. This method can only be applied to problems that are periodic in time and where all relevant frequencies are known in advance. Any phenomena with a frequency not considered can't be correctly resolved. A large part of the speed-up results from a severely reduced computational domain. A single passage per row is usually sufficient to resolve most unsteady effects in turbomachinery. In total, a speed-up of one to two orders of magnitude can be achieved.

It can be challenging to optimize a HB setup for computational cost, accuracy and stability. Some workflows and settings have been found to work quite well for many cases, but deviating from those may be required for your case. HB is commonly used for two different applications. To determine the aeroelastic stability of a blade or to capture the unsteady interaction of several blade rows.

Fluid structure interaction

In the first case only a single blade passage of one specific row is usually resolved. One or more vibration modes are imprinted on the blade via PREP. It is recommended to restart from a converged steady state simulation. A low CFL number of approximately 3 should be a good starting point. Including the unsteadiness of turbulence with computeTpVarHarmonics is also recommended. If the simulation is unstable, the \(log(\omega)\) turbulence model variants and the filtering of the transport equations (SetFilter) can improve the stability.

General HB setup

For HB simulations aimed at resolving unsteady aerodynamic phenomena, a different setup is recommended. The simulations described here usually include interactions between several blade rows. By default all transport models are resolved only for the time-averaged flow. To include harmonic frequencies in the turbulence and transition models, the computeTpVarHarmonics option must be enabled in the control file. This may increase the accuracy of the simulation but potentially lead to convergence issues. It can also affect the computational cost as more harmonics must be solved. In addition, inaccurate initial solutions seem to affect the HB solver more than the time domain solver. Therefore, increasing the complexity of the simulation step by step can prevent stability problems.

Consequently, it is recommended to start with a simple setup and increase the complexity whenever the previous simulation provided a stable solution. A good starting point is a setup that includes only direct downstream interactions with 1-3 harmonics and only time-averaged transport models. If this converges, add more harmonics and upstream effects as necessary and restart from the previous solution. Not using computeTpVarHarmonics may provide stability at this stage and will still give reasonable results for many cases. However, if there is a strong unsteadiness in the turbulence model, e.g. due to periodic flow separation, it may be necessary to resolve this unsteadiness with computeTpVarHarmonics. If this is used, the \(log(\omega)\) turbulence model versions can improve the stability. The steady state calculations should then also use these model variants to allow a smooth restart. As they are just a different formulation, they should give the same results as the base models once grid convergence is achieved.

The CFL number must be chosen very low compared to the time domain solver. A CFL number between 1 and 10 can be considered normal. In the absence of information for a specific case, start with 5 and increase if possible or decrease if necessary. A ramp when starting or restarting has been found to be beneficial in some cases. The recommended spatial discretisation settings can be taken from the table in Spatial discretization scheme for structured grids.

For an accurate solution, close to an equivalent URANS solution, several harmonics usually have to be considered. The exact number may be different for each row and is entirely dependent on your problem, but usually 3-5 harmonics is a good guess. With the approach outlined above, this can be checked by comparing the changes to the solutions for each harmonic when a new harmonic is added. The computational cost of the HB solver scales with the number of harmonics, which may be an upper limit for your case and resources. In general, more harmonics are required for an HB simulation where unsteady effects are relevant in the turbulence model. With an active transition model this number may increase again.

This leads to the following workflow:

  • Run a steady state simulation to obtain a sensible initial solution
  • Restart with HB, only consider downstream effects
  • Restart again with computeTpVarHarmonics, if desired/necessary
  • Increase the number of harmonics considered
  • Add all desired interactions (i.e. upstream effects, clocking...)

Control file commands are useful to change some settings of the Harmonic Balance solver. Here is an example control file that emphasizes stability over speed. This should be optimized after a successful simulation!

all immediate SetTemporalDampingFactor 1.e-2
all immediate SetTemporalDampingFactorTP 1.e-2
all immediate RobustHarmonicUpdate ResidualWeighted
all immediate ImplictBCImplementation NLHB
all immediate ImplicitSystemRelaxationFactor 0.5
all iteration 200 RampCflNum 0.1 5 1000
all immediate SetCFLNum 0.1

Several options are available if stability problems are encountered at any step:

  • decrease relaxation parameter 'all immediate ImplicitSystemRelaxationFactor 0.5' (see ImplictBCImplementation)
  • less effective, but sometimes useful, decrease the CFL number 'all immediate SetCFLNum 2'
  • use implicit boundary conditions, if available 'all immediate ImplictBCImplementation NLHB' (see ImplictBCImplementation)
  • use the residual weighted update strategy 'all immediate RobustHarmonicUpdate ResidualWeighted' (see RobustHarmonicUpdate)
  • limit pseudotime step size: all immediate SetPseudoTimeSteppingMode block/family (see SetPseudoTimeSteppingMode)
    • pseudotime step may become very small

At interfaces:

  • merge the lower and upper bands 'all immediate ChgPanelAtt -FAMILY row01Inlet -MERGE_BANDS 0.03' (see MERGE_BANDS)
  • relaxation at interfaces: all immediate ChgPanelAtt -FAMILY <PanelGroup> -RELAXATION_FACTOR 0.01 (default 0.1) (see RELAXATION_FACTOR)
  • relaxation at edges of interfaces: all immediate ChgPanelAtt -FAMILY <PanelGroup> -RELAXATION_INTERACTION <bandIndexStart> <bandIndexEnd> <factor1> <factor2> (see RELAXATION_INTERACTION)
    • limits the unsteady effects that are transported across interfaces close to hub and tip. Use 3-7 bands, depending on grid resolution:
    • at the hub: all immediate ChgPanelAtt -FAMILY <PanelGroup> -RELAXATION_INTERACTION 1 5 0 1
    • at the tip: all immediate ChgPanelAtt -FAMILY <PanelGroup> -RELAXATION_INTERACTION -5 0 1 0

More involved changes that may affect the converged solution:

  • block <number> immediate SetTemporalDampingFactor 1.e-2 (default 1.e-3) (see SetTemporalDampingFactor)
  • change number of harmonics considered:
    • decreasing usually helps for the Navier-Stokes solver
    • increasing can help with under-resolved transport models. Use this if instabilities arise directly after computeTpVarHarmonics is used.
  • reduce model complexity, if possible for your simulation goal, e.g.:
    • neglect upstream effects / clocking effects
    • time averaged transport models
  • block <number> immediate ChgSpaceAccuracy 1stOrder (see ChgSpaceAccuracy)

Discontinuous Galerkin spatial discretization scheme

The Discontinuous Galerkin (DG) method (see Discontinuous Galerkin Method) offers a high-order accuracy on unstructured grids and advantageous dissipation and dispersion characteristics for scale-resolving simulations. An efficient variant has been implemented in TRACE, namely the Discontinuous Galerkin Spectral Element Method (DGSEM), which is based on the collocation of solution and integrations nodes. However, DGSEM is limited to hexahedral elements and, hence, only unstructured hexahedral grids can be simulated with TRACE-DG. To prevent numerical artifacts close to boundaries, the geometry and adjacent element layers, i.e. the boundary layer, have to be discretized with high-order polynomials of degrees greater than one. These high-order meshes can be created either with PyMesh or with commercial meshing tools.

The DG solver can be applied for direct numerical simulations (DNS) and large eddy simulations (LES) of translational configurations featuring one or multiple rows. (Note that no RANS model is implemented in DG!) For LES, the implicit LES approach should be chosen with DG so that the dissipation provided by the Riemann solver mimics a subgrid-scale model. Here, an accuracy order of six is a good choice to obtain an optimal characteristic. The stabilization of the scheme for scale-resolving simulations is realized with a split-form/entropy-conservative discretization of the volume terms. When flows with shocks are considered, the DG shock-capturing scheme have to be explicitly activated via GMC.

Moreover, the compact stencil of the DG scheme in combination with an explicit time-integration schemes offers a good parallel scalability. A 3rd- or 4th-order Runge-Kutta scheme is the recommended choice. The restriction of the time-step size for explicit time integration schemes is rather strict and depends on the chosen time integration scheme and the basis node-set and accuracy order of the DG scheme. The CFL number can be checked beforehand via the POST-task '–localCFL' or on the screen output during the simulation. Generally, the CFL-number should be below one, although it might vary for the specific solver setting. Hence, large time step sizes are a likely reason for a crash of a simulation, and should be reduced to stabilize the run.

DG can be activated via GMC since version 9.4.3. For LES and DNS applications, additionally required control-file commands are (see Commands for DG solver):

all immediate usebr1scheme
all immediate UseDGStrongForm
all immediate UseSplitForm KennedyGruber

Here, Kennedy-Gruber split-form is applied for the volume integrals, which results in a kinetic-energy preserving discretization of the volume terms. Entropy-stability can be achieved for the spatial operator by using the Chandrashekar volume discretization and interface flux activatable via

all immediate UseDGStrongForm
all immediate UseSplitForm CHANDRASHEKAR
all immediate SetRiemannSolver CHANDRASHEKAR

The entropy-stable variant is slightly more expensive than the Kennedy-Gruber split-form. Remark that for smooth problems, i.e. no turbulence or shocks, the split-form should be deactivated, since spurious oscillations can occur. Moreover, the viscous discretization scheme of Bassi-Rebay 1 is recommended in combination with split-form/entropy-stable scheme.

The accuracy order can be set to an arbitrary value from 1st to 16th via the control file command SetSpatialAccuracy. This can be exploited to simulate transient phases, e.g. after initializing with a RANS solution, using a low accuracy order (less DOF and, hence, faster) and active the desired order of accuracy for the restart of the final run to sample statistics. The restart on solutions with different polynomial degrees is supported.

Benchmark a TRACE simulation

To measure and report a statistically significant performance of a simulation there is a special benchmark mode in TRACE. All screen and file output is deactivated in this mode and only timings of the iterations are reprted on screen. The simple iterations are internally analyzed after [22] until statistical convergence is reached. All other convergence/divergence criteria and maximal number of timesteps from CGNS are ignored. You can still specify a wall time limit by using -rtl. At the end of the benchmarking a machine readable file is written with all samples and the statistical analysis. It is recommended to bind TRACE's threads to hardware cores as shown below:

mpirun --report-bindings -bind-to hwthread -map-by core TRACE -cgns TRACE.cgns -cntl TRACE_control.input --benchmark

For --benchmark suboptions see Command line options.

Using -tpp while binding threads is discouraged. It is not possible to change TRACE's metadata after the initial timestep to achieve homogeneous measurements. This is checked in the beginning and leads to an error.