Implicit Discontinuous Galerkin Solution on Unstructured Mesh for Turbine Blade Secondary Flow

To enhance solution accuracy with high order methods, an implicit solver using the Discontinuous Galerkin (DG) discretization on unstructured mesh has been developed. The DG scheme is favored chiefly due to its distinctive feature of achieving a higher order accuracy by simple internal sub-divisions of a given mesh cell. It thus can relieve the burden of mesh generation for local refinement. The developed method has been assessed in several test cases. Firstly an examination on the mesh-dependency with different DG orders for discretization has demonstrated the expected mesh convergence-order of accuracy correlation. The flow around a cylinder solved with up to the 5 th order accuracy demonstrates the need for a high order geometrical representation corresponding to the high-order flow field discretization. A calculated turbulent flat plate boundary layer demonstrates the capability of the high order DG in capturing the laminar sub-layer for a given coarse mesh (y+>20) without a wall function, in a clear contrast to a conventional 2 nd order scheme. The focal test case of the present work is a high pressure (HP) turbine cascade with the flow losses being strongly influenced by both the laminar separation induced transition on the blade suction surface and the secondary flow development in the endwall region. Fully turbulent RANS solutions consistently over-predict the losses regardless, which can be significantly reduced by an empirically specified transition at a mid-chord point. Of particular interest is a contrasting behavior of pure laminar flow solutions. A steady laminar flow solution is difficult to converge, and even when converging, tends to give an unrealistically large separation. On the other hand when the laminar solution is run in a time accurate unsteady mode, a qualitatively different flow pattern emerges. Separated shear layer is then shown to lead to coherently shed unsteady vortices with a net entrainment from the main stream to near-wall region. The resultant time-averaged near-wall flow then effectively becomes a reattached boundary layer. Both overall and distributed losses computed from unsteady laminar solutions are shown to be consistently much better than the fully turbulent RANS solutions. It is remarkable and quite unexpected that for such a seemingly complex 3D flow problem, the straightforward unsteady laminar solutions with no empirical input seem to be comparable to (or slightly better than) the tripped RANS with empirical input.

solutions. It is remarkable and quite unexpected that for such a seemingly complex 3D flow problem, the straightforward unsteady laminar solutions with no empirical input seem to be comparable to (or slightly better than) the tripped RANS with empirical input.

INTRODUCTION
Computational fluid dynamics (CFD) has played a significant role in turbomachinery design and analysis in past few decades. However, the conventional CFD methods as widely applied are mostly based on 2 nd order discretization schemes, which tend to be dissipative when high resolution of practical turbomachinery problems is pursued. Recent development in applications of high fidelity flow models has indicated the requirement for developing and applying new high order methods. Highorder schemes based on finite difference upwind-biased weighted essentially nonoscillatory (WENO), e.g. [1] tend to be limited to structured grids and require rather large neighboring stencils for higher order discretization. Another high-order finite volume method (FVM) using an upwind flux scheme with Monotonic Upstream-Centered Scheme for Conservative Law (MUSCL) reconstruction has been applied for predicting the transitional flow over turbine cascade [2] while the extension to highorders is also restrictive.
On the other hand, there exists a family of high-order discretization schemes following a distinctively different strategy of achieving the high-order accuracy totally relying on local internal elements without extensive stencils on extra mesh points. The feature of internal sub-division is particularly appealing given the difficulties in generating quality fine meshes for complex geometries. Among these high-order schemes, the DG discretization has gained huge interest in recent years due to its very nature of easy extension to high-orders by increasing the order of piece-wise polynomials and maintaining the conservation laws without restrictions of inter-element continuity. DG is firstly introduced to solve the neuron transport problems [3] and further extended to solve linear hyperbolic problems. The breakthrough of solving the non-linear hyperbolic equations, particularly the compressible Euler equations, in a general framework is developed [4], which employs a Total Variation Diminishing (TVD) to obtain the high-order temporal discretization. The extension to the Navier-Stokes equations using a DG method is first made by Bassi and Rebay [5] introducing the BR1 scheme to treat the discretization of viscous terms and the BR2 scheme is further put forward to eliminate the extended stencils. A stable flux reconstruction (FR) method, recovering the nodal DG scheme, is more recently proposed [6].
The DG approach has been extensively applied to external aerodynamics, and also started to be developed and applied to internal turbomachinery flows, though to a much less extent. The DG method solving the Reynolds-Averaged Navier-Stokes (RANS) with the k-model has been applied to simulate the main flow features of turbomachinery flow [7][8][9].
High-fidelity turbulence eddy resolved (instead of modelled) methods, like large eddy simulation (LES), have been applied to study the endwall flows in the context of low pressure turbine (LPT) cascade [10][11][12]. Turbulence eddy resolved simulations have also been conducted more recently using the DG based methods, typically at relatively low Reynolds numbers. The direct numerical simulation (DNS) using a DG discretization has been lately applied to a LPT cascade to capture the flow separation and transition at low Reynolds numbers [13,14], on a two dimensional (2D) geometry with a small spanwise length without considering the endwall effects. Another preliminary study of LES of a full-span LPT cascade with the DG by introducing an extra wall-resolved model has been carried out [15], but only has the benchmark test case of a turbulent channel flow been validated. An entropy-stable DG spectral-element method has also been employed to simulate a LPT cascade [16,17] with only blade profile loss being taken into account, again due to lack of the endwall effects.
It is worth pointing out that most previous high order DG studies have been confined to 2D (quasi 3D) configurations. It is well known that a major source of flow losses in a blade passage, particularly in high pressure turbines (HPT), is in the endwall region dominated by secondary flows [18]. To the authors' knowledge, the endwall secondary flow development in a 3D HPT blade passage is yet to be studied by a highorder scheme.
Furthermore, it is of the present interest to examine a 3D turbine configuration where flow losses are influenced by both the secondary flow in the endwall region and the blade surface laminar flow separation with a subsequent transition. The instabilities of the separated laminar shear-layer subsequently lead to a transition to a turbulent state followed by a reattachment. For a RANS based flow solver, the boundary layer transition can be treated by 2D boundary layer based transition models from very simple separation bubble length correlated models e.g. [19], and those more sophisticated transport-equation based transitional models [20]. It is noted that strong 3D secondary flow effects in the endwall region should challenge those 2D boundary layer based transition models. The lack of high-order DG solutions for the flows with largely 2D blade surface boundary layer separation interacting with 3D endwall secondary flows also provides a motivation to examine the applicability of a DG solver to the complex 3D flow interactions.
Given the background introduced above, the present study has been initiated to develop an implicit high-order DG solver on unstructured meshes for turbomachinery applications. An implicit formulation seems to be suited for an unstructured mesh solver, for which some well-established solution techniques for explicit methods (e.g. multigrid methods) are less effective.
In the following sections, the DG methodology will be presented first. Some basic validation and sensitivity studies are followed to illustrate the distinctive features and characteristics of the DG method. Flow losses in a HP turbine cascade subject to the influences of the secondary flow and the laminar separation will be examined and compared to the corresponding experimental data. Finally, some further considerations on the modelling of the related flow physics are presented and the corresponding observations are made.

Discontinuous Galerkin Discretization
The computational domain is partitioned into a set of non-overlapping elements k with boundary . The approximate solutions to the flow equations in space x and time space where , designed for all-speed flows is adopted [21], which combines the efficiency from the flux-vector splitting and the accuracy from the flux-difference splitting methods with enhanced robustness. In order to maintain the compactness of a high-order scheme, the viscous flux is discretized using the BR2 scheme. For turbulence closure, the one equation Spalart-Allmaras (SA) turbulence model is fully coupled with the flow equations, and further modified to avoid the negative working variables and prevent the oscillations when applying a high-order discretization across the discontinuity [22].

Basis Functions
The approximate solution and test functions are represented as the expansions of p-th order polynomials, Journal of Turbomachinery where the ˆ( )

Implicit Time Integration
The spatially discretized equations can be integrated easily in time using an explicit scheme such as the multi-stage Runge-Kutta scheme [4]. The time step size of the explicit scheme is restricted by the corresponding small CFL, normally less than 1, due to the numerical stability requirement.
An implicit temporal integration scheme is developed as a primary solution method in the present work. The implicit scheme has an advantage of using very large time steps (CFL~3 (10 ) O in the present work). Given the challenges of developing effective multi-grid methods for unstructured meshes, the implicit formulation proves to be an important way of accelerating solutions for both steady and unsteady flows. It is worth noting that for very complex geometry configurations, highly non-uniform meshes might be generated. The time step limit set by the minimum mesh size of a highly non-uniform and also highly skewed unstructured mesh can be too restrictive, particularly for unsteady flow solutions where a uniform time step needs to be taken for the temporal consistency.
The spatial discretization in Eqn. (2) leads to a system of ordinary differential where M is the global block-diagonal mass matrix including all the mass matrices of elements.
where the superscript n represents the time step level. Since residuals are non-linear terms in the governing equations, Eqn. (5) is a non-linear system of algebraic equations.
For the sake of resolving this kind of non-linear system equations, the linearization of residuals is performed, Note that the global Jacobian matrix The non-linear system Eqn. (5) can be solved using a Newton method as in the following iterative steps, , the Newton iteration is initialized with a previous time level solution n U and iterations are performed to reach the sufficient small convergence tolerance as In each Newton iteration, Ax b is solved using a preconditioned GMRES iterative algorithm with a block incomplete LU factorization with zero fill-in (ILU0) [23]. Compared to other preconditioning schemes like block Jacobian or block Gauss-Seidel, ILU0 is the most robust one with a smaller number of iterations especially when applied to the low Mach flows on highly stretched meshes.
The overall DG discretization and implicit temporal integration process is illustrated in Fig. 2, where three nested loops are included. The outermost loop is the time step loop, following by the inner Newton iterations, while the GMRES iterative scheme sits in the innermost loop. For the no-slip wall, the velocity and eddy viscosity are all prescribed to zero and an adiabatic condition is also imposed here requiring that the heat flux is set to zero in the energy equation. For a slip wall boundary condition, the flow is tangential to the wall. A far-field boundary condition requires that the boundary is far away enough from the solid body and the boundary state is specified by given freestream variables.

COMPUTATIONAL RESULTS
This section covers three validation cases, 2D inviscid and viscous flows around a cylinder, a 2D turbulent flat plate boundary layer and a 3D high pressure turbine cascade.

Flow around a cylinder
The first validation study is an inviscid flow around a cylinder with an inflow velocity at 35 m/s. The geometry of cylinder is 2D but a 3D mesh is generated by extruding the 2D geometry one layer in the third direction to accommodate the current 3D solver. The mesh generator adopted is Gmsh [24], an open-source software for generating a 3D finite element mesh. The physical mesh near the cylinder surface is shown in Fig. 3(a) and there are only 12 node points to represent the geometry of the cylinder. The effect of using a high-order boundary representation is examined by comparing high-order DG simulations with such a coarse wall geometry mesh. Figs. 3(b), 3(c) and 3(d) exhibit high-order curved boundary with iso-parametric elements represented by corresponding p-th order polynomials as adopted in the spatial discretization [25]. The p-th order of the boundary polynomials is denoted by "BPp". The DG flow field simulations by p-th order polynomials are denoted as "DGp", where the corresponding order of the discretization accuracy is p+1.  we can see that a high-order field discretization may not be very meaningful without a corresponding high-order boundary geometrical representation. The computational cost from DG1 to DG3 on the coarse mesh is given in Table 1.
Here, the degrees of freedom for a 3D tetrahedral element are defined as N N Jacobian matrix needs to be calculated and inverted in each implicit time step, where the CPU time is increased significantly as the DG orders increase. It should be noted that at the current stage, the developed code is far from being optimized, indicated by the slower than expected scaling of the computational speed when the DG order is increased.  [26], which is also recently used for validation purposes [27]. The mesh refined in the near wall region for capturing the laminar boundary layer is shown in Fig. 8. There are still only 12 node points to represent the cylinder geometry and so the high-order boundary representation is also adopted for high-order DG simulations. In Fig. 9, the instantaneous entropy contours are presented for the solutions from a 2 nd order DG1 to a 5 th order DG4. As clearly shown in Fig. 9

Turbulent Boundary Layer over Flat Plate
The second test case is for a turbulent boundary layer over a flat plate with an inflow velocity 34.6 m/s and a Reynolds number of 6x10 6  and a spacing ratio is chosen as 1.6. This nearwall mesh would be extraordinarily coarse for a typical 2 nd order spatial discretization but it can be employed for high-order DG computations. To compare the near wall behavior computed with different orders of accuracy, the non-dimensional velocity u profiles against y are plotted in Fig. 12 for the DG solutions from the 2 nd order to the 5 th order, all with the direct nonslip wall condition.
The first mesh element y for the 2 nd order solution is markedly higher than that typically require for a direct nonslip wall condition. It is nearly 20 and thus u is substantially over-predicted by the 2 nd order scheme for this mesh. The computed results get closer to the theoretical law of wall as the order of accuracy increases.
Overall, the 5 th order (DG4) results match very well with the theoretical velocity profile in both the viscous sub layer and the log law layer. Hence, we can see that good results can be achieved by high-order DG solutions on extremely coarse meshes where a conventional 2 nd order scheme would struggle. It must be reminded that all these solutions of different DG orders are obtained on the same physical mesh. The results thus also serve as a demonstration that the burden of generating high quality near wall fine meshes for complex geometries may be considerably alleviated by the high order methods for turbulent flow simulations without invoking the wall function. Fig. 12 Computed velocity profiles of turbulent boundary layer at Rex = 6x10 6

High Pressure Turbine Cascade
As stated in the introduction, one primary motivation of the present work is to examine the predictability of 3D blade aerodynamic performance influenced by both endwall secondary flow and blade surface bubble type separation. To this end, the high pressure turbine cascade experimentally studied at Durham University (commonly known as 'Durham Cascade', [28]) is selected.

Durham Turbine Cascade and RANS Turbulent Flow Solutions
Durham cascade has been used extensively in the past for studying secondary flow loss generation and optimization of enwall geometry for loss reduction, and also as a standard test case for validating turbomachinery CFD methods [29][30][31][32]. At the experimental conditions with a Reynolds number of 4x10 5 , the flow in the blade passage is shown to be transitional subject to a bubble type separation on suction surface [30].
The main parameters are given in Table 2  The present DG solver in the RANS mode has been applied to predict the main flow and secondary loss. This case has been firstly attempted with a fully turbulent solution with the Sparlart-Allamas turbulence model, as presented in [34]. Some of the preliminary fully turbulent RANS results are presented here as a reference point.
A 2D blade mesh is firstly made and then extruded in the third direction, approximately 50,000 node points, 300,000 mesh cells are generated using Gmsh for the RANS simulations of this half-span blade passage with one endwall, where y < 0.7 on the blade surface and z < 0.5 on the end wall.
The pitchwise mass-averaged total pressure loss coefficient Cp0 at 128% ax C is presented in Fig. 13. Clearly the agreement between the DG1 and the DG2 solutions shows a reasonable order of accuracy convergence, and both show a clear overprediction of flow losses, which is consistent with previous RANS predictions [33,35].   For a transitional RANS, once turbulent flow is tripped (by a transitional model or a simple tripping), the turbulence eddy viscosity convected to the trialing-edge will impact on the onset and evolution of the trailing edge unsteady vortex shedding, e.g. as observed by Ning and He [36]. Typical RANS based transition models are based on boundary layer flows, and it is unknown how these models would cope with unsteady flow around a blunt trailing edge. More specifically for the present case, if the transition is triggered by the suction surface laminar separation, a steady turbulent boundary layer should be predicted by the transition model. The trailing edge flow will then still be subject to turbulence eddy-viscosity convected from the upstream turbulent boundary layer, which might potentially suppress the onset and evolution of vortex shedding. It is known that the trailing edge vortex shedding strongly affects the trailing edge basepressure and thus the losses e.g. [18], [36], [37].
A further modeling consideration in the present work stems from a recent work by He and Yi [37], which indicates that an essentially unsteady laminar solution, even with seemingly much coarser temporal and spatial resolutions than a typical LES, may predict an effectively reattached separation bubble on a compressor blade, in agreement with the experiment. The key requirement seems to be an unsteady laminar solution with sufficient temporal resolution to capture the evolution of the large scale unsteady vortices in the separated shear layer. For the present case, we thus attempt a direct fully laminar flow model. Accordingly, a consistent numerical constraint in this present case is that the temporal and spatial resolution should be sufficiently fine to resolve both the evolution of the unstable laminar separated shear layer on the blade suction surface as well as that of the trailing edge vortex shedding.

Sensitivity Study on Temporal Resolution of Unsteady Laminar Solution
Given the above modelling consideration and intent, we need to first obtain some guidance on the corresponding temporal resolution required for an adequate unsteady laminar solution. A finer mesh has been generated to ensure that the spatial scales of the shear layer instability and the trailing edge vortex shedding are adequately captured, as shown in Fig. 16. For the shear layer instability, a typical RANS boundary layer mesh can only provide a reasonable resolution in the wall normal direction, so further refinement in the streamwise is added as indicated in Fig. 16 with a close up around the suction surface separation region. Similarly around the trailing edge and in the wake region, a sufficiently fine mesh is required to support the local resolution to capture the initiation and development of the vortex shedding as illustrated in Fig. 16 for the close up around the T.E. and wake. The total mesh cells for the half-span turbine blade passage is 1.2 million with 5 2 10 node points and this mesh resolution is still much coarser than one for a LES for the half-span passage domain. It is also pointed out that high-order DG simulations are equivalent to subdividing mesh points and increasing the DG order for a given mesh should have a similar effect to that of a mesh refinement.
The periodic boundary conditions are applied at the pitch-wise boundaries of the computational domain. The endwall and mid-span are subject to the no-slip condition and the symmetric condition, respectively.   St  The qualitatively contrasting characteristics between the steady and unsteady laminar flow solutions may be interpreted simply as shown in Fig. 20. For the steady laminar solution, the instabilities of separated shear-layers are suppressed by the lack of temporal resolution. A steady (or largely steady due to numerical convergence difficulties) flow field is then produced where the separated shear layers may eventually reattach, leading to a large separation bubble (Fig. 20a). On the other hand, for an unsteady solution, as long as the resolution is sufficient to capture the laminar shear layer instabilities, separated shear layers are unable to grow into a large vortex before being shed. Once a vortex is shed, the local flow becomes reattached before the separation starts another vortex. We thus have coherent unsteadily shed vortices (Fig.   20b). The small unsteady vortices also seem to generate a net entrainment of flow from the main stream to near-wall region. The time-averaged net flow effectively behave similarly to a reattached turbulent boundary layer. carried out with the time-step size corresponding to that of CFLm=500 for this 2D case.

3D Unsteady Laminar Solution of Durham Cascade
We now look at the 3D unsteady laminar calculations of the Durham HP turbine cascade. The time averaged solutions are obtained over time steps to cover at least for one passage through-flow time. The time-averaged total pressure losses have been measured at downstream at a location of 128% axial chord (x=128% ax C ). We first look at the sensitivity of the DG solution with respect to the orders of accuracy.
The results of time-averaged total pressure loss coefficients Cp0 from the DG1 and DG2 solutions compared to the measured experiment data [38] are presented in Fig. 21. We can see that for the mesh adopted, the solutions become largely independent of the order of accuracy of the DG discretization. The measured data of the experiment illustrated in Fig. 21  The overall mass-averaged total pressure loss values at plane 128% ax C are given in Table 4. The unsteady laminar solution shows a significant loss reduction compared to the RANS solution, and is much closer to the experimental value.

CONCLUSIONS
An implicit solver using the Discontinuous Galerkin discretization on unstructured meshes has been developed, validated and applied for analysis of turbine blade secondary flow. The DG scheme is distinctive in that a higher order accuracy is achieved by simple internal sub-divisions of a given mesh cell. This, in combination with the implicit time-integration formulation, is well suited for an unstructured mesh solver.
The preliminary test for a flow around cylinder is conducted to verify the method implementation, and also to illustrate the importance of a higher-order geometrical representation. A test for a flat plate turbulent boundary layer highlights the advantage of the simple internal sub-division of a mesh cell in relaxing the fine near wall mesh generation as required by a conventional low-order scheme.
The developed DG solver is then applied to a high pressure turbine cascade A more interesting and remarkable observation is that the flow losses predicted by a direct unsteady laminar solution are consistently better than those produced by the fully turbulent RANS solution. Furthermore, it is noted that straightforward unsteady laminar solutions with no empirical input seem to be comparable to (or slightly better than) the tripped RANS with empirical input. In relation to underlining flow physics as the different models predicted, a steady laminar solution seems to suppress instabilities of separated shear layer, leading to a very large separation bubble. The unsteady laminar solutions, on the other hand, show that unstable separated shear layers lead to coherently shed unsteady vortices with marked net entrainment from main stream to near-wall flow. The resultant time-averaged near-wall flow then seems to become effectively a reattached steady boundary layer. It should be pointed out that the present observation on the computed unsteady laminar separation behavior on a typical coarse RANS mesh for a turbine cascade is similar to that of a recent work for a compressor cascade using a different CFD solver [37]. Given the marked differences in the flow performance computed and the corresponding computational resources required, these findings should deserve further investigations.