Creating LBM kernels

The following list describes common parameters for the creation functions. They have to be passed as keyword parameters.

Method parameters

General:

  • stencil='D2Q9': stencil name e.g. ‘D2Q9’, ‘D3Q19’. See pystencils.stencils.get_stencil() for details

  • method='srt': name of lattice Boltzmann method. This determines the selection and relaxation pattern of moments/cumulants, i.e. which moment/cumulant basis is chosen, and which of the basis vectors are relaxed together

    • srt: single relaxation time (lbmpy.methods.create_srt())

    • trt: two relaxation time, first relaxation rate is for even moments and determines the viscosity (as in SRT), the second relaxation rate is used for relaxing odd moments, and controls the bulk viscosity. (lbmpy.methods.create_trt())

    • mrt: orthogonal multi relaxation time model, relaxation rates are used in this order for : shear modes, bulk modes, 3rd order modes, 4th order modes, etc. Requires also a parameter ‘weighted’ that should be True if the moments should be orthogonal w.r.t. weighted scalar product using the lattice weights. If False the normal scalar product is used. For custom definition of the method, a ‘nested_moments’ can be passed. For example: [ [1, x, y], [x*y, x**2, y**2], … ] that groups all moments together that should be relaxed with the same rate. Literature values of this list can be obtained through lbmpy.methods.creationfunctions.mrt_orthogonal_modes_literature(). See also lbmpy.methods.create_mrt_orthogonal()

    • mrt_raw: non-orthogonal MRT where all relaxation rates can be specified independently i.e. there are as many relaxation rates as stencil entries. Look at the generated method in Jupyter to see which moment<->relaxation rate mapping (lbmpy.methods.create_mrt_raw())

    • trt-kbc-n<N> where <N> is 1,2,3 or 4. Special two-relaxation rate method. This is not the entropic method yet, only the relaxation pattern. To get the entropic method, see parameters below! (lbmpy.methods.create_trt_kbc())

  • relaxation_rates: sequence of relaxation rates, number depends on selected method. If you specify more rates than method needs, the additional rates are ignored. For SRT and TRT models it is possible ot define a single relaxation_rate instead of a list, the second rate for TRT is then determined via magic number.

  • compressible=False: affects the selection of equilibrium moments. Both options approximate the incompressible Navier Stokes Equations. However when chosen as False, the approximation is better, the standard LBM derivation is compressible.

  • equilibrium_order=2: order in velocity, at which the equilibrium moment/cumulant approximation is truncated. Order 2 is sufficient to approximate Navier-Stokes

  • force_model=None: possible values: None, 'simple', 'luo', 'guo', 'buick', 'schiller', or an instance of a class implementing the same methods as the classes in lbmpy.forcemodels. For details, see lbmpy.forcemodels

  • force=(0,0,0): either constant force or a symbolic expression depending on field value

  • maxwellian_moments=True: way to compute equilibrium moments/cumulants, if False the standard discretized LBM equilibrium is used, otherwise the equilibrium moments are computed from the continuous Maxwellian. This makes only a difference if sparse stencils are used e.g. D2Q9 and D3Q27 are not affected, D319 and DQ15 are

  • cumulant=False: use cumulants instead of moments

  • initial_velocity=None: initial velocity in domain, can either be a tuple (x,y,z) velocity to set a constant velocity everywhere, or a numpy array with the same size of the domain, with a last coordinate of shape dim to set velocities on cell level

  • output={}: a dictionary mapping macroscopic quantites e.g. the strings ‘density’ and ‘velocity’ to pystencils fields. In each timestep the corresponding quantities are written to the given fields.

  • velocity_input: symbolic field where the velocities are read from (for advection diffusion LBM)

  • density_input: symbolic field or field access where to read density from. When passing this parameter, velocity_input has to be passed as well

  • kernel_type: supported values: ‘default_stream_collide’ (default), ‘collide_only’, ‘stream_pull_only’. With ‘default_stream_collide’, streaming pattern and even/odd time-step (for in-place patterns) can be specified by the streaming_pattern and timestep arguments. For backwards compatibility, kernel_type also accepts ‘stream_pull_collide’, ‘collide_stream_push’, ‘esotwist_even’, ‘esotwist_odd’, ‘aa_even’ and ‘aa_odd’ for selection of the streaming pattern.

  • streaming_pattern: The streaming pattern to be used with a ‘default_stream_collide’ kernel. Accepted values are ‘pull’, ‘push’, ‘aa’ and ‘esotwist’.

  • timestep: Timestep modulus for the streaming pattern. For two-fields patterns, this argument is irrelevant and by default set to Timestep.BOTH. For in-place patterns, Timestep.EVEN or Timestep.ODD must be speficied.

Entropic methods:

  • entropic=False: In case there are two distinct relaxation rate in a method, one of them (usually the one, not determining the viscosity) can be automatically chosen w.r.t an entropy condition. For details see lbmpy.methods.entropic

  • entropic_newton_iterations=None: For moment methods the entropy optimum can be calculated in closed form. For cumulant methods this is not possible, in that case it is computed using Newton iterations. This parameter can be used to force Newton iterations and specify how many should be done

  • omega_output_field=None: you can pass a pystencils Field here, where the calculated free relaxation rate of an entropic or Smagorinsky method is written to

LES methods:

  • smagorinsky=False: set to Smagorinsky constant to activate turbulence model, omega_output_field can be set to write out adapted relaxation rates

Fluctuating LB:

  • fluctuating: enables fluctuating lattice Boltzmann by randomizing collision process. Pass dictionary with parameters to lbmpy.fluctuatinglb.add_fluctuations_to_collision_rule

Optimization Parameters

Simplifications / Transformations:

  • cse_pdfs=False: run common subexpression elimination for opposing stencil directions

  • cse_global=False: run common subexpression elimination after all other simplifications have been executed

  • split=False: split innermost loop, to handle only 2 directions per loop. This reduces the number of parallel load/store streams and thus speeds up the kernel on most architectures

  • builtin_periodicity=(False,False,False): instead of handling periodicity by copying ghost layers, the periodicity is built into the kernel. This parameters specifies if the domain is periodic in (x,y,z) direction. Even if the periodicity is built into the kernel, the fields have one ghost layer to be consistent with other functions.

Field size information:

  • pdf_arr=None: pass a numpy array here to create kernels with fixed size and create the loop nest according to layout of this array

  • field_size=None: create kernel for fixed field size

  • field_layout='c': 'c' or 'numpy' for standard numpy layout, 'reverse_numpy' or 'f' for fortran layout, this does not apply when pdf_arr was given, then the same layout as pdf_arr is used

  • symbolic_field: pystencils field for source (pdf field that is read)

  • symbolic_temporary_field: pystencils field for temporary (pdf field that is written in stream, or stream-collide)

CPU:

  • openmp=True: Can be a boolean to turn multi threading on/off, or an integer specifying the number of threads. If True is specified OpenMP chooses the number of threads

  • vectorization=False: controls manual vectorization using SIMD instrinsics. If True default vectorization settings are use. Alternatively a dictionary with parameters for vectorize function can be passed. For example {'instruction_set': 'avx', 'assume_aligned': True, 'nontemporal': True}. Nontemporal stores are only used if assume_aligned is also activated.

GPU:

  • target='cpu': 'cpu' or 'gpu', last option requires a CUDA enabled graphics card and installed pycuda package

  • gpu_indexing='block': determines mapping of CUDA threads to cells. Can be either 'block' or 'line'

  • gpu_indexing_params='block': parameters passed to init function of gpu indexing. For 'block' indexing one can e.g. specify the block size {'block_size' : (128, 4, 1)}

Other:

  • openmp=True: only applicable for cpu simulations. Can be a boolean to turn multi threading on/off, or an integer specifying the number of threads. If True is specified OpenMP chooses the number of threads

  • double_precision=True: by default simulations run with double precision floating point numbers, by setting this parameter to False, single precision is used, which is much faster, especially on GPUs

Terminology and creation pipeline

Kernel functions are created in three steps:

  1. Method:

    the method defines the collision process. Currently there are two big categories: moment and cumulant based methods. A method defines how each moment or cumulant is relaxed by storing the equilibrium value and the relaxation rate for each moment/cumulant.

  2. Collision/Update Rule:

    Methods can generate a “collision rule” which is an equation collection that define the post collision values as a function of the pre-collision values. On these equation collection simplifications are applied to reduce the number of floating point operations. At this stage an entropic optimization step can also be added to determine one relaxation rate by an entropy condition. Then a streaming rule is added which transforms the collision rule into an update rule. The streaming step depends on the pdf storage (source/destination, AABB pattern, EsoTwist). Currently only the simple source/destination pattern is supported.

  3. AST:

    The abstract syntax tree describes the structure of the kernel, including loops and conditionals. The ast can be modified e.g. to add OpenMP pragmas, reorder loops or apply other optimizations.

  4. Function:

    This step compiles the AST into an executable function, either for CPU or GPUs. This function behaves like a normal Python function and runs one LBM time step.

The function create_lb_function() runs the whole pipeline, the other functions in this module execute this pipeline only up to a certain step. Each function optionally also takes the result of the previous step.

For example, to modify the AST one can run:

ast = create_lb_ast(...)
# modify ast here
func = create_lb_function(ast=ast, ...)
create_lb_function(ast=None, optimization={}, **kwargs)

Creates a Python function for the LB method

create_lb_ast(update_rule=None, optimization={}, **kwargs)

Creates a pystencils AST for the LB method

create_lb_method(**params)

Creates a LB method, defined by moments/cumulants for collision space, equilibrium and relaxation rates.

create_lb_method_from_existing(method, modification_function)

Creates a new method based on an existing method by modifying its collision table.

Parameters
  • method – old method

  • modification_function – function receiving (moment, equilibrium_value, relaxation_rate) as arguments, i.e. one row of the relaxation table, returning a modified version

switch_to_symbolic_relaxation_rates_for_omega_adapting_methods(method_parameters, kernel_params, force=False)

For entropic kernels the relaxation rate has to be a variable. If a constant was passed a new dummy variable is inserted and the value of this variable is later on passed to the kernel