DataHandling

class DataHandling

Manages the storage of arrays and maps them to a symbolic field. Two versions are available: a simple, pure Python implementation for single node simulations :py:class:SerialDataHandling and a distributed version using walberla in :py:class:ParallelDataHandling

Keep in mind that the data can be distributed, so use the ‘access’ method whenever possible and avoid the ‘gather’ function that has collects (parts of the) distributed data on a single process.

abstract property default_target: Target

Target Enum indicating the target of the computation

abstract property dim: int

Dimension of the domain, either 2 or 3

abstract property shape: Tuple[int, ...]

Shape of outer bounding box.

abstract property periodicity: Tuple[bool, ...]

Returns tuple of booleans for x,y,(z) directions with True if domain is periodic in that direction.

abstract add_array(name, values_per_cell, dtype=<class 'numpy.float64'>, latex_name=None, ghost_layers=None, layout=None, cpu=True, gpu=None, alignment=False, field_type=FieldType.GENERIC)

Adds a (possibly distributed) array to the handling that can be accessed using the given name.

For each array a symbolic field is available via the ‘fields’ dictionary

Parameters:
  • name (str) – unique name that is used to access the field later

  • values_per_cell – shape of the dim+1 coordinate. DataHandling supports zero or one index dimensions, i.e. scalar fields and vector fields. This parameter gives the shape of the index dimensions. The default value of 1 means no index dimension are created.

  • dtype – data type of the array as numpy data type

  • latex_name (Optional[str]) – optional, name of the symbolic field, if not given ‘name’ is used

  • ghost_layers (Optional[int]) – number of ghost layers - if not specified a default value specified in the constructor is used

  • layout (Optional[str]) – memory layout of array, either structure of arrays ‘SoA’ or array of structures ‘AoS’. this is only important if values_per_cell > 1

  • cpu (bool) – allocate field on the CPU

  • gpu (Optional[bool]) – allocate field on the GPU, if None, a GPU field is allocated if default_target is ‘GPU’

  • alignment – either False for no alignment, or the number of bytes to align to

Return type:

Field

Returns:

pystencils field, that can be used to formulate symbolic kernels

add_arrays(description, dtype=<class 'numpy.float64'>, ghost_layers=None, layout=None, cpu=True, gpu=None, alignment=False, field_type=FieldType.GENERIC)

Adds multiple arrays using a string description similar to pystencils.fields()

>>> from pystencils.datahandling import create_data_handling
>>> dh = create_data_handling((20, 30))
>>> x, y =dh.add_arrays('x, y(9)')
>>> print(dh.fields)
{'x': x: double[22,32], 'y': y(9): double[22,32]}
>>> assert x == dh.fields['x']
>>> assert dh.fields['x'].shape == (22, 32)
>>> assert dh.fields['y'].index_shape == (9,)
Parameters:
  • description (str) – String description of the fields to add

  • dtype – data type of the array as numpy data type

  • ghost_layers (Optional[int]) – number of ghost layers - if not specified a default value specified in the constructor is used

  • layout (Optional[str]) – memory layout of array, either structure of arrays ‘SoA’ or array of structures ‘AoS’. this is only important if values_per_cell > 1

  • cpu (bool) – allocate field on the CPU

  • gpu (Optional[bool]) – allocate field on the GPU, if None, a GPU field is allocated if default_target is ‘GPU’

  • alignment – either False for no alignment, or the number of bytes to align to

Return type:

Tuple[Field]

Returns:

Fields representing the just created arrays

abstract has_data(name)

Returns true if a field or custom data element with this name was added.

abstract add_array_like(name, name_of_template_field, latex_name=None, cpu=True, gpu=None)

Adds an array with the same parameters (number of ghost layers, values_per_cell, dtype) as existing array.

Parameters:
  • name – name of new array

  • name_of_template_field – name of array that is used as template

  • latex_name – see ‘add’ method

  • cpu – see ‘add’ method

  • gpu – see ‘add’ method

abstract add_custom_data(name, cpu_creation_function, gpu_creation_function=None, cpu_to_gpu_transfer_func=None, gpu_to_cpu_transfer_func=None)

Adds custom (non-array) data to domain.

Parameters:
  • name (str) – name to access data

  • cpu_creation_function – function returning a new instance of the data that should be stored

  • gpu_creation_function – optional, function returning a new instance, stored on GPU

  • cpu_to_gpu_transfer_func – function that transfers cpu to gpu version, getting two parameters (gpu_instance, cpu_instance)

  • gpu_to_cpu_transfer_func – function that transfers gpu to cpu version, getting two parameters (gpu_instance, cpu_instance)

add_custom_class(name, class_obj, cpu=True, gpu=False)

Adds non-array data by passing a class object with optional ‘to_gpu’ and ‘to_cpu’ member functions.

abstract property fields: Dict[str, Field]

Dictionary mapping data name to symbolic pystencils field - use this to create pystencils kernels.

abstract property array_names: Sequence[str]

Sequence of all array names.

abstract property custom_data_names: Sequence[str]

Sequence of all custom data names.

abstract ghost_layers_of_field(name)

Returns the number of ghost layers for a specific field/array.

Return type:

int

abstract values_per_cell(name)

Returns values_per_cell of array.

Return type:

Tuple[int, ...]

abstract iterate(slice_obj=None, gpu=False, ghost_layers=None, inner_ghost_layers=True)

Iterate over local part of potentially distributed data structure.

Return type:

Iterable[Block]

abstract gather_array(name, slice_obj=None, all_gather=False, ghost_layers=False)

Gathers part of the domain on a local process. Whenever possible use ‘access’ instead, since this method copies the distributed data to a single process which is inefficient and may exhaust the available memory

Parameters:
  • name – name of the array to gather

  • slice_obj – slice expression of the rectangular sub-part that should be gathered

  • all_gather – if False only the root process receives the result, if True all processes

  • ghost_layers – number of outer ghost layers to include (only available for serial version of data handling)

Return type:

Optional[ndarray]

Returns:

gathered field that does not include any ghost layers, or None if gathered on another process

abstract run_kernel(kernel_function, *args, **kwargs)

Runs a compiled pystencils kernel.

Uses the arrays stored in the DataHandling class for all array parameters. Additional passed arguments are directly passed to the kernel function and override possible parameters from the DataHandling

Return type:

None

abstract get_kernel_kwargs(kernel_function, **kwargs)

Returns the input arguments of a kernel

abstract swap(name1, name2, gpu=False)

Swaps data of two arrays

abstract to_cpu(name)

Copies GPU data of array with specified name to CPU. Works only if ‘cpu=True’ and ‘gpu=True’ has been used in ‘add’ method.

abstract to_gpu(name)

Copies GPU data of array with specified name to GPU. Works only if ‘cpu=True’ and ‘gpu=True’ has been used in ‘add’ method.

abstract all_to_cpu()

Copies data from GPU to CPU for all arrays that have a CPU and a GPU representation.

abstract all_to_gpu()

Copies data from CPU to GPU for all arrays that have a CPU and a GPU representation.

abstract is_on_gpu(name)

Checks if this data was also allocated on the GPU - does not check if this data item is in synced.

abstract create_vtk_writer(file_name, data_names, ghost_layers=False)

VTK output for one or multiple arrays.

Args

file_name: base file name without extension for the VTK output data_names: list of array names that should be included in the vtk output ghost_layers: true if ghost layer information should be written out as well

Return type:

Callable[[int], None]

Returns:

a function that can be called with an integer time step to write the current state i.e create_vtk_writer(‘some_file’, [‘velocity’, ‘density’]) (1)

abstract create_vtk_writer_for_flag_array(file_name, data_name, masks_to_name, ghost_layers=False)

VTK output for an unsigned integer field, where bits are interpreted as flags.

Parameters:
  • file_name – see create_vtk_writer

  • data_name – name of an array with uint type

  • masks_to_name – dictionary mapping integer masks to a name in the output

  • ghost_layers – see create_vtk_writer

Return type:

Callable[[int], None]

Returns:

functor that can be called with time step

abstract synchronization_function(names, stencil=None, target=None, **kwargs)

Synchronizes ghost layers for distributed arrays.

For serial scenario this has to be called for correct periodicity handling

Parameters:
  • names – what data to synchronize: name of array or sequence of names

  • stencil – stencil as string defining which neighbors are synchronized e.g. ‘D2Q9’, ‘D3Q19’ if None, a full synchronization (i.e. D2Q9 or D3Q27) is done

  • targetTarget either ‘CPU’ or ‘GPU’

  • kwargs – implementation specific, optional optimization parameters for communication

Return type:

Callable[[], None]

Returns:

function object to run the communication

reduce_float_sequence(sequence, operation, all_reduce=False)

Takes a sequence of floating point values on each process and reduces it element-wise.

If all_reduce, all processes get the result, otherwise only the root process. Possible operations are ‘sum’, ‘min’, ‘max’

Return type:

array

reduce_int_sequence(sequence, operation, all_reduce=False)

See function reduce_float_sequence - this is the same for integers

Return type:

array

fill(array_name, val, value_idx=None, slice_obj=None, ghost_layers=False, inner_ghost_layers=False)

Sets all cells to the same value.

Parameters:
  • array_name (str) – name of the array that should be modified

  • val – value to set the array to

  • value_idx (Union[int, Tuple[int, ...], None]) – If an array stores multiple values per cell, this index chooses which of this values to fill. If None, all values are set

  • slice_obj – if passed, only the defined slice is filled

  • ghost_layers – True if the outer ghost layers should also be filled

  • inner_ghost_layers – True if the inner ghost layers should be filled. Inner ghost layers occur only in parallel setups for distributed memory communication.

Return type:

None

min(array_name, slice_obj=None, ghost_layers=False, inner_ghost_layers=False, reduce=True)

Returns the minimum value inside the domain or slice of the domain.

For meaning of arguments see documentation of DataHandling.fill().

Returns:

the minimum of the locally stored domain part is returned if reduce is False, otherwise the global minimum on the root process, on other processes None

max(array_name, slice_obj=None, ghost_layers=False, inner_ghost_layers=False, reduce=True)

Returns the maximum value inside the domain or slice of the domain.

For argument description see DataHandling.min()

save_all(file)

Saves all field data to disk into a file

load_all(file)

Loads all field data from disk into a file

Works only if save_all was called with exactly the same field sizes, layouts etc. When run in parallel save and load has to be called with the same number of processes. Use for check pointing only - to store results use VTK output

log(*args, level='INFO')

Similar to print with additional information (time, rank).

log_on_root(*args, level='INFO')

Logs only on root process. For serial setups this is equivalent to log

property is_root

Returns True for exactly one process in the simulation

property world_rank

Number of current process