(ir_objects)=
# Constants, Memory Objects, and Functions

## Memory Objects: Symbols and Buffers

### The Memory Model

In order to reason about memory accesses, mutability, invariance, and aliasing, the *pystencils* backend uses
a very simple memory model. There are three types of memory objects:

- Symbols ({any}`PsSymbol`), which act as registers for data storage within the scope of a kernel
- Field buffers ({any}`PsBuffer`), which represent a contiguous block of memory the kernel has access to, and
- the *unmanaged heap*, which is a global catch-all memory object which all pointers not belonging to a field
  array point into.

All of these objects are disjoint, and cannot alias each other.
Each symbol exists in isolation,
field buffers do not overlap,
and raw pointers are assumed not to point into memory owned by a symbol or field array.
Instead, all raw pointers point into unmanaged heap memory, and are assumed to *always* alias one another:
Each change brought to unmanaged memory by one raw pointer is assumed to affect the memory pointed to by
another raw pointer.

### Symbols

In the pystencils IR, instances of {any}`PsSymbol` represent what is generally known as "virtual registers".
These are memory locations that are private to a function, cannot be aliased or pointed to, and will finally reside
either in physical registers or on the stack.
Each symbol has a name and a data type. The data type may initially be {any}`None`, in which case it should soon after be
determined by the {any}`Typifier`.

Other than their front-end counterpart {any}`sympy.Symbol <sympy.core.symbol.Symbol>`,
{any}`PsSymbol` instances are mutable;
their properties can and often will change over time.
As a consequence, they are not comparable by value:
two {any}`PsSymbol` instances with the same name and data type will in general *not* be equal.
In fact, most of the time, it is an error to have two identical symbol instances active.

#### Creating Symbols

During kernel translation, symbols never exist in isolation, but should always be managed by a {any}`KernelCreationContext`.
Symbols can be created and retrieved using {any}`add_symbol <KernelCreationContext.add_symbol>`
and {any}`find_symbol <KernelCreationContext.find_symbol>`.
A symbol can also be duplicated using {any}`duplicate_symbol <KernelCreationContext.duplicate_symbol>`,
which assigns a new name to the symbol's copy.
The {any}`KernelCreationContext` keeps track of all existing symbols during a kernel translation run
and makes sure that no name and data type conflicts may arise.

Never call the constructor of {any}`PsSymbol` directly unless you really know what you are doing.

#### Symbol Properties

Symbols can be annotated with arbitrary information using *symbol properties*.
Each symbol property type must be a subclass of {any}`PsSymbolProperty`.
It is strongly recommended to implement property types using frozen
[dataclasses](https://docs.python.org/3/library/dataclasses.html).
For example, this snippet defines a property type that models pointer alignment requirements:

```{code-block} python

@dataclass(frozen=True)
class AlignmentProperty(UniqueSymbolProperty)
    """Require this pointer symbol to be aligned at a particular byte boundary."""
    
    byte_boundary: int

```

Inheriting from {any}`UniqueSymbolProperty` ensures that at most one property of this type can be attached to
a symbol at any time.
Properties can be added, queried, and removed using the {any}`PsSymbol` properties API listed below.

Many symbol properties are more relevant to consumers of generated kernels than to the code generator itself.
The above alignment property, for instance, may be added to a pointer symbol by a vectorization pass
to document its assumption that the pointer be properly aligned, in order to emit aligned load and store instructions.
It then becomes the responsibility of the runtime system embedding the kernel to check this prequesite before calling the kernel.
To make sure this information becomes visible, any properties attached to symbols exposed as kernel parameters will also
be added to their respective {any}`Parameter` instance.

### Buffers

Buffers, as represented by the {any}`PsBuffer` class, represent contiguous, n-dimensional, linearized cuboid blocks of memory.
Each buffer has a fixed name and element data type,
and will be represented in the IR via three sets of symbols:

- The *base pointer* is a symbol of pointer type which points into the buffer's underlying memory area.
  Each buffer has at least one, its primary base pointer, whose pointed-to type must be the same as the
  buffer's element type. There may be additional base pointers pointing into subsections of that memory.
  These additional base pointers may also have deviating data types, as is for instance required for
  type erasure in certain cases.
  To communicate its role to the code generation system,
  each base pointer needs to be marked as such using the {any}`BufferBasePtr` property,
  .
- The buffer *shape* defines the size of the buffer in each dimension. Each shape entry is either a `symbol <PsSymbol>`
  or a {any}`constant <PsConstant>`.
- The buffer *strides* define the step size to go from one entry to the next in each dimension.
  Like the shape, each stride entry is also either a symbol or a constant.

The shape and stride symbols must all have the same data type, which will be stored as the buffer's index data type.

#### Creating and Managing Buffers

Similarily to symbols, buffers are typically managed by the {any}`KernelCreationContext`, which associates each buffer
to a front-end {any}`Field`. Buffers for fields can be obtained using {any}`get_buffer <KernelCreationContext.get_buffer>`.
The context makes sure to avoid name conflicts between buffers.

## Constants

In the pystencils IR, numerical constants are represented by the {any}`PsConstant` class.
It interacts with the type system (in particular, with {any}`PsNumericType` and its subclasses)
to facilitate bit-exact storage, arithmetic, and type conversion of constants.

Each constant has a *value* and a *data type*. As long as the data type is `None`,
the constant is untyped and its value may be any Python object.
To add a data type, an instance of {any}`PsNumericType` must either be set in the constructor,
or be applied by converting an existing constant using {any}`interpret_as <PsConstant.interpret_as>`.
Once a data type is set, the set of legal values is constrained by that type.

To facilitate the correctness of the internal representation, `PsConstant` calls {any}`PsNumericType.create_constant`.
This method must be overridden by subclasses of `PsNumericType`; it either returns an object
that represents the numerical constant according to the rules of the data type,
or raises an exception if that is not possible.
The fixed-width integers, the IEEE-754 floating point types, and the corresponding vector variants that are
implemented in pystencils use NumPy for this purpose.

The same protocol is used for type conversion of constants, using {any}`PsConstant.reinterpret_as`.

## Functions

The pystencils IR models two primary kinds of functions: IR functions and external functions.

IR functions are functions that are only serve as an intermediate representation of a function,
and must be lowered to a concrete implementation at some point during kernel translation.
This is typically done by the {any}`SelectFunctions` pass, in combination with the active {any}`Platform`
class.

IR function classes derive from the common base class {any}`PsIrFunction`.
The following groups of IR functions exist:

 - Pure mathematical functions (such as `sqrt`, `sin`, `exp`, ...), through {any}`PsMathFunction`;
 - Special numerical constants (such as $\pi$, $e$, $\pm \infty$) as 0-ary functions through {any}`PsConstantFunction`
 - Intrinsic GPU functions (such as fast division and square roots) through {any}`PsGpuIntrinsicFunction`
 - Random number generator invocations through {any}`PsRngEngineFunction`

External functions with a fixed C-like signature are modelled by the {any}`CFunction` class.
They are used to inject platform-specific runtime APIs, vector intrinsics, and user-defined
external functions into the IR.
These are the only functions allowed to remain in a kernel by the time it is exported
as C code.

### Side Effects

`PsMathFunction` and `PsConstantFunction` represent *pure* functions.
Their occurences may be moved, optimized, or eliminated by the code generator.
For `CFunction`, on the other hand, side effects are conservatively assumed,
such that these cannot be freely manipulated.

## Literals

In the pystencils IR, a *literal* is an expression string, with an associated data type,
that is taken literally and printed out verbatim by the code generator.
They are represented by the {any}`PsLiteral` class,
and are used to represent compiler-builtins
(like the CUDA variables `threadIdx`, `blockIdx`, ...),
preprocessor macros (like `INFINITY`),
and other pieces of code that could not otherwise be modelled.
Literals are assumed to be *constant* with respect to the kernel,
and their evaluation is assumed to be free of side effects.

## API Documentation

```{eval-rst}

.. automodule:: pystencils.codegen.properties
  :members:

.. automodule:: pystencils.backend.memory
  :members:

.. automodule:: pystencils.backend.constants
  :members:

.. automodule:: pystencils.backend.literals
  :members:

.. automodule:: pystencils.backend.functions
  :members:

```