Qualifiers
OpenCL C supports four types of qualifiers: function qualifiers, address space qualifiers, access qualifiers, and type qualifiers.
Function Qualifiers
OpenCL C adds the kernel (or __kernel) function qualifier. This qualifier is used to specify that a function in the program source is a kernel function. The following example demonstrates the use of the kernel qualifier:
kernel void parallel_add(global float *a, global float *b, global float *result) { ... } // The following example is an example of an illegal kernel // declaration and will result in a compile-time error. // The kernel function has a return type of int instead of void. kernel int parallel_add(global float *a, global float *b, global float *result) { ... }
The following rules apply to kernel functions:
- The return type must be void. If the return type is not void, it will result in a compilation error.
- The function can be executed on a device by enqueuing a command to execute the kernel from the host.
- The function behaves as a regular function if it is called from a kernel function. The only restriction is that a kernel function with variables declared inside the function with the local qualifier cannot be called from another kernel function.
The following example shows a kernel function calling another kernel function that has variables declared with the local qualifier. The behavior is implementation-defined so it is not portable across implementations and should therefore be avoided.
kernel void my_func_a(global float *src, global float *dst) { local float l_var[32]; ... } kernel void my_func_b(global float * src, global float *dst) { my_func_a(src, dst); // implementation-defined behavior }
A better way to implement this example that is also portable is to pass the local variable as an argument to the kernel:
kernel void my_func_a(global float *src, global float *dst, local float *l_var) { ... } kernel void my_func_b(global float * src, global float *dst, local float *l_var) { my_func_a(src, dst, l_var); }
Kernel Attribute Qualifiers
The kernel qualifier can be used with the keyword __attribute__ to declare the following additional information about the kernel:
- __attribute__((work_group_size_hint(X, Y, Z))) is a hint to the compiler and is intended to specify the work-group size that will most likely be used, that is, the value specified in the local_work_ size argument to clEnqueueNDRangeKernel.
- __attribute__((reqd_work_group_size(X, Y, Z))) is intended to specify the work-group size that will be used, that is, the value specified in the local_work_size argument to clEnqueueNDRangeKernel. This provides an opportunity for the compiler to perform specific optimizations that depend on knowing what the work-group size is.
- __attribute__((vec_type_hint(<type>))) is a hint to the compiler on the computational width of the kernel, that is, the size of the data type the kernel is operating on. This serves as a hint to an auto-vectorizing compiler. The default value of <type> is int, indicating that the kernel is scalar in nature and the auto-vectorizer can therefore vectorize the code across the SIMD lanes of the vector unit for multiple work-items.
Address Space Qualifiers
Work-items executing a kernel have access to four distinct memory regions. These memory regions can be specified as a type qualifier. The type qualifier can be global (or __global), local (or __local), constant (or __constant), or private (or __private).
If the type of an object is qualified by an address space name, the object is allocated in the specified address space. If the address space name is not specified, then the object is allocated in the generic address space. The generic address space name (for arguments to functions in a program, or local variables in a function) is private.
A few examples that describe how to specify address space names follow:
// declares a pointer p in the private address space that points to // a float object in address space global global float *p; // declares an array of integers in the private address space int f[4]; // for my_func_a function we have the following arguments: // // src - declares a pointer in the private address space that // points to a float object in address space constant // // v - allocate in the private address space // int my_func_a(constant float *src, int4 v) { float temp; // temp is allocated in the private address space. }
Arguments to a kernel function that are declared to be a pointer of a type must point to one of the following address spaces only: global, local, or constant. Not specifying an address space name for such arguments will result in a compilation error. This limitation does not apply to non-kernel functions in a program.
A few examples of legal and illegal use cases are shown here:
kernel void my_func(int *p) // illegal because generic address space // name for p is private. kernel void my_func(private int *p) // illegal because memory pointed to by // p is allocated in private. void my_func(int *p) // generic address space name for p is private. // legal as my_func is not a kernel function void my_func(private int *p) // legal as my_func is not a kernel function
Global Address Space
This address space name is used to refer to memory objects (buffers and images) allocated from the global memory region. This memory region allows read/write access to all work-items in all work-groups executing a kernel. This address space is identified by the global qualifier.
A buffer object can be declared as a pointer to a scalar, vector, or user-defined struct. Some examples are:
global float4 *color; // an array of float4 elements typedef struct { float3 a; int2 b[2]; } foo_t; global foo_t *my_info; // an array of foo_t elements
The global address qualifier should not be used for image types.
Pointers to the global address space are allowed as arguments to functions (including kernel functions) and variables declared inside functions. Variables declared inside a function cannot be allocated in the global address space.
A few examples of legal and illegal use cases are shown here:
void my_func(global float4 *vA, global float4 *vB) { global float4 *p; // legal global float4 a; // illegal }
Constant Address Space
This address space name is used to describe variables allocated in global memory that are accessed inside a kernel(s) as read-only variables. This memory region allows read-only access to all work-items in all work-groups executing a kernel. This address space is identified by the constant qualifier.
Image types cannot be allocated in the constant address space. The following example shows imgA allocated in the constant address space, which is illegal and will result in a compilation error:
kernel void my_func(constant image2d_t imgA) { ... }
Pointers to the constant address space are allowed as arguments to functions (including kernel functions) and variables declared inside functions.
Variables in kernel function scope (i.e., the outermost scope of a kernel function) can be allocated in the constant address space. Variables in program scope (i.e., global variables in a program) can be allocated only in the constant address space. All such variables are required to be initialized, and the values used to initialize these variables must be compile-time constants. Writing to such a variable will result in a compile-time error.
Also, storage for all string literals declared in a program will be in the constant address space.
A few examples of legal and illegal use cases follow:
// legal - program scope variables can be allocated only // in the constant address space constant float wtsA[] = { 0, 1, 2, . . . }; // program scope // illegal - program scope variables can be allocated only // in the constant address space global float wtsB[] = { 0, 1, 2, . . . }; kernel void my_func(constant float4 *vA, constant float4 *vB) { constant float4 *p = vA; // legal constant float a; // illegal – not initialized constant float b = 2.0f; // legal – initialized with a compile- // time constant p[0] = (float4)(1.0f); // illegal – p cannot be modified // the string "opencl version" is allocated in the // constant address space char *c = "opencl version"; }
Local Address Space
This address space name is used to describe variables that need to be allocated in local memory and are shared by all work-items of a work-group but not across work-groups executing a kernel. This memory region allows read/write access to all work-items in a work-group. This address space is identified by the local qualifier.
A good analogy for local memory is a user-managed cache. Local memory can significantly improve performance if a work-item or multiple work-items in a work-group are reading from the same location in global memory. For example, when applying a Gaussian filter to an image, multiple work-items read overlapping regions of the image. The overlap region size is determined by the width of the filter. Instead of reading multiple times from global memory (which is an order of magnitude slower), it is preferable to read the required data from global memory once into local memory and then have the work-items read multiple times from local memory.
Pointers to the local address space are allowed as arguments to functions (including kernel functions) and variables declared inside functions. Variables declared inside a kernel function can be allocated in the local address space but with a few restrictions:
- These variable declarations must occur at kernel function scope.
- These variables cannot be initialized.
Note that variables in the local address space that are passed as pointer arguments to or declared inside a kernel function exist only for the lifetime of the work-group executing the kernel.
A few examples of legal and illegal use cases are shown here:
kernel void my_func(global float4 *vA, local float4 *l) { local float4 *p; // legal local float4 a; // legal a = 1; local float4 b = (float4)(0); // illegal – b cannot be // initialized if (...) { local float c; // illegal – must be allocated at // kernel function scope ... } }
Private Address Space
This address space name is used to describe variables that are private to a work-item and cannot be shared between work-items in a work-group or across work-groups. This address space is identified by the private qualifier.
Variables inside a kernel function not declared with an address space qualifier, all variables declared inside non-kernel functions, and all function arguments are in the private address space.
Casting between Address Spaces
A pointer in an address space can be assigned to another pointer only in the same address space. Casting a pointer in one address space to a pointer in a different address space is illegal. For example:
kernel void my_func(global float4 *particles) { // legal – particle_ptr & particles are in the // same address space global float *particle_ptr = (global float *)particles; // illegal – private_ptr and particle_ptr are in different // address spaces float *private_ptr = (float *)particle_ptr; }
Access Qualifiers
The access qualifiers can be specified with arguments that are an image type. These qualifiers specify whether the image is a read-only (read_ only or __read_only) or write-only (write_only or __write_only) image. This is because of a limitation of current GPUs that do not allow reading and writing to the same image in a kernel. The reason for this is that image reads are cached in a texture cache, but writes to an image do not update the texture cache.
In the following example imageA is a read-only 2D image object and imageB is a write-only 2D image object:
kernel void my_func(read_only image2d_t imageA, write_only image2d_t imageB) { ... }
Images declared with the read_only qualifier can be used with the built-in functions that read from an image. However, these images cannot be used with built-in functions that write to an image. Similarly, images declared with the write_only qualifier can be used only to write to an image and cannot be used to read from an image. The following examples demonstrate this:
kernel void my_func(read_only image2d_t imageA, write_only image2d_t imageB, sampler_t sampler) { float4 clr; float2 coords; clr = read_imagef(imageA, sampler, coords); // legal clr = read_imagef(imageB, sampler, coords); // illegal write_imagef(imageA, coords, &clr); // illegal write_imagef(imageB, coords, &clr); // legal }
imageA is declared to be a read_only image so it cannot be passed as an argument to write_imagef. Similarly, imageB is declared to be a write_ only image so it cannot be passed as an argument to read_imagef.
The read-write qualifier (read_write or __read_write) is reserved. Using this qualifier will result in a compile-time error.
Type Qualifiers
The type qualifiers const, restrict, and volatile as defined by the C99 specification are supported. These qualifiers cannot be used with the image2d_t and image3d_t type. Types other than pointer types cannot use the restrict qualifier.