Vector Data Types
For the scalar integer and floating-point data types described in Table 4.1, OpenCL C adds support for vector data types. The vector data type is defined with the type name, that is, char, uchar, short, ushort, int, uint, float, long, or ulong followed by a literal value n that defines the number of elements in the vector. Supported values of n are 2, 3, 4, 8, and 16 for all vector data types. Optionally, vector data types are also defined for double and half. These are available only if the device supports the double-precision and half-precision extensions. The supported vector data types are described in Table 4.2.
Table 4.2. Built-In Vector Data Types
Type |
Description |
char n |
A vector of n 8-bit signed integer values |
uchar n |
A vector of n 8-bit unsigned integer values |
short n |
A vector of n 16-bit signed integer values |
ushort n |
A vector of n 16-bit unsigned integer values |
int n |
A vector of n 32-bit signed integer values |
uint n |
A vector of n 32-bit unsigned integer values |
long n |
A vector of n 64-bit signed integer values |
ulong n |
A vector of n 64-bit unsigned integer values |
float n |
A vector of n 32-bit floating-point values |
double n |
A vector of n 64-bit floating-point values |
half n |
A vector of n 16-bit floating-point values |
Variables declared to be a scalar or vector data type are always aligned to the size of the data type used in bytes. Built-in data types must be aligned to a power of 2 bytes in size. A built-in data type that is not a power of 2 bytes in size must be aligned to the next-larger power of 2. This rule does not apply to structs or unions.
For example, a float4 variable will be aligned to a 16-byte boundary and a char2 variable will be aligned to a 2-byte boundary. For 3-component vector data types, the size of the data type is 4 x sizeof(component). This means that a 3-component vector data type will be aligned to a 4 x sizeof(component) boundary.
The OpenCL compiler is responsible for aligning data items appropriately as required by the data type. The only exception is for an argument to a kernel function that is declared to be a pointer to a data type. For such functions, the compiler can assume that the pointee is always appropriately aligned as required by the data type.
For application convenience and to ensure that the data store is appropriately aligned, the data types listed in Table 4.3 are made available to the application.
Table 4.3. Application Data Types
Type in OpenCL Language |
API Type for Application |
char |
cl_char |
uchar |
cl_uchar |
short |
cl_short |
ushort |
cl_ushort |
int |
cl_int |
uint |
cl_uint |
long |
cl_long |
ulong |
cl_ulong |
float |
cl_float |
double |
cl_double |
half |
cl_half |
char n |
cl_char n |
uchar n |
cl_uchar n |
short n |
cl_short n |
ushort n |
cl_ushort n |
int n |
cl_int n |
uint n |
cl_uint n |
long n |
cl_long n |
ulong n |
cl_ulong n |
float n |
cl_float n |
double n |
cl_double n |
half n |
cl_half n |
Vector Literals
Vector literals can be used to create vectors from a list of scalars, vectors, or a combination of scalar and vectors. A vector literal can be used either as a vector initializer or as a primary expression. A vector literal cannot be used as an l-value.
A vector literal is written as a parenthesized vector type followed by a parenthesized comma-delimited list of parameters. A vector literal operates as an overloaded function. The forms of the function that are available are the set of possible argument lists for which all arguments have the same element type as the result vector, and the total number of elements is equal to the number of elements in the result vector. In addition, a form with a single scalar of the same type as the element type of the vector is available. For example, the following forms are available for float4:
(float4)( float, float, float, float ) (float4)( float2, float, float ) (float4)( float, float2, float ) (float4)( float, float, float2 ) (float4)( float2, float2 ) (float4)( float3, float ) (float4)( float, float3 ) (float4)( float )
Operands are evaluated by standard rules for function evaluation, except that no implicit scalar widening occurs. The operands are assigned to their respective positions in the result vector as they appear in memory order. That is, the first element of the first operand is assigned to result.x, the second element of the first operand (or the first element of the second operand if the first operand was a scalar) is assigned to result.y, and so on. If the operand is a scalar, the operand is replicated across all lanes of the result vector.
The following example shows a vector float4 created from a list of scalars:
float4 f = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
The following example shows a vector uint4 created from a scalar, which is replicated across the components of the vector:
uint4 u = (uint4)(1); // u will be (1, 1, 1, 1)
The following examples show more complex combinations of a vector being created using a scalar and smaller vector types:
float4 f = (float4)((float2)(1.0f, 2.0f), (float2)(3.0f, 4.0f)); float4 f = (float4)(1.0f, (float2)(2.0f, 3.0f), 4.0f);
The following examples describe how not to create vector literals. All of these examples should result in a compilation error.
float4 f = (float4)(1.0f, 2.0f); float4 f = (float2)(1.0f, 2.0f); float4 f = (float4)(1.0f, (float2)(2.0f, 3.0f));
Vector Components
The components of vector data types with 1 to 4 components (aka elements) can be addressed as <vector>.xyzw. Table 4.4 lists the components that can be accessed for various vector types.
Table 4.4. Accessing Vector Components
Vector Data Types |
Accessible Components |
char2, uchar2, short2, ushort2, int2, uint2, long2, ulong2, float2 |
.xy |
char3, uchar3, short3, ushort3, int3, uint3, long3, ulong3, float3 |
.xyz |
char4, uchar4, short4, ushort4, int4, uint4, long4, ulong4, float4 |
.xyzw |
double2, half2 |
.xy |
double3, half3 |
.xyz |
double4, half4 |
.xyzw |
Accessing components beyond those declared for the vector type is an error. The following describes legal and illegal examples of accessing vector components:
float2 pos; pos.x = 1.0f; // is legal pos.z = 1.0f; // is illegal float3 pos; pos.z = 1.0f; // is legal pos.w = 1.0f; // is illegal
The component selection syntax allows multiple components to be selected by appending their names after the period (.). A few examples that show how to use the component selection syntax are given here:
float4 c; c.xyzw = (float4)(1.0f, 2.0f, 3.0f, 4.0f); c.z = 1.0f; c.xy = (float2)(3.0f, 4.0f); c.xyz = (float3)(3.0f, 4.0f, 5.0f);
The component selection syntax also allows components to be permuted or replicated as shown in the following examples:
float4 pos = (float4)(1.0f, 2.0f, 3.0f, 4.0f); float4 swiz = pos.wzyx; // swiz = (4.0f, 3.0f, 2.0f, 1.0f) float4 dup = pox.xxyy; // dup = (1.0f, 1.0f, 2.0f, 2.0f)
Vector components can also be accessed using a numeric index to refer to the appropriate elements in the vector. The numeric indices that can be used are listed in Table 4.5.
Table 4.5. Numeric Indices for Built-In Vector Data Types
Vector Components |
Usable Numeric Indices |
2-component |
0, 1 |
3-component |
0, 1, 2 |
4-component |
0, 1, 2, 3 |
8-component |
0, 1, 2, 3, 4, 5, 6, 7 |
16-component |
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, A, b, B, c, C, d, D, e, E, f, F |
All numeric indices must be preceded by the letter s or S. In the following example f.s0 refers to the first element of the float8 variable f and f.s7 refers to the eighth element of the float8 variable f:
float8 f
In the following example x.sa (or x.sA) refers to the eleventh element of the float16 variable x and x.sf (or x.sF) refers to the sixteenth element of the float16 variable x:
float16 x
The numeric indices cannot be intermixed with the .xyzw notation. For example:
float4 f; float4 v_A = f.xs123; // is illegal float4 v_B = f.s012w; // is illegal
Vector data types can use the .lo (or .odd) and .hi (or .even) suffixes to get smaller vector types or to combine smaller vector types into a larger vector type. Multiple levels of .lo (or .odd) and .hi (or .even) suffixes can be used until they refer to a scalar type.
The .lo suffix refers to the lower half of a given vector. The .hi suffix refers to the upper half of a given vector. The .odd suffix refers to the odd elements of a given vector. The .even suffix refers to the even elements of a given vector. Some examples to illustrate this concept are given here:
float4 vf; float2 low = vf.lo; // returns vf.xy float2 high = vf.hi; // returns vf.zw float x = low.low; // returns low.x float y = low.hi; // returns low.y float2 odd = vf.odd; // returns vf.yw float2 even = vf.even; // returns vf.xz
For a 3-component vector, the suffixes .lo (or .odd) and .hi (or .even) operate as if the 3-component vector were a 4-component vector with the value in the w component undefined.