3.1 Swizzling and Write Masking

Swizzling refers to the extraction, rearrangement, and possible duplication of elements of tuple types. On many GPUs swizzling is free or at least cheap, and smart use of swizzling can make your code more efficient. However, inappropriate use of swizzling can also make your code incredibly hard to read, and may make it hard for the compiler to optimize it.

Swizzling is expressed in Sh using the function call operator “()” with a sequence of integer arguments. For example, suppose a is an instance of an RGB color ShColor3f. Then the expression a(2,1,0) represents the corresponding color with components in BGR order.

GPUs only directly support 4-tuples, while Sh supports n-tuples. Swizzles of tuples of length greater than the hardware-supported tuple length will be decomposed into a number of swizzled and write-masked move operations.

Components are numbered starting at 0, and the components of the output tuple are listed in the order specified in the swizzle. Components can be ignored or duplicated, and swizzling can also change the tuple size. For instance, suppose an LA (luminance-alpha) color b is represented using ShColor2f. A corresponding 4-channel RGBA color could be represented using the expression b(0,0,0,1), or a greyscale RGB color could be expressed using b(0,0,0).

Repeated swizzling is permitted. For instance, a(2,1,0)(2,1,0) on a 3-tuple just returns the original component order, since it reverses the components twice. Swizzles are applied left to right. In general these kinds of expressions should be avoided, since they look like matrix swizzles (discussed below). However repeated swizzles are occasionally useful if you need to compute a complex permutation and it is easier to express it as a sequence of simpler permutations. The compiler will of course collapse such decomposed swizzles into a single operation.

The () notation only expresses swizzling when used on the right side of an assignment or in an expression used as an argument to a control construct. However, the same notation may also be used on the left side of an assignment statement, where it expresses something different: write masking, or selective component update.

For write masking, the numbers indicate which components should be written. The number of components mentioned in a write mask expression on the left of an assignment should have the same number of components as the result of the expression on the right. It does not make sense to duplicate a value in a write mask (this would imply writing twice to the same location), so assignments such as t(1,1) = x(0,1) are illegal and will throw a program definition time exception.

Reordering components in a write mask performs a permutation of the elements written. The way to think of this is that the writemask computes a sequence of references to elements in the order requested. Swizzles can be applied to computed values as well as individual tuples, so a(0,1) = (b + c)(1,0) is perfectly reasonable: it adds tuples b and c, extracts and swaps their first two components, and writes the result to the first two components of tuple a. Given the interpretation of permuted writemasks, this could also be written as a(1,0) = b + c, assuming b and c are both two-tuples.

The “[]” operator may be used for swizzling one element. This is equivalent to treating the tuple as an array of scalars. On the right hand side, it reads from a particular component of a tuple, and on the left hand side, it writes to a component element. Like the “()” swizzle, this operator can be applied to computed values as well as variables, so a[0] = (b + c)[1] computes the sum of tuples b and c, extracts component 1 from the intermediate result, and writes it to component 0 of a.

Currently swizzle and writemask arguments must be program definition time constants. They can be generated by C++ code but not by Sh expressions. In the future, we might permit shader-computed swizzles, but such swizzles would be relatively expensive on current hardware.

Matrices may also be swizzled, but they require two swizzle operations in sequence. The first (leftmost) swizzle is a swizzle on rows, the second (rightmost) swizzle is a swizzle on columns. This notation permits permutation of rows and columns as well as the extraction of submatrices.

On matrices the “[]” and “()” operators have slightly different interpretations. Applying the “[]” operator to a matrix extracts a row of the matrix and makes it accessible as an ShAttrib. Applying an additional “[]” operator to this tuple extracts an element of this tuple, and thus an element of the matrix, as expected. For instance, M[1][2] extracts the scalar at row 1, column 2. You could also extract row 1 of the matrix using just M[1], or column 1 using transpose(M)[1]. Both rows and columns are numbered starting at 0.

On matrices, the “()” operators must always be applied in pairs, since the first swizzle operator applied results in a special type whose only legal operation is a column swizzle. To make matrix expressions simpler, a swizzle with no arguments is interpreted as doing nothing (the identity swizzle). For instance, the expression M(3,2,1,0)(3,2,1,0) computes the transpose of a 4 4 matrix by reversing both rows and columns (although a transpose function is provided). The expression M(3,2,1,0)() reverses only the rows and leaves the columns alone, while M()(3,2,1,0) reverses the columns and leaves the rows alone.

When used on the left hand side of an expression, the “[]” operator can be used to select an element (or row) to write to and the “()” notation can be used to specify a submatrix to assign new values to. The same restrictions apply as mentioned for tuple swizzles. In particular, in a matrix writemask as in a tuple writemask, repeating a component index is an error.

Our swizzling and writemasking syntax differs from the syntax used in most other real-time shading languages, which use expressions like a.rgba. This alphabetic syntax is not supported in Sh, for several reasons: 1. It would be quite painful to define in C++. We could predefine a (large) number of member variables pointing back at each object instance, each representing a different swizzle, but the cost would be horrific. 2. The alphabetic swizzle names depend on the type and number of arguments. How would you swizzle a 9-tuple? 3. The alphabetic syntax interferes with the syntax for member access. To be fair, our syntax interferes with the syntax for constructors, but it’s unambiguous in practice. 4. Using numerical arguments means that we can use computed values as swizzle arguments.

The [] and () operators are also used on texture objects for lookup and on program objects for function application. See Chapters 4 and 5, respectively.


Note: This manual is available as a bound book from AK Peters, including better formatting, in-depth examples, and about 200 pages not available on-line.