-// The ALU instructions register blocks are enumerated according to the order
-// in which fglrx. I assume there is space for 64 instructions, since
-// each block has space for a maximum of 64 DWORDs, and this matches reported
-// native limits.
-//
-// The basic functional block seems to be one MAD for each color and alpha,
-// and an adder that adds all components after the MUL.
-// - ADD, MUL, MAD etc.: use MAD with appropriate neutral operands
-// - DP4: Use OUTC_DP4, OUTA_DP4
-// - DP3: Use OUTC_DP3, OUTA_DP4, appropriate alpha operands
-// - DPH: Use OUTC_DP4, OUTA_DP4, appropriate alpha operands
-// - CMP: If ARG2 < 0, return ARG1, else return ARG0
-// - FLR: use FRC+MAD
-// - XPD: use MAD+MAD
-// - SGE, SLT: use MAD+CMP
-// - RSQ: use ABS modifier for argument
-// - Use OUTC_REPL_ALPHA to write results of an alpha-only operation (e.g. RCP)
-// into color register
-// - apparently, there's no quick DST operation
-// - fglrx set FPI2_UNKNOWN_31 on a "MAD fragment.color, tmp0, tmp1, tmp2"
-// - fglrx set FPI2_UNKNOWN_31 on a "MAX r2, r1, c0"
-// - fglrx once set FPI0_UNKNOWN_31 on a "FRC r1, r1"
-//
-// Operand selection
-// First stage selects three sources from the available registers and
-// constant parameters. This is defined in INSTR1 (color) and INSTR3 (alpha).
-// fglrx sorts the three source fields: Registers before constants,
-// lower indices before higher indices; I do not know whether this is necessary.
-// fglrx fills unused sources with "read constant 0"
-// According to specs, you cannot select more than two different constants.
-//
-// Second stage selects the operands from the sources. This is defined in
-// INSTR0 (color) and INSTR2 (alpha). You can also select the special constants
-// zero and one.
-// Swizzling and negation happens in this stage, as well.
-//
-// Important: Color and alpha seem to be mostly separate, i.e. their sources
-// selection appears to be fully independent (the register storage is probably
-// physically split into a color and an alpha section).
-// However (because of the apparent physical split), there is some interaction
-// WRT swizzling. If, for example, you want to load an R component into an
-// Alpha operand, this R component is taken from a *color* source, not from
-// an alpha source. The corresponding register doesn't even have to appear in
-// the alpha sources list. (I hope this alll makes sense to you)
-//
-// Destination selection
-// The destination register index is in FPI1 (color) and FPI3 (alpha) together
-// with enable bits.
-// There are separate enable bits for writing into temporary registers
-// (DSTC_REG_* /DSTA_REG) and and program output registers (DSTC_OUTPUT_* /DSTA_OUTPUT).
-// You can write to both at once, or not write at all (the same index
-// must be used for both).
-//
-// Note: There is a special form for LRP
-// - Argument order is the same as in ARB_fragment_program.
-// - Operation is MAD
-// - ARG1 is set to ARGC_SRC1C_LRP/ARGC_SRC1A_LRP
-// - Set FPI0/FPI2_SPECIAL_LRP
-// Arbitrary LRP (including support for swizzling) requires vanilla MAD+MAD */
+ * The ALU instructions register blocks are enumerated according to the order
+ * in which fglrx. I assume there is space for 64 instructions, since
+ * each block has space for a maximum of 64 DWORDs, and this matches reported
+ * native limits.
+ *
+ * The basic functional block seems to be one MAD for each color and alpha,
+ * and an adder that adds all components after the MUL.
+ * - ADD, MUL, MAD etc.: use MAD with appropriate neutral operands
+ * - DP4: Use OUTC_DP4, OUTA_DP4
+ * - DP3: Use OUTC_DP3, OUTA_DP4, appropriate alpha operands
+ * - DPH: Use OUTC_DP4, OUTA_DP4, appropriate alpha operands
+ * - CMPH: If ARG2 > 0.5, return ARG0, else return ARG1
+ * - CMP: If ARG2 < 0, return ARG1, else return ARG0
+ * - FLR: use FRC+MAD
+ * - XPD: use MAD+MAD
+ * - SGE, SLT: use MAD+CMP
+ * - RSQ: use ABS modifier for argument
+ * - Use OUTC_REPL_ALPHA to write results of an alpha-only operation
+ * (e.g. RCP) into color register
+ * - apparently, there's no quick DST operation
+ * - fglrx set FPI2_UNKNOWN_31 on a "MAD fragment.color, tmp0, tmp1, tmp2"
+ * - fglrx set FPI2_UNKNOWN_31 on a "MAX r2, r1, c0"
+ * - fglrx once set FPI0_UNKNOWN_31 on a "FRC r1, r1"
+ *
+ * Operand selection
+ * First stage selects three sources from the available registers and
+ * constant parameters. This is defined in INSTR1 (color) and INSTR3 (alpha).
+ * fglrx sorts the three source fields: Registers before constants,
+ * lower indices before higher indices; I do not know whether this is
+ * necessary.
+ *
+ * fglrx fills unused sources with "read constant 0"
+ * According to specs, you cannot select more than two different constants.
+ *
+ * Second stage selects the operands from the sources. This is defined in
+ * INSTR0 (color) and INSTR2 (alpha). You can also select the special constants
+ * zero and one.
+ * Swizzling and negation happens in this stage, as well.
+ *
+ * Important: Color and alpha seem to be mostly separate, i.e. their sources
+ * selection appears to be fully independent (the register storage is probably
+ * physically split into a color and an alpha section).
+ * However (because of the apparent physical split), there is some interaction
+ * WRT swizzling. If, for example, you want to load an R component into an
+ * Alpha operand, this R component is taken from a *color* source, not from
+ * an alpha source. The corresponding register doesn't even have to appear in
+ * the alpha sources list. (I hope this all makes sense to you)
+ *
+ * Destination selection
+ * The destination register index is in FPI1 (color) and FPI3 (alpha)
+ * together with enable bits.
+ * There are separate enable bits for writing into temporary registers
+ * (DSTC_REG_* /DSTA_REG) and and program output registers (DSTC_OUTPUT_*
+ * /DSTA_OUTPUT). You can write to both at once, or not write at all (the
+ * same index must be used for both).
+ *
+ * Note: There is a special form for LRP
+ * - Argument order is the same as in ARB_fragment_program.
+ * - Operation is MAD
+ * - ARG1 is set to ARGC_SRC1C_LRP/ARGC_SRC1A_LRP
+ * - Set FPI0/FPI2_SPECIAL_LRP
+ * Arbitrary LRP (including support for swizzling) requires vanilla MAD+MAD
+ */