Forward DCT (also controls coefficient quantization) A forward DCT routine is given a pointer to an input sample array and a pointer to a work area of type DCTELEM[]; the DCT is to be performed in-place in that buffer. Type DCTELEM is int for 8-bit samples, INT32 for 12-bit samples. (NOTE: Floating-point DCT implementations use an array of type FAST_FLOAT, instead.) The input data is to be fetched from the sample array starting at a specified column. (Any row offset needed will be applied to the array pointer before it is passed to the FDCT code.) Note that the number of samples fetched by the FDCT routine is DCT_h_scaled_size * DCT_v_scaled_size. The DCT outputs are returned scaled up by a factor of 8; they therefore have a range of +-8K for 8-bit data, +-128K for 12-bit data. This convention improves accuracy in integer implementations and saves some work in floating-point ones. Each IDCT routine has its own ideas about the best dct_table element type.