Playing around with OpenCL from C# using OpenCL.NET.
- FpConfig
- Dump out device floating point configuration info
CL_DEVICE_SINGLE_FP_CONFIG
- Dump out device floating point configuration info
- RunKernel
- Run a simple kernel
- SaveBinaries
- Build a
program
and save the CL binaries to filesCL_PROGRAM_BINARY_SIZES
CL_PROGRAM_BINARIES
- Build a
- WorkGroupInfo
- Dump out kernel work group info
CL_KERNEL_WORK_GROUP_SIZE
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE
CL_KERNEL_COMPILE_WORK_GROUP_SIZE
CL_KERNEL_LOCAL_MEM_SIZE
CL_KERNEL_PRIVATE_MEM_SIZE
- Dump out kernel work group info
- ReductionScalar
- Implement reduction using scalar float (see section 10.2 Numerical Reduction in OpenCL in Action)
- ReductionVector
- Implement reduction using vector float4 (see section 10.2 Numerical Reduction in OpenCL in Action)
- ReductionVectorComplete
- Like ReductionVector but do the final summing on the OpenCL device using a second kernel (see section 10.3 Synchronizing work-groups in OpenCL in Action)
- Note: when I run this against my "Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz" device it gives crazy (i.e. wrong) results
- UPDATE: See this discussion on the Intel forums for an explanation of why this happens (race condition)
- I get the same results using the C code from the book's download (VS2015 project can be found here)
- I get the same results in C++ too (fixed code can be found here)
- OpenCL - The open standard for parallel programming of heterogeneous systems
- OpenCL (Wikipedia)
- OpenCL 1.2 Reference Pages
- (I am using version 1.2 - current version is 2.1)
- OpenCL.NET
- OpenCL.NET (NuGet)
- OpenCL in Action (Manning Publications Co.)
- Simon McIntosh-Smith
- Head of the Microelectronics Group and Bristol University Business Fellow
- Senior Lecturer in High Performance Computing and Architectures
- OpenCL: A Hands-on Introduction
- COMPILING OPENCL KERNELS
- OpenCL™ Optimization Case Study: Simple Reductions