- OpenCL Programming by Example
- Ravishekhar Banger Koushik Bhattacharyya
- 1057字
- 2025-04-04 22:05:46
Platform versions
The OpenCL is designed to support devices with different capabilities under a single platform. This includes devices which conform to different versions of the OpenCL specification. While writing an OpenCL based application one needs to query the implementation about the supported version in the platform. There are mainly two different types of version identifiers to consider.
- Platform Version: Indicates the version of the OpenCL runtime supported.
- Device Version: Indicates the device capabilities and attributes. The conformant version info provided cannot be greater than platform version.
Query platforms
Now let's write an OpenCL program to get the platform details. Use the get_platform_property
example in this chapter.
The OpenCL standard specifies API interfaces to determine the platform configuration. To query the platform versions and details of the OpenCL implementation, the following two APIs are used:
cl_int clGetPlatformIDs (cl_uint num_entries, cl_platform_id *platforms, cl_uint *num_platforms); cl_int clGetPlatformInfo(cl_platform_id platform, cl_platform_info param_name, size_t param_value_size, void *param_value, size_t *param_value_size_ret);
clGetPlatformIDs
is used to obtain the total number of platforms available in the system. There can be more than one platform. If you install two OpenCL runtimes, one from AMD APP SDK and the other Intel OpenCL runtime for the CPU, you should be able to see two platforms in the system. Usually you don't want to pre-allocate the memory for storing the platforms. Before getting the actual platform, an application developer should query for the number of OpenCL implementations available in the platform. This is done using the following OpenCL call:
clError = clGetPlatformIDs(0, NULL, &num_platforms);
This call returns the total number of available platforms. Once we have obtained the number of available platforms we can allocate memory and query for the platform IDs for the various OpenCL implementations as follows:
platforms = (cl_platform_id *)malloc (num_platforms*sizeof(cl_platform_id)); clError = clGetPlatformIDs (num_platforms, platforms, NULL);
Once the list of platforms is obtained, you can query for the platform attributes in a loop for each platform. In the example we have queried the following parameters using the API clGetPlatformInfo
:
CL_PLATFORM_NAME CL_PLATFORM_VENDOR CL_PLATFORM_VERSION CL_PLATFORM_PROFILE CL_PLATFORM_EXTENSIONS
Example:
clError = clGetPlatformInfo (platforms[index], CL_PLATFORM_NAME, 1024, &queryBuffer, NULL);
In the get_device_property
example where we get device properties, we default to the first available platform and query the device property for all the devices in default platform obtained. Take a look at the get_device_property
example for this chapter.
clError = clGetPlatformIDs(1, &platform, &num_platforms);
Note the difference in the calls to clGetPlatformIDs
in the two examples discussed.
In this section we just wrote a small program to print the platform details. Take a look at how we allocate memory for platforms and how we get the details of the platform. As an exercise try to install multiple OpenCL implementations in your platform and see how many OpenCL platforms are enumerated by the function clGetPlatformIDs
.
Multiple OpenCL implementations can be installed in the platform. You would question how would the application pick the appropriate runtime. The answer is OpenCL Installable Client Driver (ICD). We will study this more in a later section.
Query devices
We shall now continue with getting the attributes and resource limitations of an OpenCL device. In the last program we were able to print all the platform information available. In this example we shall try to enhance the existing code to print some basic device attributes and resource information for the first available platform. We will implement a function PrintDeviceInfo()
, which will print the device specific information. The following two OpenCL APIs are used in the example:
cl_int clGetDeviceIDs (cl_platform_id platform, cl_device_type device_type, cl_uint num_entries, cl_device_id *devices, cl_uint *num_devices); cl_int clGetDeviceInfo (cl_device_id device, cl_device_info param_name, size_t param_value_size, void *param_value, size_t *param_value_size_ret);
In the same way as we did for platforms, we first determine the number of devices available, and then allocate memory for each device found in the platform.
clError = clGetDeviceIDs (platform, CL_DEVICE_TYPE_ALL, 0, NULL, &num_devices);
The above call gives the number of available device of CL_DEVICE_TYPE_ALL
. You can otherwise use CL_DEVICE_TYPE_CPU
or CL_DEVICE_TYPE_GPU
, if you want to list the number of available CPU or GPU devices.
To understand better we we have added the PrintDeviceInfo
function:
void PrintDeviceInfo(cl_device_id device) { char queryBuffer[1024]; int queryInt; cl_int clError; clError = clGetDeviceInfo(device, CL_DEVICE_NAME, sizeof(queryBuffer), &queryBuffer, NULL); printf("CL_DEVICE_NAME: %s\n", queryBuffer); queryBuffer[0] = '\0'; clError = clGetDeviceInfo(device, CL_DEVICE_VENDOR, sizeof(queryBuffer), &queryBuffer, NULL); printf("CL_DEVICE_VENDOR: %s\n", queryBuffer); queryBuffer[0] = '\0'; clError = clGetDeviceInfo(device, CL_DRIVER_VERSION, sizeof(queryBuffer), &queryBuffer, NULL); printf("CL_DRIVER_VERSION: %s\n", queryBuffer); queryBuffer[0] = '\0'; clError = clGetDeviceInfo(device, CL_DEVICE_VERSION, sizeof(queryBuffer), &queryBuffer, NULL); printf("CL_DEVICE_VERSION: %s\n", queryBuffer); queryBuffer[0] = '\0'; clError = clGetDeviceInfo(device, CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(int), &queryInt, NULL); printf("CL_DEVICE_MAX_COMPUTE_UNITS: %d\n", queryInt); }
Note that each of the param_name
associated with clGetDeviceInfo
returns a different data type. In the routine PrintDeviceInfo
you can see that the CL_DEVICE_MAX_COMPUTE_UNITS
param_name
returns an integer type The CL_DRIVER_VERSION
param_name
returns a character buffer.
The preceding function prints the following information about the device:
CL_DEVICE_NAME CL_DEVICE_VENDOR CL_DRIVER_VERSION CL_DEVICE_VERSION CL_DEVICE_MAX_COMPUTE_UNITS
Following is the maximum number of compute units for different types of platforms when you query for the GPU type device:
For APU like processors:
AMD A10 5800K - 6
AMD trinity has 6 SIMD engines (compute units) and each has 64 processing elements.
INTEL HD 4000 - 16
Intel HD 4000 has 16 compute units and each is a single thread processor.
For discrete graphics:
NVIDIA GTX 680 - 8
The NVIDIA GTX 680 has a total of eight Compute units; each compute unit has 192 processing elements.
AMD Radeon HD 7870 - 32
The AMD Radeon HD 7870 GPU has 32 compute units and each has 64 processing elements.
It is not the case that if you have more compute units in the GPU device type, the faster the processor is. The number of compute units varies across different computer architectures and across different hardware vendors. Sometimes even within the vendors there are different families like the NVIDIA Kepler and Fermi architectures or the AMD Radeon HD 6XXX and Radeon HD 7XXX Architecture. The OpenCL specification is targeted at programming these different kinds of devices from different vendors. As an enhancement to the sample program print all the device related attributes and resource sizes for some of the param_name
instances listed as follows:
CL_DEVICE_TYPE
CL_DEVICE_MAX_CLOCK_FREQUENCY
CL_DEVICE_IMAGE_SUPPORT
CL_DEVICE_SINGLE_FP_CONFIG
Besides these there are many more device attributes which can be queried. Take a look at the different param_name
instances provided in the OpenCL specification 1.2, table 4.3. You should try out all the param_name
instances and try to understand each device property.