| Apr | MAY | Jun |
| 03 | ||
| 2012 | 2013 | 2014 |
COLLECTED BY
Collection: Wide Crawl started April 2013
| Stable release | 1.0 |
|---|---|
| Written in | C, C++, and Fortran |
| Operating system | Cross-platform |
| Platform | Cross-platform |
| Type | API |
| Website | http://www.openacc.org/ |
OpenACC is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI. The standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems.[1]
Like in OpenMP, the programmer can annotate C, C++ and Fortran source code to identify the areas that should be accelerated using PRAGMA compiler directives and additional functions.[2] Unlike OpenMP, code can be started not only on the CPU, but also on the GPU.
OpenACC members have worked as members of the OpenMP standard group to merge into OpenMP specification to create a common specification which extends OpenMP to support accelerators in a future release of OpenMP.[3][4] These efforts resulted in a technical report[5] for comment and discussion timed to include the annual Supercomputing Conference (November 2012, Salt Lake City) and to address non-NVIDIA accelerator support with input from hardware vendors who participate in OpenMP.[6]
At ISC’12 OpenACC was demonstrated to work on NVIDIA, AMD and Intel accelerators, without performance data.[7]
In November 12, 2012, at the SC12 conference, a draft of the OpenACC version 2.0 specification was presented.[8] New suggested capabilities include new controls over data movement (such as better handling of unstructured data and improvements in support for non-contiguous memory), and support for explicit function calls and separate compilation (allowing the creation and reuse of libraries of accelerated code).
Contents |
Support of OpenACC is available in compilers from PGI (from version 12.6), Cray, and CAPS.[7][9]
To use OpenACC, user should include "openacc.h" in C or "openacc_lib.h" in Fortran;[10] and then call acc_init() function.
OpenACC defines some pragmas (directives), for example:
#pragma acc parallel #pragma acc kernels #pragma acc data #pragma acc loop #pragma acc cache #pragma acc update #pragma acc declare #pragma acc wait
There are some runtime API functions defined too: acc_get_num_devices(), acc_set_device_type(), acc_get_device_type(), acc_set_device_num(), acc_get_device_num(), acc_async_test(), acc_async_test_all(), acc_async_wait(), acc_async_wait_all(), acc_init(), acc_shutdown(), acc_on_device(), acc_malloc(), acc_free().
OpenACC generally takes care of work organisation for the target device however this can be overridden through the use of gangs and workers. A gang consists of workers and operates over a number of processing elements (as with a workgroup in OpenCL).
|
||||||||||||||||||||||||||||||||||||||