官术网_书友最值得收藏!

Memory-related properties

Memory on the GPU has a hierarchical architecture. It can be divided in terms of L1 cache, L2 cache, global memory, texture memory, and shared memory. The cudaDeviceProp provides many properties that help in identifying memory available with the device. memoryClockRate and memoryBusWidth provide clock rate and bus width of the memory respectively. The speed of the memory is very important. It affects the overall speed of your program. totalGlobalMem returns the size of global memory available with the device. totalConstMem returns the total constant memory available with the device. sharedMemPerBlock returns the total shared memory that can be used in tne device. The total number of registers available per block can be identified by using regsPerBlock. Size of L2 cache can be identified using the l2CacheSize property. The following code snippet shows how to use memory-related properties from the CUDA program:

printf( " Total amount of global memory: %.0f MBytes (%llu bytes)\n",
(float)device_Property.totalGlobalMem / 1048576.0f, (unsigned long long) device_Property.totalGlobalMem);
printf(" Memory Clock rate: %.0f Mhz\n", device_Property.memoryClockRate * 1e-3f);
printf(" Memory Bus Width: %d-bit\n", device_Property.memoryBusWidth);
if (device_Property.l2CacheSize)
{
printf(" L2 Cache Size: %d bytes\n", device_Property.l2CacheSize);
}
printf(" Total amount of constant memory: %lu bytes\n", device_Property.totalConstMem);
printf(" Total amount of shared memory per block: %lu bytes\n", device_Property.sharedMemPerBlock);
printf(" Total number of registers available per block: %d\n", device_Property.regsPerBlock);
主站蜘蛛池模板: 虎林市| 邛崃市| 阿克苏市| 大宁县| 武山县| 柳林县| 龙胜| 凤翔县| 梁河县| 湟源县| 伽师县| 沁阳市| 德清县| 太谷县| 静安区| 青浦区| 延川县| 肥东县| 体育| 吴桥县| 阜南县| 西华县| 西平县| 灵璧县| 福鼎市| 株洲县| 阿拉尔市| 石河子市| 樟树市| 景宁| 临江市| 东平县| 固阳县| 黔江区| 浑源县| 吴江市| 阿拉善右旗| 莫力| 上高县| 儋州市| 壶关县|