- Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
- Bhaumik Vaidya
- 196字
- 2021-08-13 15:48:25
Local memory and registers
Local memory and register files are unique to each thread. Register files are the fastest memory available for each thread. When variables of the kernel do not fit in register files, they use local memory. This is called register spilling. Basically, local memory is a part of global memory that is unique for each thread. Access to local memory will be slow compared to register files. Though local memory is cached in L1 and L2 caches, register spilling might not affect your program adversely.
A simple program to understand how to use local memory is shown as follows:
#include <stdio.h>
#define N 5
__global__ void gpu_local_memory(int d_in)
{
int t_local;
t_local = d_in * threadIdx.x;
printf("Value of Local variable in current thread is: %d \n", t_local);
}
int main(int argc, char **argv)
{
printf("Use of Local Memory on GPU:\n");
gpu_local_memory << <1, N >> >(5);
cudaDeviceSynchronize();
return 0;
}
The t_local variable will be local to each thread and stored in a register file. When this variable is used for computation in the kernel function, the computation will be the fastest. The output of the preceding code is shown as follows:

- 程序員面試白皮書
- Building Modern Web Applications Using Angular
- Python網絡爬蟲從入門到實踐(第2版)
- Django Design Patterns and Best Practices
- Mastering AndEngine Game Development
- SQL Server 2016數據庫應用與開發習題解答與上機指導
- 利用Python進行數據分析
- Learning Docker Networking
- 機器學習微積分一本通(Python版)
- 零基礎看圖學ScratchJr:少兒趣味編程(全彩大字版)
- Python Web自動化測試設計與實現
- 交互設計師成長手冊:從零開始學交互
- C#網絡編程高級篇之網頁游戲輔助程序設計
- ASP.NET jQuery Cookbook(Second Edition)
- Mastering Vim