- Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
- Bhaumik Vaidya
- 240字
- 2021-08-13 15:48:25
Global memory
All blocks have read and write access to global memory. This memory is slow but can be accessed from anywhere in your device code. The concept of caching is used to speed up access to a global memory. All memories allocated using cudaMalloc will be a global memory. The following simple example demonstrates how you can use a global memory from your program:
#include <stdio.h>
#define N 5
__global__ void gpu_global_memory(int *d_a)
{
d_a[threadIdx.x] = threadIdx.x;
}
int main(int argc, char **argv)
{
int h_a[N];
int *d_a;
cudaMalloc((void **)&d_a, sizeof(int) *N);
cudaMemcpy((void *)d_a, (void *)h_a, sizeof(int) *N, cudaMemcpyHostToDevice);
gpu_global_memory << <1, N >> >(d_a);
cudaMemcpy((void *)h_a, (void *)d_a, sizeof(int) *N, cudaMemcpyDeviceToHost);
printf("Array in Global Memory is: \n");
for (int i = 0; i < N; i++)
{
printf("At Index: %d --> %d \n", i, h_a[i]);
}
return 0;
}
This code demonstrates how you can write in global memory from your device code. The memory is allocated using cudaMalloc from the host code and a pointer to this array is passed as a parameter to the kernel function. The kernel function populates this memory chunk with values of the thread ID. This is copied back to host memory for printing. The result is shown as follows:

As we are using global memory, this operation will be slow. There are advanced concepts to speed up this operation which will be explained later on. In the next section, we will explain local memory and registers that are unique to all threads.
- Node.js 10實戰(zhàn)
- 數(shù)據(jù)庫程序員面試筆試真題與解析
- 國際大學(xué)生程序設(shè)計競賽中山大學(xué)內(nèi)部選拔真題解(二)
- Objective-C應(yīng)用開發(fā)全程實錄
- Learning Spring 5.0
- MATLAB圖像處理超級學(xué)習(xí)手冊
- MySQL數(shù)據(jù)庫基礎(chǔ)實例教程(微課版)
- Clean Code in C#
- 硬件產(chǎn)品設(shè)計與開發(fā):從原型到交付
- Mastering Embedded Linux Programming
- Java程序設(shè)計實用教程(第2版)
- UI動效設(shè)計從入門到精通
- 基于MATLAB的控制系統(tǒng)仿真及應(yīng)用
- Web開發(fā)新體驗
- C++面向?qū)ο蟪绦蛟O(shè)計