官术网_书友最值得收藏!

Dataset class

Any custom dataset class, say for example, our Dogs dataset class, has to inherit from the PyTorch dataset class. The custom class has to implement two main functions, namely __len__(self) and __getitem__(self, idx). Any custom class acting as a Dataset class should look like the following code snippet:

from torch.utils.data import Dataset
class DogsAndCatsDataset(Dataset):
def __init__(self,):
pass
def __len__(self):
pass
def __getitem__(self,idx):
pass

We do any initialization, if required, inside the init method—for example, reading the index of the table and reading the filenames of the images, in our case. The __len__(self) operation is responsible for returning the maximum number of elements in our dataset. The __getitem__(self, idx) operation returns an element based on the idx every time it is called. The following code implements our DogsAndCatsDataset class:

class DogsAndCatsDataset(Dataset):

def __init__(self,root_dir,size=(224,224)):
self.files = glob(root_dir)
self.size = size

def __len__(self):
return len(self.files)

def __getitem__(self,idx):
img = np.asarray(Image.open(self.files[idx]).resize(self.size))
label = self.files[idx].split('/')[-2]
return img,label

Once the DogsAndCatsDataset class is created, we can create an object and iterate over it, which is shown in the following code:

for image,label in dogsdset:
#Apply your DL on the dataset.

Applying a deep learning algorithm on a single instance of data is not optimal. We need a batch of data, as modern GPUs are optimized for better performance when executed on a batch of data. The DataLoader class helps to create batches by abstracting a lot of complexity. 

主站蜘蛛池模板: 眉山市| 稷山县| 东源县| 乐亭县| 长宁区| 临清市| 全南县| 四平市| 望奎县| 健康| 施甸县| 泸水县| 远安县| 高尔夫| 涪陵区| 永泰县| 乌拉特前旗| 沁源县| 西乡县| 上犹县| 吉林省| 新竹县| 晋城| 新蔡县| 乐亭县| 尖扎县| 襄城县| 房山区| 黑河市| 边坝县| 武冈市| 浦城县| 黄浦区| 章丘市| 黄梅县| 乌兰浩特市| 乌什县| 威海市| 新沂市| 石景山区| 东乡族自治县|