- Deep Learning with R for Beginners
- Mark Hodnett Joshua F. Wiley Yuxi (Hayden) Liu Pablo Maldonado
- 485字
- 2021-06-24 14:30:46
Image Classification Using Convolutional Neural Networks
It is not an exaggeration to say that the huge growth of interest in deep learning can be mostly attributed to convolutional neural networks. Convolutional neural networks (CNNs) are the main building blocks of image classification models in deep learning, and have replaced most techniques that were previously used by specialists in the field. Deep learning models are now the de facto method to perform all large-scale image tasks, including image classification, object detection, detecting artificially generated images, and even attributing text descriptions to images. In this chapter, we will look at some of these techniques.
Why are CNNs so important? To explain why, we can look at the history of the ImageNet competition. The ImageNet competition is an open large-scale image classification challenge that has one thousand categories. It can be considered as the unofficial world championship for image classification. Teams, mostly fielded by academics and researchers, compete from around the world. In 2011, an error rate of around 25% was the benchmark. In 2012, a team led by Alex Krizhevsky and advised by Geoffrey Hinton achieved a huge leap by winning the competition with an error rate of 16%. Their solution consisted of 60 million parameters and 650,000 neurons, five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1,000-way softmax layer to do the final classification.
Other researchers built on their techniques in subsequent years, with the result that the original ImageNet competition is essentially considered solved. In 2017, almost all teams achieved an error rate of less than 5%. Most people consider that the 2012 ImageNet victory heralded the dawn of the new deep learning revolution.
In this chapter, we will look at image classification using CNN. We are going to start with the MNIST dataset, which is considered as the Hello World of deep learning tasks. The MNIST dataset consists of grayscale images of size 28 x 28 of 10 classes, the numbers 0-9. This is a much easier task than the ImageNet competition; there are 10 categories rather than 1,000, the images are in grayscale rather than color and most importantly, there are no backgrounds in the MNIST images that can potentially confuse the model. Nevertheless, the MNIST task is an important one in its own right; for example, most countries use postal codes containing digits. Every country uses automatic address routing solutions that are more complex variations of this task.
We will use the MXNet library from Amazon for this task. The MXNet library is an excellent library introduction to deep learning as it allows us to code at a higher level than other libraries such as TensorFlow, which we cover later on in this book.
The following topics will be covered in this chapter:
- What are CNNs?
- Convolutional layers
- Pooling layers
- Softmax
- Deep learning architectures
- Using MXNet for image classification
- LibGDX Game Development Essentials
- Hands-On Machine Learning with Microsoft Excel 2019
- Mastering Ninject for Dependency Injection
- 文本挖掘:基于R語言的整潔工具
- R數(shù)據(jù)科學實戰(zhàn):工具詳解與案例分析(鮮讀版)
- 數(shù)據(jù)庫系統(tǒng)原理及應用教程(第4版)
- 智能數(shù)據(jù)分析:入門、實戰(zhàn)與平臺構(gòu)建
- 信息學競賽寶典:數(shù)據(jù)結(jié)構(gòu)基礎
- 大數(shù)據(jù)治理與安全:從理論到開源實踐
- Hadoop 3實戰(zhàn)指南
- Visual Studio 2012 and .NET 4.5 Expert Development Cookbook
- 云原生架構(gòu):從技術演進到最佳實踐
- Practical Convolutional Neural Networks
- Foxtable數(shù)據(jù)庫應用開發(fā)寶典
- SQL必知必會(第4版)