官术网_书友最值得收藏!

History of CNNs

There have been numerous attempts to recognize pictures by machines for decades. It is a challenge to mimic the visual recognition system of the human brain in a computer. Human vision is the hardest to mimic and most complex sensory cognitive system of the brain. We will not discuss biological neurons here, that is, the primary visual cortex, but rather focus on artificial neurons. Objects in the physical world are three dimensional, whereas pictures of those objects are two dimensional. In this book, we will introduce neural networks without appealing to brain analogies. In 1963, computer scientist Larry Roberts, who is also known as the father of computer vision, described the possibility of extracting 3D geometrical information from 2D perspective views of blocks in his research dissertation titled BLOCK WORLD. This was the first breakthrough in the world of computer vision. Many researchers worldwide in machine learning and artificial intelligence followed this work and studied computer vision in the context of BLOCK WORLD. Human beings can recognize blocks regardless of any orientation or lighting changes that may happen. In this dissertation, he said that it is important to understand simple edge-like shapes in images. He extracted these edge-like shapes from blocks in order to make the computer understand that these two blocks are the same irrespective of orientation:

The vision starts with a simple structure. This is the beginning of computer vision as an engineering model. David Mark, an MIT computer vision scientist, gave us the next important concept, that vision is hierarchical. He wrote a very influential book named VISION. This is a simple book. He said that an image consists of several layers. These two principles form the basis of deep learning architecture, although they do not tell us what kind of mathematical model to use.

In the 1970s, the first visual recognition algorithm, known as the generalized cylinder model, came from the AI lab at Stanford University. The idea here is that the world is composed of simple shapes and any real-world object is a combination of these simple shapes. At the same time, another model, known as the pictorial structure model, was published from SRI Inc. The concept is still the same as the generalized cylinder model, but the parts are connected by springs; thus, it introduced a concept of variability. The first visual recognition algorithm was used in a digital camera by Fujifilm in 2006.

主站蜘蛛池模板: 新干县| 济宁市| 海淀区| 莲花县| 乌兰浩特市| 平南县| 洱源县| 东至县| 兰州市| 徐闻县| 潞城市| 龙里县| 望都县| 茶陵县| 祁连县| 安图县| 濮阳县| 凉山| 唐山市| 奉新县| 吴忠市| 台南县| 黔西县| 裕民县| 顺义区| 化隆| 巴林右旗| 江西省| 治县。| 罗江县| 察哈| 太原市| 温泉县| 临城县| 庄浪县| 东莞市| 诸城市| 沈丘县| 兴国县| 彰化市| 黑水县|