- Building Computer Vision Projects with OpenCV 4 and C++
- David Millán Escrivá Prateek Joshi Vinícius G. Mendon?a Roy Shilkrot
- 388字
- 2021-07-02 12:28:27
How do humans understand image content?
If you look around, you will see a lot of objects. You encounter many different objects every day, and you recognize them almost instantaneously without any effort. When you see a chair, you don't wait for a few minutes before realizing that it is in fact a chair. You just know that it's a chair right away.
Computers, on the other hand, find it very difficult to do this task. Researchers have been working for many years to find out why computers are not as good as we are at this.
To get an answer to that question, we need to understand how humans do it. The visual data processing happens in the ventral visual stream. This ventral visual stream refers to the pathway in our visual system that is associated with object recognition. It is basically a hierarchy of areas in our brain that helps us recognize objects.
Humans can recognize different objects effortlessly, and can cluster similar objects together. We can do this because we have developed some sort of invariance toward objects of the same class. When we look at an object, our brain extracts the salient points in such a way that factors such as orientation, size, perspective, and illumination don't matter.
A chair that is double the normal size and rotated by 45 degrees is still a chair. We can recognize it easily because of the way we process it. Machines cannot do that so easily. Humans tend to remember an object based on its shape and important features. Regardless of how the object is placed, we can still recognize it.
In our visual system, we build up these hierarchical invariances with respect to position, scale, and viewpoint that help us to be very robust. If you look deeper into our system, you will see that humans have cells in their visual cortex that can respond to shapes such as curves and lines.
As we move further along our ventral stream, we will see more complex cells that are trained to respond to more complex objects such as trees, gates, and so on. The neurons along our ventral stream tend to show an increase in the size of the receptive field. This is coupled with the fact that the complexity of their preferred stimuli increases as well.
- 計算機組成原理與接口技術:基于MIPS架構實驗教程(第2版)
- 達夢數據庫編程指南
- 從0到1:數據分析師養成寶典
- 分布式數據庫系統:大數據時代新型數據庫技術(第3版)
- Access 2007數據庫應用上機指導與練習
- 企業大數據系統構建實戰:技術、架構、實施與應用
- PySpark大數據分析與應用
- 圖解機器學習算法
- INSTANT Apple iBooks How-to
- 數據庫原理與應用
- 區塊鏈技術應用與實踐案例
- 大數據技術原理與應用:概念、存儲、處理、分析與應用
- Oracle高性能SQL引擎剖析:SQL優化與調優機制詳解
- Unity 2018 By Example(Second Edition)
- 大數據與機器學習:實踐方法與行業案例