官术网_书友最值得收藏!

Why is it difficult for machines to understand image content?

We now understand how visual data enters the human visual system, and how our system processes it. The issue is that we still don't fully understand how our brain recognizes and organizes this visual data. In machine learning, we just extract some features from images, and ask the computers to learn them using algorithms. We still have these variations, such as shape, size, perspective, angle, illumination, occlusion, and so on.

For example, the same chair looks very different to a machine when you look at it from the profile view. Humans can easily recognize that it's a chair, regardless of how it's presented to us. So, how do we explain this to our machines?

One way to do this would be to store all the different variations of an object, including sizes, angles, perspectives, and so on. But this process is cumbersome and time-consuming. Also, it's actually not possible to gather data that can encompass every single variation. The machines would consume a huge amount of memory and a lot of time to build a model that can recognize these objects.

Even with all this, if an object is partially occluded, computers still won't recognize it. This is because they think this is a new object. So when we build a computer vision library, we need to build the underlying functional blocks that can be combined in many different ways to formulate complex algorithms.

OpenCV provides a lot of these functions, and they are highly optimized. So once we understand what OpenCV is capable of, we can use it effectively to build interesting applications.

Let's go ahead and explore that in the next section.

主站蜘蛛池模板: 鄱阳县| 阳高县| 石景山区| 泾川县| 监利县| 鄂伦春自治旗| 庆云县| 射洪县| 克山县| 阿坝| 靖江市| 寿光市| 香格里拉县| 富裕县| 定陶县| 炉霍县| 安多县| 拉萨市| 富源县| 门源| 绥德县| 砚山县| 北票市| 桐庐县| 天津市| 蓬安县| 翁源县| 黎川县| 敦化市| 丰台区| 正安县| 利津县| 平定县| 炎陵县| 繁昌县| 台湾省| 安化县| 咸宁市| 芮城县| 怀远县| 巫山县|