官术网_书友最值得收藏!

What constitutes computer vision?

In order to begin the discussion on computer vision, observe the following image:

Even if we have never done this activity before, we can clearly tell that the image is of people skiing in the snowy mountains on a cloudy day. This information that we perceive is quite complex and can be sub divided into more basic inferences for a computer vision system.

The most basic observation that we can get from an image is of the things or objects in it. In the previous image, the various things that we can see are trees, mountains, snow, sky, people, and so on. Extracting this information is often referred to as image classification, where we would like to label an image with a predefined set of categories. In this case, the labels are the things that we see in the image. 

A wider observation that we can get from the previous image is landscape. We can tell that the image consists of Snow, Mountain, and Sky, as shown in the following image:

Although it is difficult to create exact boundaries for where the Snow, Mountain, and Sky are in the image, we can still identify approximate regions of the image for each of them. This is often termed as segmentation of an image, where we break it up into regions according to object occupancy. 

Making our observation more concrete, we can further identify the exact boundaries of objects in the image, as shown in the following figure:

In the image, we see that people are doing different activities and as such have different shapes; some are sitting, some are standing, some are skiing. Even with this many variations, we can detect objects and can create bounding boxes around them. Only a few bounding boxes are shown in the image for understanding—we can observe much more than these. 

While, in the image, we show rectangular bounding boxes around some objects, we are not categorizing what object is in the box. The next step would be to say the box contains a person. This combined observation of detecting and categorizing the box is often referred to as object detection. 

Extending our observation of people and surroundings, we can say that different people in the image have different heights, even though some are nearer and others are farther from the camera. This is due to our intuitive understanding of image formation and the relations of objects. We know that a tree is usually much taller than a person, even if the trees in the image are shorter than the people nearer to the camera. Extracting the information about geometry in the image is another sub-field of computer vision, often referred to as image reconstruction. 

主站蜘蛛池模板: 海南省| 布尔津县| 定安县| 奉节县| 阿坝县| 金溪县| 平湖市| 保德县| 扎囊县| 寿光市| 新昌县| 文化| 治县。| 元江| 湛江市| 怀远县| 繁昌县| 云龙县| 浮梁县| 阿克| 陈巴尔虎旗| 庆阳市| 合水县| 琼结县| 延边| 察雅县| 蕉岭县| 竹溪县| 偃师市| 常宁市| 蒙城县| 兴国县| 昭觉县| 库尔勒市| 原平市| 夏河县| 防城港市| 郴州市| 望奎县| 和平区| 顺平县|