This page is a part of CVprimer.com, a wiki devoted to computer vision. It focuses on low level computer vision, digital image analysis, and applications. It is designed as an online textbook but the exposition is informal. It geared towards software developers, especially beginners, and CS students. The wiki contains mathematics, algorithms, code examples, source code, compiled software, and some discussion. If you have any questions or suggestions, please contact me directly.

Dimension

From Computer Vision Primer

Jump to: navigation, search

The Roomba has vision however rudimentary. It does not detect vertical changes, so it is fair to say that its vision is 1-dimensional. Observe that the parameter corresponding to this dimension is binary - it touches an obstacle or it does not. Another 1D vision system is radar (with continuous parameter).

Photography provides 2D vision.

Here's a way to define 3D: “I can see the object AND I know how far it is”. The object is a 2D picture and the distance (depth) is the 3rd dimension. Without that, it’s not 3D. It does not matter whether the picture is curved as in panoramic shots.

Generally, conversion of a single 2D image into 3D does not work. The simple reason is the lack of information.

To capture a 3D image you need a 3D camera.

What is a 3D camera? Any camera takes 2D pictures so all you need to add is the 3rd dimension. Time could be that, so a video camera is a 3D camera. Or you could combine several cameras in a row - that row is the 3rd dimension (in fact just two cameras will do). In either case, you can find the distance via stereo vision. Or you could simply add a distance measuring device such as radar, etc.

Calibration can also solve the problem of 3rd dimension. The solution however is only as complete as your collection of objects of known size. Imagine going through a forest...

Of course, here we refer to spacial dimensions of images. Time, color, spectrum are referred to as image parameters.

[edit]