This page is a part of, a wiki devoted to computer vision. It focuses on low level computer vision, digital image analysis, and applications. It is designed as an online textbook but the exposition is informal. It geared towards software developers, especially beginners, and CS students. The wiki contains mathematics, algorithms, code examples, source code, compiled software, and some discussion. If you have any questions or suggestions, please contact me directly.

Main Page

From Computer Vision Primer

Jump to: navigation, search

Welcome To Computer Vision Primer!

- -

A software developer's resource of computer vision methods


Computer Vision For Beginners: A Developer’s Platform

The current image analysis and computer vision technology is a very large collection of disparate “tools” in the form of “toolboxes”, “cookbooks”, or code libraries. It follows the following outdated manual paradigm:

1. An image is given. 2. The image is processed and analyzed with mathematical tools. 3. Analysis produces data about the contents of the image.

Image analysis tools include “edge detection”, “thresholding”, “segmentation”, “Fourier transform”, “wavelets”, "the Laplacian of the Gaussian" (my favorite), and on and on, all drown in a sea of image processing tools. It takes serious training and experience to put these pieces together. The methods are mathematically advanced at a level that goes well beyond what is covered in a typical undergraduate degree in computer science: Fourier and wavelet transforms, partial differential equations, probability and statistics, discrete topology and geometry, etc.

A computer vision platform should allow the software developer to concentrate on the user’s needs instead of custom development of or experimentation with mathematical methods and algorithms. Our goal is to take care of the "Mathematical tools" part above so that the developer would face this:

The image analysis part is hidden from the developer who is simply supplied with the output data.

Initially we'll be able to handle only the fundamentals: objects in the image, their locations, measurements, their topology, etc. It is what may be called the low level computer vision. This data will allow the developer to concentrate on high level computer vision: what these objects represent in the context of his project.

For that we have our free software developer's kit (SDK). If you also want to understand how everything works, this wiki gives you a unique chance. We have complete and detailed expositions and source code.

In fact, this wiki is self-contained and requires only high school math...

How To Read This Wiki

The wiki contains 100 articles articles and there is a number of ways you can read it. Depending on your interests, this is how you can start. Each article has further links (UC stands for “under construction”).

If you are just curious about computer vision...

If you are interested in photo editing etc...

If you are a scientist interested in image analysis for biology, medicine etc...

If you are a student taking a computer vision class or image processing class...

Read the book assigned by your professor. Ignore this wiki if you want a good grade... (Unless you are one of my students, then proceed to The Mathematics of Computer Vision.)

If you are a beginner software developer interested in elementary computer vision...

If you are a software developer interested in advanced computer vision...

If you are a computer vision researcher...

Also take a look at these slides:

Personal tools