Computer Vision for Dummies » 2008

November 28, 2008

Image search engines still keep launching: Incogna

Filed under: Image search, reviews, rants — Peter @ 1:19 am

The screenshot tells the whole story. The image of a table in the upper left corner is the query image. The rest are supposed to be “similar”. What is the image filled with numbers doing here you ask? Hmm… Oh yes, it’s a table of numbers!

Previous posts on the topic are here.

Digital discoveries

Comments (0)

November 23, 2008

A Graph, Non-Tree Representation of the Topology of a Gray Scale Image, paper

Filed under: updates, computer vision/machine vision/AI, mathematics — Peter @ 4:27 pm

This is a new version of a paper I wrote a few months ago. It describes the algorithm behind Pixcavator.

There are a couple of issues here. The first is reflected in the new title. It was unclear to a lot of people that the topology graph isn’t a tree. To deal with that I first added 5-6 mentions of the fact that “this graph isn’t a tree”. Then I added a section called The topology graph is not a tree. Finally I renamed the paper as well. At that point I just kept rewriting. The new version is almost unrecognizable. The image below illustrates the structure of the topology graph.

The second issue is the complexity of the algorithm as a function of the number of pixels N in the image. I show that my algorithm is O(N²) in the worst case scenario - the comb image on the right. In Fast computation of a contrast invariant image representation by P. Monasse and F. Guichard, they build a graph too (that one is a tree) and claim that the complexity is O(NlogN). Their proof looked unsatisfactory so I decided to apply their algorithm to my image. Odd…

Suppose the image is not quantized so there may be N levels of gray in the image. For each level, the level curve has to be traversed (step 4a). These level curves expand from a very short to the whole comb. The perimeter of the comb is N, roughly. Then the total number of pixels to be visited is 1+2+3+…+N = N*(N+1)/2. Therefore the complexity is O(N²), not O(NlogN).

Seems clear to me but maybe there is someone who can see a flaw in this argument? What am I missing? I’d appreciate any feedback.

Comments (0)

November 17, 2008

A few thoughts and opinions on MATLAB.

Filed under: Uncategorized, image processing/image analysis software, reviews, mathematics — Peter @ 12:59 am

In a recent post I made a couple of comments about MATLAB’s choice as the software for Digital Image Processing Using MATLAB by Gonzalez, Woods, and Eddins. I listed “MATLAB is ubiquitous” as a pro and “MATLAB is expensive” and “MATLAB is good for education and possibly research, problematic for industry” as two cons. Coincidentally, there was a discussion on Hacker News centered on MATLAB. Here are some of the comments:

… designed for engineers (I mean engineers, not computer programmers) to explore matrix models interactively, then save their work as scripts - you were never meant to use m-files for general purpose programming… What you’re paying for with MATLAB is access to the Mathworks Toolboxes. If you need them then it’s absolutely worth every penny.

Matlab is, to the programmer with experience in almost any other language, a tremendous horror… That being said, if you have the mathematical chops to rearrange your problem into something solvable via matrix transformations, you can probably write it quickly and elegantly in Matlab without worrying too greatly about execution speed. Better, the built in toolboxes have already solved huge (engineering) problem spaces… Prototype the math in Matlab, implement in a language that doesn’t suck.

.. the matrix/vector/tensor core is very elegant and powerful.

You’re probably unlikely to write a real application in Matlab… when you pay for Matlab, you pay for the assurance that the implementation of the tools provided is correct and therefore your research is based on a proven foundation.

My biggest complaint about Matlab (besides the licensing) is that it’s just a horrendously bad programming language (if you can call it a language at all)… you have to buy a toolbox for everything…

Matlab is really really, annoyingly powerful. You’ve got almost anything to try your ideas, implement an academic paper. But, it is slow (execution time). Prototype with Matlab, implement with C++.

Even though image processing/analysis wasn’t discussed, it is good to see that my assessment is confirmed.

I can add a few things though.

Matlab does some “rearranging“ of image manipulation (games) into matrix operations, they can’t be analyzed as matrices as they aren’t subject to matrix operations. They are in fact tables.

Matrices, yes. Linear algebra? Linear maps, kernels, images? Not as much. Quotient spaces? No. How am I supposed to compute homology of images?

Also, there are no pointers. So how am I supposed to implement graph representations of images?

So, MATLAB is mathematically powerful but only if you understand mathematics very narrowly.

Eventually I’ll add this to the wiki under MATLAB.

Comments (0)

November 10, 2008

Books on computer vision, part 2

Filed under: computer vision/machine vision/AI, reviews, mathematics — Peter @ 3:10 pm

As I mentioned in the last post, I am at the initial stages of writing a book on elementary computer vision. It makes sense at this point to provide a rationale for such a book.

Current textbooks either have extensive prerequisites or take too long to get the student to use what’s been learned in real-life computer vision projects.

Let’s consider an example. Suppose we know freshman or sophomore students in a technical discipline. They have to take their first course in image processing. What are they capable of doing at the end of a typical course? They know about image representation and how to handle image files. They know how to increase contrast and remove noise. They are familiar with image restoration, image enhancement, and image compression. All good, but this choice of topics draws students toward photo editing and away from the scientific and industrial applications.

I am talking about the image processing vs. image analysis dilemma. The former produces images and the latter produces data. More on this here.

As image processing is a time consuming topic, the students may only get a little taste of image analysis (image segmentation and related topics about image content).

The result is that in order to make their skills applicable to scientific image analysis, they will need to take a more advanced course on the subject. Such a course would require (some combination of) calculus 1-3, linear algebra, probability. Even then, 3D images, especially their topology, are rarely addressed.

So, there may be a need for something even more elementary than Digital Image Processing Using MATLAB by Gonzalez, Woods, and Eddins discussed last time.

Comments (0)

November 3, 2008

Books on computer vision

Filed under: computer vision/machine vision/AI, reviews, mathematics — Peter @ 2:28 pm

As I have mentioned before, I am thinking about writing a book on elementary computer vision and image analysis. Of course, I’ll follow what’s already in the wiki. It will take a while and in the process I research some of the better books related to the subject. I think Digital Image Processing Using MATLAB by Gonzalez, Woods, and Eddins is one of the best and closest to what I have in mind. Here is a short analysis.

Pros:	Cons:
“[T]extbook format not a software manual”.
Comprehensive coverage of image processing.	A loose collection of “tools”. More about image processing than image analysis. No video analysis. No 3D analysis.
Many illustrations.
Some mathematics is explained.	Required: Good understanding of calculus, Some linear algebra.
Many examples of MATLAB code.
Website: a lot of supplementary material (even PowerPoint slides for instructors).
Many projects online.	No exercises in the book.
Based on MATLAB which is ubiquitous.	MATLAB is expensive. MATLAB is good for education and possibly research, problematic for industry.
Accessible to “individuals with a basic background in digital image processing, mathematical analysis, and computer programming, all at the level typical of that found in a junior/senior curriculum in a technical discipline.”	These requirements make it an intermediate book.