It's difficult to effectively use cv libraries (such as opencv) because in order to approach new problems, understanding of the underlying theory and methodology is often necessary. Example code can only go so far to teach the user how to evaluate and adapt to new constraints. simplecv.org is down for me right now, but I hope that it addresses this concern!
The principles of computer vision are not too difficult to grok, requiring only knowledge of geometry and basic linear algebra. Two popular (dated, but still relevant) introductory texts are referenced to below [1] [2]. However, papers describing techniques that work in practice are often dense reads involving concepts from signal processing, control theory, and probability (e.g., [3]).
I personally enjoy learning about simple techniques that are also very effective in practice (e.g. Viola-Jones detection [4], SIFT points [5], RANSAC [6]), and I hope that simplecv.org might go over these in detail too.
I've worked on image recognition and find OpenCV too simple. I don't like algorithms that work on the entire image without taking into account what is being recognized. For example, if you have a combination of thick and thin edges, you only want to erode the thick ones and you only want to erode the thick edges until they are thin and smooth. If you posterize an image, you don't want to cross regions separated by long smooth edges. You don't want to blur away noise in the image because when the noise is localized, you can use it to identify what the object is (similar to the way shazam uses high frequencies to identify songs).
OpenCV, simpleCV etc are very useful libraries for toying around with images. You can get interesting results without a lot of effort. But the more serious you get about image recognition, the more you find that you can't use them globally across the image. Finding the yellow car in the parking spot is a good example of the usefulness of the library and also its simplistic capabilities. It recognized a yellow patch in the image and it doesn't recognize a car in a general way. When you're ready to write code to recognize a car, you'll probably find you can't use openCV libraries.
What has worked amazingly well for me, is to create a model of the image areas and then apply transformations based on the model. If you have a sharp foreground and a blurry background, don't run the recognition algorithms that rely on sharp edges on the blurry background.
A brief note from experience: Viola Jones is only well suited to images that are oriented in a specific direction (e.g. faces), rather than rotation-invariant (e.g. cars as seen from satellites).
This is kind of a pattern with CV: it's still far more "art" than "science" at this point.
Not hard at all. That's why the whole thing was done over the summer of '66 at CSAIL/MIT as a grad student project. Yes, nearly fifty years later it's safe to say that some progress has been made, but a solution to the general problem is still "just a few months away".
Unfortunately computer vision libraries and toolsets are the worst offenders of making each algorithm a 'hammer' when traditional problems are not a 'nail'. A good example is "Detecting a Car" (http://tutorial.simplecv.org/en/latest/examples/parking.html), where the mean color is used to determine if a car is in the image. A nice little exercise, but it oversimplifies the process. Each allows the user to step in a few puddles and then advertises that you'll be able to swim in the ocean.
This statement is an oversimplification. Computer Vision libraries are essentially Swiss army knives. If the person without reasonable understanding of Computer Vision research thinks that he can solve the problem, that is the person's problem. Again I wouldn't blame the naive software engineer who thinks he can solve these problems, because our (extremely complex) visual process makes it all seem very simple. Even Marvin Minsky thought he could get his undergrad Gerry Sussman to “spend the summer linking a camera to a computer and getting the computer to describe
what it saw”. After almost 60 years we aren't very to close that. Sure, we can solve certain problems in most scenarios (like face detection) or many relatively more complex problem in limited variance situations (like machine vision problems). This is the equivalent of a person who have learned C++ recently, using iteration/recursion/design patterns for everything he sees.
I honestly can't tell if you agree or disagree. My sentiment on the subject was that CV libraries are good swiss army knives, but to continue the analogy - you wouldn't tell someone that all you need to survive in the wilderness is said swiss army knife.
Honestly, I thought that was a good introductory tutorial to their library.
It's pretty clear that it's only trying to look for a specific car, in a specific spot. And it's obvious that they're taking advantage of the car's bright color to do this, which isn't generally applicable.
It's a toy algorithm, but it's approachable for somebody with no CV experience.
I agree that the example would be better if it raised questions about the weaknesses of this approach and the next steps you might need for a more general solution. But it is still a legitimate example. I'm an amateur, but I'm pretty sure the world of production grade vision systems is full of hacks like this. A good example is a nose detector on a depth sensor: just look for the closest point to the camera. John Russ's Image Procesing Handbook (amazon.com/Image-Processing-Handbook-Sixth-Edition/dp/1439840458) is full of examples where simple techniques are good enough under controlled circumstances.
True object recognition would require strong AI. Every real life vision system is a compromise that makes many assumptions about its input. You'd be foolish to reach for something like SIFT if you really know that the car is always yellow and it always parks in the same 200x200px square.
OpenCV seems like a good platform for academics to publish their algorithm as code. It's good that to try a new approach I don't need to figure out how to get/compile/install a new tool. But we need something better for doing real work... something that simply offers the most basic building blocks. Easy GPU offloading, easy inputing/outputting of movies or images, cross platform shader tools, colorspace conversion, image filters, etc.
Isn't the most important part of any algorithm whether it will fit the purpose? For the situation they describe (purely as an example) their algorithm seems it will probably work quite well. You're not going to build a humanoid robot using that algorithm, but it is addressing the problem they specified so directly that anything more would be over-engineering the solution. Most real-life problems are going to be addressable using similar techniques.
> You're not going to build a humanoid robot using that algorithm, but it is addressing the problem they specified so directly that anything more would be over-engineering the solution
Clearly.
> Most real-life problems are going to be addressable using similar techniques.
Make it judgmental ("Good o'clock shadow", "Too long, shave it or grow it out", "Too patchy, shave it.") and that would make a useful albeit niche smartphone app.
I'm looking to measure dimensions of objects using computer vision libraries. I think it is impossible to do this without a reference scale in the same image.
You can calibrate the camera, estimating the intrinsic params such as focal length. But for size you would need two images and to know the location each were taken relative to each other (extrinsic params). This is not impossible, just more difficult than a simple calibration and photo snap.
On top of that, you will have to be able to identify the object you are trying to measure (unless it is a manually driven process).
What about follicle width? There is variation from person to person, and even on a single person, but I am sure you could find an average that is suitable.
I have been working on CV with an emphasis on mobile phones/robotics. The big issue is the lack of consistent documentation. It is either scattered all over the web, or outdated. Been writing a document to address this, but it is still not done.
If you want to play with CV, just get a Raspberry pi, the camera module, and have fun. The Rpi makes it easy to move it around without much trouble. You can even add CV to many different appliances with the Rpi.
I love OpenCV. Even with the slightly outdated and incomplete wrapper in Ruby, you can still have a lot of fun with it.
This was posted on HN some time ago and remains one of my favorite creative uses of OpenCV: "So I Suck At 24: Automating Card Games Using OpenCV and Python"
I agree. I recently used OpenCV to process some photos of some documents, and it was very awesome. I particularly liked the python 'cv2' bindings. Not bad. It was my first time really using NumPy, in a deep way, and it actually performed decently.
But, that said, after reading all the OpenCV posts on Stack Overflow and also going through a ton of OpenCV tutorials, I found that the library couldn't meet my needs, in a lot of ways. (Apparently, processing text documents from camera images is a lot harder than I thought!) So, now I'm off reading research papers on image processing, like I don't have anything better to do.
This library seems to be great for fast exploration of new concepts. So I installed it and tested. I went through a small tutorial and tried to load an image from internet and display, unfortunately it didn't work. Connection failed, and image window showed up but no content... This is rather disappointing, and I was really expecting from this library that it will work out of box.
Those who are giving brickbats to SimpleCV might be right but this seems an extremely useful thing to me.
In the robotics courses we often build robots that pick up balls etc. So far we used either openCV or Matlab for image processing but generallly turned out to be pathetic solutions given the steep learning curve with openCV.
I'm not well versed in the usage of OpenCV; but I just can't get the fact that most CV applications will require a more thorough understanding of the subject (both mathematically and software-technically) in order to create something more than a 'nice toy' (I'm waiting to be corrected here)
This was the text in my Computer Vision course. I agree it did a good job and OpenCV was a lot of fun to work with. I would definitely pick it up again if I had interest in this field.
The site seems to have been thoroughly slashdotted ... er, fireballed ... er, hackernewsificated ... honestly, hasn't someone come up with a word yet for when HN does this?
The principles of computer vision are not too difficult to grok, requiring only knowledge of geometry and basic linear algebra. Two popular (dated, but still relevant) introductory texts are referenced to below [1] [2]. However, papers describing techniques that work in practice are often dense reads involving concepts from signal processing, control theory, and probability (e.g., [3]).
I personally enjoy learning about simple techniques that are also very effective in practice (e.g. Viola-Jones detection [4], SIFT points [5], RANSAC [6]), and I hope that simplecv.org might go over these in detail too.
[1]: http://www.robots.ox.ac.uk:5000/~vgg/hzbook/
[2]: http://dl.acm.org/citation.cfm?id=551277
[3]: http://robots.stanford.edu/papers/montemerlo.fastslam-tr.pdf
[4]: http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_dete...
[5]: http://en.wikipedia.org/wiki/Scale-invariant_feature_transfo...
[6]: http://en.wikipedia.org/wiki/RANSAC