I've been interested in computer vision for a long time, but I haven't had any free time to make any progress until this holiday season. Over Christmas and the New Years I experimented with various methodologies in OpenCV to detect road signs and other objects of interest to OpenStreetMap. After some failed experiments with thresholding and feature detection, the excellent /r/computervision suggested using the dlib C++ module because it has more consistently-good documentation and the pre-built tools are faster.
After a day or two figuring out how to compile the examples, I finally made some progress:
- Clone
dlibfrom Github to your local machine:
git clone [email protected]:davisking/dlib.git- Install the
libjpegdependency:
brew install libjpeg- As of this writing,
dlibwon't compile due to weirdness with the system-installedlibjpeg, so the developer suggests modifying line 277 ofdlib/CMakeLists.txtto look like this:
if (JPEG_FOUND AND LIBJPEG_IS_GOOD AND NOT APPLE)
- Compile the example programs that come with
dlib(one of which is the classifier training program):
mkdir dlib/examples/build
cd dlib/examples/build
cmake ..
cmake --build .- You'll also want to compile the
imglabtool so you can mark up images to tell the system what you're searching for:
mkdir dlib/examples/build
cd dlib/tools/imglab/build
cmake --build .
I used the imglab exe to make the file with the boxes. while running the code to build the svm file on certain occasions it fails somewhere so i checked i changed the width and the height to random value it worked but that will increase the chances of misclassifications. How is it the bounding boxes are affecting this process of training?
Theres absolutely no error message the last check point is when it counts the no of images and then the crash
so is there a certain aspect ratio to maintained while drawing the bounding box over the object? because certain occasions the default window size 80 x 80 does not seem to work unless changed to 50 x 50. What features should be common? similar height, width , aspect ratio , area etc..