You can see it when you look out your window...

In which we make OpenCV play nice with Anaconda under Windows and roll libjpeg-turbo in along the way.

You may have noticed, if you have been reading my blog, that I like to prototype in Python.  I find Python to be almost as fun as Lisp when it comes to building up a solution without knowing exactly what the solution is going to look like.  The reason I typically start in Python is the wealth of libraries; however, managing a Python install can become a real pain.  Thanks to the people over at Continuum Analytics, you can be spared most of it.

Continuum Analytics produces a fantastic distribution known as Anaconda (or Miniconda depending on the route you choose).  I won't go into all the reasons I think you should give it a try, but if you program in Python, you should check them out.  The only shortcoming I have come across is the Windows build does not include OpenCV, but we will remedy that today.  Along the way, I will also show you how to build OpenCV against libjpeg-turbo for better jpeg performance.

Things you will need:

  1. CMake: Used to generate makefiles/solutions to build OpenCV.
  2. Visual Studio: This article is geared towards VS, but you should be able to follow roughly the same steps for other compilers.
  3. OpenCV: The reason for this excursion.  I downloaded the self extracting Windows package.  You can just grab the source if you like.
  4. libjpeg-turbo: An optimized crossplatform library for working with jpegs.
  5. Anaconda (or Miniconda): My (and soon to be yours) favorite Python distribution.

 

Caution:

Make sure your versions match, i.e. if you install 64 bit Anaconda, install the 64 bit version of libjpeg-turbo and build OpenCV for 64 bits.

If you use Miniconda to make your default Python installation Python 3 (as I have done) you will need to make a Python 2 environment.  Please see here.

 To the Bat Cave!

  1. Run the self extracting OpenCV archive if necessary, and extract OpenCV to a place you will remember, e.g. C:\OpenCV
  2. Run the installer for libjpeg-turbo
  3. If your default Python install is Python 3
    1. Right click on the cmake/bin directory while holding down Shift
    2. Click "Open command window here"
    3. Activate your Python 2 environment with
      activate ENV_NAME
  4. Launch cmake-gui (from the command prompt in step 3 if necessary)
  5. Set "Where is the source code:" to the path in step 1
  6. Set "Where to build the binaries:" to where you normally build libs, e.g. C:\Libs\OpenCV
  7. Click Configure and wait for it to complete
  8. Check the Advanced checkbox
  9. Locate the option BUILD_JPEG and uncheck it
  10. Click the Add Entry button and add
    • Name: JPEG_LIBRARY
    • Type: FILEPATH
    • Value: path to the static libjpeg-turbo64 e.g. C:/libjpeg-turbo64/lib/jpeg-static.lib
  11. Click the Add Entry button and add
    • Name: JPEG_INCLUDE_DIR
    • Type: PATH
    • Value: path to the libjpeg-turbo64 include e.g. C:/libjpeg-turbo64/include
  12. If you see a message in the Configure output that says PYTHON_INCLUDE_DIR and/or PYTHON_LIBRARY are not found, you will need to set them by hand by clicking in the Value column and entering the path. On my system (Windows 8.1 Pro) these are C:/Users/lemoneer/Miniconda3/envs/python-2/include and C:/Users/lemoneer/Miniconda3/envs/python-2/libs/python27.lib respectively.
  13. Click Configure again and wait for it to complete
  14. Verify all is well in the output of Configure
  15. Click Generate
  16. Open the generated solution file in your build directory from step 6
  17. Change the build to Release
  18. Build the Solution
  19. Expand CMakeTargets in the Solution Explorer
  20. Right click on INSTALL
  21. Click Build
  22. Once the build is complete, update your path to include the OpenCV dlls
  23. You may Clean the Build at the solution level to remove intermediate files you do not need (this wont affect the install).

 

It is worth nothing that there are other build options worth considering like building the examples.  Enable/disable to your heart's content then click Configure followed by Generate and Build.

If all goes well, you will have OpenCV dlls you can use from C/C++ as well as from your Python 2 environment.

‘Cause I got something for you. It is shiny, it is clean.

In which we use OpenCV to track a laser.

Building on my article about using HSV thresholding to isolate features in video frames, below is a video showing the use of OpenCV to track a red laser.  The majority of the algorithm is the same as presented in the previous article; the only addition is actually determining the area of interest after thresholding.  In order to preventing spoiling anyone's fun, I am not going to post the code as of now.

Hint: think about how to find the largest area in white after thresholding.

If you would like to compare algorithms or just want to know what I did, drop me a note via the contact form.

In preliminary testing, the algorithm seems robust enough to handle changing lighting conditions as well as differing background colors; however, the angle of reflection of the laser can cause issues.  For best results, the laser should be relatively perpendicular to the image plane.  As the laser approaches being parallel to the image plane, less laser light is reflected back to the camera.  The same applies to irregular objects or surfaces that scatter the laser at angles away from the camera.  This can be seen at the end of the video when the laser is reflected off the lamp shade.

Testing was done using a Logitech C920.  Depending on the laser used and ambient lighting conditions, turning off auto white balance may prevent the intensity of the laser from being interpreted as white.

As far as applications, I leave that to the reader.

I suggest you gentlemen invent a way to put a square peg in a round hole.

In which we optimize OpenCV on the BeagleBone Black.

If you have been paying attention to the BBB group on Google Groups, you may have discovered a lively thread on webcams [1]. As part of this thread, I have been working with Michael Darling to realize the best performance possible when using OpenCV to process an MJPG stream from a webcam on the BBB.  OpenCV relies on libjpeg when loading the MJPG frames to OpenCV Mats, and libjpeg is not the fastest of jpeg libraries.  However, there is another option - libjpeg-turbo [2].

While it is technically possible to compile OpenCV with libjpeg-turbo support on the BBB, you will have fewer issues and spend less time compiling if you use a cross compiler on a more powerful computer.  Michael has written a guide to cross compiling OpenCV.  Below you will find the guide as a webpage for online viewing or a pdf for download as well as the latest code to capture frames from a webcam and convert them to OpenCV Mats.  The guide is currently a draft, and we would appreciate any feedback you can provide.  Many thanks to Michael for taking the time to write this up.

Note: The code in the guide and the latest code are slightly different.

  1. -o is used to indicate which frames to convert to OpenCV Mats and requires an integer argument.
        -o 1 will convert every frame
        -o 2 will convert every 2nd (ever other) frame, and so on.
        The default is 1.
  2. -p is similar to -o in the original framegrabber.  However, it doesn't actually output anything.  It just controls if any frames are to be converted.
  3. Captured count and processed count variables have been renamed and moved to the top.
  4. Formatting has been corrected.
  5. Setting of the Mat size now uses the width and height variables.

 

The guide will be updated to reflect these changes in a future release, but in the meantime, you will need to adjust the command line arguments specified in the guide, namely replace -o with -p.

Guide [DRAFT]
    BBB_30fps.html
    BBB_30fps.pdf (204.32 kb)
Code
    framegrabberCV.c (19.77 kb)

[1] problems with webcams
[2] libjpeg-turbo

I get the news I need on the weather report

In which we publish a MJPEG stream from the BeagleBone Black.

Continuing with the series of post on OpenCV, webcams, and MJPEG, today we will look at streaming an MJPEG capture from the BBB.  Before I get into it though, you should know that I did try FFMPEG/avconv and VLC to stream video from the BBB using rtp, but the several seconds of latency made it unsuitable for my needs.  You should also know that I do not claim that this is the one true way to stream video from the BBB.

The libraries used:

  • ZeroMQ[1] and CZMQ[2] - used to create pub/sub connections between the BBB and software running on a desktop
  • OpenCV[3] - used to display the MJPEG stream

 

The subscriber was compiled and tested under Windows 7 using Visual Studio 2012; however, the code should compile under Linux with very few, if any, modifications.

Backgound:

I am working on a project that requires a video stream from the BBB be consumed in N places where N is a minimum of 2.  The stream will be processed using OpenCV, and because of the nature of this project, I need as little latency in the video stream as possible.

Theory of Operation:

The BBB will capture frames in MJPEG format from a webcam via a modified version of framegrabber.  The modified version of framegrabber can run indefinitely and outputs the frames as a series of ZeroMQ messages over a publish socket.  The clients will subscribe to the publish socket on the BBB using ZeroMQ and load each frame received into OpenCV.

The ZeroMQ pub/sub configuration allows many clients to connect to the published stream.  No synchronization is used between the publisher and subscriber; the the stream is treated as continuous, and the subscribers are free to connect and disconnect at will.

Results:

Single subscriber 640x480 - cpu use on the BBB ~4.3% and memory use ~0.8%

Multiple subscribers 640x480 - cpu use on BBB ~6.6% and memory use ~0.8%

Single subscriber 1920x1080 - cpu use on BBB ~23.2% and memory use ~3.5%

Using this setup, I have been able to stream frames with a resolution as high as 1920x1080 with little to no latency, but there is a limitation, the network.  When using this over Wifi with high resolutions or several clients running on one machine, I noticed the frame rate would drop the further I went from the router.  If you watch the output of the top command on the BBB as you get further from the router, you will see framegrabber's memory use begin to climb.  This is due to the publish socket buffering the data.  As you walk back towards the router, you will see the memory use drop until it, and the frame rate, stabilizes.  During this stabilization period you will probably experience delayed video that is displayed at a higher frame rate than normal as the buffer is flushed.

There are several things you can do reduce or eliminate this latency.

  1. If possible, use a wired connection
  2. Use an 802.11 N router and clients
  3. Make sure your WiFi router is optimally located
  4. Adjust the QoS settings of your router to give higher or highest priority to the traffic on the port you publish over

 

To reduce the amount of time it takes for subscribers to catch up once their connection has improved, the high water mark on the socket can be reduced.  This has the effect of dropping frames once too many are buffered and essentially reduces the amount of buffered data a subscriber has to process to get in sync.

The reader may find it interesting that it does not matter if the BBB publisher or the subscribers are started up first.  The publisher will simply dump data until at least one subscriber connects, and the subscribers will wait on the publisher.  In addition, you can kill the publisher while subscribers are connected, restart it with new (or the same) settings, and the subscribers will continue on.  The reader should verify this by changing the resolution after the subscriber or subscribers have connected.

Code:

framegrabberPub.c (17.96 kb) Publisher - you will need zhelpers.h

compile with

gcc framegrabberPub.c -lzmq -o framegrabberPub 

framegrabberSub.c (3.79 kb) C client

[BONUS]
framegrabberSub.py (2.52 kb) Python client

The Python client will display the stream with little latency until garbage collection occurs.  When this happens, the display will freeze and the buffered data on the BBB will increase.  Once garbage collection completes, the display will eventually synchronize much like the WiFi issue detailed above.

[UPDATE]
If your wireless router is capable of broadcasting at both 2.4 and 5 Ghz at the same time, you can improve performance when using a WiFi connection for both the publisher and the subscriber by having one connect at 2.4 Ghz and the other connect at 5 Ghz.

[UPDATE]
Added a link to zhelpers.h needed to compile the publisher.

[1] http://zeromq.org/
[2] http://czmq.zeromq.org/
[3] http://opencv.org/

I said, "Do you speak-a my language?"

In which we learn how to turn a buffer of bytes into an image OpenCV can work with.

If you followed my previous post, you may have realized you can produce, with very few modifications, a stream of MJPEG (jpeg) images captured from the webcam. You may also be left wondering how to work with these images using OpenCV. It might actually be easier than you imagine.

The little bit of Python code below will display a single image. The code loads an image from a file into a buffer, converts the buffer into an OpenCV image, and displays the image. If you are receiving the image as a buffer already, you can forgo the loading step.

import cv2
from cv2 import cv
import numpy as np

# let's load a buffer from a jpeg file.
# this could come from the output of frameGrabber
with open("IMAGE.jpg", mode='rb') as file:
    fileContent = file.read()

# convert the binary buffer to a numpy array
# this is a requirement of the OpenCV Python binding
pic = np.fromstring(fileContent, np.int8)

# here is the real magic
# OpenCV can actually decode a wide variety of image formats
img = cv2.imdecode(pic, cv.CV_LOAD_IMAGE_COLOR)

cv2.namedWindow("image")
while True:
    cv2.imshow("image", img)
    
    key = cv2.waitKey(20)
    if key == 27: # exit on ESC        
        break
    
cv2.destroyAllWindows()

For those of you working in C, you can convert a buffer into an image with this (tested in Visual Studio 2012)

#include <stdio.h>
#include <malloc.h>
#include "opencv2\core\core_c.h"
#include "opencv2\highgui\highgui_c.h"

// used to load a jpeg image into a buffer
int load_buffer(const char *filename, char **result) 
{ 
	int size = 0;
	FILE *f = fopen(filename, "rb");
	if (f == NULL) 
	{ 
		*result = NULL;
		return -1; // -1 means file opening fail 
	} 
	fseek(f, 0, SEEK_END);
	size = ftell(f);
	fseek(f, 0, SEEK_SET);
	*result = (char *)malloc(size+1);
	if (size != fread(*result, sizeof(char), size, f)) 
	{ 
		free(*result);
		return -2; // -2 means file reading fail 
	} 
	fclose(f);
	(*result)[size] = 0;
	return size;
}

int main() 
{ 
	char *buffer; 
	int size;
	CvMat mat;
	IplImage *img;

	// load jpeg file  into buffer
	// this could come from the output of frameGrabber
	size =load_buffer("IMG.jpg", &buffer);
	if (size < 0) 
	{ 
		puts("Error loading file");
		return 1;
	}

	// create a cvMat from the buffer
	// note the params: height, width, and format
	mat = cvMat(1080, 1920, CV_8UC3, (void*)buffer);
	// magic sauce, decode the image
	img = cvDecodeImage(&mat, 1);

	// show the image
	cvShowImage("image", img );

	// wait for a key
	cvWaitKey(0);

	// release the image
	cvReleaseImage(&img);


	return 0;
}

For further information see the OpenCV Docs.

OpenCVjpeg.py (2.24 kb)
OpenCVjpeg.c (2.78 kb)

you gotta keep 'em separated

In which we threshold video frames using HSV values.

In my previous post, I provided a tool that allows one to determine the H, S, and V values of a pixel or pixels in an image for use in setting a threshold on an image or video frame.  Today, I will show you what these values are good for, namely setting a threshold on frames in a video or webcam stream.

You may download the Python code here HSVthresholder.
The code is copyright © 2013 Matthew Witherwax and released under the BSD license.

To use the application, simply supply a command line argument for either a video file or a webcam using -v.  In the case of a webcam, specify dev# where # is the number of the device.  If you only have one webcam, you should be able to pass dev0; otherwise, you may start at dev1 and increment until you find the webcam you would like to use.  In the case of a video file, the application will loop the video indefinitely until you exit with esc.  For other options, pass -h on the command line.

The general flow of the program is:

  1. Capture frame from video or webcam
  2. Dilate image
  3. Convert to HSV
  4. Threshold image
  5. Erode image
  6. Display image

 

You can find more information about dilation and erosion here [1] and the function inRange() used for thresholding here [2].

All the interesting bits in the code are documented with comments, but I would like to draw attention to a two operations with parameters you may wish to tune.

In both dilating and erosion, a structuring element is supplied.  Possible values to adjust are the element itself which may be one of cv2.MORPH_RECT, cv2.MORPH_CROSS, or cv2.MORPH_ELLIPSE as well as the size, the second argument to cv2.getStructuringElement.  In addition, you may adjust the number of times the operation is applied by adjusting the iterations parameter.  While tuning these parameters, you might find it beneficial to create a third named window to display the frame after the dilation so you can see what effects your changes have.

Dilating

# here we dilate the image so we can better threshold the colors
if self.dilate:
	element = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
	frame = cv2.dilate(frame, element, iterations=5)


Erosion

# here we erode the image to remove noise
if self.erode:
	element = cv2.getStructuringElement(cv2.MORPH_RECT,(5,5))
	frameHSV = cv2.erode(frameHSV, element, iterations=2)

And here are some screenshots that show the application isolating the red grip on a pen.  The red grip show up as white while the rest of the image is blacked out.  If you look closely, you will see some stray white dots.  Depending on your needs you may either ignore these or tune the parameters until they disappear.  Now that the grip has been isolated, you can track it, but that is a topic for another post.

[1] http://docs.opencv.org/doc/tutorials/imgproc/erosion_dilatation/erosion_dilatation.html
[2] http://docs.opencv.org/modules/core/doc/operations_on_arrays.html#cv2.inRange

HSVthresholder.py (6.38 kb)

Was that Imperial Red, Lust, or Crimson? ...I am pretty sure it was just red.

In which we find the H, S, and V values for an image's pixels.

Recently I have been working on an application that requires me to locate a small colored dot in an image or video frame.  In order to accomplish this, I have been using OpenCV.  For reasons outside the scope of this post, I am working with the images/frames in the HSV color space.  You can find more information about HSV in general and OpenCV's HSV implementation here [1].  There you will also find a nifty tool - ColorWheelHSV - to help you visualize OpenCV's HSV implementation.  However, if you are like me, you might find it difficult to determine proper H, S, and V values by eyeballing your target and the output of ColorWheelHSV.

Enter HSVpixelpicker.

The Python script below will allow you to determine the H, S, and V values of a given pixel in an image as OpenCV sees it.  Simply pass the image file name with -f on the command line, click on the pixel you are interested in, and the H, S, and V values will be printed to the console.  Using this tool, you can check the exact value of the pixel or pixels you are interested in.  The more samples you have to test, the better the idea you will get of the H, S, and V values you are dealing with.  You can then refer to ColorWheelHSV to select your ranges.

The code is copyright © 2013 Matthew Witherwax and released under the BSD license.

from optparse import OptionParser
import cv2
from cv2 import cv

class HSVpixelpicker:
    def __init__(self):
        # original image
        self.image = None
        # image converted to HSV
        self.imageHSV = None
        
        # create window to show image
        cv2.namedWindow('image')
        # wire click handler
        cv.SetMouseCallback('image',self.on_mouse, 0);
        
    # handles left clicking on the image
    # gets the pixel under the cursor and prints its HSV values
    def on_mouse(self, event,x,y,flag,param):
      if(event==cv.CV_EVENT_LBUTTONDOWN):
        pixel = self.imageHSV[y][x]
        print 'H:',pixel[0],'\tS:',pixel[1],'\tV:',pixel[2]
        
    def open_image(self, filename):
        self.image = cv2.imread(filename)
        self.imageHSV = cv2.cvtColor(self.image,cv2.COLOR_BGR2HSV)
        
    def show(self, filename):
        self.open_image(filename)
        while True:
            cv2.imshow('image', self.image)
            
            # show image until user presses the esc key
            if cv.WaitKey(10) == 27:
                break
            
        # clean up
        cv2.destroyAllWindows()

if __name__ == "__main__":
    parser = OptionParser()
    parser.add_option('-f', '--file', action='store',
        type='string', dest='filename')
    (options, args) = parser.parse_args()

    hsv_picker = HSVpixelpicker()
    hsv_picker.show(options.filename)

[1] http://www.shervinemami.info/colorConversion.html

HSVpixelpicker.py (3.03 kb)