Object Tracking

Table of Contents

The camera shows the tracking of different vehicles [1].

Image showing a number of cars and trucks on a highway. The object detection software is able to detect the cars and identify the type (car, truck) by putting a rectangle around it and labelling it.

Computer Vision and OpenCV

Introduction to Computer Vision and OpenCV

Computer vision is useful in detecting objects and their physical characteristics. By looking at pixels and the colour intensities, a number of filters can help identify objects. It is similar to how humans see things. The challenge is being able to recognize more complex objects such as a horse, rather than simple shapes. An image that uses a computer identification program can be seen above. 

An image is an array of boxes that can be filled or empty. As well, there are three layers for the typical RGB photo with a layer of red, blue, and green. Each layer has a 256 resolution where 0 is black, 255 is white, and the numbers in between are in the spectrum of the grey scale (shade of grey). The 256 resolution is equivalent to 28; hence, it is called an 8-bit photo [2].

One of the most common software for programmers is OpenCV. It is written in the language of Python and C++ and is open sourced. There are a number of features that can also be done using the OpenCV terminal: resizing, shapes, warping perspective, colour and shape detection, and more. Since it is a terminal and not a program, it will need to be opened in a program which offers the Python or C++ languages (e.g. Microsoft Visual Studios, PyTorch). There are a number of libraries that will need to be included at the start of the code. OpenCV can also process photos, videos, and live video feed. Different functions will be needed to access it.

For the golf project, our team used OpenCV in order to track the golf ball over a number of frames from the video feed. We were able to use a number of functions after to calculate the spin speed, velocity, and spin angle. More information can be found here: Golf Project.


Installation

To install the OpenCV program, please go to this page link and download the most recent release for your device: OpenCV Open Source. More information on instillation can be found on the website. The OpenCV build code will need to be linked to the program that you are using. The Object Tracking: C++ page can show you more information on this.

Alternatively, MATLAB can be used where the extension will need to be installed. The Object Tracking: MATLAB sub-page will go into more detail on the syntax for the different programs and what to look out for. 


Filters

Filters are a good way to narrow the search of an item. An example of this is binarizing the golf ball where you filter the image to show only a certain spectrum as black and everything else as white (black and white image). The OpenCV functions allow a user to have these function options.

Some common functions for editing the photo include black and white, and canny. There is also the function to dilate an image to decrease thickness or erode to increase the thickness of the lines. For instance, if there is a partial outline around a circle, then eroding the image would make the partial outline thicker and possibly connect the lines creating a circle. This will help enable better object identification.

As well, the threshold function turns the image to black and white (1s and 0s) which makes it easier for object recognition.

RGB vs HSV

RGB stands for Red Green Blue, and HSV stands for Hue Saturation Value. These are two different colour spaces. There are other spaces such as HSL, but these are the two main spaces for computer vision purposes. HSV works well when you are not looking for a specific colour. 

Image of binarization of an apple. There are two images of an apple in this photo. One is the original image in black and white and the other is threshold image that is black and grey (two solid colours with no shade).

The threshold function can be used on an image of an apple to focus on specific attributes [3].



Shapes and Text

Shapes and text have functions as well. The key elements are the source file (img), points (centre, vertices) relative to the image coordinate system, the colour (scalar, use 0-255, 3 layer bit), and the thickness.


Object Tracking

Introduction to Detection and Tracking

It is usually more efficient to detect then track an object than it is to detect an object every frame. Tracking involves using information from the previous frame (e.g. direction, speed) and making a prediction. For a person walking in a linear course, it would be predicted that each step forward would be in one direction. From that, there are potential issues with detection, such as why an object leaves the frame or is covered by another object in the moment (e.g. tracking a person in a video that is in a crowded mall). Tracking must give room for handling some level of occupation (disruptions).

Object tracking is key for the golf ball project. The goal of object tracking is to successfully locate and follow an object over time/successive frames of a video. The goals of object tracking are speed, accuracy, and robustness when in use.

There are different types of filters that can be used for different situations. They will need to be installed on the program or have their libraries downloaded.

Kalman Filter

Signal processing algorithms used to predict the location of moving objects based on prior motion information (e.g. Apollo Mission to moon) [4]. This is only suited for linear systems as problems occur for other systems [4].

Unscented Kalman Filter

The challenge with many filters is it usually depends on a linear path. In reality, many situations do not have a linear path. The Unscented Kalman Filter takes into account a non-linear path in predicting future paths [4].


References

[1] Faizan ShaikhFaizan is a Data Science enthusiast and a Deep learning rookie. A recent Comp. Sc. undergrad, “Building an Object Detection Model from Scratch in Python,” Analytics Vidhya, 24-May-2020. [Online]. Available: https://www.analyticsvidhya.com/blog/2018/06/understanding-building-object-detection-model-python/. [Accessed: 28-Jan-2021].

[2] “The Computer Vision Pipeline, Part 2: input images,” Manning, 18-Aug-2019. [Online]. Available: https://freecontent.manning.com/the-computer-vision-pipeline-part-2-input-images/. [Accessed: 11-Feb-2021].

[3] “Basic Thresholding Operations,” OpenCV. [Online]. Available: https://docs.opencv.org/3.4/db/d8e/tutorial_threshold.html. [Accessed: 13-Apr-2021].

[4] “Learning the Unscented Kalman Filter,” Learning the Unscented Kalman Filter - File Exchange - MATLAB Central. [Online]. Available: https://www.mathworks.com/matlabcentral/fileexchange/18217-learning-the-unscented-kalman-filter. [Accessed: 13-Apr-2021].

Contributors:

UserLast Update
Mayurakhi Khan 1118 days ago
Former user (Deleted)
Former user (Deleted)