Object Detection and Counting

Object detection, especially recognition can be done using different technics, like a combination of OpenCV functions. For me, it was rather interesting to build a quick model in R then to spend weeks writing long C++ or .NET code for it. I started with a people counter as a practical application for the object detection and took a footage of people passing by the office.

The first thing needed is to prepare images from video using FFmpeg. Then choose a background image and create a matrix of difference between an image with an object on it and the background one. As it can be found on my blog I have created an R library for raster image processing and vectorization – fasteraster, this can be used for object detection, thus the idea was to vectorize the matrix of differences by some gradient-detected zones.

On this picture, there is a variant #1, where the matrix of pictures was represented by simple RGB values. Their comparison gave me a strong object detection of the person’s shadow (see in the course code):object detection - shadows

Thus next idea (variant #2 in code) was to calc differentials between colors (Red / Green, Green / Blue) and then compare to the background. This cleaned shadow detection but introduced another issue with a lot of detection of dark areas that probably caused by poor CMOS-camera color detection capabilities:object detection - dark areas

Then I decided to subtract the colors (Red – Green, Green – Blue) and it worked just fine. I also added filtering for detected zones weight and shown these on the video.As you can see, there is another problem when a black object moves through the black background  – it is being split into two or three parts:

In this case, I just added code to join the areas and calculate the new center of the joined object. Added track line and two green margins to detect that object passed both in the same direction:

As one can see the model itself took a page of code, most of it was for the visualization. However following items was not included in the model:

  • background image – it has to adjust to the weather, daytime and other conditions (like somebody left a bag in the middle of the observation area).
  • count the objects – simply check the vectors crossed the green margins
  • multiple objects detection – needs identification algorithm based on the path approximation.
  • joined objects recognition – needs clusterization of shape medians to split the joined area into smaller ones by average weight and path approximation.
Object detection R source code

X <- 48 * 2 ;
Y <- 27 * 2;
from <- 140;
to <- 200;

matrixFromFrame <- function (idx)
  v <- readPNG(sprintf("in/%03d.png", idx));
  rgb <- lapply(1:3, function(x) as.matrix(aggregate(raster(v[ , , x]), fact = 5)));
  rgb <- lapply(rgb, function(x) t(x)[1:X, Y:1]);
  #1 return(rgb);
  #2 return(list(rgb[[1]] / rgb[[2]], rgb[[2]] / rgb[[3]]));
  return(list(rgb[[1]] - rgb[[2]], rgb[[2]] - rgb[[3]], (rgb[[1]] + rgb[[2]] + rgb[[3]]) / 3));

processFrame <- function(idx, back)
#  png(file = sprintf("out/final%03d.png", idx), width = 640, height = 480);
  rggb <- matrixFromFrame(idx);
  diff <- (rggb[[1]] - back[[1]]) ^ 2 + (rggb[[2]] - back[[2]]) ^ 2;
  pol <- raster2vector(diff, 0.001, 100, 100);
  plot(0, type = "l", xlim = c(1, X), ylim = c(1, Y));
  rasterImage(readPNG(sprintf("in/%03d.png", idx), native = TRUE), 1, 1, X, Y);
  abline(v = 30, col = 'green');
  abline(v = 70, col = 'green');
  lapply(pol, function(x) lines(rbind(x, x[1,]), col = 'blue'));
  zone <- rasterZoneAnalyzer(diff, 0.001, 100, 100);
  zone <- zone[zone[ , 2] > 10, , drop = FALSE];
  #text(zone[ , 3], zone[ , 4], labels = zone[ , 2], col = 'red');
  track[[idx - from + 1, 1]] <<- sum(zone[, 2] * zone[, 3]) / sum(zone[, 2]);
  track[[idx - from + 1, 2]] <<- sum(zone[, 2] * zone[, 4]) / sum(zone[, 2]);
  lines(track, col = 'red');
  points(track, col = 'red', pch = 20);
#  dev.off();

track <- matrix(nrow = to - from + 1, ncol = 2);
back <- matrixFromFrame(100);
lapply(from:to, function(x) processFrame(x, back));

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi

The Natural Ear for Digital Sound Processing – as an alternative to the Fourier Transformation

This is a primitive prototype of the natural ear. Why I came to it and how it can be better than the Fast Fourier Transformation (FFT) in Digital Sound Processing (DSP) – that what the article is about.

Some of the software development projects that I was related to used Fourier Transformation for waveform analysis. The projects Included sound tone recognition for gun targets and DTMF signals. But before that, I was keen to get a “picture” of the human speech and music harmony. Recently I started an app that will play some instrument while I am playing a lead guitar. The problem was to teach the computer to listen to my tempo and keep the musical rhythm in order. To accomplish this I used Fourier Transformation for the first seconds of Pink Floyd composition “Marooned”. Then I compared the “picture” to the same composition performed by me and the results were poor until I selected FFT block size as much as 8192 to recognize notes at least to 6th octave.

This showed the first problem with Fourier Transformation – for really good analysis you need to increase block size (on a number of frequency bins) and, as result, performance goes down, especially for real-time processing.

The second problem of Fourier Transformation analysis for music – the same instrument depending on the timbre can generate the different set of overtones. These overtone frequencies analyzed by FFT created peaks that were irrelevant to what we actually hear. To generalize the result I summarized the frequency bins by twelve semitones. The picture was better, but now the very first note recognized as C, while it was B in fact:

This forced me to read more about the nature of sound, hearing and human ear. I thought that maybe the problem is the third problem with Fourier Transformation – it is sensitive to the signal phase. The human ear does not recognize phase of individual harmonics, only frequencies.

I created a model using R language (you can find the code at the end of the article) that generates input signals for a set of frequencies:

Then used some formulas I combined fifteen years ago ( the same experiment failed due to the poor PC performance) to create a model of a pendulum. The object can receive an incoming signal and oscillate if there is a frequency in the signal that is the same it’s own:



The fading coefficient that does not depend on the auto-oscillation frequency of the pendulum:

The position of the pendulum:

Velocity and energy:

This is a reaction of the pendulum on the same frequency signal:

green – input signal
blue – pendulum oscillation
red – pendulum energy

For the input signal that slightly differs from the frequency of the pendulum the amplitude and energy are significantly smaller than in the previous result:

Combined plot for nine different signals – the central one has been recognized:

After that, I built a set of pendulums for different frequencies to cover five octaves and twelve notes. This is resulting energy for 60 pendulums listening to the first chords of “Marooned”:

And as result, the main tone was detected correctly. I think that ability of the human ear to omit the phase information of the input signal is crucial for the music recognition. I used this model to create a C++ library named Cochlea to listen, detect and synchronize music in real-time. That will be described in next article.

R code

#define a class that imitate a pendulum and has two methods - init and tick
pendulum <- setRefClass(
  fields = list( v = "numeric",
                 x = "numeric",
                 K = "numeric",
                 T = "numeric",
                 Phi = "numeric",
                 E = "numeric",
                 lastS = "numeric"),
  methods = list(
    #define the initial state and calculate coefficients
    init = function(w = "numeric")
      T <<- 44100 / w;
      #coefficient of elasticity
      K <<- (2 * pi / T) * (2 * pi / T);
      #fading coefficient
      Phi <<- 2 * atan(T) / pi;
      #initial state
      v <<- 0;
      x <<- 0;
      lastS <<- 0;
    #pass the position of the stimulating lever
    tick = function(s)
      lastX <- x;
      x <<- x + (v + s - lastS - K * x) * Phi;
      v <<- x - lastX;
      E <<- (v * v) / 2 + (K * x * x) / 2;
      lastS <<- s;
      return(c(x, E));

#create one pendulum and init with 700 as frequency of auto-oscillation
p <- pendulum();

#init a vector of waveforms with frequencies from 500 to 900
m <- aaply(seq(500, 900, 50), 1, function(x) sine(x, 1500)@left);

# clear end of the waveform
m[, 1001:1500] <- 0;

#apply the pendulum tick to the vector of waveforms
m <- t(m);
r <- aaply(m, c(1, 2), p$tick, .progress = "time");

#index of the waveform to  plot
i <- 5;

#show results
plot(m[, i] * 100, type = "l", col = "dark green");
lines(r[ , i, 1], type = "l", col = "blue");
lines(r[ , i, 2], type = "l", col = "red");

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi

Vectorization of raster to polygons

I would never start to write any vectorization code if there was any free library. However, recently I was involved as a tech lead in an interesting project related to geometry. We needed to calculate complex projections of numerous shapes. As we started with “clean” mathematical solution, we quickly end up with a huge number of calculations related to polygon triangulation – O(log n!). We spent a week and I found that we are in a real trouble – for real scenarios the process lasted for minutes. Then I decided to turn our faces to discretization (as it was declared at the beginning) and we did the job and got the result in a form of a matrix. It was Friday and on the next Monday, we ought to present the results. But in a vector form.

Search for raster vectorization in R packages shown this package and the function rasterToPolygons looked good, perhaps it was producing too many points for the polygons. If there is no needed R package then I need to create own. Well, still having a half of a weekend, I did some C++ coding and created this function that does the job in one pass.

Initial bitmap with an enclave and an option to do not allow exclaves and the result:

initial imagevectorization result #1

and an example from the volcano dataset:

 inp = volcano;
 res = raster2vector(volcano, 120, 200, 20);
 image(inp, col = rev(grey.colors(100)), useRaster = TRUE)
 plot(0, type = "l", xlim = c(0, nrow(inp)), ylim = c(0, ncol(inp)))
 a = lapply(res, function(x) lines(rbind(x, x[1,])))


volcano imagevectorization result #2

You can find source package: fasteraster_1.0.4.tar

and Linux 64 binary here: fasteraster_1.0.4_R_x86_64-pc-linux-gnu.tar

or get a fresh version right from CRAN: https://cran.r-project.org/package=fasteraster

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi