Lviv AI & Big Data Day collected Data Science people from Ukraine

A ticket to AI & Big Data Day worth its cost. Data Science only touches grand IT business in Lviv and most of the conference participants were true amateurs in this magic in a good sense of the word.

NLP and Deep Learning

The first speaker was Volodymyr Getmanskyi with NLP and NLU in details. What was interested that in this first line of Data Science technologies that were able to show business value quickly they still build models from scratch and using monsters like IBM Watson does not add much value to solve really specific problems.

Continuing talking about speakers from the Eleks company for me it was very interesting to hear about practical modeling of a system that recognizes what people wear from Sergey Shelpuk and Olha Romaniuk. They build and teach a Convolution Neural Network based on 154 layers of Con-Con-MPool design (actually adjusted architecture from the winner of the ImageNet challenge). What was interesting that the main training session lasted for three days using TensorFlow on two Nvidia 1080 GPUs and one CPU.

Data Science in Business and Legal

I visited a business section of the conference to listen Mamed Khalilov about probs in AI startups. His main message was – hey my high-tech friends that start a business in AI – forget about ML (or any other buzzword) in stating goals for you and the team, marketing services or a product to your clients – focus on the business value and use ML as a tool to achieve it. And of course knowledge of the business field – data understanding expertise goes before the deep learning latest trends and other AI techniques.

Ivan Horodyskyy gave us a good head-up on near future legal issues with AI and Robotics in EU. Interesting thing was that the definition of the robot was taken from Golem, Frankenstein, Chapek’s robot and Asimov’s Laws. The main question was the liability, as it turns out, the robot cannot be liable and everything falls on the user/ trainer/ owner/ manufacturer/ designer chain. Hopefully, the EU institution clearly stands that the object of the regulations must have a physical body which is not a concern for a software product as long as it was not placed on some hardware.

Computer Vision

My favorite topic on the conf was from Dmytro Peleshko who is working on computer vision problems. His ideas and results in background detection, classification, object tracking and recognition inspired me for new steps in my works related to object movement tracking. We had a very good discussion on the problem of multiple objects intersection and identification in distributed cameras system during which I shared my solution on the shadow suppression problem.

I liked the conf pretty much, the only thing to adjust is the frequency of such events. One year means a lot if we count it from the Moore’s Law standpoint.

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi

Object Detection and Counting

Object detection, especially recognition can be done using different technics, like a combination of OpenCV functions. For me, it was rather interesting to build a quick model in R then to spend weeks writing long C++ or .NET code for it. I started with a people counter as a practical application for the object detection and took a footage of people passing by the office.

The first thing needed is to prepare images from video using FFmpeg. Then choose a background image and create a matrix of difference between an image with an object on it and the background one. As it can be found on my blog I have created an R library for raster image processing and vectorization – fasteraster, this can be used for object detection, thus the idea was to vectorize the matrix of differences by some gradient-detected zones.

On this picture, there is a variant #1, where the matrix of pictures was represented by simple RGB values. Their comparison gave me a strong object detection of the person’s shadow (see in the course code):object detection - shadows

Thus next idea (variant #2 in code) was to calc differentials between colors (Red / Green, Green / Blue) and then compare to the background. This cleaned shadow detection but introduced another issue with a lot of detection of dark areas that probably caused by poor CMOS-camera color detection capabilities:object detection - dark areas

Then I decided to subtract the colors (Red – Green, Green – Blue) and it worked just fine. I also added filtering for detected zones weight and shown these on the video.As you can see, there is another problem when a black object moves through the black background  – it is being split into two or three parts:

In this case, I just added code to join the areas and calculate the new center of the joined object. Added track line and two green margins to detect that object passed both in the same direction:

As one can see the model itself took a page of code, most of it was for the visualization. However following items was not included in the model:

  • background image – it has to adjust to the weather, daytime and other conditions (like somebody left a bag in the middle of the observation area).
  • count the objects – simply check the vectors crossed the green margins
  • multiple objects detection – needs identification algorithm based on the path approximation.
  • joined objects recognition – needs clusterization of shape medians to split the joined area into smaller ones by average weight and path approximation.
Object detection R source code
library("png");
library("raster");
library("fasteraster");

X <- 48 * 2 ;
Y <- 27 * 2;
from <- 140;
to <- 200;

matrixFromFrame <- function (idx)
{
  v <- readPNG(sprintf("in/%03d.png", idx));
  rgb <- lapply(1:3, function(x) as.matrix(aggregate(raster(v[ , , x]), fact = 5)));
  rgb <- lapply(rgb, function(x) t(x)[1:X, Y:1]);
  #1 return(rgb);
  #2 return(list(rgb[[1]] / rgb[[2]], rgb[[2]] / rgb[[3]]));
  return(list(rgb[[1]] - rgb[[2]], rgb[[2]] - rgb[[3]], (rgb[[1]] + rgb[[2]] + rgb[[3]]) / 3));
}

processFrame <- function(idx, back)
{
#  png(file = sprintf("out/final%03d.png", idx), width = 640, height = 480);
  rggb <- matrixFromFrame(idx);
  diff <- (rggb[[1]] - back[[1]]) ^ 2 + (rggb[[2]] - back[[2]]) ^ 2;
  pol <- raster2vector(diff, 0.001, 100, 100);
  plot(0, type = "l", xlim = c(1, X), ylim = c(1, Y));
  rasterImage(readPNG(sprintf("in/%03d.png", idx), native = TRUE), 1, 1, X, Y);
  abline(v = 30, col = 'green');
  abline(v = 70, col = 'green');
  lapply(pol, function(x) lines(rbind(x, x[1,]), col = 'blue'));
  zone <- rasterZoneAnalyzer(diff, 0.001, 100, 100);
  zone <- zone[zone[ , 2] > 10, , drop = FALSE];
  #text(zone[ , 3], zone[ , 4], labels = zone[ , 2], col = 'red');
  track[[idx - from + 1, 1]] <<- sum(zone[, 2] * zone[, 3]) / sum(zone[, 2]);
  track[[idx - from + 1, 2]] <<- sum(zone[, 2] * zone[, 4]) / sum(zone[, 2]);
  lines(track, col = 'red');
  points(track, col = 'red', pch = 20);
#  dev.off();
}

track <- matrix(nrow = to - from + 1, ncol = 2);
back <- matrixFromFrame(100);
lapply(from:to, function(x) processFrame(x, back));

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi

The Natural Ear for Digital Sound Processing – as an alternative to the Fourier Transformation

This is a primitive prototype of the natural ear. Why I came to it and how it can be better than the Fast Fourier Transformation (FFT) in Digital Sound Processing (DSP) – that what the article is about.

Some of the software development projects that I was related to used Fourier Transformation for waveform analysis. The projects Included sound tone recognition for gun targets and DTMF signals. But before that, I was keen to get a “picture” of the human speech and music harmony. Recently I started an app that will play some instrument while I am playing a lead guitar. The problem was to teach the computer to listen to my tempo and keep the musical rhythm in order. To accomplish this I used Fourier Transformation for the first seconds of Pink Floyd composition “Marooned”. Then I compared the “picture” to the same composition performed by me and the results were poor until I selected FFT block size as much as 8192 to recognize notes at least to 6th octave.

This showed the first problem with Fourier Transformation – for really good analysis you need to increase block size (on a number of frequency bins) and, as result, performance goes down, especially for real-time processing.

The second problem of Fourier Transformation analysis for music – the same instrument depending on the timbre can generate the different set of overtones. These overtone frequencies analyzed by FFT created peaks that were irrelevant to what we actually hear. To generalize the result I summarized the frequency bins by twelve semitones. The picture was better, but now the very first note recognized as C, while it was B in fact:

This forced me to read more about the nature of sound, hearing and human ear. I thought that maybe the problem is the third problem with Fourier Transformation – it is sensitive to the signal phase. The human ear does not recognize phase of individual harmonics, only frequencies.

I created a model using R language (you can find the code at the end of the article) that generates input signals for a set of frequencies:

Then used some formulas I combined fifteen years ago ( the same experiment failed due to the poor PC performance) to create a model of a pendulum. The object can receive an incoming signal and oscillate if there is a frequency in the signal that is the same it’s own:

 

Frequency:

The fading coefficient that does not depend on the auto-oscillation frequency of the pendulum:

The position of the pendulum:

Velocity and energy:

This is a reaction of the pendulum on the same frequency signal:

green – input signal
blue – pendulum oscillation
red – pendulum energy

For the input signal that slightly differs from the frequency of the pendulum the amplitude and energy are significantly smaller than in the previous result:

Combined plot for nine different signals – the central one has been recognized:

After that, I built a set of pendulums for different frequencies to cover five octaves and twelve notes. This is resulting energy for 60 pendulums listening to the first chords of “Marooned”:

And as result, the main tone was detected correctly. I think that ability of the human ear to omit the phase information of the input signal is crucial for the music recognition. I used this model to create a C++ library named Cochlea to listen, detect and synchronize music in real-time. That will be described in next article.

R code
library(plyr);
library(tuneR);

#define a class that imitate a pendulum and has two methods - init and tick
pendulum <- setRefClass(
  "pendulum",
  fields = list( v = "numeric",
                 x = "numeric",
                 K = "numeric",
                 T = "numeric",
                 Phi = "numeric",
                 E = "numeric",
                 lastS = "numeric"),
  methods = list(
    #define the initial state and calculate coefficients
    init = function(w = "numeric")
    {
      #period
      T <<- 44100 / w;
      #coefficient of elasticity
      K <<- (2 * pi / T) * (2 * pi / T);
      #fading coefficient
      Phi <<- 2 * atan(T) / pi;
      #initial state
      v <<- 0;
      x <<- 0;
      lastS <<- 0;
    },
    
    #pass the position of the stimulating lever
    tick = function(s)
    {
      lastX <- x;
      #position
      x <<- x + (v + s - lastS - K * x) * Phi;
      #velocity
      v <<- x - lastX;
      #energy
      E <<- (v * v) / 2 + (K * x * x) / 2;
      lastS <<- s;
      return(c(x, E));
    }
  )
)

#create one pendulum and init with 700 as frequency of auto-oscillation
p <- pendulum();
p$init(700);

#init a vector of waveforms with frequencies from 500 to 900
m <- aaply(seq(500, 900, 50), 1, function(x) sine(x, 1500)@left);

# clear end of the waveform
m[, 1001:1500] <- 0;

#apply the pendulum tick to the vector of waveforms
m <- t(m);
r <- aaply(m, c(1, 2), p$tick, .progress = "time");

#index of the waveform to  plot
i <- 5;

#show results
plot(m[, i] * 100, type = "l", col = "dark green");
lines(r[ , i, 1], type = "l", col = "blue");
lines(r[ , i, 2], type = "l", col = "red");

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi

Vectorization of raster to polygons

I would never start to write any vectorization code if there was any free library. However, recently I was involved as a tech lead in an interesting project related to geometry. We needed to calculate complex projections of numerous shapes. As we started with “clean” mathematical solution, we quickly end up with a huge number of calculations related to polygon triangulation – O(log n!). We spent a week and I found that we are in a real trouble – for real scenarios the process lasted for minutes. Then I decided to turn our faces to discretization (as it was declared at the beginning) and we did the job and got the result in a form of a matrix. It was Friday and on the next Monday, we ought to present the results. But in a vector form.

Search for raster vectorization in R packages shown this package and the function rasterToPolygons looked good, perhaps it was producing too many points for the polygons. If there is no needed R package then I need to create own. Well, still having a half of a weekend, I did some C++ coding and created this function that does the job in one pass.

Initial bitmap with an enclave and an option to do not allow exclaves and the result:

initial imagevectorization result #1

and an example from the volcano dataset:

 library(fasteraster);
 library(datasets);
 inp = volcano;
 res = raster2vector(volcano, 120, 200, 20);
 image(inp, col = rev(grey.colors(100)), useRaster = TRUE)
 plot(0, type = "l", xlim = c(0, nrow(inp)), ylim = c(0, ncol(inp)))
 a = lapply(res, function(x) lines(rbind(x, x[1,])))

produces:

volcano imagevectorization result #2

You can find source package: fasteraster_1.0.4.tar

and Linux 64 binary here: fasteraster_1.0.4_R_x86_64-pc-linux-gnu.tar

or get a fresh version right from CRAN: https://cran.r-project.org/package=fasteraster

Andy Bosyi,
Information Technology & Data Science
Linkedin: http://www.linkedin.com/in/andybosyi

По Кому Подзвін

* * *

Немає жодної людини
Сама щоб була по собі,
Бо навіть частка континенту –
Це просто грудочка землі.

Втрачає острів свою скелю
Яку відкушує вода
Так з смертю кожної людини
Разом малішаю і я.

Якщо ти спільно з усім людством,
Живеш у радості й журбі,
То не питай, по кому подзвін
-Звучить він також по тобі.

Джон Донн (1621),
вільний переклад

* * *

Містер Ні

* * *

Від світу втеча, за хвостом гонитва,
Забивши серце до життєвої стерні –
Це все останні проблиски дитинства,
Які краде у нас Великий Містер Ні.

 * * *