Sound Analysis with a GUI for visualisation of spectrograms.

Ever thought of a project in which you are presented with a GUI for analysing the spectrograms, that could show you a visual representation of the Sound waves and you would be able to zoom in to the parts for wave analysis, and classify the waves and separate them out. As a matter of fact we can also extend this GUI into a Android Based or IOS based software which would come in handy for audio capturing and analysis. So moving ahead.

What is a Spectrogram you ask ?

A Spectrogram is a visual representation of the frequencies of a signal as it varies with time. Now, sound classification or audio tagging have various applications.

I could think of a few use cases where i can put this technology to use.

Imagine, you have a .wav file which carries the sound of an engine and there are some anomalies in it. I could use this algorithm to classify the waves and the next step would be to recognise and separate them.

The most important part is not the one where you can visualise the waves, the major challenge lies in these things.

  • How would you make up for the loss of information ?
  • How to recognise the loss of information ?
  • How to perform Feature Extraction ?
  • How to train your model ?

Until now not many people are a part of the community which deals with sound analysis or feature extraction within sound waves which in turn requires heavy knowledge of DSP (Digital signal processing) which not to mention is a dreaded subject for many students in engineering, including me who was a student of Electronics and Instrumentation with a very little background on signal processing, and of-course was extremely terrified by this subject. Despite my fears, i was still curious in exploring the path.

So, with all my fears intact and a curious mind, i thought i would try to create something which might be a part of a small revolution.

So to start with i would like to focus on these following aspects :

What are the techniques that we are going to use in the Information retrieval from Audio files ? The below two are also considered as the industry standards.

  • Python
  • TensorFlow

We also need to focus on a few theoretical aspects

  • What are the types of audio features for ML? But first what is an Audio feature?

Audio features are description of sound and then you may question, why do we need these audio features ?

We need these audio features to train the audio systems.

How many ways can we perform Audio Feature categorisation ? So as an answer to that these are the below ways to do it

  • Level of Abstraction
  • Temporal Scope
  • Music Aspect
  • Signal Domain
  • ML Approach

Not going too deep into the concepts, we would be more interested in going right into the feature extraction mode. In the signal domain category, there would be two aspects i.e. Time Domain and Frequency Domain.

A certain number of features are on the Time Domain e.g. Amplitude Envelope, RMSE (Root mean square Energy), Zero Crossing rate.

While on the Frequency domain, we get the Band Energy Ratio, Spectral Centroid, Spectral Flux.

To analyse the Spectrograms, we can therefore apply FFT to transform the Time Domain into frequency domain, and we can analyse the density of the graph in terms of intensity of distribution of colours on the graph.

Below is a snippet of an example spectrogram, just to give you an idea on how it should look like.

Spectrogram of the spoken words “nineteenth century”. Frequencies are shown increasing up the vertical axis, and time on the horizontal axis. The legend to the right shows that the colour intensity increases with the density.

Now to start with the GUI :

This is a snippet where i generate a small project (NodeJS) which would help in audio analysis. Although don’t expect it to be a complete copy paste tutorial, there are steps which i skipped and i would expect the Developers or fellow readers to know NodeJS, Express, AngularJS.

Create a node server, and for that i would suggest creating a folder.

Type the below commands

mkdir SoundAINodeJS

which in turn will create a folder and then go inside the folder and type the below command

npm init

which in turn will initialise your project , and by that i mean, a package.json file will be created. Just for simplicity keep pressing Enter to create the package.json file

and add the necessary files or libraries you need to download from the internet

{
"name": "node-crud",
"version": "0.0.0",
"private": true,
"scripts": {
"start": "node ./bin/www"
},
"dependencies": {
"body-parser": "^1.19.0",
"cookie-parser": "~1.4.3",
"debug": "^2.6.9",
"ejs": "~2.5.5",
"express": "^4.17.1",
"hiplot": "^0.1.19",
"mongoose": "^5.10.14",
"morgan": "^1.10.0",
"nodemon": "^2.0.6",
"paginate": "^0.2.0",
"serve-favicon": "^2.5.0",
"socket.io": "^3.0.1"
}
}

As your project is created, cd to the folder and type

npm install

The above command automatically downloads all the libraries from the internet and installs them.

Now add all the subsequent folders to create a similar structure, Don’t be bothered by the name “mywebsite_copy”, your folder should have the name SoundAINodeJS.

Now for the Sound analysis part, you have to use a jupyter notebook which in turn should do the python code analysis. You can also use pycharm, although i found jupyter notebooks to be very quick and dirty method.

Now for the sound analysis, just to give you a heads-up over the analysis part

A snippet of the code for individual sound wave analysis.

— — — — — — — — (Post Under Construction) — — — — — — — — —

--

--

Shreya Mitra-Cloud/Electronics/Software Engr

An Engineer with a masters degree in E&I , trying to meddle with various topics and keeping myself informed MBA@WHU | Columbia Business school