Wikipedia describes independent component analysis as “a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals”. (Clearly, this was written as part of their campaign to make technical articles accessible.)
In normal people words, ICA is a form of blind source separation — a method of unmixing signals after they have been mixed together, without knowing exactly how they were mixed. It’s not as bad as Wikipedia makes it sound. It’s just the signal processing equivalent of this:
One of the problems I always have with learning stuff like this is the lack of clear examples. They exist, but they’re not generally very good. (And why do researchers always work with awful noisy 3-second 8 kHz recordings?) So, upon getting working results, I wrote up this little example. This is in Python and requires the MDP (python-mdp in Ubuntu) and Audiolab packages (sudo easy_install scikits.audiolab).
In order for ICA to work, it requires at least one different recording for each signal you want to unmix. So if you have two musical instruments playing together in a room, and want to unmix them to get separate recordings of each individual instrument, you’ll need two different recordings of the mixture to work with (like a stereo microphone). If you have three instruments playing together, you’ll need three microphones to separate out all three original signals, etc. So, first, create the mix:
- Find or make two mono sound files. I just used clips of music.
- Mix them together to a stereo track, with both sounds mixed into both channels, but with each panned a little differently, so the two channels are not identical. They should sound all jumbled together, but the left channel should sound slightly different from the right.
- Save in a format that libsndfile can read, like FLAC or WAV (not mp3):
- Mixed music
-
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
Alternatively, just mix them in Python:
sig1, fs1, enc1 = wavread('file1.wav')
sig2, fs2, enc2 = wavread('file2.wav')
mixed1 = sig1 + 0.5 * sig2
mixed2 = sig2 + 0.6 * sig1
So now you have the mixed signals, and you can pretend you don’t know how they were mixed. To unmix them automatically, run something like this in Python:
from mdp import fastica
from scikits.audiolab import flacread, flacwrite
from numpy import abs, max
# Load in the stereo file
recording, fs, enc = flacread('mix.flac')
# Perform FastICA algorithm on the two channels
sources = fastica(recording)
# The output levels of this algorithm are arbitrary, so normalize them to 1.0.
sources /= max(abs(sources), axis = 0)
# Write back to a file
flacwrite(sources, 'sources.flac', fs, enc)
The output has each signal in its own channel:
You can hear some crosstalk, but it’s pretty good:
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
For more than two sources, I just read them in separately and combined them in Python:
rec1, fs, enc = flacread('Mixdown (1).flac') # Mono file
rec2, fs, enc = flacread('Mixdown (2).flac')
rec3, fs, enc = flacread('Mixdown (3).flac')
sources = fastica(array([rec1,rec2,rec3]).transpose())
flacwrite() has no problem writing multi-channel files.
Mixed speech:
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
After demixing, there’s very little crosstalk, though the noise floor increases considerably. This seems to be the case when the mixes are very similar:
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.
Although this method was recommended to me for real-life audio signals and microphones, as I’ve described above, it turns out that ICA doesn’t actually work well when the signals occur at different delays in the different sensor channels; it assumes instantaneous mixing (that the signals are in perfect sync with each other in all the different recordings). Delay would happen in a real-life situation with performers and microphones, since each source is a different distance from each microphone. This is exactly the application I had in mind, though, so I don’t really have any further interest in ICA…
March 30th, 2010 at 8:20 am
AS you said that it unmixes the signal into independent statistical way only when we have two sources to mix them up.
Can’t this unmix a EEG signal which contains ECG or EMG artefacts in it.
I searched alot many IEEE papers but no one illustrated well about the proper way to soplve this problem.Kindly publish another example while showing that ECG OR EMG artefact removal from EEG data.
Thanks!
March 30th, 2010 at 9:01 am
I’d imagine that taking multiple channels of EEG and then demixing them with ICA would separate out the different parts you want, but I’m not an expert on this stuff, sorry. It might not be the right kind of problem for ICA to solve. Search for “blind source separation” instead of ICA to see if there are better algorithms.
May 31st, 2010 at 9:49 pm
I was just trying out some similar examples of using fastica and I found your site. Totally agree about the lack of examples.
Anyway, the thing I’m wondering is why you need N channels to produce N separated signals. Why can’t ICA just work on one signal and separate out the components? Seems a bit roundabout to have to copy a second channel and then pan it.
I want to build an app that takes in a song and separates out three signals (vocals, melody, and percussion). It would be very useful to DJs because pretty much no such software exists currently.
May 31st, 2010 at 10:06 pm
How would it know what the two signals were if it only had one to work with? It uses the difference between the two mixed signals to figure out what the two original signals were.
May 31st, 2010 at 10:34 pm
I guess I should read more about it… I thought by doing a form PCA it was separating out frequencies that covaried in amplitude together. I didn’t realize that its doing some kind of comparison between sensors.
Its definitely possible to do this with just one signal, but maybe ICA is not what I’m looking for. I would suppose it would work in the same way your brain separates out a conversation from a bunch of people talking in the same room (granted you have two ears to sample from but I’m pretty sure you would still have the ability if you were deaf in one ear).
May 31st, 2010 at 10:40 pm
lol I just tried this with Biggie’s “Juicy” and it separates Puff Daddy saying “uh uh yea thats right” really annoyingly into one channel and everything else into the other channel. It sounds hilarious.
June 1st, 2010 at 12:01 pm
Well, in order to extract two signals from one signal, you need a model of what type of signal to expect. If one signal is all low frequencies and the other all high frequencies, you could separate them with a simple filter, for instance. But if you don’t know anything specific about the signals, you’re not going to be able to separate them.
Yes, you could do this the way humans do it. All you’d have to do is write software to simulate a human brain. I’d be very interested in getting a copy of this, if you do it.
November 30th, 2011 at 9:44 pm
Hi, I am trying to use your code in a real-world recording. I use two mono microphone to record simultaneously into a stereo mic in.
So the left channel of the audio file now contains the signal from mic 1 and the right channel from mic 2.
I want to ask how should I prepare the file so that “Mix them together to a stereo track, with both sounds mixed into both channels, but with each panned a little differently”?
Thanks.
December 1st, 2011 at 1:41 am
I think I used Adobe Audition and mixed the two tracks together to a stereo mix, with each panned differently. Then Left and Right both contain both signals, but not at the same levels.
December 1st, 2011 at 1:06 pm
What does it mean by “with each panned differently”?
Do you think it should work on real-world recording with one microphone for each channel?
December 1st, 2011 at 1:12 pm
I mean, for instance, that Mic 1 is 100% in Left channel, and 50% in Right channel, while Mic 2 is 100% in Right channel, and 50% in Left channel. You need both signals in both channels, but not at the same level.
Yes it should work fine for real-world microphone recordings, as long as they were recorded independently and mixed without any delay. Mixing them like this is not realistic, though. In real life, if you are recording 2 sources with 2 microphones at the same time, there will be slight delay differences between the microphones, which ICA does not handle as well.
April 1st, 2012 at 1:53 am
Don’t give up so fast: some modified ICA algorithms that do work
April 4th, 2012 at 8:55 pm
I’ve since discovered the DUET algorithm, which seems very similar to my original idea: Breaking up the two signals with STFT and comparing phase and amplitude differences to guess at their origin in space, and then cluster nearby points and reconstruct the signal from only those STFT components.
May 12th, 2012 at 3:00 am
Hi,
Your demonstration here is just great ,I really appreciate it. Thanks a lot!
But a simple problem arise when I try to use fastica in Matlab. here is the code;
>> w1=wavread(‘C:\Users\BillyChan\Desktop\1.wav’);
>> w2=wavread(‘C:\Users\BillyChan\Desktop\2.wav’);
>> mix1=0.8*w1+0.2*w1;
>> mix2=0.8*w1+0.2*w1;
>> test=mix1′;
>> test=[test ; mix2'];
>> [icasig, A, W]=fastica(test,’numOfIC’,2);
Here what I want to do is to separate the two independent signal,i.e w1 and w2, from the mixed-signal,just as what you did in this demonstration. But what i really got in icasig is only one signal, not two signal. What really happened? Am I doing the right thing? I would really appreciate it if you can give me some hints.
Thanks a lot!
May 12th, 2012 at 11:02 am
Well, as you wrote it, mix1 and mix2 are both identical to w1, so there’s nothing to separate. It should be something like
mix1=0.8*w1+0.2*w2; mix2=0.1*w1+0.7*w2;Both sources should be in both mixes, but at different levels.May 13th, 2012 at 7:08 am
Oops…I made a mistake in the code. >_<
Thanks a lot!