Wikipedia describes independent component analysis as “a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals”. (Clearly, this was written as part of their campaign to make technical articles accessible.)
In normal people words, ICA is a form of blind source separation — a method of unmixing signals after they have been mixed together, without knowing exactly how they were mixed. It’s not as bad as Wikipedia makes it sound. It’s just the signal processing equivalent of this:
One of the problems I always have with learning stuff like this is the lack of clear examples. They exist, but they’re not generally very good. (And why do researchers always work with awful noisy 3-second 8 kHz recordings?) So, upon getting working results, I wrote up this little example. This is in Python and requires the MDP (python-mdp in Ubuntu) and Audiolab packages (
sudo easy_install scikits.audiolab).
In order for ICA to work, it requires at least one different recording for each signal you want to unmix. So if you have two musical instruments playing together in a room, and want to unmix them to get separate recordings of each individual instrument, you’ll need two different recordings of the mixture to work with (like a stereo microphone). If you have three instruments playing together, you’ll need three microphones to separate out all three original signals, etc. So, first, create the mix:
- Find or make two mono sound files. I just used clips of music.
- Mix them together to a stereo track, with both sounds mixed into both channels, but with each panned a little differently, so the two channels are not identical. They should sound all jumbled together, but the left channel should sound slightly different from the right.
- Save in a format that libsndfile can read, like FLAC or WAV (not mp3):
- Mixed music
Alternatively, just mix them in Python:
sig1, fs1, enc1 = wavread('file1.wav') sig2, fs2, enc2 = wavread('file2.wav') mixed1 = sig1 + 0.5 * sig2 mixed2 = sig2 + 0.6 * sig1
So now you have the mixed signals, and you can pretend you don’t know how they were mixed. To unmix them automatically, run something like this in Python:
from mdp import fastica from scikits.audiolab import flacread, flacwrite from numpy import abs, max # Load in the stereo file recording, fs, enc = flacread('mix.flac') # Perform FastICA algorithm on the two channels sources = fastica(recording) # The output levels of this algorithm are arbitrary, so normalize them to 1.0. sources /= max(abs(sources), axis = 0) # Write back to a file flacwrite(sources, 'sources.flac', fs, enc)
The output has each signal in its own channel:
You can hear some crosstalk, but it’s pretty good:
For more than two sources, I just read them in separately and combined them in Python:
rec1, fs, enc = flacread('Mixdown (1).flac') # Mono file rec2, fs, enc = flacread('Mixdown (2).flac') rec3, fs, enc = flacread('Mixdown (3).flac') sources = fastica(array([rec1,rec2,rec3]).transpose())
flacwrite() has no problem writing multi-channel files.
After demixing, there’s very little crosstalk, though the noise floor increases considerably. This seems to be the case when the mixes are very similar:[audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Source-1.mp3] [audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Source-2.mp3] [audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Source-3.mp3]
Although this method was recommended to me for real-life audio signals and microphones, as I’ve described above, it turns out that ICA doesn’t actually work well when the signals occur at different delays in the different sensor channels; it assumes instantaneous mixing (that the signals are in perfect sync with each other in all the different recordings). Delay would happen in a real-life situation with performers and microphones, since each source is a different distance from each microphone. This is exactly the application I had in mind, though, so I don’t really have any further interest in ICA…
AS you said that it unmixes the signal into independent statistical way only when we have two sources to mix them up.
Can’t this unmix a EEG signal which contains ECG or EMG artefacts in it.
I searched alot many IEEE papers but no one illustrated well about the proper way to soplve this problem.Kindly publish another example while showing that ECG OR EMG artefact removal from EEG data.
I’d imagine that taking multiple channels of EEG and then demixing them with ICA would separate out the different parts you want, but I’m not an expert on this stuff, sorry. It might not be the right kind of problem for ICA to solve. Search for “blind source separation” instead of ICA to see if there are better algorithms.
I was just trying out some similar examples of using fastica and I found your site. Totally agree about the lack of examples.
Anyway, the thing I’m wondering is why you need N channels to produce N separated signals. Why can’t ICA just work on one signal and separate out the components? Seems a bit roundabout to have to copy a second channel and then pan it.
I want to build an app that takes in a song and separates out three signals (vocals, melody, and percussion). It would be very useful to DJs because pretty much no such software exists currently.
How would it know what the two signals were if it only had one to work with? It uses the difference between the two mixed signals to figure out what the two original signals were.
I guess I should read more about it… I thought by doing a form PCA it was separating out frequencies that covaried in amplitude together. I didn’t realize that its doing some kind of comparison between sensors.
Its definitely possible to do this with just one signal, but maybe ICA is not what I’m looking for. I would suppose it would work in the same way your brain separates out a conversation from a bunch of people talking in the same room (granted you have two ears to sample from but I’m pretty sure you would still have the ability if you were deaf in one ear).
lol I just tried this with Biggie’s “Juicy” and it separates Puff Daddy saying “uh uh yea thats right” really annoyingly into one channel and everything else into the other channel. It sounds hilarious.
Well, in order to extract two signals from one signal, you need a model of what type of signal to expect. If one signal is all low frequencies and the other all high frequencies, you could separate them with a simple filter, for instance. But if you don’t know anything specific about the signals, you’re not going to be able to separate them.
Yes, you could do this the way humans do it. All you’d have to do is write software to simulate a human brain. I’d be very interested in getting a copy of this, if you do it. 😀
Hi, I am trying to use your code in a real-world recording. I use two mono microphone to record simultaneously into a stereo mic in.
So the left channel of the audio file now contains the signal from mic 1 and the right channel from mic 2.
I want to ask how should I prepare the file so that “Mix them together to a stereo track, with both sounds mixed into both channels, but with each panned a little differently”?
I think I used Adobe Audition and mixed the two tracks together to a stereo mix, with each panned differently. Then Left and Right both contain both signals, but not at the same levels.
What does it mean by “with each panned differently”?
Do you think it should work on real-world recording with one microphone for each channel?
I mean, for instance, that Mic 1 is 100% in Left channel, and 50% in Right channel, while Mic 2 is 100% in Right channel, and 50% in Left channel. You need both signals in both channels, but not at the same level.
Yes it should work fine for real-world microphone recordings, as long as they were recorded independently and mixed without any delay. Mixing them like this is not realistic, though. In real life, if you are recording 2 sources with 2 microphones at the same time, there will be slight delay differences between the microphones, which ICA does not handle as well.
Don’t give up so fast: some modified ICA algorithms that do work
I’ve since discovered the DUET algorithm, which seems very similar to my original idea: Breaking up the two signals with STFT and comparing phase and amplitude differences to guess at their origin in space, and then cluster nearby points and reconstruct the signal from only those STFT components.
Your demonstration here is just great ,I really appreciate it. Thanks a lot!
But a simple problem arise when I try to use fastica in Matlab. here is the code;
>> test=[test ; mix2′];
>> [icasig, A, W]=fastica(test,’numOfIC’,2);
Here what I want to do is to separate the two independent signal,i.e w1 and w2, from the mixed-signal,just as what you did in this demonstration. But what i really got in icasig is only one signal, not two signal. What really happened? Am I doing the right thing? I would really appreciate it if you can give me some hints.
Thanks a lot!
Well, as you wrote it, mix1 and mix2 are both identical to w1, so there’s nothing to separate. It should be something like
mix1=0.8*w1+0.2*w2; mix2=0.1*w1+0.7*w2;Both sources should be in both mixes, but at different levels.
Oops…I made a mistake in the code. >_<
Thanks a lot!
I’am a PHD student in computer science, and i’m working on the Blind source separation, i read abot SOBI algorithm, but i have somme difficulties to understand it, please if you know this algorithm , can you help me to know how can i get just the covariance matrix please ?
Just this question
Hi, we are working in matlab with a fast ica code(similar to the one that Billy Chan used, I suppose), and we’re having some problems with the output signal.
[icasig] = fastica (mistura’, ‘numofIC’,2);
our output signal is something not similar to any of the expected output.And we get the following warnings:
Warning: Data clipped during write to file:sinal2.wav
> In wavwrite>PCM_Quantize at 280
In wavwrite>write_wavedat at 302
In wavwrite at 139
Do you know what’s the problem?
Thanks a lot!
It says “data clipped”, so I would guess that your data is clipped. 🙂 Did you normalize the signal level before writing to the wav file?
hi every one..
I am doing my final year project on FASTICA algorithm. can any one please give me the program for FASTICA in matlab?
it would be very helpful…
I’ve managed to get working code for ICA on Matlab, but what would be the main alteration from ICA to PCA, or preferably does anyone have example code for PCA, so I can compare the two?
Hi, Thank you for your good information.
The Video is wonderful Concept to me.
I have only one mixed signal.
But Is it possible to ICA Analysis, in that case ?
So, Can I separate the original signal ??
I hope so. ^^;
Thanks in advance.
No, you need more than one mix of the signal for ICA.
Great example! I use (an adapted version of) it in a signal processing class I teach. Just reading through the comments the first one from Puneet Mishra struck me as pretty hilarious at first (the guy pretty much asks the author to do his PhD for him), but then I realized some of the people reading this may not know that the subject of “unmixing signals” is actually a very nontrivial problem. Please be aware that all of this is very much an active area of research and much is unknown.
Basically, ICA, PCA and all other linear techniques only work in very specific, often artificial, examples. Everybody knows for instance that you cannot “unmix” your milk from your coffee by stirring backwards. The reason it works in the youtube video above is that they have a very viscous fluid (the video mentions that the Reynold number is < 1 and they have a laminar flow–laminar flow is the fluid dynamics term for linearity).
This is also the reason that the demo above works, but that it does not work for the practical purpose the author had in mind (unmixing real world recordings).
As for Puneet's question: EEG is an incredibly complex multivariate recording of a nonlinear source with unknown (high) dimensionality. How ECG and EMG artifacts mix into those recordings is very complex, nonlinear and largely unknown. The artifacts can perhaps be filtered out (partially) by taking into account the specific time and frequency characteristics ("patterns") of the ECG and EMG, but that will require advanced, custom-made algorithms. Out-of-the-box ICA will definitely not work.
Hi, How to get the array w in the algorithm in python FasTICANode…….i want to know the matriz W but i dont kwon how this array in python. Thanks
Hi, how can one use ICA for signal denoising and dimension reduction .Thanks
Hi,how can one use ICA for signal denoising and dimension reduction.Please help on how to write the code in matlab.I am a new user
Really informative piece.
I have implemented ICA algorithm using maximum likelihood estimate. I used your provided .flac file and was able to separate the sources.
I have following questions for you.
1) How did you generate the .flac/wav file. I tried doing so by getting 2 mp3 files and converting them to .wav file. But each of the .wav file had multiple channels and they weren’t of same length. I guess we can clip the length but will it be fine to just pick one of the many channels?.
It says how I made it in the text right above it.
You can also just mix them in your matlab/python software, as I also described above it.
Has other algorithm which uses second order statistics: AMUSE.
Here have your implementation in Python: http://dspandmath.blogspot.com.br/2015/12/blind-source-separation-with-python.html
I am trying to get your code running with my audio files. But I get an error saying
wavwrite(np.array([mixed1, mixed2]).T, ‘mixed.wav’,fs2, enc2,dtype=object)
ValueError: setting an array element with a sequence.
The code works fine with your inputs.I recorded two audio files simultaneously from two mics but I get this error. Why is it throwing an error with my audio files but not yours. The file formats are the same fs = 44100 and enc = pcm16. Please let me know.
why does it say dtype=object?
post all of your code
The only difference is that the files are slightly of different sizes. One is 1,701 kb and the other is 1,696 kb. How will I overcome this.
That error got solved when I used an WAV cutter and cut both the audio wav files to the same size. I put that dtype=object by mistake. The code runs fine without it. But now I am using the wave files generated by your code and I get the following error.
File “C:\Python26\fastica.py”, line 21, in
wavwrite(np2.array([mixed1, mixed2]).T, ‘mixed3.wav’,fs1,enc1)
File “C:\Python26\lib\site-packages\scikits\audiolab\pysndfile\matapi.py”, line 47, in basic_writer
hdl = Sndfile(filename, ‘w’, uformat, nc, fs)
UnboundLocalError: local variable ‘nc’ referenced before assignment
I don’t know what that nc error is, but I’ve been using PySoundFile lately, I always had issues getting scikits.audiolab to work reliably.
Full code here:
from mdp import fastica
from scikits.audiolab import wavread, wavwrite
from numpy import abs, max
from array import array
import numpy as np2
start_time = time.time()
nc = 0
sig1, fs1, enc1 = wavread('mixed1 1_2.wav')
sig2, fs2, enc2 = wavread('mixed2 1_2.wav')
mixed1 = sig1
mixed2 = sig2
#mixed = np.array([mixed1,mixed2]).T
#mixed = mixed.tolist()
#wavwrite(mixed, 'mixed.wav',fs2, enc2)
wavwrite(np2.array([mixed1, mixed2]).T, 'mixed3.wav',fs1,enc1)
#wavwrite(np.array([mixed1, mixed2]).T, 'mixed.wav',dtype = list)
# Load in the stereo file
recording, fs, enc = wavread('mixed3.wav')
# Perform FastICA algorithm on the two channels
sources = fastica(recording)
# The output levels of this algorithm are arbitrary, so normalize them to 1.0.
sources /= max(abs(sources), axis = 0)
# Write back to a file
wavwrite(sources, 'sources.wav', fs, enc)
I can mail you my audio files. My email id is email@example.com
Hey @Enno and @Endolith:
As you said that delays are different when you’re trying to do FASTICA on a real world clip and the algorithm assumes the two rec to be in a perfect sync, hence it doesnt work. Then, how come the audio clippings you’re using are in perfect sync (as they’re also recordings after all)… I am a bit confused.
I’m taking 2 different monaural recordings and then making 2 different monaural mixes of them. So the mixes contain the same material, in sync, but at different levels. The only difference is the amount of each in the mix.
Thanks for such a quick reply…Again to clarify what is actually in sync here?
imagine that signal 1 is a sine wave that starts at 0 and signal 2 is an impulse at 5 seconds. mix 1 is 1/2 times the sine wave + 2 * the impulse. mix 2 is 3 * the sine wave + 1 times the impulse. both mixes have the impulse at 5 seconds, but at different amplitudes, and both mixes have the sine wave starting at 0, but at different amplitudes.
Ohk…I am slowly getting the idea. Will I be able to do fastICA on two separate voice recordings and then mixing them together?
And I had one more doubt, FastICA is giving only one output, what do I do if I want to get the other one?
Friends, are you Know the algorithm FastIcaNode for python?. I used this but I still not understand some things, for example, if this code needs the Signal whitening, what is the meaning of signal whitening?
Read this Janet
Pingback: 语音识别研究的四大前沿方向 - 算法网
Pingback: 语音识别中的鸡尾酒会问题 – 源码巴士