Android audio applications

Signal generator 1.0.2

  • Outputs sine, pink noise, and white noise
  • Volume and frequency sliders
    • Controls are hard to grab at the edges
    • Volume control is independent from the phone’s volume control
    • Doesn’t go below 100 Hz
  • Sine waves are distorted and badly aliased at high frequencies.  THD+N measurements:
    • 100 Hz 0.149%
    • 1 kHz: 0.08%
    • 10 kHz: 7.1%
    • 20 kHz: 20.15%
    • Spectrum for 997 Hz:
    • signal generator 997 Hz
  • Although the control says “1.00 kHz” at startup, it’s actually playing 765 Hz.  Frequencies are correct after you start moving it around, though.
  • White noise level is higher than sine wave — sine wave never reaches the peak output of the phone, even at “0 dB”
  • There doesn’t seem to be any way to enter levels other than 0 dB manually, since it doesn’t let you type a minus sign.
  • White noise sounds like it’s repeating every 1.4 seconds
  • Noise is the same in both channels

Frequency generator 200909150

  • Outputs sine, square, triangle, sawtooth
  • Playing more than one sine wave causes clipping, but you can decrease the phone’s volume control
  • Default setting is 440, 444, and 448 Hz, but when played together, the waveform changes abruptly once per second
  • Sine wave THD+N:
    • 40 Hz: 1.00%
    • 440 Hz: 0.056%
    • 10 kHz: 0.051%
    • 20 kHz: 0.222%
  • I don’t know how the square, triangle and sawtooth are generated, but it ain’t right.
    • This is what the “square wave” looks like at 10 kHz:
    • freq gen square 10 kHz
    • And this is the spectrum:
    • frequency generator 10 kHz square
    • Nooot even close

(Using ExtUSB headphone cable, Adobe Audition, Audio Precision.)

A simple FastICA example

Wikipedia describes independent component analysis as “a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals”. (Clearly, this was written as part of their campaign to make technical articles accessible.)

In normal people words, ICA is a form of blind source separation — a method of unmixing signals after they have been mixed together, without knowing exactly how they were mixed. It’s not as bad as Wikipedia makes it sound. It’s just the signal processing equivalent of this:

One of the problems I always have with learning stuff like this is the lack of clear examples. They exist, but they’re not generally very good. (And why do researchers always work with awful noisy 3-second 8 kHz recordings?) So, upon getting working results, I wrote up this little example.  This is in Python and requires the MDP (python-mdp in Ubuntu) and Audiolab packages (sudo easy_install scikits.audiolab).

In order for ICA to work, it requires at least one different recording for each signal you want to unmix. So if you have two musical instruments playing together in a room, and want to unmix them to get separate recordings of each individual instrument, you’ll need two different recordings of the mixture to work with (like a stereo microphone). If you have three instruments playing together, you’ll need three microphones to separate out all three original signals, etc. So, first, create the mix:

  1. Find or make two mono sound files. I just used clips of music.
  2. Mix them together to a stereo track, with both sounds mixed into both channels, but with each panned a little differently, so the two channels are not identical. They should sound all jumbled together, but the left channel should sound slightly different from the right.
  3. Save in a format that libsndfile can read, like FLAC or WAV (not mp3):
    • Mixed music
    • [audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Mixed-NIN-and-Mazzy-Star.mp3]

Alternatively, just mix them in Python:

sig1, fs1, enc1 = wavread('file1.wav')
sig2, fs2, enc2 = wavread('file2.wav')
mixed1 = sig1 + 0.5 * sig2
mixed2 = sig2 + 0.6 * sig1

So now you have the mixed signals, and you can pretend you don’t know how they were mixed. To unmix them automatically, run something like this in Python:

from mdp import fastica
from scikits.audiolab import flacread, flacwrite
from numpy import abs, max

# Load in the stereo file
recording, fs, enc = flacread('mix.flac')

# Perform FastICA algorithm on the two channels
sources = fastica(recording)

# The output levels of this algorithm are arbitrary, so normalize them to 1.0.
sources /= max(abs(sources), axis = 0)

# Write back to a file
flacwrite(sources, 'sources.flac', fs, enc)

The output has each signal in its own channel:

You can hear some crosstalk, but it’s pretty good:

[audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Unmixed-Mazzy.mp3]
[audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Unmixed-NIN.mp3]

For more than two sources, I just read them in separately and combined them in Python:

rec1, fs, enc = flacread('Mixdown (1).flac') # Mono file
rec2, fs, enc = flacread('Mixdown (2).flac')
rec3, fs, enc = flacread('Mixdown (3).flac')

sources = fastica(array([rec1,rec2,rec3]).transpose())

flacwrite() has no problem writing multi-channel files.

Mixed speech:

[audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Mix.mp3]

After demixing, there’s very little crosstalk, though the noise floor increases considerably.  This seems to be the case when the mixes are very similar:

[audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Source-1.mp3] [audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Source-2.mp3] [audio:http://www.endolith.com/wordpress/wp-content/uploads/2009/11/Source-3.mp3]

Although this method was recommended to me for real-life audio signals and microphones, as I’ve described above, it turns out that ICA doesn’t actually work well when the signals occur at different delays in the different sensor channels; it assumes instantaneous mixing (that the signals are in perfect sync with each other in all the different recordings).  Delay would happen in a real-life situation with performers and microphones, since each source is a different distance from each microphone. This is exactly the application I had in mind, though, so I don’t really have any further interest in ICA…