[clug] Audio file formats

Tue May 20 07:09:48 GMT 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kristy A. Bennett wrote:
| Okay, CLUG-speak hasn't changed.  I understood about 56% of what you
| were trying to convey as usual Paul!
|
| Paul Wayper wrote:
|> So can you tell us what you're doing this for and what your end user
|> will have
|> to play the file back?  The more I understand what you're trying to do
|> the
|> better my advice will be :-)
| I have had some requests from my business consultancy clients for some
| materials that they can refer back to particularly on a range of topics
| from IT issues, web security (or moreover web insecurity) through to
| strategic HRM & marketing.  At the moment I provide hard copies of
| information to them but there is an obvious demand out there for voice.
| Many discussions have oriented around acquiring further information
| through audio as it doesn't take as long to digest and can go into more
| depth over the same time frame required to read a paper.  Add to that
| the commute, morning walk or gym application and it's past the post.  At
| this point I am looking for the best possible means to provide both,
| what is in effect, a subscribed podcast with weekly releases as well as
| CD's for them to throw in the car.

OK, well this sounds like you need a reasonable quality podcasting setup.  If
this is a commercial venture, you should be looking to buy some reasonable
equipment for it.  I'd put together a list similar to:

* One or two reasonable quality voice microphones with stands.  Shure, Alesis
and Yamaha are good brands.
* One four-track mixing board with at least two microphone inputs.  There are
quite good small mixers made by Beringher under the Eurodesk brand.
* One decent quality sound import/export box - preferably a USB or Firewire
box that does at least 48KHz in 16bit stereo.  I like the look of the Edirol
UA-1EX but haven't coughed up the money for one yet.
* Cables to tie it all together.
* More scratch hard disk space for all your audio.

1) Make sure your room is nice and quiet, and you (and your co-presenter or
guest) have comfortable chairs close to microphones.
2) Make sure that the loudest sound the microphone is likely to hear is _just_
below -3dB - just below the red line.
3) Record the sound in Audacity at the highest sampling rate and number of
bits you can.
4) Save the audio as a rough cut now.
6) Process your sound to improve its quality.  At the very least you should
normalise your sound.  You can also use the "Noise Removal..." effect in
Audacity if you've got background noise - just select a section of audio where
that background noise is the only sound and set that as the noise profile,
then select the whole recording and run the noise removal on it.  You can also
cut out those unnecessary 'um's and pauses if you're doing a question and
answer session.  Trim the head and tail and put your intro and outro on.
7) Save that as a wave file in 44100Hz 16-bit stereo for the CD.
8) Use LAME to produce a high-quality and a small-size MP3 file of it for your
website.  To get these I would recommend using:

lame --abr 160 -q 1 -b 32 -B 320 -c $file.wav $file-high.mp3
lame --abr 16 -a -q 2 -b 8 -B 160 --resample 11025 -c $file.wav $file-low.mp3

On my test piece of music, we have:
file.wav: 72MB
file-high.mp3: 7.4MB = ~10x reduction.
file-low.mp3: 798KB = ~100x reduction.

The latter is still quite clear and would be perfectly listenable on an iPod.
~ I've had phone conversations that have sounded worse.

If you want to go with 'standard' WAV files, then I would do the following:

sndfile-resample -to 11025 -c 4 $file.wav /tmp/tempfile.wav
sndfile-convert -ms-adpcm /tmp/tempfile.wav $file-low.wav
rm /tmp/tempfile.wav

(/tmp/tempfile.wav used due to the sndfile tools being unable to use stdin or
stdout.  You can safely use linear interpolation (-c 4) here instead of the
better quality -c 0 because we're just throwing out three out of four samples,
and no actual re-engineering of the waveform need take place.)

On the above file, this results in:

file-low.wav: 4.6MB = ~20x reduction.

This is, however, noticeably inferior in quality - the sound has less high
notes and has a more noticeable crackle to it.  But, if people can't play MP3
files or you're unwilling to write them, then this is your best bet.

Of course, for OGG encoding for the same bitrates I would use:

oggenc -b 160 $file.wav -o $file-high.ogg
oggenc -b 16 --downmix --resample 11025 $file.wav -o $file-low.ogg

These give:

file-high.ogg: 6.9MB = ~10x reduction
file-low.ogg: 966KB = ~80x reduction

For a given bit rate, OGG is better in quality than MP3.

What was I procrastinating about doing now?  Hope this helps, anyway :-)

Have fun,

Paul

P.S. Don't forget to put in the artist, album, title etc tags in the MP3 or
OGG files.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFIMnk8u7W0U8VsXYIRAiimAKDR3tWDNVDw0oTq2iboT+eQuhAopACeLqIw
nS1ry4eHDMZGakla1KTqCPQ=
=Hnij
-----END PGP SIGNATURE-----