Linux Audio
From Freespire
Home-->Documentation-->Sound Issues-->Linux Audio
What's Wrong With Linux Audio
Contents |
The state of things
Software
The way it was (OSS)
In the beginning (of Linux audio) there was the Open Sound System (OSS) and it was good, well, not really. It was simply better than the silence that preceded it. It was just a set of devices in /dev that had an API made up of file operations like read, write & ioctl.
The way it is (ALSA)
Then came the Advanced Linux Sound Architecture (ALSA) which was better. It came with a library called Asound that was more thoroughly documented than OSS and made writing applications that used sound cleaner and more robust. But while it became the de-facto sound system for the Linux kernel, many applications continued to use OSS because they wanted to maintain portability (OSS is available on BSD & Solaris, ALSA is not). To make sure old (and some new) software continued working they also implemented an OSS emulation layer which provides the old devices they're looking for.
Hardware
The Good
Sound cards of quality usually implement hardware-level mixing. When this feature is exposed to the applications (via the drivers) it allows several applications to play sound simultaneously.
The Bad
Unfortunately, most sound cards on the market today don't implement hardware mixing. This means that only one application can have the sound device open at a time.
The Ugly
ALSA has solved this problem by implementing a plug-in called 'dmix' (for direct mixing) that performs software-level mixing and once again allows multiple applications to play sound. But since this plug-in is implemented at the level of the Asound library it is restricted to applications that can use ALSA. This means OSS applications are left out in the cold. As a matter of fact, if an application is using OSS, the ALSA will be blocked, even if dmix is active. And if ALSA is in-use, then OSS will be blocked.
While dmix enables multiple ALSA applications to play sound simultaneously, only one user can have access to the sound device at a time.
Several people have tried implementing 'sound servers' (esound, aRts & JACK to name three) which solve the problem by creating one connection to the sound device and mixing the audio from any application that chooses to connect to it. However, they have the same problem as ALSA's dmix, if applications don't use these servers, there is no benefit.
What Developers have done about it
In the past
In older versions of the OS KDE's aRts sound server was used and launched OSS applications via 'artsdsp', a program that uses LD_PRELOAD trickery to intercept communications with the OSS device and re-route them to aRts. This worked, except that aRts has very high latency which made video playback irritating to watch and VOIP difficult to use.
More recently
Current versions of the OS switched to the JACK sound server and modified artsdsp to become jackdsp, allowing the OS to maintain a certain level of OSS compatibility. The primary benefit of the JACK server was that its latency is low enough to be unnoticeable, making video playback and VOIP very nice.
Time for a change
Despite all the effort put into this issue over the years, it's never really been solved. The biggest problem is that sometimes, sound servers refuse to start or get killed while in use. There is also the fact that some applications just can't be appeased by LD_PRELOAD trickery and don't always have the source code or time to modify it to talk directly to our chosen sound system.
What can be done
Long term
- One solution would be for someone to create (and/or decide on) a single sound server that all applications that wanted to play sound would be required to use. Any application (other than the server) that attempted direct communication with the sound device would be refused. As Douglas Adams said: "this is, of course, impossible".
- You might think that forcing all applications to move to ALSA would also solve the problem, but you're only half right. It would solve the problem for the first user, but not the second, third, etc who tried to play sound.
Short term
- Leave things as they are. JACK works pretty well.
- Switch to an ALSA-based solution. With a combination of the dmix plug-in and the aoss wrapper (ALSA's answer to artsdsp) we should be able to maintain status-quo and eliminate the problem of the finicky sound servers. The only down-side to this is that Freespire's implemented an extension to JACK that allows us to disconnect it from the sound device while keeping the applications connected to it happy. This makes the sound device available to applications that absolutely must have direct access. A way has not yet been found to do this with a pure ALSA solution.
Optimum Solution
Create a new proxy ALSA sound driver. This driver would be a kernel module like all the others and would appear to the system as just another sound card. When loaded it would (via internal, kernel-level APIs) find the first real sound device and establish a connection to it. Then as applications opened connections to the proxy driver it would adjust the sample rate & bit-depth as necessary and mix the streams together before feeding them to the real sound driver. Since this would be effectively implementing a sound server at the kernel level it would work for all applications, regardless of sound API or user. This solution is probably the most technically challenging of them all but it also reaps the greatest rewards since it should require significantly less maintenance than any of the other solutions we've tried or considered.

