[Desktop_architects] Making Sound On Linux Just Work

Greg Wright gwright at real.com
Tue Dec 13 13:44:15 PST 2005


Just some comments off the top of my head below.....

 >
 >                       Making Sound Just Work
 >                      ------------------------
 >
 > One of the "second tier" of requirements mentioned several times at
 > the OSDL Portland Linux Desktop Architects workshop was "making audio
 > on Linux just work". Many people find it easy to leave this
 > requirement lying around in various lists of goals and requirements,
 > but before we can make any progress on defining a plan to implement
 > the goal, we first need to define it rather more precisely.
 >
 > This list is intended to avoid any implementation details, and is
 > focused entirely on a task oriented analysis of the issues. Your input
 > is sought to complete, improve and clarify this analysis.
 >
 > DEFINING THE GOAL
 > =================
 >
 > The list below is a set of tasks that a user could reasonably expect
 > to perform on a computer running Linux that has access to zero, one
 > or more audio interfaces, as well as zero one or more network
 > interfaces.
 >
 > The desired task should either work, or produce a sensible and
 > comprehensible error message explaining why it failed. For example,
 > attempting to control input gain on a device that has no hardware
 > mixer should explain that the device has no controls for input gain.
 >
 >  CONFIGURATION (see also MIXING below)
 >
 >         - identify what audio h/w exists on the system
 >         - identify what network audio destinations are available
 >         - choose some given audio h/w or network endpoint
 >           as the default for input
 >         - ditto for output

Of course, knowing the capabilities of each h/w device will be
important in order to chose which one you want to use (multi-channel?
Audio quality, hardware decoders? etc). Probably a given in the
above requirements.

 >         - enable/disable given audio h/w
 >         - easily (auto)load any kernel modules required for
 >           given functionality
 >
 >  PLAYBACK
 >
 >          - play a compressed audio file
 >               * user driven (e.g. play(1))
 >               * app driven (e.g. {kde,gnome_play}_audiofile())

Whatever compressed formats (codecs) are available on the system,
should be available to every app that uses the audio subsystem. I
should not have to use one app to play back codec A and another to
playback codec B. I should just be able to use whatever app I like
the most.

 >         - play a PCM encoded audio file (specifics as above)
 >         - hear system sounds
 >         - VOIP
 >         - game audio
 >         - music composition
 >         - music editing
 >         - video post production


           - A/V Sync should be spot on for video playback. This is
             not a given for most of the current audio subsystems, ESound
             for example.


 >
 >  RECORDING
 >
 >          - record from hardware inputs
 >             * use default audio interface
 >             * use other audio interface
 >             * specify which h/w input to use
 >             * control input gain
 >         - record from other application(s)
 >         - record from live (network-delivered) audio
 >           streams
 >             * PCM/lossless compression (WAV, FLAC etc)
 >             * lossy compression (mp3, ogg etc)

Any thoughts on protected content here? I feel we need to star thinking about
adding content protection into the system. This includes kernel
secure processes, secure hardware, DRM, etc. I know it isn't very popular,
but in the future if we want access to the content owner's high value
content (like DVDs, music, HDTV, etc) we will have to have this.
If we don't, we could find ourselves as a second class citizen unable to
play back much of the desired content. We should at least start thinking
about this. In Longhorn (Vista), a huge amount of effort has gone into
this. I feel they must be doing it because they are fearful of the PC
being left behind as a method of playing back this kind of content. If MS
doesn't think they can get access to high value content without ensuring at
least some content protection, with all their money and influence, I am not
sure how we could.

More later,
--greg.


 >
 >
 >  MIXING
 >
 >         - control h/w mixer device (if any)
 >
 >              * allow use of a generic app for this
 >              * NOTE to non-audio-focused readers: the h/w mixer
 >                is part of the audio interface that is used
 >                to control signal levels, input selection
 >                for recording, and other h/w specific features.
 >                Some pro-audio interfaces do not have a h/w mixer,
 >                most consumer ones do. It has almost nothing
 >                to do with "hardware mixing" which describes
 >                the ability of the h/w to mix together multiple
 >                software-delivered audio data streams.
 >
 >         - multiple applications using soundcard simultaneously
 >         - control application volumes independently
 >         - provide necessary apps for controlling specialized
 >              hardware (e.g. RME HDSP, ice1712, ice1724, liveFX)
 >
 >  ROUTING
 >
 >         - route audio to specific h/w among several installed devices
 >         - route audio between applications
 >         - route audio across network
 >         - route audio without using h/w (regardless to whether or
 >           not h/w is available; e.g. streaming media)
 >
 >  MULTIUSER
 >
 >         - which of the above should work in a multi-user scenario?
 >
 >  FORMATS
 >
 >         - basically, the task list if covered by the above list,
 >           but there are some added criteria:
 >
 >         - audio data formats divide into:
 >         - direct sample data (e.g. RIFF/WAV, AIFF)
 >         - losslessly compressed (e.g. FLAC)
 >         - lossy compression (e.g. Vorbis, MP3)
 >         - apps that can handle a given division should all
 >           handle the same set of formats, with equal prowess.
 >           i.e. apps don't have to handle lossy compression
 >           formats, but if they do, they should all handle
 >           the same set of lossy compression formats. Principle:
 >           minimize user suprise.
 >         - user should see no or limited obstacles to handling
 >           proprietary formats
 >
 >  MISC
 >
 >         - use multiple soundcards as a single logical device
 >         - use multiple sub-devices as a single logical device
 >           sub-devices are independent chipsets on
 >           a single audio interface; many soundcards
 >           have analog i/o and digital i/o available
 >           as two different sub-devices)
 >




More information about the Desktop_architects mailing list