[Accessibility-ia2] media a11y

Silvia Pfeiffer silviapfeiffer1 at gmail.com
Thu Jun 16 22:34:20 PDT 2011

On Fri, Jun 17, 2011 at 2:28 PM, Peter Korn <peter.korn at oracle.com> wrote:

> **
> Silvia,
> Reading through your explanation, it seems to me the key issue is the
> division of labor around the rendering of the external file description
> text.  I believe you have made the assumption that the AT will render this -
> at its own pace - while the general approach of accessibility APIs is that
> of exposing GUI/UI elements to AT.
> The issue of whether the description text is in the formal DOM or a "shadow
> DOM" is, I think, a red herring.  The user agent has access to the text, and
> the user agent presents all of the media to the user/AT in some fashion.
> For example, a key facet of the accessibility API is exposing the bounding
> rectangle of all rendered text - something that is also not in the DOM but
> is known to the user agent.  There is no reason why the user agent cannot
> likewise expose the description text in some for to the user/AT.

I was under the impression that this exposure is done through the a11y API.
But I will have to step back on this, because I do not understand enough
about how AT interact with the UA and Web pages.

> But, returning to what I suspect is the crux of the matter...  I think we
> shouldn't attempt to decide which approach is better before coding up a
> sample of each approach and examining them.  I propose three explorations:
>    1. Have the "audio-description-aware" video renderer also render the
>    audio descriptions
>       - Use one of the new generation of web-based TTS engines, and simply
>       have an option in the render to turn description TTS on
>       - Note: this doesn't handle Braille; but... how many folks who want
>       these descriptions are interested in having an audio stream that they can
>       hear be interrupted/paused so that they can move their hands to their
>       Braille display to read the description, only to then press a key to have
>       the audio stream continue?  For Braille-only folks, why would they want
>       anything other than the description text - perhaps intermingled with the
>       caption text - in "book form"?
Ah, I think there is a misunderstanding. The idea of how I always understood
the text descriptions to work with the video should be completely lean-back:
The user starts the video with the text descriptions turned on and can then
stop interacting with the video, because AT and the video player will just
play from start to end just like any other person who is watching a video,
except that AT will put the video player on hold while it is reading out
text descriptions (and the gap is too short) and resume the player when it's
done reading.

As for braille-only folks: we can assume that they would prefer a text-only
transcript that does not play in parallel with the video if made available.
Lacking that, they could also want the video played back with captions and
descriptions turned on and the AT managing the video playback in the
background. Suboptimal, I know, but sometimes possibly the only thing

>    -
>    1. Use our existing IA2 API, with a new pattern:
>       - Have the "audio-description-aware" video renderer expose an
>       AccessibleAction "pause/resume" or some such.
>        - Modify some IA2 AT (e.g. NVDA) to recognize this situation, and
>       make use of the "pause/resume" action
>       - At various moments in the audio/video stream, fire an event with
>       the updated description for that video component
>       - The modified NVDA would receive this event, call the
>       "pause/resume" action, render the description text (in speech and/or
>       Braille), and then call "pause/resume" again
This is indeed basically the spec that is in the wiki page.

>    -
>    1. Try an experimental new API
>       - Call this API something like AccessibleMultimedia; potentially
>       model it on AccessibleStreamable (in fact, we might have this be options
>       "2a" and "2b" - one that uses Streamable with a general pause/resume, and
>       another completely new)
>       - Modify some AT (e.g. NVDA) to recognize this new API
>       - Do whatever you think would be right for this new API
> Then interested parties can examine the source code of both approaches,
> play with the resulting applications, and discuss this in a much more
> concrete fashion.  For example, my Braille user assumptions in #1 may be
> totally off base; having coded up examples that use Braille in #2 and #3 we
> can try this with users and find out from them what they want.

I think these are great ideas! It would be awesome if anybody was to
implement even just one of these options!


> Regards,
> Peter
> On 6/16/2011 3:35 PM, Silvia Pfeiffer wrote:
> Hi Pete,
> Thanks for your review and questions on the video side of things. I'm
> hoping the combined expertise here will be able to define the best way
> to deal with video and the track specification of HTML5 [1]. I may,
> however, need to go into detail on how tracks and cues are handled in
> HTML5 before we can come to the right solution.
> [1] http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#the-track-element
> On Fri, Jun 17, 2011 at 5:36 AM, Pete Brunet <pete at a11ysoft.com> <pete at a11ysoft.com> wrote:
>  I've read through the discussion and have these comments and questions:
> What is the use case to justify an API (and associated synchronization
> complexity) for access to cues that is not solved by captions for those who
> can't hear the audio and audio descriptions of visual media for those who
> can't see the video?
>  The <track> element in HTML5 allows association of external text files
> that provide lists of timed caption cues, subtitle cues, text
> description cues and chapter cues to audio and video elements. This is
> on top of what can come from within a audio or video file, which can
> contain captions and subtitles as text, as well as audio descriptions
> as audio and sign language as video.
> Why I approached Alexander was to find out how to deal with text
> descriptions. Text descriptions are something new that you may not
> have seen in traditional accessibility approaches for audio and video:
> they provide the text that is usually spoken in audio descriptions as
> actual timed text cues. The files are essentially the same as caption
> files with cues that have a start and an end time and some text.
> However, it is expected that these text descriptions are read out by a
> screen reader or handed to a braille device to be communicated to
> those who can't hear.
> In addition, it should probably be possible to also expose caption
> cues (and subtitle cues for that matter) to AT for those that can
> neither hear nor see and want to consume them through braille. This
> was, however, not my main use case.
> Note that none of the text cues are part of the DOM of the Web page
> but only live in the shadow DOM. Therefore, I guess, some method of
> exposure is required.
>  Since it's early in the discussion of this issue I think this topic needs to
> be separated from the rest of the discussion.  Alex can you move that to a
> separate section like you did for the Registry API?
> At least at this point I'm not in favor of the media control methods.
> Developers should provide accessible GUI controls.  The developer would have
> to implement the access in any case and having access through the GUI would
> eliminate adding the code for these new methods on both sides of the
> interface.  If the app developer does a correct implementation of the GUI
> there would be no extra coding required in ATs.
>  I guess the idea here was that there may be situations where AT needs
> to overrule what is happening in the UI, for example when there are
> audio and video resources that start autoplaying on a newly opened
> page. However, I am not quite clear on this point either.
> The key problem that I saw with text descriptions and video controls
> is that we have quite a special case with text descriptions since the
> author of the text descriptions can identify the breaks in the video
> timeline into which a description cue needs to be fitted, and they can
> provide the text that needs to be spoken in this break, but they
> cannot know how long it will take to actually voice or braille this
> text. Therefore, AT in this case needs to control the video's playback
> timeline and possibly put it on hold when the end time of the cue is
> reached until AT has finished with the text of the cue. I would think
> that this may be one of the only cases where AT actually has to
> control the display of the Web page rather than just being a mere
> observer.
> Best Regards,
> Silvia.
> _______________________________________________
> Accessibility-ia2 mailing listAccessibility-ia2 at lists.linuxfoundation.orghttps://lists.linux-foundation.org/mailman/listinfo/accessibility-ia2
> --
> [image: Oracle] <http://www.oracle.com>
> Peter Korn | Accessibility Principal
> Phone: +1 650 5069522 <+1%20650%205069522>
> 500 Oracle Parkway | Redwood City, CA 94065
> [image: Green Oracle] <http://www.oracle.com/commitment> Oracle is
> committed to developing practices and products that help protect the
> environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/accessibility-ia2/attachments/20110617/77617822/attachment-0001.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 658 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/accessibility-ia2/attachments/20110617/77617822/attachment-0002.gif 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 356 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/accessibility-ia2/attachments/20110617/77617822/attachment-0003.gif 

More information about the Accessibility-ia2 mailing list