Aural CSS: Support for CSS 2 Aural Style Sheets / CSS 3 Speech Module

Questions about support for aural CSS (Cascading Style Sheets) have been popping up in various corners of the Web lately, so I thought I would compile what I know as a supplementary page to my Screen Readers and Abbreviations tests.

If you find this information to be incomplete or inaccurate, please let me know so that I can update this page.

Note: Some new sources of information will be added to this page pending review. In the meantime, you may like to follow the links I have included at the bottom of the page and read for yourself.

Introduction

CSS includes 'aural' (or 'speech') properties that allow web designers and developers control over the way in which HTML (and XML) is synthesised as speech by CSS-aware software. However, these properties enjoy very limited support in current web browsers, screen readers and in other assistive technology software where the properties may be of benefit.

Opinion

Unfortunately, such limited support makes aural style sheets practically useless. Without an improved level of support from software vendors, web designers and developers are unlikely to use it as a tool. We have a paradox, however, as vendors are unlikely to prioritise support for something that is not used and has no benefit to them. Instead, current screen reader software (such as JAWS) and speaking browsers (such as Home Page Reader) analyse words to determine how they should be pronounced using their own non-CSS-based algorithms.

However, even if support was better than it is, using the aural properties is another matter. The average web designer or developer would still need the skill to write an appropriate and considerate aural style sheet, selecting voices, and perhaps positioning them spacially. If you think about how many designers actually use print style sheets, how many might actually implement aural style sheets?

It's also worth considering whether or not aural style sheets are as useful as they sound at first. Should speech properties be in the hands of web developers at all? Screen reader software typically allows users to set preferences for speech speed, voices, custom pronunciation dictionaries, etc. Such settings should remain under users' control rather than being part of building of a web site. Anything that could potentially override a user's settings could be harmful and insensitive.

Indeed, there may be some specific cases where aural style sheets would be useful. As an example, screen readers may struggle to pronounce the name of Australian airline, Qantas. Some speech synthesisers pronounce the name as “kan-tass” rather than “kwon-tass”. An aural style sheet could be used by a web developer to specify how speech synthesisers should pronounce the word. However, you must remember that the aural style sheet would only affect the web site to which it is applied, and that screen readers are used not only to access many web sites, but to access various computer software, too. Those screen reader settings are more useful to users than those set for a single web site.

Taking these considerations into account, I do not believe aural style sheets are important or useful enough to see their support improve in speech synthesisers.

A little history

Aural CSS first appeared in the CSS 2 Specification, the current official W3C Recommendation for CSS. The CSS 2.1 Specification – currently a "last call" Working Draft that will become the next official W3C Recommendation – extends the specification to include a new property, but deprecates the 'aural' media type and reserves the favoured 'speech' media type. The CSS 3 Speech module reworks and replaces the 'aural' properties as specified for CSS 2, 19 Aural style sheets / CSS 2.1, Appendix A. Aural style sheets. To quote some relevant sections of the CSS specifications:

Summary of Known Support

Note about Safari with VoiceOver: It has been suggested that using Safari with VoiceOver offers support for aural CSS. It seems that this is just rumour~~, but I have not yet tested it myself~~. My own initial testing using Safari 3.1.2 with VoiceOver indicates that the speak property is not supported.

Note about iCab: It has also been implied that iCab should support CSS 2 Aural style sheets as it claims full CSS 2.1 support. I currently have no information to confirm support.

Note about Window-Eyes: GW Micro are quoted as having said in December 2003 that they have no plans to support aural style sheets in Window-Eyes (see addendum to Shortened forms on the Web).

Details of Aural CSS Properties

The following table shows which properties are available in the different CSS specifications.

Table of Aural CSS Properties
CSS property	CSS 2	CSS 2.1	CSS 3
azimuth	y	y	n
cue	y	y	y
cue-after	y	y	y
cue-before	y	y	y
elevation	y	y	n
mark	n	n	y
mark-after	n	n	y
mark-before	n	n	y
pause	y	y	y
pause-after	y	y	y
pause-before	y	y	y
phonemes	n	n	y
pitch	y	y	n
pitch-range	y	y	n
play-during	y	y	n
rest	n	n	y
rest-after	n	n	y
rest-before	n	n	y
richness	y	y	n
speak	y	y	y
speak-header	n	y	n
speak-numeral	y	y	n
speak-punctuation	y	y	n
speech-rate	y	y	n
stress	y	y	n
voice-balance	n	n	y
voice-duration	n	n	y
voice-family	y	y	y
voice-pitch	n	n	y
voice-pitch-range	n	n	y
voice-rate	n	n	y
voice-stress	n	n	y
voice-volume	n	n	y
volume	y	y	n

Table of Known Support for Aural CSS Properties
CSS property	Opera 9	FireVox	Emacspeak
azimuth	n	?	?
cue	y	?	?
cue-after	y	?	?
cue-before	y	?	?
elevation	n	?	?
mark	?	?	?
mark-after	?	?	?
mark-before	?	?	?
pause	y	?	?
pause-after	y	?	?
pause-before	y	?	?
phonemes	y	?	?
pitch	n	?	?
pitch-range	n	?	?
play-during	n	?	?
rest	?	?	?
rest-after	?	?	?
rest-before	?	?	?
richness	n	?	?
speak	y	?	?
speak-header	n	?	?
speak-numeral	n	?	?
speak-punctuation	n	?	?
speech-rate	n	?	?
stress	n	?	?
voice-balance	y	?	?
voice-duration	y	?	?
voice-family	y	?	?
voice-pitch	y	/	?
voice-pitch-range	y	?	?
voice-rate	y	/	?
voice-stress	y	?	?
voice-volume	y	/	?
volume	n	?	?

CSS 2

Note: The speak-date and speak-time properties were referenced in a W3C note in 1997, but never made it into a specification.

CSS 2.1

The new property speak-header is introduced and 'aural' media type is deprecated in favour of 'speech' media type.

CSS 3