Aural CSS: Support for CSS 2 Aural Style Sheets / CSS 3 Speech Module
Last updated: May 12, 2010 (created on November 21, 2006)
Questions about support for aural CSS (Cascading Style Sheets) have been popping up in various corners of the Web lately, so I thought I would compile what I know as a supplementary page to my Screen Readers and Abbreviations tests.
If you find this information to be incomplete or inaccurate, please let me know so that I can update this page.
Note: Some new sources of information will be added to this page pending review. In the meantime, you may like to follow the links I have included at the bottom of the page and read for yourself.
- Introduction
- Summary of Known Support
- Details of Aural CSS Properties
- CSS 2 (superseded)
- CSS 2.1 (W3C Candidate Recommendation, 08 September 2009)
- CSS 3 (W3C Working Draft, 16 December 2004)
- Related Links
- New Sources Pending Addition
Introduction
CSS includes 'aural' (or 'speech') properties that allow web designers and developers control over the way in which HTML (and XML) is synthesised as speech by CSS-aware software. However, these properties enjoy very limited support in current web browsers, screen readers and in other assistive technology software where the properties may be of benefit.
If you want a good introduction to aural style sheets, the “Aural stylesheets” section of Joe Clark's book, Building Accessible Websites, is a very informative.
Opinion
Unfortunately, such limited support makes aural style sheets practically useless. Without an improved level of support from software vendors, web designers and developers are unlikely to use it as a tool. We have a paradox, however, as vendors are unlikely to prioritise support for something that is not used and has no benefit to them. Instead, current screen reader software (such as JAWS) and speaking browsers (such as Home Page Reader) analyse words to determine how they should be pronounced using their own non-CSS-based algorithms.
However, even if support was better than it is, using the aural properties is another matter. The average web designer or developer would still need the skill to write an appropriate and considerate aural style sheet, selecting voices, and perhaps positioning them spacially. If you think about how many designers actually use print style sheets, how many might actually implement aural style sheets?
It's also worth considering whether or not aural style sheets are as useful as they sound at first. Should speech properties be in the hands of web developers at all? Screen reader software typically allows users to set preferences for speech speed, voices, custom pronunciation dictionaries, etc. Such settings should remain under users' control rather than being part of building of a web site. Anything that could potentially override a user's settings could be harmful and insensitive.
Indeed, there may be some specific cases where aural style sheets would be useful. As an example, screen readers may struggle to pronounce the name of Australian airline, Qantas. Some speech synthesisers pronounce the name as “kan-tass” rather than “kwon-tass”. An aural style sheet could be used by a web developer to specify how speech synthesisers should pronounce the word. However, you must remember that the aural style sheet would only affect the web site to which it is applied, and that screen readers are used not only to access many web sites, but to access various computer software, too. Those screen reader settings are more useful to users than those set for a single web site.
Taking these considerations into account, I do not believe aural style sheets are important or useful enough to see their support improve in speech synthesisers.
A little history
Aural CSS first appeared in the CSS 2 Specification, the current official W3C Recommendation for CSS. The CSS 2.1 Specification – currently a "last call" Working Draft that will become the next official W3C Recommendation – extends the specification to include a new property, but deprecates the 'aural' media type and reserves the favoured 'speech' media type. The CSS 3 Speech module reworks and replaces the 'aural' properties as specified for CSS 2, 19 Aural style sheets / CSS 2.1, Appendix A. Aural style sheets. To quote some relevant sections of the CSS specifications:
“UAs are not required to implement the properties of this chapter in order to conform to CSS 2.1.”
And:
“We expect that in a future level of CSS there will be new properties and values defined for speech output. Therefore CSS 2.1 reserves the 'speech' media type (see chapter 7, "Media types"), but does not yet define which properties do or do not apply to it.
“The properties in this appendix apply to a media type 'aural', that was introduced in CSS 2. The type 'aural' is now deprecated.”
CSS 2.1, Appendix A. Aural style sheets, A.1 The media types 'aural' and 'speech'
Summary of Known Support
The CSS 3 Speech module is currently supported in:
- Opera (Windows XP or 2000 only) – support for some CSS 3 speech properties. I've quickly tested this and it mostly works! Note that some of the supported properties require a
-xv-
prefix to work in Opera, e.g.-xv-voice-balance: right
. - FireVox extension for Firefox – support for keyword values of a few CSS 3 speech properties. However, having tested this with Firefox 3, support is currently broken (Nov 2008). Learn more about FireVox support: Technical details about Speech property support in CLC-4-TTS.
Note about FireVox and Firefox: Firefox does not parse aural/speech CSS properties, so FireVox support is achieved by parsing the CSS directly.
CSS 2 Aural Style Sheets are currently supported in:
- Emacspeak audio desktop – The only known support is as part of Emacspeak, an audio desktop for Linux. A few of the speech-enabled applications on the Emacspeak desktop have support for aural CSS properties. The 'w3' command line application is a standards-compliant Web browser with aural CSS support. Unfortunately, it is not really a viable solution for general consumption.
- pwWebSpeak – While this non-visual browser can still be downloaded, it has not been in development since 2001. It allegedly had some support for aural CSS. I have not yet tested this.
- Fonix SpeakThis – This hosted text-to-speech service would deliver your website in MP3 format and allegedly supported aural CSS. Fonix seem to have replaced SpeakThis with other services and I have found little reference to the software since 2003.
Note about Safari with VoiceOver: It has been suggested that using
Safari with VoiceOver offers support for aural CSS.
It seems that this is just rumour, but I have not yet tested it myself. My own initial testing using Safari 3.1.2 with VoiceOver indicates that the speak
property is not supported.
Note about iCab: It has also been implied that iCab should support CSS 2 Aural style sheets as it claims full CSS 2.1 support. I currently have no information to confirm support.
Note about Window-Eyes: GW Micro are quoted as having said in December 2003 that they have no plans to support aural style sheets in Window-Eyes (see addendum to Shortened forms on the Web).
References:
- CSS support in Opera
- Technical details about Speech property support in CLC-4-TTS
- CSS 3 Speech Styles Demo
- [Accessibility_sig] speech style sheets?
Details of Aural CSS Properties
The following table shows which properties are available in the different CSS specifications.
CSS property | CSS 2 | CSS 2.1 | CSS 3 |
---|---|---|---|
azimuth | y | y | n |
cue | y | y | y |
cue-after | y | y | y |
cue-before | y | y | y |
elevation | y | y | n |
mark | n | n | y |
mark-after | n | n | y |
mark-before | n | n | y |
pause | y | y | y |
pause-after | y | y | y |
pause-before | y | y | y |
phonemes | n | n | y |
pitch | y | y | n |
pitch-range | y | y | n |
play-during | y | y | n |
rest | n | n | y |
rest-after | n | n | y |
rest-before | n | n | y |
richness | y | y | n |
speak | y | y | y |
speak-header | n | y | n |
speak-numeral | y | y | n |
speak-punctuation | y | y | n |
speech-rate | y | y | n |
stress | y | y | n |
voice-balance | n | n | y |
voice-duration | n | n | y |
voice-family | y | y | y |
voice-pitch | n | n | y |
voice-pitch-range | n | n | y |
voice-rate | n | n | y |
voice-stress | n | n | y |
voice-volume | n | n | y |
volume | y | y | n |
The following table shows the current support for aural/speech CSS properties.
Key:
- n
- not supported
- y
- is supported
- /
- partial support
- ?
- not tested / level of support unknown (but unlikely if CSS 2)
CSS property | Opera 9 | FireVox | Emacspeak |
---|---|---|---|
azimuth | n | ? | ? |
cue | y | ? | ? |
cue-after | y | ? | ? |
cue-before | y | ? | ? |
elevation | n | ? | ? |
mark | ? | ? | ? |
mark-after | ? | ? | ? |
mark-before | ? | ? | ? |
pause | y | ? | ? |
pause-after | y | ? | ? |
pause-before | y | ? | ? |
phonemes | y | ? | ? |
pitch | n | ? | ? |
pitch-range | n | ? | ? |
play-during | n | ? | ? |
rest | ? | ? | ? |
rest-after | ? | ? | ? |
rest-before | ? | ? | ? |
richness | n | ? | ? |
speak | y | ? | ? |
speak-header | n | ? | ? |
speak-numeral | n | ? | ? |
speak-punctuation | n | ? | ? |
speech-rate | n | ? | ? |
stress | n | ? | ? |
voice-balance | y | ? | ? |
voice-duration | y | ? | ? |
voice-family | y | ? | ? |
voice-pitch | y | / | ? |
voice-pitch-range | y | ? | ? |
voice-rate | y | / | ? |
voice-stress | y | ? | ? |
voice-volume | y | / | ? |
volume | n | ? | ? |
CSS 2
CSS 2 is now superseded by CSS 2.1. The following properties were defined in the CSS 2 W3C Recommendation, 12 May 1998 (revised 11 April 2008): http://www.w3.org/TR/2008/REC-CSS2-20080411/aural.html
19 properties:
- azimuth
- cue
- cue-after
- cue-before
- elevation
- pause
- pause-after
- pause-before
- pitch
- pitch-range
- play-during
- richness
- speak
- speak-numeral
- speak-punctuation
- speech-rate
- stress
- voice-family
- volume
Note: The speak-date
and speak-time
properties were referenced in
a W3C note in 1997, but never made it into a specification.
CSS 2.1
W3C Candidate Recommendation, 08 September 2009: http://www.w3.org/TR/CSS21/aural.html
The new property speak-header
is introduced and 'aural' media type is deprecated in favour of 'speech' media type.
20 properties:
- azimuth
- cue
- cue-after
- cue-before
- elevation
- pause
- pause-after
- pause-before
- pitch
- pitch-range
- play-during
- richness
- speak
- speak-header
- speak-numeral
- speak-punctuation
- speech-rate
- stress
- voice-family
- volume
CSS 3
W3C Working Draft, 16 December 2004: http://www.w3.org/TR/css3-speech/#property-index
22 properties:
- cue
- cue-after
- cue-before
- mark
- mark-after
- mark-before
- pause
- pause-after
- pause-before
- phonemes
- rest
- rest-after
- rest-before
- speak
- voice-balance
- voice-duration
- voice-family
- voice-pitch
- voice-pitch-range
- voice-rate
- voice-stress
- voice-volume
Related Links
- Test: Abbreviations (and Screen Reader Support for CSS 2 Aural Style Sheets / CSS 3 Speech Module)
- Reference: CSS 3 Speech Module (latest version)
- Reference: CSS 2, 19 Aural style sheets
- Reference: CSS 2.1, Appendix A. Aural style sheets
- Discussions: “Support for aural style sheets” on Accessify Forum
- Discussions: Comments on “Long Live the
Q
Tag” on A List Apart - Links: dotjay's aural CSS bookmarks on del.icio.us
New Sources Pending Addition
There are a few pages of information I've found that I still need to read through and/or digest, but you can take a look yourself in the meantime:
- Opera.Vox (CSS 3 speech!) – Some useful information and aural CSS test cases with specific focus on Opera and its support for the CSS 3 Speech Module.
- CodeStyle.org: Aural media browser conformance – A test suite and compatibilty guide for the 'aural' media type.
- richinstyle.com: Aural CSS tests
- Accessites.org: Can You Hear Me Now? – This article seems a little too
hopeful to me. It tries to get people to start using aural style sheets. While that might be an honourable cause – it would be good to build
up interest – the article lets itself down. The article makes it sound as if the aural properties in CSS 2 are becoming a useful
tool when they probably never will be. Unfortunately, the article concentrates on discussing the use of CSS 2 properties, which only
seem to enjoy reasonable support from Emacspeak (see earlier notes). The
speak-header
property mentioned was introduced in CSS 2.1, which isn't yet an official recommendation, so that particular property will probably never see the light of day. I'm not sure whether or not Emacspeak supports it. Also, those CSS 2 properties don't feature in CSS 3, as the CSS 3 Speech Module deprecates many of the CSS 2 aural properties. As a result, I think if software begins to support the aural CSS properties in any useful way, we'll be seeing support for the CSS 3 properties and not the older CSS 2 properties. The article also references Opera as supporting aural CSS. Opera only claims support for a selection of properties defined by the CSS 3 Speech Module, not the CSS 2 aural properties. The article received hype from sites like SitePoint.com (Jul 3, 2006 News Wire), so I'm a little worried that it puts across inaccurate information. Anyway, Accessites have updated the page (15 Feb 2007) with a note to this effect. It did serve to remind me about Fonix SpeakThis (a hosted text-to-speech service that would deliver your website in MP3 format), I think referencing it now is somewhat out of date as SpeakThis seems to have died a death. Fonix seem to have replaced it with other services and I have found little reference to the software since 2003. - It's worth noting that current screen readers appear to support CSS. For example, using
display:none;
in CSS with a 'screen' media type will be parsed by a browser and the HTML element targetted by that CSS will (should) be removed from the DOM. Hence, the model of the page in the screen reader's virtual buffer will not contain that HTML element either. It has been suggested that screen readers should ignore any CSS meant for other media (screen media, for example), but interpret any aural CSS, meaning a shift away from reliance on browsers and towards behaving as a speech browser rather than a screen reader.