Re: headers attribute debate

<note
class="inTransmittal">

One of your participants asked about the accessibility relevance of
the 'headers' attribute as supported in HTML 4.01.

Here is a review of that issue.

This review has received consensus support in the PFWG. (member link)
http://www.w3.org/2007/06/06-pf-irc#T16-36-35

As ever, we would be glad to discuss this with you in a more
interactive fashion should you wish further clarification or to
take a different direction.

Al
/chair, PFWG

</note>

Reference:
http://www.w3.org/mid/1c8dbcaa0705271606o7ba8f4e7ybbbdfd9bc6f0559a@mail.gmail.com

** summary

The function provided by the 'headers' attribute is to associate
table cells with information required for the understanding of the
cell contents; information that is provided 'centrally' in header
cells because it applies to more than one cell. Tables are different
from the bulk of web content where there is one path to ancestors and
such common information. In tables, there are common characteristics
both by row and by column.

This function is shared with a) an association algorithm, and b) the 'scope'
attribute.

1. The function is needed.

Metadata for Content Adaptation Workshop:

<quote
cite="http://www.w3.org/2004/06/DI-MCA-WS/execreport.html">

The relationship between (fragments of) content should be captured in metadata.
[...]
Where possible, metadata should be derived from the existing markup,...

</quote>

WAI-ARIA States and Properties:

We would consider most related headers to fall within the meaning of
the aaa:labelledby attribute (occasionally aaa:describedby). The
existing 'headers' attribute provides this function in the context of
HTML tables.

http://www.w3.org/TR/aria-state/#labelledby

The point is that the eye at nominal layout can rapidly identify the
headers that pertain to a data cell, whereas the ear cannot.  People
operating without vision need their assistive technology to have
access to this information in a way they can mechanically recognize.
That is a job for markup.

2. The markup works.

 From the User Agent Working Group:

<quote
cite="http://lists.w3.org/Archives/Public/wai-xtech/2007Jun/0021.html">

The 'headers' attribute is supported by the major screen readers used in the
world (JAWS, WindowEyes, ??HAL/SuperNova-still waiting for a reply).
WindowEyes uses the headers and id attribute combination. WindowEyes does
*not* use the scope attribute. JAWS has support for headers/id, row and
column span, and the 'axis' attribute.

Assistive technologies, browser extensions, and tools that use DOM access
also support the headers attribute and expose that information through their
accessibility APIs and to their end users with disabilities and to
developers. Examples of this include Firefox extensions like FireVox and the
University of Illinois Firefox accessibility extension, and developer tools
like Parasoft's WebKing and IBM's RAVEN tool
(http://www.alphaworks.ibm.com/tech/raven).

In addition, platform accessibility APIs such as IAccessible2 on Windows,
ATK/AT-SPI on Linux, and the Java accessibility API all have functions for
getting the row and column headers. The headers attribute, scope attribute,
and TH all provided explicit, engineered ways for browsers to get row and
column headers and expose that information to assistive technologies through
the accessibility APIs. Without these, the browsers and assistive
technologies are forced to resort to heuristics such as font styling and
location (topmost and leftmost cells), which is insufficient for complex
tables with spanned and multiple row/column headers.

</quote>

An independent review by the U.S. Department of Justice for the Federal
Statistics workshop on accessible tables found that 'headers' was effective
in getting needed header information to consumers, as compared with
'scope' which was not well supported.

http://workshops.fedstats.gov/Nakata_Fedstats.ppt

'headers' is systematically applied by sites developed with the
relevant Oracle tools.  Oracle is a major presence in databases,
this is a big slice of the relevant user base.

http://lists.w3.org/Archives/Public/wai-xtech/2007May/0072.html

More 'yes we use it' responses:

http://lists.w3.org/Archives/Public/wai-xtech/2007May/0063.html
http://lists.w3.org/Archives/Public/wai-xtech/2007May/0064.html

3. 'headers' vs. 'scope'

'headers' was placed in the language because it handles all tables,
regular and irregular. 'scope' handles only regular cases.

http://juicystudio.com/article/html-scope-headers-debate.php#overlaidtable

Train timetable with some interior header data that applies to the
left, some to the right:

http://lists.w3.org/Archives/Public/www-archive/2007May/att-0083/czsched.htm

'headers' was taken up by assistive technology because the
client-side processing is simple. 'scope' is much more indirect and
hence didn't make the first round of adoption. This has prompted a
feedback loop of those authors who care using 'headers' and hence
user success and AT coding stick with that.

There has been some related commentary on the need to constrain
featuritis, that the language not grow with overly-narrow features.

It is strange that this be offered as a reason for preferring 'scope'
to 'headers.' On a pure language-complexity basis, 'scope' is more
heavyweight. It adds several new terms, an attribute name and
multiple values. 'headers' adds just one term and otherwise re-uses
core features (ID).

It shold be made clear that the Accessibility Initiative does care
about both language simplicity and authorability of features. The
PFWG charter identifies both authorability and "small footprint" as
values to be sought in markup language designs. This is why we
advocate for the 'backplane' re-engineering efforts to distill common
functions and provide common solutions for them, and why we have
invested in developing Authoring Tool Accessibility Guidelines and
techniques.

4.  80%-rule-NOT

Some commentors have suggested that in order to sustain a small
language there have to be some screening factors, and frequency of
use in the as-is Web is the screening factor to use.

The WAI position on this is roughly "that is like saying that the
builder of a high-rise building should decide whether or not to
include fire-stairs based on whether the previous buildings at that
street address had burned down or not."

We don't build fire stairs just enough to evacuate 80% of the occupants.

Accessibility features address failure modes that are infrequent, but
critical when they occur.

Popularity among authors should be used to select between _functionally
complete_ alternative strategies for supporting required functionality.

5. The path forward.

Starting from scratch, the broader @labelledby and @describedby
relationships are still needed, even with @scope and/or @headers,
because the needed semantic information is not limited to table
cells. Inside tables, these together with better algorithms could
make either or both of @scope and @headers deprecatable, suitable to
migrate out of use.

Leaf-by-leaf markup with a bottom-up facility, will still be needed to
cover irregular cases.  Such schema-terse, instance-verbose technology
can certainly be used less frequency if structural
reforms are introduced such as for example nestable row-groups in
place of the present awkward 'tbody' at one level only.

But we are not starting from scratch. There is an immense user base
of many sources as well as many sinks for HTML. So a measured
migration is the fastest change that would be appropriate.

In any case, the function is needed, and 'headers' markup is currently
delivering the needed function. If we are to work this attribute out
of the system, it must be because it is being replaced with something
better, and via an orderly transition.

6. summary:
(some more discursive details follow signature)

The genuine requirement that html4:td.headers addresses is broader
than just for table cells; a re-engineered solution could deliver
both superior usability and authorability. But we are not starting
from scratch. There is a disability constituency that currently uses
and depends on this feature: anyone offering to remove it should be
expected to demonstrate that the replacement works better and is in
service.

So from an accessibility perspective, dropping 'headers' because
'scope' could afford the same semantics in 'most of the cases' is a
wrong decision; now or, taken in isolation, for the future. But
'scope vs. headers' is not the right frame of reference for the
future. As the requirement isn't limited to tables, we look forward
to a better solution, gracefully migrated to, once the requirements
get looked at in the right breadth of view.

And if we can together set up the sampling capability, we'd be glad
to work on alternative strategies in terms of how one would recode
current 'live' examples.

** details.

7. AT have small markets; they can only afford easy algorithms.

The reason that 'headers' got picked up rapidly and 'scope' didn't
was in part the following peformance comparison:

The screen reader had a table cell in its sights, and had received
a 'hunh?' query from its user.  It needed to contextualize this table
cell.  To answer this query by 'scope' the AT would have to search
the table for TH cells (often misused for styling) and then check
the 'scope' on each.  If the author used 'headers' there was an
attribute on the object at hand pointing to a short list of what
more to say.  Need I say more?

8. Yes, more could be done with algorithmics.  At the FedStats
workshop I was surprised to realize that what they characterized
as 'complex tables' were not, in my over-math-educated mind,
complex.  They were fully regular relations, with tree-form indices
associated with the rows and/or columns.  There was a hierarchical
structure to the categories represented by rows or columns and
groups of rows or columns.  There was no irrregularity in the data
structure.  Just further structure above a flat list in the row and
column collections.

On the other hand, the 'irregular tables' that we forced 'headers'
into the markup do exist in the wild. These tables have "variant
record structure" where some of the fields are re-cast to new headers
on the fly within the table. Regularity is a convenience of language;
less common in the wild. Consider the periodic table of elements in
Chemistry. It's not truly periodic, it is quasi-periodic with
progressively longer and longer rows. It's a blend of table and tree
with important properties from each. Similarly the tree-table
structure we are working on in WAI-ARIA that is a commonplace of file
browsing today.

There are cases where two triangular arrays are packed into one
rectangular display, and there is a critical diagonal -- on one side
one refers to the left hand header, on the other side to the right;
and similarly for top/bottom headers.

http://juicystudio.com/article/html-scope-headers-debate.php#overlaidtable

In the wild we also found train timetables that were weirder than
this. They overlaid two mathematical relations (Eastbound and
Wesbound timetables, for example) re-using some fields. These can be
modeled by casting cells in the roles of data and headers, but not
with the coarse granularity afforded by 'scope.'

Train timetable with some interior header data that applies to the
left, some to the right:

http://lists.w3.org/Archives/Public/www-archive/2007May/att-0083/czsched.htm

Rationale at the time of HTML4:
http://www.w3.org/WAI/PF/guide.html#TABLE

Philosophy behind 'headers' as 'lowest level language' that covers all
cases:
http://www.w3.org/2002/Talks/06/24-US_FedStatsWorkshop/slide1-0.html

9.  There's room for further re-factorization in the space of 
required information
See in particular the strong appeal for anywhere-in-the-page LINK and META
capability.

http://lists.w3.org/Archives/Public/w3c-wai-hc/1997OctDec/0063.html

One way to avoid language bloat is to build in effective metadata
capabilities. Let's not overlook that lever as we work to ensure that
what is easy *is* easy, but what is hard *is* possible.

10. Current awareness, demographic research.

The WAI would benefit from better tools to gauge the state of accessiblity
in the as-is Web.  Organizations active in the HTML WG have data and tools
that could make this possible.  If possible, we should start an exploratory
conversation about methods of asking the Web questions of interest.
Raw feature presence or absence is too coarse, but searches that would
find a collection of table-bearing pages that had, say, 10% irregular examples,
would then with combined mechanical and manual screening, give us a much
better set of samples to try algorithmic proposals against.  Algorithmic
proposals should include both user agent proposals as to how to extract
metadata from the markup *and* authoring tool algorithms as to when
and how to challenge the author so that the metadata extracted by the
user agent is more frequently accurate.

There has been some suggestion that 'headers', where used, has mostly
been used wrong.  Is there any way to see if the used-wrong vs. used-right
difference has a significant correlation with the tools used?  This would
be of interest to the toolsmiths and WAI Education and Outreach.

The HTML4-era examples are old.  We will be looking for more current
examples to share via the HTML WG Wiki.

Received on Wednesday, 6 June 2007 18:11:19 UTC