While migrating content to Drupal 8, it would be great if the HTML could be cleaned up a bit. All part of Authoring Tools Accessibility Guidelines Principle B.1: Fully automatic processes produce accessible content
- Upgrading outdated HTML4 tags and upgrading to HTML5
- Alt text for images & titles for SVG files should be preserved
B.1.1.2 Content Auto-Generation During Authoring Sessions
"If the authoring tool provides the functionality for automatically generating web content during an authoring session, then at least one of the following is true: (Level A to meet WCAG 2.0 Level A success criteria; Level AA to meet WCAG 2.0 Level A and AA success criteria; Level AAA to meet all WCAG 2.0 success criteria)
(a) Accessible: The content is accessible web content (WCAG) without author input; or
(b) Prompting: During the automatic generation process, authors are prompted for any required accessibility information (WCAG); or
(c) Automatic Checking: After the automatic generation process, accessibility checking is automatically performed; or
(d) Checking Suggested: After the automatic generation process, the authoring tool prompts authors to perform accessibility checking."
https://www.w3.org/TR/2015/NOTE-IMPLEMENTING-ATAG20-20150924/#sc_b112
B.1.2.1 Restructuring and Recoding Transformations
"If the authoring tool provides restructuring transformations or re-coding transformations, and if equivalent mechanisms exist in the web content technology of the output, then at least one of the following is true: (Level A to meet WCAG 2.0 Level A success criteria; Level AA to meet WCAG 2.0 Level A and AA success criteria; Level AAA to meet all WCAG 2.0 success criteria)
(a) Preserve: Accessibility information (WCAG) is preserved in the output; or
(b) Warning: Authors have the default option to be warned that accessibility information (WCAG) may be lost (e.g. when saving a vector graphic into a raster image format); or
(c) Automatic Checking: After the transformation, accessibility checking is automatically performed; or
(d) Checking Suggested: After the transformation, the authoring tool prompts authors to perform accessibility checking."
https://www.w3.org/TR/2015/NOTE-IMPLEMENTING-ATAG20-20150924/#sc_b121
B.1.2.2 Copy-Paste Inside Authoring Tool
"If the authoring tool supports copy and paste of structured content, then any accessibility information (WCAG) in the copied content is preserved when the authoring tool is both the source and destination of the copy-paste and the source and destination use the same web content technology."
https://www.w3.org/TR/2015/NOTE-IMPLEMENTING-ATAG20-20150924/#sc_b122
B.1.2.4 Text Alternatives for Non-Text Content are Preserved
"If the authoring tool provides web content transformations that preserve non-text content in the output, then any text alternatives for that non-text content are also preserved, if equivalent mechanisms exist in the web content technology of the output."
https://www.w3.org/TR/2015/NOTE-IMPLEMENTING-ATAG20-20150924/#sc_b124
Comments
Comment #2
mgiffordComment #3
mgiffordThink this should be part of Core.
Comment #4
mgiffordchanging compontent
Comment #5
andrewmacpherson commentedInteresting idea @mgifford. I think this could be achievable to a degree, but some parts look very ambitious.
Updating html4 to html5 might be achievable, though there will doubtless be some gotchas like
<acronym>. To convert the HTML automatically, we'd need some kind of library to be available to PHP during migration. PHP can use tidy, but the PHP module may not be installed on all systems (e.g. on Debian, installing the php-7.1 meta package does not cause php-7.1-tidy package to be installed). Ideally we'd want to use some library which can be pulled in by composer.For B1.1.2, automated accessibility checking is ambitious IMO. The migration row data has large chinks of HTML as unparsed strings, but most modern accessibility checkers want a rendered DOM. Automated migrations can take a long time, so I don't think option (b) prompting is feasible. I think option (d) "checking suggested" is all we'll manage, possibly with migrate messages for individual records.
I updated the summary to make it clear this was about ATAG.
Comment #6
mgiffordIt totally could be huge. But could also be a matter of searching for outdated code & flagging them as potential problems.
Images without alt text,
<FONT>tags should be easy enough to count & keep track of. Would really be most useful if someone with a big migration were able to invest in this idea so that their site would be cleaner and everyone else would benefit.Also, just running the HTML against something like aXe or pa11y should be possible. Heck, just running the html through a tidy tool would be great for the formatting alone. Would be great to strip out remaining MSWord formatting too.
I wish the W3C released a JavaScript conversion tool with every major release of HTML/CSS...
I was just thinking of the Core conversation and ways to incorporate more Core initiatives.
Comment #7
heddnI'm not sure this should block migrate stability. Nice ideas, BTW.
Comment #8
mgiffordYup. Not a blocker. Just an opportunity.
Comment #19
quietone commentedThe Migrate Drupal UI Module was approved for removal in #3371229: [Policy] Migrate Drupal and Migrate Drupal UI after Drupal 7 EOL.
This is Postponed. The status is set according to two policies. The Remove a core extension and move it to a contributed project and the Extensions approved for removal policies.
The deprecation work is in #3463321: Deprecate Migrate Drupal UI and the removal work in #3522601: [meta] Tasks to remove Migrate Drupal UI module.
Migrate Drupal UI will not be moved to a contributed project. It will be removed from core after the Drupal 12.x branch is open.
Comment #20
quietone commentedThe Migrate Drupal and Migrate Drupal UI modules are deprecated and will be removed from Drupal 12. For these modules, effort is now focused on bug fixes and necessary tasks. Therefore, this feature request is closed as won't fix.
Thanks to all for working on this!
Comment #22
quietone commentedAnd tag cleanup