Problem/Motivation

The content_first_audit module currently lacks checks for several common HTML authoring mistakes that affect accessibility and content quality:

     
  • Empty block-level tags used as spacers. Editors sometimes insert empty <p>, <h1><h6>, <li>, <strong>, and similar block or phrasing elements (containing only whitespace) as a visual line-break trick instead of using <br> or CSS margins. This produces invalid, inaccessible markup. Inline elements such as <span> are intentionally excluded from this check because they are often used as style hooks with no required text content.
  •  

  • Images missing the alt attribute. Every <img> element must carry an alt attribute. An empty value (alt="") is acceptable only for decorative images identified by role="presentation" or aria-hidden="true". Any other image without alt, or with an empty alt and no decorative marker, should be flagged.
  •  

  • Disallowed child elements inside headings. Block-level elements such as <p>, <div>, or <ul> nested directly inside <h1><h6> produce invalid HTML per the W3C specification and can break assistive technologies.
  •  

Steps to reproduce

     
  1. Install and enable content_first_audit.
  2.  

  3. Create a node whose body field contains any of the following patterns:
     
       
    • <p>&nbsp;</p> or <p> </p> used as a blank line.
    •  

    • <img src="photo.jpg"> with no alt attribute.
    •  

    • <img src="deco.png" alt=""> without role="presentation" or aria-hidden="true".
    •  

    • <h2><p>Section title</p></h2>.
    •  

     

  4.  

  5. Run the content audit for that node.
  6.  

  7. Observe that none of the above issues are currently reported.
  8.  

Proposed resolution

Add three new audit checks inside content_first_audit:

     
  1.  EmptyBlockTag audit: parse the field HTML with a DOM library and flag any element whose tag is in a configurable deny-list (defaulting to p, h1, h2, h3, h4, h5, h6, li, strong, em, blockquote, td, th) and whose textContent (trimmed) is empty.
     
  2.  

  3.  ImageAltAttribute audit: select all <img> elements and flag any that:
     
       
    • are missing the alt attribute entirely, OR
    •  

    • have alt="" but are not marked decorative (role="presentation" or aria-hidden="true").
    •  

     

  4.  

  5.  InvalidHeadingContent audit: select all <h1><h6> elements and flag any that contain direct block-level child elements (p, div, ul, ol, table, etc.) not permitted by the HTML specification. Consider using the W3C Markup Validator or an equivalent library for authoritative element-nesting rules.
     
  6.  

Each audit should follow the existing plugin architecture: implement ContentAuditPluginInterface, be tagged as a service, and return structured violations with the offending HTML snippet, a human-readable description, and a severity level.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

eduardo morales alberti’s picture

Issue summary: View changes
eduardo morales alberti’s picture

Issue summary: View changes
eduardo morales alberti’s picture

Issue summary: View changes

eduardo morales alberti’s picture

Status: Active » Fixed

Added to the merged train, ready

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.