The test coverage for theme output across core is patchy, and in some places not well implemented.
Tests that compare one string of HTML output to another are fragile, and failures are not necessarily representitive of a failure to the user (perhaps because an elements attributes are in a different order.
Improve coverage of themed output
Test core themes separately
Use superior analysis tools such as CSS Selectors or XPath/SimpleXml to perform assertions with rendered output.
More - TBD
User interface changes