Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hi,
I'm trying to import the following XHTML file (which is converted from a DITA XML sample file):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xml:lang="en-us" lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta name="DC.Format" content="XHTML"/>
<link rel="stylesheet" type="text/css" href="../commonltr.css"/>
<title>Changing the oil in your car</title>
</head>
<body id="changeoil">
<h1 class="title topictitle1">Changing the oil in your car</h1>
<div class="body taskbody">
<p class="shortdesc">
Once every 6000 kilometers or three months, change the oil in your car.
</p>
<div class="section context">
<p class="p">Changing the oil regularly
will help keep the engine in good condition.
</p>
</div>
<p class="li stepsection">To change the oil:</p>
<ol class="ol steps">
<li class="li step"><span class="ph cmd">Remove the old oil filter.</span></li>
<li class="li step"><span class="ph cmd">Drain the old oil.</span></li>
<li class="li step"><span class="ph cmd">Install a new oil filter and gasket.</span></li>
<li class="li step"><span class="ph cmd">Add new oil to the engine.</span></li>
<li class="li step"><span class="ph cmd">Check the air filter and replace or clean it.</span></li>
<li class="li step"><span class="ph cmd">Top up the windshield washer fluid.</span></li>
</ol>
</div>
</body>
</html>
However, the HTML file isn't recognized as a page and therefore doesn't show in blue in step 2. And I get the following error message:
I think (due to file suffix 'document') that 'sites/default/files/garage/tasks/changingtheoil.html' is not a html page I can process.
Is this due to some misconfiguration (I tried various combinations of the settings, all with the same result) or a bug?
Any pointers welcome.
Frank
Comment | File | Size | Author |
---|---|---|---|
#6 | add-xml-mime-type-2448437-5.patch | 464 bytes | Frank Ralf |
Comments
Comment #1
dman CreditAttribution: dman commentedI'd expect that as the file is called 'changingtheoil.html' on the system, then it would be picked up as usual.
Inspecting the stuff that happens inside import_html_guess_file_class() and _import_html_file_classes() ... I'm going to start by guessing that the MIME type registry on your system may not be recognising it as type "text/html", but as something else.
Are you able to see what the result is if you run the php function :
Comment #2
Frank Ralf CreditAttribution: Frank Ralf commentedHi dman,
Thanks for the quick reply. I ran your script on my hosted server and it indeed returned "application/xml" instead of "text/html" for the MIME type. When I comment the XML and doctype declarations the returned MIME type is "text/html" and the module works properly.
So is this an error or a misconfiguration on the server side? How can I amend this?
TIA
Frank
JFTR, what the W3C says:
Comment #3
Frank Ralf CreditAttribution: Frank Ralf commentedI've had a closer look at the functions you mentioned. import_html_guess_file_class() doesn't cater for "application/xml" MIME type so the general "document" is returned:
So I'd suggest to either add the following more specific code to that function:
Or give the file ending precedence over the MIME type when guessing the file format.
Frank
Comment #4
dman CreditAttribution: dman commentedOK, I guess we can do that.
I'm an XHTML standards freak, but I still never thought that 'application' was a good description for an xhtml document.
import_html however can happily run on pure-xml documents without too much effort though, so it's fine to support application/xml for that reason.
I think your additional case would be a fine patch.
Comment #5
Frank Ralf CreditAttribution: Frank Ralf commentedHere's the patch ;-)
Frank
Comment #6
Frank Ralf CreditAttribution: Frank Ralf commentedRe-uploaded the patch in UTF-8 format without BOM.
Comment #8
dman CreditAttribution: dman commentedThanks! Committed to 7.x-1.x. I'll see if I can get it into 7.x-2.x dev also
Comment #10
dman CreditAttribution: dman commentedComment #11
Frank Ralf CreditAttribution: Frank Ralf commentedThanks! I'm closing this issue then ;-)