When creating a node by copy/pasting from something like Word, the pasted content will often already have the doctype and html tags set. The current code in node_prepare wraps everthing with fake html elements without checking if they already exist, resulting in a ton of error messages.

I made the following change to only do that if it isn't already found in the input by wrapping a if/else around it:

function htmltidy_fragment($input, $format, &$errors, &$warnings) {
  if ($input) {
    // Pretend it's a full document. This declaration just suppresses one of
    // the warnings.
    if (!strstr($input, '<html') && !strstr($input, '<HTML')) {
      $html = '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">';
      // Put a new line after the fake headers so our content starts at the
      // begining of a line. this way we can get correct line/column info by just
      // subtracting one from the line number
      $html .= "<html><head><title></title></head><body>\n";
      $html .= $input;
      $html .= '</body></html>';
    }
    else {
      $html = $input;
    }
    $output = htmltidy_string($html, $format, $errors, $warnings);
...

Sorry this is not a real patch, I don't have a cvs checkout to do a diff against right now.

Comments

michaelfavia’s picture

Status: Needs review » Fixed

Thank you.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.