What this module is and what it does

This is a bridge between the latest version of the PHP Simple HTML DOM Parser (simplehtmldom) library - so that Drupal developers can easily write their modules using its API.

What the library does is that it gives you the simplest way to parse the html DOM tree whenever you need to parse html:

  • in your drupal input filters
  • in hook_alter();
  • when migrating html sites to drupal

Easy html parsing

  • Have you ever wanted to do some str_replace (or maybe just extract it) only over the plaintext part of a html document?
  • Now you can do it with a Drupal module.
  • Now you can get/set all kinds of html tags and their attributes or inner text. Easy.

Installation and Usage

For 7.x-2.x and 6.x-2.x please download simplehtmldom library (README.txt)
Download and enable the module.

Usage example for Drupal 7: (use in your module's code)

  // Create a DOM object.
  $html_obj = new simple_html_dom();

  // Load HTML from a string.

  // Remove all plain text fragments.
  foreach ($html_obj->find('text') as $plain_text_obj ) {
    $plain_text_obj->innertext = "";

  // Display the results.
  echo $html_obj;
  // Release resources to avoid memory leak in some versions.

Here are the docs:

The library is released with a MIT License which according to Wikipedia is compatible with GPL 2.0, thus it can be embedded in Drupal modules. However, the library is not included in the module package starting with 2.x version.

Authors and maintainers:

Author: rsvelko from the team of Segments.at - your drupal partner.
Co-maintainer: Konstantin Komelin

Versions 6.x-2.x and 7.x-2.x:

These versions don't contain simplehtmldom library inside, now it's external dependency. It allows us to use whichever version of simplehtmldom library we like. Please follow instructions in the README.txt to install the module properly.

Support of these versions was initially sponsored by Explorable.com

If you're using Theme developer please note that it's currently incompatible with simplehtmldom 7.x-2.x. Use 7.x-1.12 instead.

