I created a filter quite sometime ago that extends the codefilter module with syntax highlighting. I would like to make it available to the drupal community.

The syntax highlighting engine used is GeSHi (http://qbnz.com/highlighter/), which is written php so it is quaranteed to work where drupal works. I also added a config page that lets admin's enable or disable highlighting and set the default language. You can also change the behaviour on a per-item basis by passing arguments to in the code tag.

The have the code hightlighted in a language other than the default do this:

<code language="html">

If you want to turn it off temporarily do this:

<code highlight="false">

I would love to have this in the drupal repository but I amnot sure how to go about it yet, I will look into that in due time. In the meanwhile if you want to try it out I setup a project page for it at http://www.filbar.org/project/geshifilter.

Comments and suggestions welcome.

Vince
vfilby@gmail.com

Comments

orion2012’s picture

Thanks for taking the time with this. I was just about to attempt the same thing with GeSHi. I've tested with several lanugages and all appears to be working tremendously. I'll touch back if I find any bugs.
Thanks again!

mhutch’s picture

Your module looks good. I'd be glad to see this go up on drupal.org.

I wrote and released something similar myself about six months ago, though it was much more minimalistic. I never got around to finishing it off with all the character escaping stuff, and I guess I'll probably take it down soon.

Will you be adding proper CSS support? If I recollect correctly, GeSHi generates stylesheets dynamically, for which it needs to know what languages are highlighted in the page, and I didn't find an easy way to do this in Drupal. I started to extract some of the relevant CSS from GeSHi but didn't get far looking into including different per-language stylesheets (for browser cacheing). Feel free to merge any of my code into yours, though there isn't much.

There are also some other GeSHi features that would be nice to expose, like line numbering.

Anyway, great job so far!

Geary’s picture

Hey guys,

I have yet another version of the GeSHi filter. :-)

Michael, several weeks ago I took your code and added a few features that I wanted in my blog for articles like this one. I added zebra striping along with a way to switch languages in the middle of a code block and keep the striping correct. The striping let me play with the appearance, using a proportional font with word wrap. Also I added the escaping so it could work with other filters, in particular Markdown.

I got it in good enough shape to use, but not really "finished". For one thing, I changed the code to generate inline CSS instead of using a separate stylesheet, because that way the styling works in more RSS aggregators. But I wanted to add something to check whether a feed or normal HTML is being generated, and use a stylesheet for HTML and inline styles for feeds.

Also, as I experimented with ways of getting it to work cleanly with Markdown and my other filters, I ended up using a <geshi> tag instead of a <code> tag. So this is how the code looks:

<geshi language>
code in language
<geshi newlanguage />
code in newlanguage
</geshi>

I meant to get back to it and try a more conventional <code> tag, but got busy with other things.

What say we all work together on this filter and get it to where it does what each of us wants? Vince, it looks like your filter is the most fully developed. Would you like to take a look at my code and see if my features can be merged in? I was meaning to take a look at merging the code myself, I'm just swamped with some other things right now. So if you don't get to it I'll be happy to take a look at it later. I just don't want to lose my zebra striping and visual tweaks after putting so much work into them. :-)

I have CVS access and would be happy to check in the combined code. I was meaning to check in my code, but seeing that you're working on it too it makes sense to coordinate. You'll want to apply for CVS access as well; there's something in the developer handbook that tells you what to do.

One other question is whether this GeSHi code should be combined into the codefilter module that already exists. It looks like we've each done GeSHi-only modules. Should this code be added to, or perhaps replace, the existing codefilter module, or be checked in as a separate geshi module? I'm inclined to keep it separate for fear of breaking something somebody is doing with the existing module, but that may be too conservative. What do you think?

If you would like to take a look at my code, warts and all, you can find it here.

Thanks!

-Mike

mhutch’s picture

Nice! Like the additional features, though the syntax for switching language isn't very 'xml-ish'. Hopefully the new version of GeSHi should remove the need for this.

I agree there needs to be a unified GeSHiCodeFilter module, or there will probably be more in our group :-) The best thing to aim for would be to fully expose all GeSHi features and any additional features people need like zebra striping as filter options. The architecture of the module that is checked into CVS would ideally allow these to be incrementally added, so that they can be submitted as patches by whoever needs them and is willing to write the code. I'd like to be able to contribute features when I need them, but don't have the time to work on the module right now.

I'm not sure how CSS should be treated, because as you say, some RSS aggregators could have issues. I actually was having trouble trying to determine which stylesheets were needed in a particular page, as you obviously don't want to load all the language stylesheets all the time. I don't think that Drupal has any easy way to make the page stylesheets dependent on the content. The only way I could see was to hit the database in hook_menu to determine which were needed for a particular node, then insert a CSS include into the header. But I'd prefer to avoid hitting the DB.

Update:Using hook_nodeapi I think we can add a key to the node which will be cached, so we don't have to hit the DB too often.

I think the existing CodeFilter should be left along, as some users may not want to have the GeSHi dependecy.

Michael

mhutch’s picture

I have started on a unified filter, with a new 4.7 release that merges some of Vince's code. This is a quick and dirty release, as I won't be able to work on it for a while due to exams, but I realised that other people might want to use it before then.

It's available at http://compsoc.dur.ac.uk/~mjh/project/GeSHicodefilter

In other news, there's a fourth GeSHicodefilter module, though not released ;-)

marble’s picture

I think you're missing an underscore in geshicodefilter_filter, in the line that starts form_set_error. I tried to add an issue on your website, but it didn't like a project of <none> and didn't have anything else in the dropdown to choose...

marble’s picture

Also, I had to comment out the following:

  //check language is enabled
  $types=variable_get('geshicodefilter_types_'.$format, array()); 
  if(!in_array($lang, $types)) {
    $lang = variable_get('geshicodefilter_default_lang_'.$format, 'c');
  }

because as far as I can see, there's nowhere that geshicodefilter_types_* gets set, so $lang always ends up as 'c' after this.

davidude’s picture

I have noticed that if you type too long of a line in "codefilter.module" it will shorten your code to the lenght of the page by trim() or wordwrap() or something. As long as you use the "<code>" tags. However, if you use "<?php" Geshifilter and codefilter don't wordwrap or trim the text to the page lengh. What is missing in their code?

Also I noticed that if there is a glossary term in the code it will be added and that will mess up the block of code.

Here is what I am talking about. You might want to look at the page in IE and Firefox.

Thanks!

mhutch’s picture

To fix the glossary terms, you should change the order in which the filters are applied.

If I remember correctly, with "<?php" the built in php highlighting is used, so the behaviour is different.

dvessel’s picture

I really wish I could use this but all those 'font' tags. eh.. I understand that it can be changed but I need php5? The documentation on this is insane!! Just for code highlighting and it's so huge!

http://qbnz.com/highlighter/geshi-doc.html

-joon
www.dvessel.com

pitpit’s picture

Hi,

You may protect the "geshi" directory to avoid extern access.

for example, on your website:
http://www.filbar.org/modules/geshifilter/geshi/contrib/example.php

or

http://www.filbar.org/modules/geshifilter/geshi/contrib/cssgen.php

pitpit’s picture

The two geshi filter provided here are bugprone with Drupal 4.7.

I recoded it in a more 4.7 compliant, fix some bugs and added some new features:

  • css include issue fixed
  • directory not more hardcoded wich allow copy geshi anywhere
  • possibility to change the code container which must goes around the code to contain it (pre, div or nothing)
  • possibility to change styling mode (in-line styles or external CSS classes)
  • possibility to enable number lines
  • default language and allowed languages are set in each input format.

Download this release at http://www.ubisum.com/node/20.

Comments and issues are welcome !

vfilby> It could be good to post geshifilter as a a new module on drupal.org and in CVS. If you agree with it, I can do it.

mhutch’s picture

Looks like more and more versions of this are popping up. We need a central repository to avoid duplication of effort -- I've already written a couple of those features for my version -- so I'd be happy to see it go up on drupal.org.

Thanks for the new features and bugfixes!

pitpit’s picture

vfilby seems to not come back on drupal.org (he posted his last message 9 weeks ago). So we have to deal without him.

My cvs access doesn't work anymore, i'm waiting for an admin to fix it and then we ll be able to create the repository and post the module on drupal project page.

The question is wich source of the 3 releases will be served as first official release. I made a lot of alterations and i'm not sure you agree with all of that (eg I deprecated the "lang" param for the "type" param...)

So we have to discuss on it. Please test and check my release, and tell me what you think.

--------
DPDev
UbiSum?

pitpit’s picture

geshifilter is now maintained here.

--------
DPDev
UbiSum?

guardian’s picture

Hmm. I started a coding GeSHi filter before Vincent Filby released his own module. I considered contributing it but I had not the time to update it to 4.7 version.

Anyway I'm glad there is now an official module.

Something I faced when using my module: I started using <code> tags to highlight code with GeSHi but it did not work that well for Java code that contained <code> tags inside the Javadoc comments (for those who don't know Javadoc supports plain html as part of code comments and it is recommended to use <code> tags around variable or function names - so the situation i'm describing here is far from being rare).

There is no trivial solution using regular expressions to handle the fact that there may be <code> tags nested inside the one s you're using to trigger GeSHi.

In the end, I decided to use <geshi lang="java"> markup. It's not compatible with the "standard" code filter but I don't care.

My two cents,
G.

mhutch’s picture

I had a quick look at your code, and apart from non-Drupal coding style in some places it seems good. Consider my module deprecated in favour of yours :)

I think the "lang" param is better than "type", because people usual talk about programming "languages", whereas in the context of programming, "type" has completely different meaning.

I'll do a proper code review in a week or two's time - I'm very busy ATM with Summer of Code (though not Drupal...).

dvessel’s picture

What do you think about this?

http://www.w3.org/TR/html4/struct/dirlang.html#h-8.1.1

States that "Computer languages are explicitly excluded from language codes".

I think class would be a better way to go or if it's not much trouble, give the option put in your own attribute through the input filter. It would certainly make everyone happy. :)

btw. A new version was released that uses <blockcode type="lang">... It also works with <code> now with highlighting and keeps it inline. Yay!

–joon
http://www.dvessel.com

gilcot’s picture

What do you think about this?

http://www.w3.org/TR/html4/struct/dirlang.html#h-8.1.1

States that "Computer languages are explicitly excluded from language codes".

...

btw. A new version was released that uses <blockcode type="lang">... It also works with <code> now with highlighting and keeps it inline. Yay!

Well. Go here http://www.w3.org/TR/html4/index/attributes.html and notice :

  • lang apply for %LanguageCode; and is #IMPLIED within All elements but APPLET, BASE, BASEFONT, BR, FRAME, FRAMESET, IFRAME, PARAM, SCRIPT
  • type is
    • an #IMPLIED for UL, OL, LI
    • a TEXT that indicates an INPUT control
    • an #IMPLIED %ContentType; for A, LINK, OBJECT, PARAM
    • a #REQUIRED %ContentType; of script or style language for SCRIPT and STYLE
  • language finaly is an #IMPLIED CDATA for SCRIPT

In another, page won't validate because of the module output : only
id, class, lang, dir, title, style, onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, onkeyup
are alowed as BLOCKQUOTE elements (but beware, Drupal should validate xhtml1) while CODE accepts :
id, class, lang, dir, title, style, onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, onkeyup :)
Last, PRE is intend for preformated texte like source codes... (same elements with CODE). Why don't use it ? It semantic is more accurate here than blockquote's one..

I think class would be a better way to go

mhutch’s picture

Well, actually, it doesn't matter whether the syntax is valid HTML, because it is run though the filter and is not present in the actual page output. We could even use BBCode-style tags to avoid confusion if need be. I would stick with "lang" because it is more intuitive -- people talk about "programming languages".

vfilby’s picture

Hey all, I have been busy here. I see the my code has been merged into a new module, I guess I should update my own site to point to it. My email is given above so please feel free to notifiy if you need help/assistance/etc.

Believe it or not I was denied CVS access to drupal! That is why the modules are all on my own site. Perhaps I should see if Dries will let me in now.

Cheers,
V

sepeck’s picture

There are several reasons one can be declined. One is the application reason was not clear. Generally people are willing to discuss and revisit applications that clear up any confusion. So without seeing your original application/reason for being turned down it's best to re-apply. As you have applied once, you may just want to send an email to the infrastrucutre list. I think the form isn't all that flexible yet.

Also, cvs and the project module has recently been updated and a project owner can give others commit rights to the individual projects and you can only commit to specifc projects you are authorized for.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

amir abbas’s picture

i read somewhere that using geshi is not safe and it has security problem
is it correct ?

--
amir abbas
www.persia-cms.com

nigel’s picture

Hi, I'm the GeSHi author.

GeSHi is not insecure. So far, in it's nearly three years of existance, there have been two security reports by security researchers, both have been found to be either incorrect or caused by users not following instructions.

On a side note: so good to see so many people writing GeSHi filters, keep it up! :D

--GeSHi Author - http://qbnz.com/highlighter/ http://geshi.org/ (may become a drupal!)
david007’s picture

Thanks for taking the time with this. I was just about to attempt the same thing with GeSHi.
____________________________
Insurance Center

josedanielestrada’s picture

I've had a little problems with GeSHi and tinymce. Try them together and you'll see what i'm talking about.

Thanks from Costa Rica! ;)

BananaTools.com

ruadhan’s picture

Hi,

I have installed Geshi filter module into drupal 5.7. It seems to work with some languages, but other languages. For instance, it does not highlight HTML, and the highlighting for PHP looks odd.

Any advice would be greatly appreciated.

shenmeng’s picture

I want to know how to change the font size.If anyone knows that can tell me? Thanks!