Extracting PHP from HTML files with AntLR3
Mon 14, 2007 10:34 |
Permalink |
Comments (0) |
Trackbacks (0)
As part of my investigation work on PHP, I needed a preprocessor capable of extracting all the PHP from a PHP file, discarding all HTML it encounters.
Initially I thought it would be a hard job, but latter found that AntLR3 makes this job really easy!
I wrote something like this:
options { filter=true; }
PHP : ‘‘ { System.out.println(getText()); };This happens to work really good :) Now I can continue my PHP parser…
*: Of course my code doesn’t work when you have strings like '?>'. Here’s a new version that should work :)
SINGLE_QUOTED_STRING : ’\’’ (’\\\\’ | ’\\\’’ | ~(’\’‘))* ’\’’ ;
DOUBLE_QUOTED_STRING : ’”’ (’\\\\’ | ’\\”’ | ~(’”’)) * ’”’ ;