As part of my investigation work on PHP, I needed a preprocessor capable of extracting all the PHP from a PHP file, discarding all HTML it encounters.

Initially I thought it would be a hard job, but latter found that AntLR3 makes this job really easy!

I wrote something like this:

lexer grammar FuzzyPHP;

options { filter=true; }

PHP : ‘‘ { System.out.println(getText()); };

This happens to work really good :) Now I can continue my PHP parser…

*: Of course my code doesn’t work when you have strings like '?>'. Here’s a new version that should work :)

PHP : ‘SINGLE_QUOTED_STRING | DOUBLE_QUOTED_STRING)) ’?>’ { out.println(getText()); };

SINGLE_QUOTED_STRING : ’\’’ (’\\\\’ | ’\\\’’ | ~(’\’‘))* ’\’’ ;

DOUBLE_QUOTED_STRING : ’”’ (’\\\\’ | ’\\”’ | ~(’”’)) * ’”’ ;

Refactoring

Published at Mon 06 August, 2007 07:22 | Permalink Permalink | Comments Comments (0) | Trackbacks Trackbacks (0)

Everyone knows that Java sucks. But now I’m finishing my one year project on program comprehension. The application is written in Java, so you can imagine I have tons of classes and many KLOCs spread over the project tree.

Unfortunately I am at this phase that it is impossible to maintain the code! Poor planning? No! JAVA SUCKS! So now I’m refactoring all the way down, rewriting and rewriting and rewriting and rewriting….

This will be a really bad week :(

About

photo of Ruben Fonseca

My name is Ruben Fonseca. I'm a Computer Science and Systems Engineer from Portugal that loves FLOSS.

I'm currently taking some time off to myself, but feel free to contact me anytime at or via LinkedIn:

View Ruben Fonseca's profile on LinkedIn

Feeds

Music

FOSDEM 2010

I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting