PHP: Replacing short open tags with proper ones recursively in a big code base

We recently took on a horrible code base at work, with lots of open tags in the code like this:

<? calculateVat(123..

As far as I know this way of opening PHP code is deprecated and soon won’t be supported at all so I thought I’d just use sed to fix this but it wasn’t quite that simple.

Sed has no way of doing look-aheads with regular expressions meaning we can’t tell it to not turn <?php into <?<?php .. ! So we have to use perl (or something else that has ‘proper’ regexp):

# Convert <? (without a trailing space) to <?php (with a trailing space):

find . -name "*.php" -print0 | 
xargs -0 perl -pi -e 's/<\?(?!php|=|xml|mso| )/<\?php /g'

# Convert <? (with a trailing space) to <?php (retaining the trailing space):

find . -name "*.php" -print0 | xargs -0 perl -pi -e 's/<\? /<\?php /g'

Note, this could probably be improved by not using xargs (xargs has issues with spaces and funny characters in the path) – you’d probably want to use find’s exec command with the curly braces {}…

Anyway, this should fix up your entire codebase, but please CHECK the results afterwards, I only realised it was turning <?xml into <?php xml after checking..

Comments welcome 🙂