Migrating Garoon codebase to PHP 7

(※注 冒頭のみ日本語訳を行いました。本編はベトナムメンバーによる英語の記事になります。)
Xin chào! Cybozu Vietnam で Garoon を開発している Huy Nguyen, Vien Tran, Long Huynh, Dieu Ho, Tai Vu, Anh Nguyen です。ベトナムメンバー初のCybozu Inside Outの投稿です。Garoon で PHP 7 への移行をした経験を記事にしました。

One year ago, Garoon developement team started to migrate the codebase of Cybozu Garoon to PHP 7. Garoon is a product with large codebase, formerly written in PHP 4 and then later migrated to PHP 5. The migration really was a challenge to the team, but it was worth doing because PHP 7 brought a host of improvements. One of the benefits is performance improvements. Benchmark of Garoon using PHP 7 showed a 33 percent of speed improvement compared to PHP 5.6.

And also, active support of PHP 5.6 ended on 19 Jan 2017 according to php.net. Source: http://php.net/supported-versions.php

In this post, I will share things that the Garoon development team met and show you the way we migrated the codebase to PHP 7.

Tools to check incompatible changes

When we upgraded our code from PHP 5 to PHP 7, we had to check our codebase for all the “backward incompatible changes” to make sure our codebase could be successfully interpreted in PHP 7. There are some tools that help you scan your code through and point you out which code needs to be fixed in order to be able to run in PHP 7, among which are Phan and Lint.

Phan

Phan is a static analyzer for PHP. It’s licensed under the MIT License.

Prepare the environment

You have to prepare a CentOS machine, with PHP 7 installed. Then install composer running the code below:

curl \-sS https://getcomposer.org/installer \| php \-\- \--install-dir=/usr/local/bin \--filename=composer

Install Phan and the php-ast extension

First, clone the source code of php-ast:

git clone https://github.com/nikic/php-ast.git

Next, build the php-ast extension and then add it to php. Run each line of the code below in turn:

cd php-ast
phpize
./configure
make install
echo 'extension=ast.so' > /etc/php.ini

Then, install Phan, and run through the code below:

git clone https://github.com/etsy/phan.git
cd phan
composer install
./test
ln \-s /phan/phan /usr/local/bin/phan

Run Phan to check backward incompatible changes

We created a bash file (e.g., testPhan.sh) with the content like the following:

COMMAND="phan \--backward-compatibility-checks \--ignore-undeclared \--quick"
find /repo/source \-type f \-regex '.*\.\(php\)' \-print0 \| xargs \-0 \-n1 \-P 1 \-I % bash \-c "$COMMAND % "

You can change the path to the soure code by changing “/repo/source”. Then run:

sh testPhan.sh

Lint

Lint is a command line option provided by PHP, which is used to check PHP syntax.

On the above CentOS machine, we ran the following command to check our PHP source code with Lint:

find . -name "*.php" -print0 | xargs -0 -n1 -P8 php -l

E_STRICT notices severity changes

E_STRICT notices severity was changed in the PHP 7, and all of the E_STRICT notices were reclassified to other levels.

Situation New level/behaviour
Indexing by a resource E_NOTICE
Abstract static methods Notice removed, triggers no error
“Redefining” a constructor Notice removed, triggers no error
Signature mismatch during inheritance E_WARNING
Same (compatible) property in two used traits Notice removed, triggers no error
Accessing static property non-statically E_NOTICE
Only variables should be assigned by reference E_NOTICE
Only variables should be passed by reference E_NOTICE
Calling non-static methods statically E_DEPRECATED

Some of these changes affected our source code.

Signature mismatch during inheritance

The “Signature mismatch during inheritance” notice usually occurs in the following cases: The first case is a case where the number of the parameters of a function/method is different during inheritance. For example:

<?php
class Foo{
    function method($a){

    }
}
class Bar extends Foo{
    function method(){

    }
}

The above example outputs:

Warning: Declaration of Bar::method() should be compatible with Foo::method($a) in /in/YoDHq on line 11

You can click here to run this example and see the output.

The second is a case where the type of a parameter of a function/method is different during inheritance. For example:

<?php
class Foo{
    function method($a){

    }
}
class Bar extends Foo{
    function method(array $a){

    }
}

The above example outputs:

Warning: Declaration of Bar::method(array $a) should be compatible with Foo::method($a) in /in/fSIbS on line 11

You can click here to run this example and see the output.

We used Phan to check our code and fixed the problems that it reported as PhanSignatureMismatch and PhanSignatureMismatchInternal.

How to fix such issues depends on the logic that we modified in the base class or subclass.

Only variables should be assigned by reference

Example:

<?php
error_reporting(E_ALL);
function foo(){
    return 'string';
}
$temp =& foo();

The above example will output:

Notice: Only variables should be assigned by reference in /in/vXUIZ on line 6

You can click here to run the example and see the output.

In the above example, the ‘&’ operator causes an E_NOTICE saying “Only variables should be assigned by reference”. In case your code is compiled in a PHP 7 environment, an error will be detected, and so you must fix it by removing ‘&’ as follows:

<?php
error_reporting(E_ALL);
function foo(){
    return 'string';
}
$temp = foo();

However, since in our code ‘&’ was used in so many places and it would take us so much time to fix them all, I would like to introduce a tool which runs on CentOS to solve this problem. First of all, in order to write E_NOTICE messages to a php_error.log, you can do the following steps:

1. Add the following script to the code of your web application and then run it:

set_error_handler(function($errno, $errstr, $errfile, $errline){
  file_put_contents('/tmp/php_error.log', implode(',', [$errno, $errstr, $errfile, $errline]) . PHP_EOL, FILE_APPEND);
});

2. The “php_error.log” will look like the following:

2048,Only variables should be assigned by reference,/var/www/project/code/xxx1.php,12
2048,Only variables should be assigned by reference,/var/www/project/code/xxx2.php,121
2048,Only variables should be assigned by reference,/var/www/project/code/xxx3.php,26
2048,Only variables should be assigned by reference,/var/www/project/code/xxx.php,194

The tool will run the following steps:

  1. Find the latest folder created with the prefix “ system-private-* ” under tmp.
  2. Grep the string “assigned by reference” in the php_error.log file, which exists inside tmp, and then output the file name and the line number to a temporary result.
  3. Find “=&”, “= &” and “= & ” and replace them with “=” following the result from the step 2.
#!/bin/bash

systemd_private_name=$(ls -t1F /tmp/ | grep systemd-private-* | head -n1)
php_error_path="/tmp/${systemd_private_name}tmp/php_error.log"

if [ ! -f $php_error_path ]; then
    echo "File is not found"
    exit;
fi


grep "assigned by reference"  $php_error_path  | sort | uniq -c | while IFS=',' read -ra line ; do
    sed -ri ${line[3]}'s/(\$.+)(\s+=\s*&\s*)(.+())/\1 = \3/g'  ${line[2]}
done

Only variables should be passed by reference

In PHP 7, using a function argument could return an error when the function argument was passed by reference. This issue can occur if you use a function of a PHP framework or a method you added.

To fix this issue we declared a variable to store the value of a function, which had been directly passed at a position called. What follows is an example:

Example: Strict Standards will be thrown out if you put exploded array in array_pop:

<?php
   $fruit = array_pop(explode(",", "Orange,Apple,strawberry"));
   echo $fruit;

Fixed code:

<?php
   $fruits = explode(",", "Orange,Apple,strawberry");
   $fruit = array_pop($fruits);
   echo $fruit;

Note that in this example you must assign a variable to the function explode and then pass the reference to the variable to array_pop to avoid a Strict Standard warning.

The methods in PHP framework

Example:

<?php
   $file_name = "abc.txt";
   $file_extension = end(explode('.', $file_name));

The above example will output:

Notice: Only variables should be passed by reference in /in/Cv65t on line 3

You can click here to run the example and see the output.

Fixed code:

<?php
   $tmp = explode('.', $file_name);
   $file_extension = end($tmp);

We used Phan to check our code and fixed the problems which were reported as PhanTypeNonVarPassByRef. However, Phan only retrieved methods in the PHP framework: as for the methods we added, Phan could not detect them as PhanTypeNonVarPassByRef. The example below is to explain this point.

The method you added

Example:

<?php
function getArray() {
    return [1, 2, 3];
}

function squareArray(array &$a) {
    foreach ($a as &$v) {
        $v **= 2;
    }
}

// Generates a warning in PHP 7.
squareArray((getArray()));

The above example will output:

Notice: Only variables should be passed by reference in in /in/HYONN on line 14

You can click here to run the example and see the output.

To detect a method you define in your project, you can search for it with regular expression using an editor (e.g, notepad++).

function.*\&\s+\$.*\)$

The above example will output:

E:\yourproject\comment_util.php (3 hits)
    Line 30:     function getFollowByUser( & $user, & $article )
    Line 61:     function _setNameRowSet( & $rowset, $inverse = FALSE )
    Line 119:     function _row2object( & $row )

After you list your reference methods from the search result, you must manually modify function calls.

Calling non-static methods statically

Static calls to methods that are not declared static are deprecated in PHP 7 and may be removed in the future.

We have an example:

<?php
class foo {
    function bar() {
        echo 'I am not static!';
    }
}

foo::bar();

The above example will output:

Deprecated: Non-static method foo::bar() should not be called statically in - on line 8
I am not static!

We used Phan to check our code and fixed the problems that were reported as PhanStaticCallToNonStatic. The easiest way to fix such a problem is change all non-static functions to static functions.

PHP 4 constructor

PHP 4 constructors are methods that have the same name as the class they are defined in.

PHP 7 will emit E_DEPRECATED whenever a PHP 4 constructor is defined. When the method name matches the class name, the class is not in a namespace, and a PHP 5 constructor (__construct) is not present then an E_DEPRECATED will be emitted.

Example:

<?php

class Filter {
    function Filter() {

    }
}

new Filter();

Output:

Deprecated: Methods with the same name as their class will not be constructors in a future version of PHP; Filter has a deprecated constructor in /in/IDpi7 on line 3

You can click here to run the above example and see the output.

You should use PHP 5 constructor like the below

<?php

class Filter {
    function __construct() {

    }
}

new Filter();

If your project is large, manually fixing the constructors will take a lot of time. There’s a useful tool called PHP-CS-Fixer that can help you automatically fix them. The PHP-CS-Fixer tool helped us fix over 500 files in our Garoon project.

I will show you how to use the tool to fix PHP 4 constructors. The below code runs with PHP-CS-Fixer 2.2.1.

1. First we need to create a configuration for PHP-CS-Fixer

.php_cs
<?php

$source_dir = 'c:\repo\projectX\source';
$finder = PhpCsFixer\Finder::create()
    ->in($source_dir)
    ->files()->name('*.csp')->name('*.php')
;

return PhpCsFixer\Config::create()
    ->setRiskyAllowed(true)
    ->setRules(array(
        'no_php4_constructor' => true
    ))
    ->setFinder($finder)
;

Note: Our Garoon project uses “.csp” as the file extension of PHP source files, which is why we add “.csp” to the configuration of PHP-CS-Fixer.

2. Run PHP-CS-Fixer

php php-cs-fixer.phar fix --config=.php_cs --verbose

For details about this tool, see: https://github.com/FriendsOfPHP/PHP-CS-Fixer

Internal functions Changes

There are some internal functions that were changed in PHP 7, and the substr() function is one of them.

For example:

<?php
$foo = substr("foo",3);
if($foo !== FALSE){
    echo "PHP 7\n";
    echo gettype($foo); //string
}
else{
    echo "PHP 5\n";
    echo gettype($foo); //boolean
}

The type of the variable $foo in the above example will be a boolean in PHP 5 and a string in PHP 7.

You can click here to run the above example and see the output.

To fix this issue, we checked the length of a string using the strlen function.

<?php
$foo = substr("foo",3);
if(strlen($foo) == 0){
    echo "Run on both PHP 5 and PHP 7\n";
    echo gettype($foo); //boolean on PHP 5 and string on PHP 7
}

You can click here to run the above example and see the output.

Conclusion

In the Garoon project, performance was improved and the codebase became more readable after the migration from PHP 5.6 to PHP 7: for example, the performance of the instance of Garoon we are using in Cybozu Inc. was shown to have been improved by 33 percent. We separated the migration into multiple parts and carried them out one by one. This way helped us cover possible issues while the migration was done in a short period of time.

Change log

2017-09-15

  • Modify the code and the output on “Only variables should be assigned by reference”