Using Rocketeer for Easy Deployment to VPS

In this blog, I will be referring to Laravel deployments in particular; however, Rocketeer was designed to be framework agnostic and so the general principles should be transferrable to any deployment scenario.

The primary goal of Rocketeer is to interface with your source code manager (SCM) and transfer code (from your SCM) to your deployment folder. It is important that you understand this – Rocketeer does NOT transfer code directly via ssh/scp. You MUST use an SCM (git/cvn). In the Laravel context, It also performs other tasks like running migrations, installing composer dependencies etc.

The flexibility provided by Rocketeer is particularly enticing in a VPS scenario. There are a couple of gotchas that you need to be aware of especially when using your own VPS as a github repo. Listed below are the steps I take to deploy Laravel apps to my VPS.

Step 1: Prep the laravel app under question – Include the dependency on rocketeer, add the provider and finally add and commit all code to your local git repository.

Step 2: Prep the remote VPS:

a. I will proceed with the assumption that you have hardened your VPS along the same lines as outlined in this great post by Bryan Kennedy. This means that you will ssh into your box using a predefined username and a key file (as against root+password).

b. From following step1, you must have added your local box ssh key to your hosts “.ssh/authorized_keys” file. Since Rocketeer will attempt to open an ssh channel to do a git clone, you must ADDITIONALLY generate a public key for your server and add it into the authorized_keys file. This is easily done by executing the following in your host:

ssh-keygen -t rsa -C "email@domain.com"

Accept the defaults. Then copy the contents of the generated .ssh/id_rsa.pub to your .ssh/authorized_keys file.

If the above is not properly done, the error message thrown by Rocketeer during the deploy process is:

Unable to clone the repository
Cloning into '/var/www/project1/releases/20140330095757'...
Permission denied (publickey).
fatal: The remote end hung up unexpectedly
Deployment was canceled by task "Deploy"
Execution time: 3.492s

c. Insure you have composer (globally) and git installed on your vps

d. Set up a directory structure convention for your projects (you will do this for each new project). Let’s say

  • Your project files get deployed to /var/www/project1
  • Your git repo is at /var/git/project1.git

Important: Since files will be deployed using the username you use to ssh, please also make sure that the username is in www-data group,and the www-data group has rwx permissions on /var/www. Do the same for the git folder as well.

e. Setup a git repository on your vps (/var/git/project1.git) and set the remote server option in your local repo

cd /var/git/project1.git
git init

on your local repo:

$ git remote add origin ssh://username@my.server.org/var/git/project1.git

Note how I specified my VPS hosted git repository. ‘username’ is the username that has ssh permissions to my vps. Rocketeer will login as this username and attempt to do a git clone in the deployment directory.

Run the following command to ensure that the remote server is properly set:

$ git remote -v
origin  ssh://username@my.server.org/var/git/project1.git (fetch)
origin  ssh://username@my.server.org/var/git/project1.git (push)

Note that you can use an IP address if you do not have a hostname associated with your VPS. After the remote server is set, you should be able to do:

$git push origin master

to push your code to your server repository.

Step 3: You are now ready to deploy apps to your vps using Rocketeer. Rocketeer uses the new Laravel 4.1 RemoteManager component (https://github.com/illuminate/remote). This requires a more recent version version of PHP.

Fill in the appropriate details in app/config/remote.php (Rocketeer pulls information from this config file into its own config.php file)

'connections' => array(
    'production' => array(
    'host'      => 'my.server.org',
    'username'  => ‘username’,
    'password'  => '',
    'key'       => '/home/vagrant/.ssh/id_rsa',
    'keyphrase' => '',
    'root'      => '/var/www',
    ),
),

On running php artisan, you should see the following Rocketeer specific options like so:

deploy
 deploy:check Check if the server is ready to receive the application
 deploy:cleanup Clean up old releases from the server.
 deploy:current Display what the current release is
 deploy:deploy Deploy the website.
 deploy:flush Flushes Rocketeer's cache of credentials
 deploy:ignite Creates Rocketeer's configuration
 deploy:rollback Rollback to the previous release, or to a specific one
 deploy:setup Set up the remote server for deployment
 deploy:teardown Remove the remote applications and existing caches
 deploy:test Run the tests on the server and displays the output
 deploy:update Update the remote server without doing a new release.

Start off the process (this is a once per project setup of config options) by typing

$ php artisan deploy:ignite
No repository is set for the repository, please provide one :ssh://username@my.server.org/var/git/project1.git
Configuration published for package: anahkiasen/rocketeer
What is your application's name ? (project1)
The Rocketeer configuration was created at anahkiasen/rocketeer
Execution time: 7.4155s

Go through the \config\packages\anahkiasen\rocketeer\remote.php file and ensure that the settings are correct.

I had to make a couple of changes:

'root_directory'   => '/home/www/', to
'root_directory'   => '/var/www/',

Also I commented out the ‘composer self-update’ task (this is because I have composer installed globally in the /usr/local/bin directory and the username that rocketeer uses to ssh does not have appropriate permissions to the folder for the composer self-update process)

// The process that will be executed by Composer
 'composer' => function ($task) {
 return array(
 //$task->composer('self-update'),
 $task->composer('install --no-interaction --no-dev --prefer-dist'),
 );
 },

Run the deploy:check command to verify that you are good to go.

$ php artisan deploy:check
Checking presence of git
Checking presence of Composer
Checking presence of mcrypt extension
Checking presence of mysql extension
Checking presence of pdo_mysql extension
Your server is ready to deploy
Execution time: 3.1507s

If any deficiencies are noted, please fix them prior to proceeding.

 Once you are ready to deploy. Type in

$ php artisan deploy

or

$ php artisan deploy:deploy

If all goes well, you should get a success message from Rocketeer and a copy of your code on the host VPS.

To troubleshoot the deploy process, type in

$ php artisan deploy --verbose

The –verbose switch will display a wealth of information to help you debug the source of the error.

We have barely scratched the surface of Rocketeer (although it covers the most common use-case). Be sure to read up the wiki to understand what exactly happens during a “deploy” and to implement more intricate deployment scenarios.

On a related note, I would also like to mention that git deployments can be accomplished using git post-receive hooks. There is a great tutorial explaining the process at digital ocean. Personally I prefer the Rocketeer way as it is easier to implement and also offers a simple version switching mechanism.

 

Laravel Hash::make() explained

First, let us run through a couple of observations in Laravel 4:

return Hash::make('test');
$2y$10$ooPG9s1lcwUGYv1nqeyNcO0ccYJf8hlhm5dJXy7xoamvgiczXHB7S
return Hash::make('test');
$2y$10$QRgaiS6bpATKKQeT22zGKuHq.edDfXQc2.4B3v.zaN.GtGwoyQuMy
return Hash::make('test', array('rounds'=>12));
$2y$12$Az4ZMmEhUxYQ3CcgfnaTt.C1MYxmFfjNxpjgPtye0uKoMBnirw8TC

These are the results returned for me.. of course, you will get different results. But the takeaway points are:

1. Hash make returns a different hash each time. This is quite curious.
2. The output is always a 60 char string
3, The initial characters of the hash are metadata (first 7 chars)

This blog post will attempt to demystify some of the inner workings that cause Hash::make() to behave this way. So, how does Laravel do this? how then is the password check performed? and finally what is the advantage that this offers?

How?

Internally, Hash::make() encrypts using the bcrypt function and Blowfish algorithm. For php>5.5, password_hash() and password_verify() functions are used. For previous php versions, a compatibility library irc_maxell/password_compat is pulled in by composer. In fact, reviewing the source code of the password_compat library provides a lot of insight into the inner workings of password_hash() and password_verify().

According to the php documentation (http://www.php.net/manual/en/function.crypt.php), “Blowfish hashing with a salt as follows: “$2a$”, “$2x$” or “$2y$”, a two digit cost parameter, “$”, and 22 characters from the alphabet “./0-9A-Za-z”.”

So, in our trial run 1 above,
$2y$ represents use of blowfish algorithm with salt
10 is the default “cost” factor
A 22 character “salt” is (randomly) generated and appended to the previous two components
this is followed by the encrypted password.

The php crypt function (that is internally used to implement bcrypt) is then called :

crypt($password, $hash);

where $password is the string to be hashed, and $hash is the concatenated value of “$2y$”.”10″.”22 random characters from 0-9, a-z, A-Z”.”$”. This function returns the 60 character hash string associated with the password.

The cleverness of this is that the algorithm, salt and cost are embedded into the hash and so can be easily parsed out into  individual components for reconstruction/verification (Please see relevant sections of the php crypt source code at https://github.com/php/php-src/blob/master/ext/standard/crypt.c#L258). Because of this, you don’t need to store the salt/cost separately in a database table.

Password check

For checking the password (wrapped by password_verify() for php>5.5), an internal function semantically equivalent to :

return crypt($password, $hash)==$hash;

is used. The original hash that was generated by the encryption is passed to the function (this is key). The supplied password is salted and run through the crypt function to generate the original hash (provided the same password is used). Note that internally, the crypt function only cares about the first 29 characters of the passed in hash (7 metadata+22 salt). Remember that the crypt function implements a one-way hash – there is no way to retrieve the password from the encrypted hash. The only way to verify password equivalence is to hash it using the same salt and compare the results. Both the Hash::check() and Auth::attempt() methods in Laravel run the same check.

Why?

The conventional method of using a md5 or sha1 to generate password hashes is insufficient for modern security requirements. Due to the advancement in computation power, it has become trivial to use a Rainbow Table (http://en.wikipedia.org/wiki/Rainbow_table) to crack passwords stored using md5/sha1 hash. The use of bcrypt function avoids this vulnerability. So, you now have a one-way hash function that is both secure and easy to implement.

Nginx config for hosting multiple projects in sibling folders

I have recently begun the process of migrating (laravel) apps into using Nginx+PHP-FPM.

Needless to say, it is definitely more challenging than configuring good ol Apache! For starters, you absolutely must know regular expressions fairly well. Nginx makes extensive use of regexes to figure out matching rules for urls. The second paradigm shift is learning that Nginx does configs on a “per-application” basis. What I mean is, you cannot just setup the server one time, and expect to drop in apps into the web folder, and hope things work (this is especially true for apps that use clean urls via htaccess rewrites). So, prior to setting up any web application, expect to work a bit on tweaking the Nginix config. The reward is a nimbler web server that performs much better under load.

I put together a config file that serves multiple Laravel applications stored in sibling folders (on a single server). So, http://192.168.33.10.xip.io/project1 will serve up application 1, and http://192.168.33.10.xip.io/project2 will serve up application 2.

/Vagrant is the root web folder, and /project1 and /project2 are sibling folders within the vagrant folder containing full laravel applications.

Hope you find this useful!

Moving From Zend Framework to Laravel 4

Let me preface this blog with the following: I have been a programmer for the past 12 yrs and have used php for nearly 8 yrs, ZF1 for nearly 3 yrs (10+projects) and ZF2 since its inception. So, I understand the framework(s) quite well and have given it a fair amount of trial time.

About 2 months ago, I stumbled upon the Laravel framework. A couple of hours into studying Laravel, I had an epiphany : This is what all other frameworks should strive to be! Easy, Functional, modern and completely out of the way. While its CodeIgniter underpinnings are easily discernible based on the folder structure and config files, it is quite a radical departure in terms of the underlying code. Laravel 4 in particular embraces all prevailing best practices in the PHP world.

In comparison, both ZF1 and ZF2 are complicated, and require a steep learning curve. I kid you not, I was up and running with Laravel 4 in about 3 hours! (properly understanding my first “hello world” with ZF2 took me a week).
In my humble opinion, ZF2 is over-engineered, demanding more attention than the actual web application at hand. The reliance on (deeply nested) arrays for everything from route configs to parameters makes coding a chore. See, while arrays are speedy and flexible, they are a debugging nightmare. You get pretty much Zero IDE support (autocomplete, code completion etc). You have to remember verbatim all the required keys (I kind of got around this using netbeans code templates.But the templates became so many in number that remembering those presented a problem!).

Two projects later, I can confidently say that there is nothing that I can do in ZF2 that I can’t do using Laravel 4 (and in lesser time). The fact that L4 ties into composer/packagist means that pretty much any open source php project(on packagist) can be utilized in a Laravel project. In fact, L4 uses ‘monolog’ for logging, ‘swiftmailer’ for emailing and ‘symfony’ for the HTTP core and command line interface. All while providing a very ‘laravelesque’ approach to coding. There really are no limits. Very cool indeed!

Although I feel a certain closeness with ZF due to the sheer amount of time I spent with it, I feel the time is right to switch to L4 for future projects. I think L4 has done a LOT of things right, and deserves credit for it. The Eloquent ORM is very easy to work with. Routing in Laravel is an absolute joy! It does much of the heavy lifting when it comes to RESTful interfaces.
Form management is beautifully implemented. It is trivial to implement custom form/html controls.The DI mechanism is very expressive.So are Filters and Events. The Blade templating engine has an uncanny resemblance to the Razr engine used by ASP.NET MVC (which I really like!).
Although it looks like L4 uses an abundance of static methods, in reality it harnesses the __callstatic() php magic method to actually load objects from the DI container. The L4 command line tool “artisan” is also very well executed. Unit testing is very simple and works right out of the box (no lengthy setups required prior to running phpunit).

L4 has indeed improved my productivity quite dramatically 🙂

Jenkins Continuous Integration, Zend Framework and Netbeans

PHP sure has matured into a serious programming language from its humble beginnings. As more and more complex software is being built using PHP, it makes sense to use sound software engineering principles to insure robustness of built applications.

A good CI tool like Jenkins will help you get an inside look at your code in the form of detailed metrics. In the tutorial that follows, I describe the full install process of Jenkins on a windows workstation running wampserver. I will also detail the process of exercising code written using the Zend Framework. And, since I use Netbeans for all my development, I will describe a couple of nifty Netbeans plugins that make writing standardized code a lot easier.

Part 1: Preparing for Install

Jenkins server has the role of a conductor that orchestrates myriad php tools which in turn do the real heavy-lifting. Our first order of business is to install a working PEAR setup. If you don’t have it installed, please read my earlier blog about how you can set it up relatively easily on WAMP.

In case you already have pear installed, please make sure that all your packages are up-to-date (“pear list-upgrades” will display outdated packages, and “pear upgrade” will perform the upgrade). You should see a screen that looks like:

image

The following are the PHP tools that will be utilized by Jenkins. Please go ahead and install them as described:

Name Purpose How to Install
PHPUnit For Unit testing PHP code pear config-set auto_discover 1
pear install pear.phpunit.de/PHPUnit
phing Build tool written in PHP (similar to ANT) pear install pear.phing.info/phing
PHP_CodeSniffer Ensure coding standards pear install –a PHP_CodeSniffer
PHPDocumentor Automatically generating documentation/API from docblock comments in your code pear install PHPDocumentor
PHPcpd Copy/Paste Detector (CPD) for PHP code. pear install phpunit/phpCPD
PHPmd PHP mess detector. Analyze code to identify poorly written sections pear install -a pear.phpmd.org/PHP_PMD
PHPDepend Generates various software metrics pear install pear.pdepend.org/PHP_Depend
PHPCodeBrowser Generates browsable code annotated with defects identified by codesniffer, cpd and md pear install -a phpunit/PHP_CodeBrowser
PHPLoc Generates statistics for Lines of Code and other software metrics pear install -a phpunit/phploc

Once you have installed all the above packages, they are instantly made available on the windows path as the installers typically create a .bat file on the same directory as the php executable (which is already set in the windows path by the pear install process). You should therefore be able to open up a command window and type the name of any of the tools above.

Part 2: Install Jenkins

Next, head over to the Jenkins website and grab the native windows installer. Run the setup process. By default, Jenkins is setup at http://localhost:8080. As you can see, installation of Jenkins is really effortless. This is one of the key features that make it hugely popular.

(This part is optional only if you want to change the default port)

On my workstation, I run IIS on port 80 and Wampserver on 8080. I prefer to run Jenkins on port 8888. Making the change is easy. Head over to the Jenkins directory (c:\Program Files\Jenkins by default) and edit the file named “Jenkins.xml”. Change the port to 8888 in the <arguments> section as shown below:

<arguments>-Xrs -Xmx256m -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle -jar
"%BASE%\jenkins.war" --httpPort=8888</arguments>

The install process sets up a windows service named  “Jenkins” that is set to autostart. Like any other windows process, you can start, stop or restart it (Right click on “My Computer”-> select “Manage”->”Services and Applications”->”Services” and locate “Jenkins” on the list)

image

If all went well, when you restart the service and point your browser to http://localhost:8888 (or http://localhost:8080 if you did not change the default port) you should now see the Jenkins home page.

You can now proceed to install the required plugins. This process is also trivially easy because the Jenkins web interface does an excellent job of managing it. Click on the “Manage Jenkins” link, and then click on the “Manage Plugins” option. Click on the “Available’ Tab. Check all the plugins that need to be installed (list below) and hit the “Install without restart” option. Jenkins will retrieve the plugin from the web, set it up and prompt you to restart!

List of recommended plugins:

Jenkins Plugin Purpose
Checkstyle Generates a summary (and detail) of code style errors returned by the PHP_CodeSniffer tool
Clover_PHP Picks up Coverage_HTML and Coverage_Clover output from phpunit tool
DRY Summarizes output about repeated code generated by the PHPCPD tool
HTML Publisher (Post build) Used to create a snapshot of the documentation (generated by the PHPDocumentor tool) and the code browser (generated by PHPCodeBrowser). The plugin creates and maintains a copy of these two reports for EVERY BUILD. So you can always look at a prior build and analyze the progress.
JDepend Report on output generated by the PDepend tool
Plot Generate graphs from CSV output generated by the PHPLoc tool
PMD Report on output generated by the phpmd tool (PHP Mess Detector)
Violations Generates reports and graphs from output generated by PHP_CodeSniffer, PHPCpd and PHPMD
Mercurial Helps Jenkins acquire source code from Mercurial Source Control System (Ex. Bitbucket)
Phing Integrates Phing build tool into Jenkins

Sebastian Bergmann (creator of PHPUnit and various other tools above) actually has a website dedicated to explaining the nuances of integrating Jenkins with PHP (http://jenkins-php.org/). You might want to peruse through his site if you haven’t already. In part 3 of this tutorial, I borrow heavily from his website.

Part 3: Setup Jenkins for your PHP/Zend Framework Project

There are three distinct phases in any Jenkins job (and these can be batched to run at specified intervals):

  1. Source code retrieval– You will typically want Jenkins to automatically retrieve source code from a repository (I use Mercurial on Bitbucket.. but there are plugins available for almost any source control solution you may be comfortable with).
  2. Build– This is the part where the source code retrieved in step 1 is rigorously analyzed. We will use a phing build file to specify all the actions we want performed on the source code. This step typically generates a bunch of reports in predefined folders.
  3. Post Build – This step takes reports and logs generated by step 2 above and presents pretty reports to the end-user.

Because of the the sheer volume of available tools, processes and options, setting up a Jenkins project that presents meaningful and complete data to the user can be quite challenging. To simplify the adoption of Jenkins for PHP users, Sebastian Bergmann has created a template that users can base their projects off of. This template prefills all the common build and post-build options so the user does not have to bother about them. Keep in mind that setting up a Jenkins job for a project is a one time deal.

  1. Download the template from https://github.com/sebastianbergmann/php-jenkins-template/downloads. The template essentially contains an elaborate XML file that details the required build options.
  2. Extract the contents of the zip file into a folder named ‘php-template-jenkins’ in the c:\Program Files\Jenkins\Jobs folder
    image
  3. Fire up the Jenkins home page (http://localhost”:8888) and click on the “Manage Jenkins” menu option. Click on the “Reload Configuration from Disk” link. This will force Jenkins to recognize the newly copied php template
  4. On the Jenkins home page, click on the “New Job” menu option
  5. Type in your application name in the “Job Name” field. Select the “Copy existing job” option at the end and type ‘php-jenkins-template’ (It has a neat auto-complete box that prompts you with available matches!)
    image
  6. Now, on the homepage, you should see your newly setup job. This job has inherited all settings form the php-template.
    image

We will be using “phing” to bring all our tools together and execute them sequentially. By default, phing looks for a file named “Build.xml” in the project root and executes it. The build file that I use is listed below:

<?xml version="1.0" encoding="UTF-8"?>

<project name="name-of-project" default="build">
    <property name="basedir" value="." />
    <target name="build"
            depends="prepare,lint,phploc,pdepend,phpmd,phpcs,phpcpd,phpdoc,phpunit,phpcb"/>

    <target name="clean" description="Cleanup build artifacts">
        <delete dir="${basedir}/build/api"/>
        <delete dir="${basedir}/build/code-browser"/>
        <delete dir="${basedir}/build/coverage"/>
        <delete dir="${basedir}/build/logs"/>
        <delete dir="${basedir}/build/pdepend"/>
    </target>

    <target name="prepare" depends="clean"
            description="Prepare for build">
        <echo msg="${basedir}" />
        <mkdir dir="${basedir}/build/api"/>
        <mkdir dir="${basedir}/build/code-browser"/>
        <mkdir dir="${basedir}/build/coverage"/>
        <mkdir dir="${basedir}/build/logs"/>
        <mkdir dir="${basedir}/build/pdepend"/>
    </target>

    <target name="lint">
        <phplint>
            <fileset dir="${basedir}">
                <include name="**/*.php" />
                <exclude name="**/tests/**" />
            </fileset>
        </phplint>
    </target>

    <target name="phploc" description="Measure project size using PHPLOC">
        <exec executable="phploc">
            <arg value="--log-csv" />
            <arg value="${basedir}/build/logs/phploc.csv" />
            <arg value="--exclude"/>
            <arg path="${basedir}/tests" />
            <arg value="--suffixes"/>
            <arg value="php" />
            <arg path="${basedir}" />
        </exec>
    </target>

    <target name="pdepend">
        <phpdepend>
            <fileset dir="${basedir}">
                <include name="**/*.php" />
                <exclude name="**/tests/**" />
            </fileset>
            <logger type="jdepend-xml" outfile="${basedir}/build/logs/jdepend.xml"/>
            <logger type="jdepend-chart" outfile="${basedir}/build/pdepend/dependencies.svg"/>
            <logger type="overview-pyramid" outfile="${basedir}/build/pdepend/overview-pyramid.svg"/>
        </phpdepend>
    </target>

    <target name="phpmd"
            description="Generate pmd.xml using PHPMD">
        <phpmd rulesets="codesize,unusedcode,naming,design">
            <fileset dir="${basedir}">
                <include name="**/*.php" />
                <exclude name="**/tests/**" />
            </fileset>
            <formatter type="xml" outfile="${basedir}/build/logs/pmd.xml"/>
        </phpmd>
    </target>

    <target name="phpcs">
        <phpcodesniffer standard="Zend" allowedFileExtensions="php">
            <fileset dir="${basedir}">
                <include name="**/*.php" />
                <exclude name="**/tests/**" />
            </fileset>
            <formatter type="default" usefile="false"  />
            <formatter type="checkstyle" outfile="${basedir}/build/logs/checkstyle.xml" />
        </phpcodesniffer>
    </target>

    <target name="phpcpd">
        <phpcpd>
            <fileset dir="${basedir}">
                <include name="**/*.php" />
            </fileset>
            <formatter type="pmd" outfile="${basedir}/build/logs/pmd-cpd.xml"/>
        </phpcpd>
    </target>

    <target name="phpdoc" description="AutoGenerate php docs">
        <phpdoc title="My API Docs"
                sourcecode="yes"
                destdir="${basedir}/build/api"
                output="HTML:Smarty:PHP">
            <fileset dir="${basedir}">
                <include name="**/*.php" />
                <exclude name="**/tests/**" />
            </fileset>
        </phpdoc>
    </target>

     <target name="phpunit" description="Run unit tests with PHPUnit">
         <exec command="phpunit --configuration=${basedir}/tests/phpunit.xml
        --log-junit ${basedir}/build/logs/junit.xml
        --coverage-clover ${basedir}/build/logs/clover.xml
        --coverage-html ${basedir}/build/coverage"/>
    </target>

    <target name="phpcb"
            description="Aggregate tool output with PHP_CodeBrowser">
        <exec executable="phpcb">
            <arg value="--log" />
            <arg path="${basedir}/build/logs" />
            <arg value="--source" />
            <arg path="${basedir}/application" />
            <arg value="--source" />
            <arg path="${basedir}/library" />
            <arg value="--output" />
            <arg path="${basedir}/build/code-browser" />
        </exec>
    </target>
</project>

This file must be placed in the root of your project. So, if you are coding using Zend Framework, it should be at the same level as “application”, “library” and “public” folders.

The code is fairly self-documenting. A property named “basedir” is declared. This is used throughout the script as ${basedir}. The project has a default build target of “build”. “build” in turn depends on multiple targets that are setup, and all are executed in sequence. “prepare” is the first target executed and that is in charge of creating the required folders.

The “tests” folder is excluded from most metric computations as that is where our unit test cases reside. Note the “–configuration” option in the “phpunit” target – this indicates the location of the phpunit.xml file which in turn contains the location of the bootstrap file, and the folders containing code to test.. It is pretty much the stock phpunit.xml file that gets generated with a new Zend Framework project with the <whitelist/> section added. The “<whitelist/>” section is required – without this, you will have inflated code-coverage reports that document the entire zend framework!

<phpunit bootstrap="./bootstrap.php">
    <testsuite name="Application Test Suite">
        <directory>./application</directory>
    </testsuite>
    <testsuite name="Library Test Suite">
        <directory>./library</directory>
    </testsuite>
  <filter>
        <whitelist>
            <directory suffix=".php">../application</directory>
            <directory suffix=".php">../library</directory>
            <exclude>
                <directory suffix=".phtml">../application</directory>
            </exclude>
        </whitelist>
    </filter>
</phpunit>

I prefer to have actual code for “code-cover generation” inside the phing job and NOT in the phpunit.xml file. The main reasons for this are:

  1. Generating a code coverage report takes time. And this grows along with the length of the code. While a delay in a jenkins job is perfectly acceptable, any delay to the TDD process is not. phpunit.xml is a file that you will run very often from your IDE (if you practice TDD) and the quicker it runs, the better.
  2. Netbeans has an awesome internal code coverage generator that generates wonderful code-coverage reports within the IDE (if you haven’t checked this feature out, I would strongly recommend that you do)
  3. Artifacts generated during code coverage analysis (various html files) do not really belong in the project. The “build/coverage” folder generated by the phing task lies outside the project and is therefore preferred.

Next, we modify our template so that it works with our source code repository and also use phing instead of ANT.

Phing intead of ANT: I prefer not to download and install another binary executable (ant) when we have a perfectly capable native php tool (phing) available for the job.

In Jenkins, click on the newly created job and then on the “Configure” option on your left.

a. Change the tags in the description to use <img/> instead of (I find this works better with the most recent release of Jenkins – 1.454)

b. Provide the source of your code in the “Source Code Management” section (Note how the password is passed in the url)

image

c. Under the “build” section, delete the existing ‘Ant’ target by clicking on the “Delete” button. Click on the “Add build step” button and select “Invoke Phing targets” (this option is made available by the phing plugin for Jenkins)

image

It will add a blank line. Nothing needs to be entered there.. By default, Jenkins will look in the windows path for phing and execute it. Phing looks in the workspace directory for a build.xml file, finds it, and dutifully executes all the targets! Hit the save button. This completes Jenkins setup for our project. Choose the “Build Now” option from the menu and take a good look at all the wonderful reports generated by Jenkins.

image

Part 4: Netbeans and PHP – Still The Perfect Match

Your development process has no doubt become much more organized by the use of Jankins CI server. However, you could be bogged down by error reports if you do not have a solid development methodology to boot. And, this starts with the IDE.

A couple of years ago, I wrote about how Netbeans surpassed all my expectations from an IDE, in spite of it being free! To this day, it remains a  clear winner for php development.

With the addition of a few plugins, you will be able to develop better code and thus have fewer surprises when you submit your code for integration. These have served me very well and I strongly recommend you utilize each one of them.

1. Unix line endings : Windows line endings (CR-LF) causes problems on unix machines (which uses LF). To avoid nasty surprises later on, it is best to install this plugin and set line endings to LF.

2. Path Tools : This plugin makes it incredibly easy to drop into console or explorer. Highlight the file or folder and click on the appropriate toolbar option!

3. phpMD / PHP CodeSniffer Plugin: This plugin uses the phpmd (mess detector) and phpcs (code sniffer) tools and tracks code violations early on and displays them right within the IDE window! Using this plugin is an easy way to standardize your codebase. Please refer to the following website for detailed instructions on how to set up and activate this plugin:  http://www.summasolutions.net/blogposts/applying-zend-coding-standard-netbeans-phpmd-codesniffer

Regression Analysis with PHP

Introduction:

Regression analysis is one of the core tools of statistics that helps investigate relationships between variables. If there is only a single explanatory variable, it is termed “Simple Regression” – It is often difficult to come across such situations in real life. Computing a simple regression involves fitting a straight line through the scatter graph obtained from the sample. From coordinate geometry basics, recall that a straight line has the equation:

y=c + mx

where m = slope of the line, and c is the y intercept.

There can of course be many straight lines through a scatter plot.. Regression analysis picks ONE line in particular – The criteria used is to minimize the sum of squares of the errors (distance from the points in the scatter plot to the line constructed).

If there is more than one variable on which the prediction depends, it is called “Multiple Regression” – It accounts for multiple additional factors (separately) so that the effect of each (independent) variable on the dependent variable can be assessed.

The equation for a (linear) multiple regression can be expressed as:

y = a+bx1+cx2+…

Mathematically, it is identical to “Simple Regression”. For example, in solving a 2 parameter regression, we need 3 dimensions – so, instead of estimating a straight line, we select a single “plane” – again such that the sum of squares of errors is minimum.

We can in fact, extrapolate this to an arbitrarily large number of independent variables. Fortunately, Computers have no problems crunching the numbers using matrix algebra to solve numerous simultaneous equations to arrive at the results.

Using Matrix Algebra to Solve Multiple Regression:

Let X be the data matrix of the predictor (independent) variables.
Let Y be the data vector representing the criterion (dependent) variable.
and, Let ‘b’ be the data vector representing the regression coefficients.

The formula for computing b (coefficient matrix) using Matrix algebra is given by the following formula:

b = (X’X)-1X’Y

The proof of this equation is quite simple:

1. The simplified equation for a (simple) linear regression in matrix terms is Y=Xb+e
2. Assume that the average error (e) will equal 0. The equation becomes Y=Xb and we need to find the value of ‘b’
3. multiply both sides of the equation by X’ (transpose of X) : X’Y = X’Xb
4. We are trying to get rid of X’X on the RHS.. so multiply both sides of the equation with (X’X)-1 – the inverse of X’X:
(X’X)-1X’Y = (X’X)-1(X’X)b
5. Any matrix multiplied by its inverse is the identity matrix I:
(X’X)-1X’Y = Ib
6. Ib = b = (X’X)-1X’Y

[Note: the above equation contains the Moore-Penrose pseudoinverse matrix which ensures that inverses take place on SQUARE matrices. A pseudo-1=(A’A)-1 *A’

To solve equation Y = Xb, the naive solution would be to multiply both sides by X-1
X-1Y = X-1Xb, and thus
X-1Y = b

However, recall that matrix inverses are only defined on Square matrices. We cannot guarantee that the matrix of independent variables will be a square matrix..in fact, it never is! the sample size needs to be reasonable to ensure correct results. The use of a pseudoinverse matrix solves this problem nicely.]

Example:

A typical workflow involving regression analysis goes like this:

  1. Formulate a hypothesis about the relationship between variables of interest
  2. Gather data from a representative sample
  3. Compute regression parameters and prove (or disprove) the hypothesis

A common example of multiple regression analysis is in college Student Admissions. The admissions committee bases its decision on numerous factors that it believes will influence the students GPA.

For example, a college could hypothesize that the GPA that will be attained by the admitted student is dependent on his/her Highschool GPA, SAT score (V+Q) and Letters of recommendation (The strength of recommendation letters will of course need to be quantified)

A sample of students currently attending the college is then surveyed for their current college GPA, high school GPA, SAT scores and Recommendation letter strength.

The predictor equation for college GPA (dependent variable) is :

Y’ = a+b1X1+b2X2+b3X3

where Y’ = Predicted GPA
{b1, b2, b3} = Regression coefficients
a = Intercept
X1 = HS GPA, X2 = SAT Score and X3 = Strength of Recommendation letters

So, once the coefficients and intercept are determined, it is quite easy to plug in values for X1, X2, X3 and “predict” the future GPA of a student.

Implementation:

Although there are numerical statistical packages that can easily compute regression parameters, they are all geared towards “offline” computation. That is, data is collected first, and then the numbers are crunched to arrive at the result. I was recently involved in a project that required online regression arithmetic – i.e., display the relevant regression parameters at the end of an online survey. I decided to write my own PHP library for this purpose (complete source code is attached at the end of this blog).

Each observation can be written as an equation as follows:

For explanatory purposes, consider a simple linear regression. In matrix terms, the equation looks like:

image

We can then solve for {b0,b1}… Note how the X matrix needs to be filled with 1’s in the first column. The exact same approach works for multiple regression too – We just use larger matrices!

image

‘n’ is the sample size (number of observations)

Please refer to the word doc attached in the download for a detailed derivation on why solving the above matrix equations gives us the sum of least square errors.

The various formulae used are:

  1. b = (X’X)-1X’Y  (Regression coefficients. This is an nX1 array)
  2. SSR = b’X’Y – (1/n) (Y’UU’Y) (sum of squares due to regression – this is a scalar. U is a unit vector of dimensions nX1)
  3. SSE = Y’Y-b’X’Y (Sum of squares due to errors – scalar)
  4. SSTO = SSR+SSE (Total sum of squares – scalar)
  5. dfTotal = sample_size – 1 (Total degrees of freedom)
  6. dfModel = num_independent – 1 (Model degrees of freedom)
  7. dfResidual = dfTotal – dfModel (Residual degrees of freedom)
  8. MSE = SSE/dfResidual  (Mean square error – scalar)
  9. SE = (X’X)-1*(MSE) then take Square Root of elements on the diagonal
  10. t-stat = b[i][j]/SE[i][j]
  11. R2= SSR/SSTO
  12. F = (SSR/dfModel)/(SSE/dfResidual)

The heart of the library is the Lib_Matrix class. It handles all required matrix manipulation operations. A fluent interface has been provided where possible to make code more intuitive and readable. The Regression computation is performed in the “Lib_Regression” class. I have also attached the complete Unit test suite (100% code coverage) to help you better understand the API.

File Name Purpose
Matrix.php Core Matrix Manipulation
Regression.php Compute various regression parameters
MatrixTest.php Complete unit test suite for both matrix and regression classes
Excel Regression.xlsx Regression performed on the test case testRegressionPassingArrays() using MSexcel
MyReg.csv Sample CSV file used by function testRegressionUsingCSV()

An example API use scenario is described below:

//independent variables.
$x = array(
            array(1, 8, 2),
            array(1, 40.5, 24.5),
            array(1, 4.5, .5),
            array(1, .5, 2),
            array(1, 4.5, 4.5),
            array(1, 7, 8),
            array(1, 24.5, 40.5),
            array(1, 4.5, 2),
            array(1, 32, 24.5),
            array(1, .5, 4.5),
        );
//dependent variables
 $y = array(array(4.5),
            array(22.5),
            array(2),
            array(.5),
            array(18),
            array(2),
            array(32),
            array(4.5),
            array(40.5),
            array(2));

$reg = new Lib_Regression();
$reg-&gt;setX($x);
$reg-&gt;setY($y);

//NOTE: passing true to the compute method generates standardized coefficients
$reg-&gt;Compute();    //go!

var_dump($reg-&gt;getSSE());
var_dump($reg-&gt;getSSR());
var_dump($reg-&gt;getSSTO());
var_dump($reg-&gt;getRSQUARE());
var_dump($reg-&gt;getF());
var_dump($reg-&gt;getRSQUAREPValue());
var_dump($reg-&gt;getCoefficients());
var_dump($reg-&gt;getStandardError());
var_dump($reg-&gt;getTStats());
var_dump($reg-&gt;getPValues());

Click here to download the regression library(Github)

References:

http://marketing.byu.edu/htmlpages/books/pcmds/REGRESS.html

http://davidmlane.com/hyperstat/prediction.html

http://luna.cas.usf.edu/~mbrannic/files/regression/regma.htm

http://www.geog.ucsb.edu/~joel/g210_w07/lecture_notes/lect09/oh07_09_2.html

http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/pvalues.htm#rtdist

memcached on 64 bit Windows

Download memcached 1.4.4-14 from here (note that later versions of memcached have been modifed and DO NOT work as windows services).

Unzip to a folder on your computer (say d:\memcached)

Open an ADMINISTRATOR command window, and navigate to the memcached folder:

image

Typing “memcached –h” will display all the available options

1. Install memcached service on your system by typing

memcached.exe –d install

2. Tell memcached to start

memcached.exe –d start

You are done.. When you view your taskmanager, you will notice memcached service running:

image

This service is setup to autostart with windows so you will not need to repeat the above.

Now, if you want to access/use memcached via php (or Zend Framework), you need to install  php_memcache extension.

php_memcache.dll comes bundled in a zip file available here (Many thanks to Anindya for making this available). Extract and save the dll file in your php ext directory.

Edit your php.ini and add this line:

extension=php_memcache.dll

Restart Apache.. Navigate to your phpinfo and you should see a section for memcache.