2503ICT: Programming with PHP


Introduction

We now commence the study of how to write three-tier dynamic Web applications. Recall that the three tiers are:

In general, each tier may be running on a separate host.

PHP (PHP Hypertext Processor) is a free, open-source, mature, widely available, efficient, well-documented HTML-embedded server-side scripting language, specifically designed for implementing the server-side tier of Web applications.

It is used for implementing some of the Web's largest applications, e.g., yahoo.com and facebook.com.

Two other widely used server-side scripting languages include Microsoft's Active Server Pages (ASP.NET) and Sun's (now Oracle's) Java Enterprise Edition (JEE). New languages and frameworks are gaining popularity and are discussed later.

Two of the main sites containing information about PHP are www.php.net and www.zend.com.

Here are some introductory tutorials:

Many of these tutorials, and others, recommend deprecated and poor PHP programming style. We'll try to present better style in our notes and examples, and note when we use bad style.

Here are some recommended texts, all in the library in paper and/or online form.

The PHP Manual (mirrored here) is a critical resource.

The MySQL Manual (mirrored here) is another critical resource.

This introduction is based on a set of examples linked from an examples page. Some of these (old) examples may use HTML 4.01 not HTML5, but you must use HTML5. Moreover, some of these these (initial) examples do not use HTML templates to separate application logic from HTML content and do not use a database for managing data. We'll correct these limitations in future lectures.

PHP Summary

PHP is an HTML-embedded scripting language. It is a modern, dynamically-typed, object-oriented language with Java-like control structures and an extensive library to support Web programming. It is relatively easy to learn and very powerful, but it is not as clean and regular as other languages such as Java, Python and Ruby.

You already know some PHP! Here is a simple PHP script, hello.php:

<!DOCTYPE html>
<html lang="en">
  <head> ... </head>
  <body>
    <p>Hello, world!</p>
  </body>
</html>
Admittedly, this PHP script is not very interesting. More useful PHP scripts have PHP code blocks embedded in HTML documents or do not contain HTML at all.

PHP constants are declared using the define() function:

define("LIMIT", "5");
echo LIMIT;

Only booleans, integers, doubles and strings may be constant values.

PHP variables are preceded by a dollar character, e.g., $current, $values[2].

The most important PHP data structures are arrays (unbounded maps from integers to values, cf., Java lists) and associative arrays (unbounded maps from strings to values, cf., Java maps). (Associative arrays are called dictionaries in some other languages.) Both are accessed using the conventional bracket notation:

$a["key"] = $b[0];   # $a is an associative array, $b is an (indexed) array

Control structures for selection and iteraton have several syntactic forms, including the familiar Java forms using braces. Functions may or may not return values.

PHP applications are structured as a set of scripts. Each script should perform a single, simple, documented task. Normally the task is either to display an HTML page or to update information stored on the server. The simplest PHP script is a plain, ordinary HTML document. Collections of related function (and constant) definitions are typically stored in a definitions file that is included in each script that uses the defined functions (and constants).

PHP scripts may be executed either by placing them at the end of a URL in a Web browser or directly from the command line on the server.

PHP Examples

Greetings example

Here is a greetings example based on an example in the tutorial on the PHP Web site. The example shows how to transfer information from an HTML form to a PHP script, transform the information, and output the transformed information as an HTML document. The example is trivial, but it is our first example where the output depends on the input. Note that this example uses bad style: it does not use templates and it does not validate user input.

Key points to note include:

Note that:

Some other points:

Factoriser example

Here is a factoriser example that acts as a simple compute server. This example also demonstrates how to transfer information from an HTML form to a PHP script, perform a nontrivial computation, and return the result in an HTML page. Note that this example uses bad style - it does not use templates - but it does validate user input.

Personal details example

The previous two examples used very simple HTML form elements. This personal details example illustrates how many different kinds of HTML5 form elements are processed in PHP. It's an important example to understand form processing. Again, it does not use templates or validate user input.

PM database example

This (PHP) database of Australian Prime Ministers illustrates how relational data may be stored as an array of associative arrays and one way of handling user-defined queries in PHP scripts. It's another important example. Alternative ways of giving and handling user-defined queries are explored in laboratory exercises.

First guestbook example

This guestbook application is a first dynamic Web application that stores information in a file on the server. It thus allows users to provide content for other users to see. Note the interaction between the PHP script index.php and the HTML form enter.html. This application demonstrates a standard way to read and write text files from a PHP script.

Note that this implementation stores just the messages and authors in the text file, without any presentation information, just the data. This is good practice. Calls to the function htmlspecialchars() encodes HTML characters such as "<" and "&" as HTML entities "&lt;" and "&amp;" before outputing message and author strings, to prevent embedded HTML elements and JavaScript scripts from being interpreted on the client. This is an example of data cleansing or data sanitisation to protect the system against malicious or inadvertent user input. Note what happens if these calls to htmlspecialchars() are omitted.

Note that this implementation stores the function definitions in a separate, included file, which is good practice.

Note also that this implementation suffers from the reload-redo problem: refreshing the index page may cause the last update to be redone. This is a very bad error! Later we'll see how to avoid this error using separate one-component or action scripts.

Later we'll also see better ways to implement this guest book application, by storing the messages in an SQLite or MySQL database.

PHP techniques

Included definition files

Store collections of related constant, variable and function definitions in separate files, and include these files into each file that may use them using the functions include, require, include_once and require_once, e.g.,

include_once "definitions.php";

The difference between include and require is that the script containing the former continues if the file is not present, but a script containing the latter dies immediately if the file is not present. The difference between include_once and include is that include_once will not include a file again if it has previously been included. This makes a difference when the file can initialise (and re-initialise) variables.

Action scripts

A very important PHP programming technique is the use of so-called action scripts. Often you want to run a script to perform some action - such as inserting an item into a shopping cart - and then return immediately to the calling, or referring, or some other, page. Indeed, we may wish to run this action script from different pages and return to the calling page whatever it is. To do this, we have to (a) avoid displaying any information associated with the action script, and (b) somehow return to the calling page whose name we may not be able to embed in the action script.

In fact, an action script must be used whenever the script updates the information on the server. This will avoid the reload-redo problem.

This action script version of the guest book application illustrates how action scripts are written and used. Here, index.php is the calling script, and add_action.php is the action script. The key PHP feature that makes this work are the following three lines of add_action.php:

// Redirect to script index.php
header("Location: index.php");
exit;

The PHP function header() sends its arguments as a response to the HTTP request that caused the current script to be executed. In this case, it redirects control to the script index.php, without returning an HTML response. Calls to the function header() must be made before any other output is generated, not even a blank line.

Note that this version of the guestbook application does not have the reload-redo problem.

More generally, we may wish to redirect control back to the referring script, whatever it was. This may be done as follows:

// Redirect back to the referring script
$referer = $_SERVER['HTTP_REFERER']";
header("Location: $referer");
exit;

Here, the PHP variable $_SERVER['HTTP_REFERER'] stores the URL of the calling, or referring, script. If this response is the first output of the action script, the effect is to immediately redirect control back to the calling page, which is exactly what is required.

The function header() can be used to output other forms of HTTP responses. There are also many other useful PHP variables like $_SERVER['HTTP_REFERER'], for example $_SERVER['PHP_SELF'] which is the name of the current script.

Note again that the function header() must be called before anything (even a blank line) is output by the script.

It is good practice to use an action script to process form data whenever the form uses method POST. Otherwise, there is a risk of incorrectly repeating the server update operation when the "reload" button in the browser is pressed,and hencing causing the reload-redo problem.

Hence, action scripts are frequently required.

Input validation

Input validation is the process of checking that data entered by users is valid, e.g., that names are nonempty, that email addresses have valid syntax, that numeric fields are numbers in the correct range, that username/password pairs correspond to registered users, and so on.

Input validation must be performed on the server.

Input validation may also be performed on the client (using JavaScript) to reduce communication load, to reduce server load, and to provide faster feedback to users. HTML5 provides direct support for performing input validation on the client.

Techniques for performing input validation and for reporting errors are described later.

Input sanitisation

Input sanitisation is the process of ensuring that user-entered data cannot damage stored information, reveal private information, or otherwise cause harm. It is necessary because users may be malicious or careless.

There are three main forms of user sanitisation.

  1. Preventing user input from being rendered as HTML or executed as JavaScript on the client. This can be done, e.g., by applying the function htmlescapechars() to user input immediately before printing it.
  2. Preventing user input from being executed as unintended SQL queries to the database server. This can be done, e.g., by applying the function mysql_escape_string() to user input before including it in SQL queries.
  3. Preventing user input from being executed as unintended shell commands on the server. This can be done, e.g., by applying the function EscapeShellCmd() to user input immediately before using it in Shell commands. This is less important in our simple applications.
In every case, it is important to truncate user input to a maximum expected length before taking any further action. Passing unexpectedly long strings to any computer program is a common way to break that program or cause it to perform in unexpected ways. In PHP, we truncate strings to some maximum length as follows:
$input = substr($input, 0, MAX_LENGTH);

PHP language and library

Arrays

PHP has indexed arrays (indexed from 0) and associative arrays (keys are arbitrary strings, cf. Java maps). To iterate over indexed arrays, use the Java construct:

for ($i = 0; $i < count($array); $i++) { $s = $s + $array[$i]; ... }

or

foreach ($array as $value) { $s = $s + value; ... }

To iterate over associative arrays, use the PHP construct:

foreach ($array as $key => $value) { $s = $s + $value; ... }

To access and update indexed arrays, use the [] construct:

$family = array('Fred', 'Wilma'); // $family[0] is 'Fred', $family[1] is 'Wilma'
$family[] = 'Peebles';            // $family[2] is 'Peebles'

To extract a slice of an indexed array, use the array_slice function:

$slice = array_slice(array, offset, length);

Other useful indexed array functions include range, array_chunk, array_splice, in_array, sort (indexed arrays), ksort (associative arrays by key), asort (associative arrays by value).

Initialise, access, update and add elements of an associative array as follows:

$person = array('name' => 'Fred', 'age' => 35, 'spouse' => 'Wilma');
$spouse = $person['spouse']; // accesses value associated with given key
$person['spouse'] = 'Marg';  // updates value associated with given key
$person['pet'] = 'Fido';     // adds new key => value association

Useful associative array functions include array_key_exists, array_keys, array_values, in_array.

Strings

Strings are arrays of characters. Variables (but not constants or array elements) included in double-quote strings are evaluated, e.g., "My name is $name." evaluates to 'My name is Rodney.' (assuming $name == "Rodney"), but 'My name is $name.' evaluates to 'My name is $name.' and "My first initial is $name[0]." evaluates to "My first initial is $name[0].". Complex expressions, including array accesses, can be expanded in double-quote strings by enclosing them in braces, e.g., "My first initial is {$name[0].}" evaluates to "My first initial is R.", and "{1+2} equals {$a[0]}" evaluates to "3 equals 3" (assuming $a[0] == 3).

Useful string functions include strlen, substr, strpos, strtok, explode, implode, join, preg_match, preg_split and parse_url.

Useful functions for encoding strings include htmlspecialchars, htmlentities, strip_tags, addcslashes, stripcslashes, EscapeShellCmd, mysql_escape_string and mysql_real_escape_string.

Tokenise strings as follows:

$token = strtok($string, $separators);
while ($token != false) {
    echo "$token<br>";
    $token = strtok($separators);
}

This is analogous to the use of scanners for tokenising in Java.

Alternatively, you could split a string into an array of tokens as follows:

$tokens = explode(" ", $string);

This separates the string into an array of tokens separated by single spaces. This may not be what you expect. Function preg_split() below is normally what you really want.

Conversely, you can combine the (string) elements of an array into a single string with the function join() as follows:

$sentence = join(" ", $words);

For example, join("...", array("This", "is", "a", "sentence")) evaluates to the string "this...is...a...sentence".

Regular expressions

Regular expressions are patterns that greatly generalise the concept of a substring. They represent complex patterns that may or may not occur in a given string. Regular expressions can be used to describe the structure of a name (first name, optional middle initial, last name), postcode (four digits), phone numbers (optional area code, seven digit number, optional extension), URL (optional mode followed by "//", optional name and password, host, optional file path, optional query), and so on. Regular expressions are widely used in operating system shells, in compilers, in Web applications, and many other application areas. All modern languages provide extensive support for creating and matching regular expressions.

PHP provides two regular expression libraries. It now recommends using the PCRE library and not the POSIX library.

The key constructs used to define regular expressions in PHP are the following:

Here are some example expressions:

A cheatsheet summarising regular expression patterns is available.

The main PHP function for matching regular expressions is preg_match():

    int preg_match(string pattern, string subject [, optional arg [, ...]])

Function preg_match() returns 1 if the regular expression pattern occurs in the string subject and 0 otherwise. Function preg_match_all() may be used to find all matches.

A common use for regular expressionas and the function preg_match() is checking that form input fields have the correct structure.

The main PHP function for splitting a string at all instances of a regular expression is preg_split():

    array preg_split (string pattern, string subject [, optional arg [, ...]])
Function preg_split() breaks the string subject into substrings at occurrences of pattern. For example the call preg_split("/[\s,;:.]+/", $sentence) splits the string $sentence into an array of white-space-or-punctuation separated words.

See this word-counting example, which counts the number of characters, words and lines in an input text, as an example of the use of the preg_split() function.

See Text Processing in the manual for a complete list of string processing and regular expression functions.

Files

Entire files may be included using the functions include and require, but file-processing applications normally open files for reading, appending or writing, obtain a file pointer, and read or write a line at a time from or to the file pointer. For example, to append a file $input to file $output:

$fp_in = fopen("$input", "r");
$fp_out = fopen("$output", "a");
while (! feof($fp_in)) {
    $line = fgets($fp_in, 4096);
    fputs(fp_out, $line);
}
fclose($fp_in);
fclose($fp_out);

Note that the end-of-file condition is tested before reading each line, whereas in Java it is tested after reading each line.

To create a new file, or to overwrite an existing file, open it with access mode "w".

See Filesystem Functions in the manual for the complete list of file processing functions.

The design of data in text files is a critical task that requires care. It should be done so that components of the files can easily be extracted, so that the structure of the files are unambiguous, and so that the files may easily be used by different applications.

One common possibility is to use one line for each logical record, with some (nonblank) character (such as ':' or ';') used to separate logical fields of each record. Occurrences of the separator need to be preceded by an escape character (normally '\').

If a record may contain fields with multiple lines, use specified tokens to separate or terminate such multiple-line fields.

Classes and objects

PHP provides limited (but improving) support for object-oriented programming. For example:

<?php
// File: counter.php
// A class that defines a counter.
class Counter {
  // Member variables
  var $count = 0;
  var $start = 0;

  // Constructor
  function Counter($start) {
    $this->start = $startl
  }

  // Methods
  function startAt($i) {
    $this->count = $i;
    $this->start = $i;
  }

  function increment() { $this->count++; }

  function reset() { $this->count = $this->start; }

  function show() { return $this->count; }
}
?>

This class could be used as follows:

<?php
include "counter.php";

$counter1 = new Counter;
$counter2 = new Counter(20);
$counter1->startAt(10);
$counter1->increment();
print "<p>Counter is now: ";
print "<p>{$counter1->show()}";
?>

Classes can be extended in the normal way:

<?php
class ColoredCounter extends Counter {
  var $color = 0;

  function ColoredCounter(integer $color) {
    this->Counter; 
    this->color = $color;
  }
      
  ...
?>

Note that the constructor of the superclass must be called explicitly.

From version 5.0 of PHP, it is possible to distinguish private and public members of a class and to apply other useful object-oriented programming concepts, apparently borrowed largely from Java.

From version 5.0 of PHP, object reference assignment is handled as in Java, which avoids a source of confusion for Java programmers in earlier versions of PHP.

To determine the class of an object, use a construct such as the following:

if (is_object(var)) {
    $class_name = get_class(var);
    ...
}

Classes are useful for encapsulating related groups of functions used for database access and user authentication, and generally for programming with objects while hiding their representations. But use them wisely as they can complicate PHP code unnecessarily. (Assignments in this course can mostly be done without using classes at all.)