Using PHP with MySQL (cont.)


We continue the study of how to write three-tier dynamic Web applications using PHP and MySQL.

Contents

Dates and times

Managing dates and times in computer programs is very complicated. There are different calendars, different time zones, different daylight savings rules, leap years (and leap seconds), and many different implementations. Here is a suggestion for one way to manage dates and times with PHP and MySQL, based on representing epochs (time instants) as Unix timestamps in PHP and as SQL timestamps (which are completely different) in MySQL, and performing date/time calculations in PHP (despite the extensive date/time functions available in MySQL).

An instant of time (such as 6:00pm on 15 May 2003) is called an epoch. Epochs are represented in PHP as Unix timestamps (the integral number of seconds since 1 Jan 1970) and in MySQL as visual numerical encodings. For example, the epoch at 21:30 on 15 May 2003 is represented by the Unix timestamp 1052998200 and by the MySQL timestamp 20030515213000 (or possibly by "20030515213000"). So it's necessary to convert Unix timestamps to MySQL timestamps when inserting rows into MySQL tables and to convert MySQL timestamps to Unix timestamps when retrieving rows from MySQL tables. These two conversions can be done in PHP as follows:

$epoch = time(); // current time
$insert = "insert into items values (..., from_unixtime($epoch), ...)";
...
$query = "select ..., unix_timestamp(time) as time, ... from items where ...";
...
while ($row = mysql_fetch_array($result)) {
    $epoch = $row["time"]; // stored time
    ...
}

Here, function time() (also used below when discussing cookies) is a PHP function that returns the current epoch ("now") as a Unix timestamp. Table items has a column time, whose type is timestamp. Alternatively, you could define the column time to have type bigint and avoid the two conversions completely. However, this would make it impossible (or at least more difficult) to use any of the extensive MySQL date/time functions if you wished to do so.

An advantage of thinking in times of epochs is that it's easy to add a number of seconds to an epoch to move forward in time and to subtract two epochs to find the difference between them.

That leaves the problem of converting more familiar forms of dates into Unix timestamps on input and converting Unix timestamps into readable forms of dates on output.

Input can be handled by the amazing PHP function strtodate(string). This function takes a string representing a time and returns the corresponding Unix timestamp. Examples of possible arguments to strtotime() are "now", "tomorrow", "+2 days" (from now), "1 week", "20 May" (this year), "20 May 2003", "2003-5-20", or "20 May 2003 8:30pm", "tomorrow 22:00", and similar combinations. Wow! Other useful input functions are mktime() for converting numerical representations of dates to timestamps and checkdate() for checking that numerical dates are meaningful. See the online documentation of these functions for details.

Output can be handled by the PHP function date(format,timestamp) which uses the string format to return a readable string representation of timestamp. If the second argument is omitted, the current time ("now") is used.A similar output function is strftime(). See the online documentation of these functions for details.

Converting a number of seconds (e.g., the difference between two Unix timestamps) is most easily done by hand using simple arithmetic.

This simple event management program illustrates these techniques and a possible format for function date() in a simple application that enters and displays events, the time of each event, and the remaining time to each event. The data model used is defined in file events.sql.

Exercise. Find an equally simple way of representing and processing dates when times within dates are not required.

User authentication and authorisation

The first task is to decide which parts of your site you wish to restrict to which users. Be generous. Users should be able to browse shopping sites (and University Web sites, and learning@griffith) anonymously. It is only when users wish to view truly confidential information or perform significant transactions that authentication (who is this user?) and authorisation (does this user have permission?) is required.

User authentication is currently done most commonly on the Web using passwords. There are two methods a server can use to require that a user submits a username/password to enter a (part of a) Web site.

  1. By sending a 401 "authorisation required" response back to the client before a username/password is first sent. This results in a small dialog box that the user must complete to gain entry.
  2. By requiring users to submit their username/password through a normal HTML form before entering the site.

Both methods are widely used, but method 2 is probably preferable.

With both methods, the server has to be able look up a file or database to test that the username/password corresponds to some registered user.

With method 1, the Web server administrator, or site maintainer, can specify that a username/password is required to read pages in a particular directory. The username and password entered are assigned to the PHP variables $_SERVER["PHP_AUTH_USER"] and $_SERVER["PHP_AUTH_PW"], respectively. This simple script indicates how these variables can be tested and set, and this script demonstrates that the values set are remembered.

Note that these username/password pairs are sent in clear text. To send them in encrypted form, either use "WWW-Authenticate: Digest" instead of "WWW-Authenticate: Basic" or use protocol https instead of protocol http.

If you know a username/password pair is required, you can enter it directly into the URL in the form http://username:password@host/path, and many bookmarks are stored in this form. Again, to encrypt the username/password pair, use https instead of http.

With the second method, the username and password must be read from a form's fields as normal before checking the user is authorised to proceed. Passwords should be entered in an element of type "password". This hides the password from a reader, but still transmits it in clear text. Again, to encrypt the username/password pair, use protocol https in the form's action.

With both methods, the password (and username) must be stored in encrypted form (as databases should be assumed to be insecure). Suppose the table of users is declared in SQL as follows:

CREATE TABLE users (
  username varchar(10) NOT NULL,
  password varchar(10) NOT NULL,
  PRIMARY KEY (username),
  KEY (password));

The username must be unique (in many applications, a user's unique email address is used as the primary key); the (encrypted) password should also be unique to support indexing for efficient checking.

The PHP one-way function crypt() encrypts the plain text password, optionally using a second parameter, so that it cannot be decrypted:

string crypt(string plaintext, [, string salt])

Using the first two characters of the username as the salt ensures that even if two users choose the same password it is encrypted differently for each of them.

The following function uses this idea to update a password in the users table. A similar method could be used to insert a username/password pair in the first place.

<?php
function updatePassword($connection, $username, $password) {
  $salt = substr($username, 0, 2);
  $storedPassword = crypt($password, $salt);
  $query = "UPDATE users" .
           "  SET password = '$storedPassword'" .
           "  WHERE username = '$username'";
  $results = mysql_query($query, $connection) or showerror();
}
?>

To authenticate a user against this table, the following function could be used.

<?php
// Returns true if $username/$password matches, else false
function authenticate($connection, $username, $password) {
  if (! isset($username) || ! isset($password))
    return false;
  $salt = substr($username, 0, 2);
  $encryptedPassword = crypt($password, $salt);
  $query = "SELECT password FROM users" .
           " WHERE username = '$username'" .
           " AND password = '$encryptedPassword'";
  $result = mysql_query($query, $connection) or showerror();
  return (mysql_num_rows($result) == 1);
}
?>

A useful function like this might be put in a separate file authentication.php for inclusion in several different scripts. For example, it could then be used as follows.

<?php
include "authentication.php";

if (! ($$connection = mysql_connect($host, $user, $password)))
    die("Could not connect to database.");
if (! (mysql_selectdb($database, $connection)))
    showerror();
if (! authenticate($connection, $_SERVER["PHP_AUTH_USER"], 
                                $_SERVER["PHP_AUTH_PW"])) {
    header("HTTP/1.1 401 Authorisation required");
    header("WWW-Authenticate: Basic realm=\"Private realm\"");

    print "Message for username/password dialog box";
} else {
?>
<html><body>
HTML response for authorised users
</body></html>
<?php
}
?>

A one-way function like crypt() is only useful for encrypting data such as passwords and PINs. It's not suitable, e.g., for credit card numbers, as these need to be visible to banks and credit card companies. A different kind of encryption scheme is used for such purposes.

Registration and login

In deciding where and how to ask users to register or login, follow best practice (e.g., amazon.com, ebay.com.au). Place registration links and login forms at standard places on the page. As stated above, don't require users to login unnecessarily. When you do require users to login, take them to the natural page afterwards (e.g., if required to login to bid, the following page should be the normal "after bid" page.

Registration scripts should send a system-generated password to the user's email address to prevent others maliciously registering the user. It must always be possible for a user to request his or her password (if forgotten) or to request a new password (which is safer). It must always be possible for a user to change his or her password (and other details) on a personal details page.

Notice how badly the wedding gift registry example from the previous lecture handles registration and login!

Validation and error reporting

Review the notes on input validation from the previous lecture.

If user input to HTML forms has errors, appropriate complete messages should be displayed with the redisplayed form, not on a separate message page.

To do this, HTML form pages need to be PHP scripts, not HTML pages. Each such script must have one or more error message variables, which are empty when the script is called normally, and nonempty when the script is recalled after the form-processing script has detected an error.

The structure of such a page must thus have the following structure:

<?php
$message = $_GET["message"];
if (! empty($message)) {
    echo $message;
}
?>
<form ...> ... </form>

The form-processing script must have the corresponding structure:

<?php
$error = "";
// Suppose $name is a required field
$name = $_GET["name"];
if (empty($name) || $name = "") {
    $error .= "A name value must be provided\n<br>";
}
// Check other fields
...
if ($error != "") {
    $referer = $_SERVER['HTTP_REFERER'];
    header("Location: $referer?message=$error");
    exit;
}
?>

Remember: The header() function must be called before this script outputs any other text at all.

A better solution would return the initial form values, via the form-processing script, back to the form, so the user doesn't have to reenter them. To do this, the header() call in the form-processing script should have the form:

header("Location: $referer?message=$error&name=$name&address=$address");
and the form script should have the form:
<?php $name = $_GET['name']; $address = $_GET['address']; ?>
<form method="post" action="script.php">
  <input type='text' name='name' value=<?='$name'?>>"
  <input type='text' name='address' value=<?='$address'?>>"
  ...
</form>
A better implementation of this solution uses session variables (described below) instead of (potentially very long) query components in the form URL. In the form-processing script, if one or more errors occur, store an array of error messages as one session variable and an array of form values as a second session variable. In the form itself, if the error array is not empty, use the error array elements to display messages above selected form elements and the form values in the second session variable as initial values of the form elements themselves. See the groups example described in a later lecture.

State maintenance

HTTP is a stateless protocol, so special techniques are required to maintain state information (such as the identity of the user or the contents of the shopping cart) between successive HTML pages or PHP scripts. Two of the most important techniques are the use of cookies and (PHP) session variables.

Cookies

Cookies are small, limited duration, text files stored on the client, and are used to maintain state associated with a particular Web site. They may be accessed directly using JavaScript on the client (to avoid unnecessary network traffic and server load) or transferred to PHP scripts on the server.

It is quite easy to create, access and update cookies using PHP scripts. Here is a simple script that creates and uses cookies.

<?php
$count = $_COOKIE['count'];
$start = $_COOKIE['start'];
if (! isset($count)) {
  $count = 0;
  $start = time();
  // Send a cookie "start=current time" to the client
  setcookie("start", $start, time()+600, "/", "", 0);
} else {
  $count++;
}
// Send a cookie "count=current count" to the client
setcookie("count", $count, time()+600, "/", "", 0);
?>
<html>
<body>
  count = <?=$count?>.<br/>
  start = <?=$start?>.
  <?php $duration = time() - $start; ?>
  <p>
  This page has been accessed <?= $count ?> times.
  <p>
  This session has lasted <?= $duration ?> seconds.
  <hr>
  <a href="show.php?file=cookie.php">Source</a>
</body>
</html>

Note that the setcookie() function must be called before the script outputs anything at all, even a blank line (cf. the header("Location: ..."); function).

URLs

State may be maintained by encoding state information in URLs used to access PHP scripts. For example, in the guest book example that printed messages two per page, state was maintained using values in URLs.

Hidden components

State may also be maintained in invisible (zero-sized) frames in a page or in hidden fields in a form.

Session variables

This is the most important way in practice to maintain complex state information in PHP-based applications.

A session is a way to identify and manage the state - the session variables - for a particular user while the user is visiting different pages on a single site.

The session variables are stored together in an associative array on the server. This array is accessed by a session identifier, which is normally stored as a cookie.

Here is a simple script that performs the same function as the previous one, but using session variables instead of cookies.

<?php
  // Initialise a session.   This call either creates a new session
  // (and sets a cookie) or re-establishes an existing one.
  session_start();

  // get session variables
  $count = $_SESSION['count'];
  $start = $_SESSION['start'];
  if (! isset($count)) {
    // initialise session variables
    $count = 0;
    $start = time();
    // register session variables
    $_SESSION['count'] = $count;
    $_SESSION['start'] = $start;
  } else {
    // update and save session variables
    $count++;
    $_SESSION['count'] = $count;
  }

  $sessionId = session_id();
?>

<html>
<body>
<?php $duration = time() - $start; ?>
<p>
This page maintains session <?= $sessionId ?>. 
<p>
It has been accessed <?= $count ?> times. <br>
And has lasted <?= $duration ?> seconds.
</body>
</html>
In general, session management involves starting a session (which sends a cookie with name PHPSESSID and value session_id() to the client), registering several variables (stored in the associative array $_SESSION) on the server, using and updating these variables across multiple visits to multiple pages, and eventually finishing the session by calling the function session_destroy(). (Undestroyed sessions are automatically destroyed by the server after a period of disuse.)

At the start of each script, it is necessary to call session_start() and get the session variables explicitly from the $_SESSION array. Before the end of each script, it's necessary to save the changed session variables explicitly in the $_SESSION array. (Some additional care is required if objects are stored in the $_SESSION array.)

As described, sessions rely on the transfer of cookies to identify the session identifier. But some users turn cookies off from security concerns. In this case, we can pass a session identifier around through URLs as before.

<?php
  session_start();

  // Generate a session-specific URL
  orderURL = "order.php?PHPSESSID=" . session_id();
?>

<a href="<?= $orderURL ?>">Create order</a>

To simplify this process, PHP sets the constant SID to a string of the form PHPSESSID=be1231231acaeda123aada12309809a7. The above fragment can then be written more simply as

<?php
  session_start();
?>

<a href="order.php?<?= SID ?>">Create Order</a>

For very large scale applications, in which different requests to the same page may be processed by different servers, it is necessary to store state information in a shared database (see Williams & Lane, 2E, Appendix F).

Database transaction processing

When multiple users are concurrently accessing and updating the same database, incorrect results can be returned and incorrect updates can be performed unless special care is taken.

Examples

In these examples, two or more transactions - sequences of operations - are performed concurrently. An interleaved set of transactions - a sequence of operations from multiple transactions - is called a schedule.

Example 1: Banking

Suppose X = 100.

T1: Read X, Add 10 to X, Write X
T2: Read X, Add 20 to X, Write X

Schedule: Read X (1), Add 10 to X (1), Read X (2), Add 20 to X (2), Write X (2), Write X (1)

After this schedule, X has value 110. But after T1 - T2 (or T2 - T1), it should have value 130.

Example 2: Summation

Suppose the database contains data items X1 to X4 with values 10 to 40. Transaction T1 computes the sum of all data items in the database. Transaction T2 deletes item X1 and inserts item X5 with value 50. The sum computed by T1 should be either 100 or 140. But consider the following schedule.

Read X1 (1), Read X2 (1), Read X3 (1), Delete X1 (2), Insert X5 (2), Read X4 (1), Read X5 (1), Write Sum.

This schedule writes the Sum 150 to the database. But this does not correspond to any possible state of the database.

Example 3: User registration

Suppose transactions T1 and T2 each insert a new user (with the same name) into the database. Each transaction should have the following structure:

Get name from form;
if (name is not in the database) {
    ...
    Insert the name (and details) into the database;
}
If T1 and T2 are executed concurrently, it is possible that each will determine the (same) name is not in the database, and each will then insert the (same) name into the database, resulting in two occurrences of the name in the database.

Principles

Transactions must be executed atomically (all or nothing), leave the database in a consistent state, act in isolation (do not affect other transactions), and be durable (when completed, stay completed). A database management system that ensures transactions satisfy these conditions is called ACID-compliant.

To ensure the ACID properties are satisfied, schedules must be serialisable, i.e., even if operations of two ar more transactions (e.g., T1, T2, T3) are interleaved, the overall effect must be the same as if they were executed in some serial order (e.g., T1 - T3 - T2).

Mechanisms

Serialisation is normally implemented using locks. Locks restrict access to data items (tuples, pages, tables). A "read lock" on an item allows other concurrent processes to read the item but prevents other processes from writing the item. A "write lock" on an item prevents other processes from reading or writing the item.

To ensure serialisability, the two-phase locking protocol must be used:

Acquire all locks before releasing any locks.

To ensure schedules avoid deadlocks, are recoverable, and have other desirable properties, the strict two-phase locking protocol must be used:

Acquire all locks before releasing any locks, and maintain all write locks until the end of the transaction.

Implementation

In MySQL versions before 4.0, read and write locks had to be explicitly obtained and released. These locking and unlocking commands are sent to the database server from the PHP script. The commands are:

LOCK TABLE name READ
LOCK TABLE name WRITE
UNLOCK TABLE name
UNLOCK TABLES
Note that the only data items that can be locked are complete tables. Commercial databases perform locking at the page or row level.

In MySQL versions from 4.0 onwards, the higher level SQL commands may be used:

START TRANSACTION
COMMIT
ROLLBACK

The normal pattern of usage in a PHP script is the following:

<?php
$result = mysql_query("start transaction", $connection);
// Read and write various related tables...
if (everything is OK) {
    $result = mysql_query("commit", $connection);
    ...
} else {
    $result = mysql_query("rollback", $connection);
    ...
}
?>
Much easier! (But you still need to be careful to include all required operations inside the transaction.)

I hope these brief descriptions of different PHP programming techniques will help you find your own way around various examples, texts and manuals more easily.


Last updated: $Date: 2008/04/21 00:28:34 $, by Rodney Topor