Programming with PHP and MySQL (or SQLite)


Contents

Background

We now continue the study of how to write three-tier dynamic Web applications. Recall that the three tiers are:

In general, each tier may be running on a separate host.

We describe how to use both SQLite and MySQL databases.

SQLite is a simple, fast, reliable, widely used, free RDBMS that does not require installation or maintenance. It's widely used in embedded systems and is suitable for use in small- to medium-sized applications.

MySQL is a more fully-functioned, sometimes faster, reliable, widely used, free RDBMS that does require installation and maintenance. MySQL was recently (2008) purchased by Sun, and more recently (2010) Sun was purchased by Oracle, but MySQL will (supposedly) remain freely available.

PostgreSQL is a serious competitor to MySQL; this competition spurs improvements to both systems.

Here are some introductory tutorials on PHP and MySQL:

Here are three recommended texts, all in the library in paper and/or online form.

The PHP Manual (mirrored here) is a critical resource. Note particularly the sections on SQLite functions and MySQL functions. The version of PHP installed and documented on dwarf is 5.2.3, which is quite a recent version.

The SQLite Documentation is a critical resource. SQLite 3.3.6 is installed on dwarf.

The MySQL Manual (mirrored here) is another critical resource. MySQL 5.0.77 is installed on dwarf.

Recall. To use PHP on dwarf, you must place all HTML pages and PHP scripts inside a directory public_html in your home directory. The directory public_html (and any enclosed directories) must have 705 permission. All HTML pages and PHP scripts must have 604 permission. All PHP scripts must end with the suffix .php. To access a PHP script test.php inside the public_html directory, give the URL http://dwarf.ict.griffith.edu.au/~s123456/test.php to your browser.

Note. See Lab 4 for information on MySQL accounts and passwords, and for command line access to SQLite and MySQL databases, and for Web access to MySQL databases using phpMyAdmin.

Programming with databases

First review the previous introductions to HTML, PHP, Smarty and SQL.

Possible approaches

There are at least two different ways to build a database backed Web site with PHP or similar languages:

  1. Use a data(base) abstraction layer such as PDO which allows you access any of several RDMBSs in a uniform way. This allows you to retarget your application between SQLite, MySQL or Oracle with minimal effort.
  2. Use an RDBMS-specific set of function calls such as the SQLite or MySQL (or Oracle) function calls.

We illustrate both approaches below.

Using PDO

(This section may be omitted on a first reading.)

PDO is one of several PHP data abstraction layers that allow database-backed Web applications to be written independently of which particular RDBMS is used.

SQLite 3

Here is SQLite-backed guest book application implemented using PDO and a SQLite 3 database to store messages on the server.

This example checks that all database function calls succeed, uses Smarty, stores function definitions in a separate file, and protects against cross-site scripting attacks (by escaping user input on output), all of which are good practice. It does not, however, protect against SQL injection attacks.

The data is stored in a SQLite 3 database, messages.db, which is a regular file. This database file and the directory containing it must both have permission 707.

The SQL "create table" statements are stored in a separate file, messages.sql, in the application directory for convenient access.

To initialise the database, give the Linux command "sqlite3 messages.db" from the dwarf command line, and then load the SQL file with the SQLite command ".read messages.sql".

To protect against SQL injection attacks, use the PDO::prepare() and PDOStatement::execute() statements described in the PDO section of the PHP manual.

MySQL

Here is MySQL-backed guest book application implemented using PDO and a MySQL database to store messages on the server.

Here, to initialise the database, launch the MySQL server from the dwarf command line, and then load the SQL file with the MySQL command "source messages.sql".

Note. The implementation of this application is identical to the preceding SQLite-backed application, except for the definitions of function open_connection() and file messages.sql.

Using the MySQL API in PHP

From now on, our examples will use the more widely-used MySQL-specific database access functions provided in PHP.

A first example

Here is a first database-backed guest book application that stores messages in a MySQL database on the server and uses the MySQL API to access the database. This example checks that all database function calls succeed, stores function definitions in a separate file, and uses Smarty, all of which are good practice.

The application uses the function mysql_fetch_array() to iterate through the result set of a query; each call returns a row of the result as a PHP associative array, which allows programmers to refer to the columns of each row by name. Note also that Smarty requires arrays not result sets, so it is necessary to construct arrays corresponding to result sets, an unfortunate complication.

Again, the SQL "create table" statements are stored in a separate file, here messages.sql, in the application directory for convenient access.

The data itself is stored in a MySQL database owned by a particular user. Details of the user and database are specified in the (private) file mysql.php, which contains definitions of the constants HOST, NAME, PASSWORD and DATABASE necessary to use the previously created tables. It has the form:

<?php
define("HOST", "mysql.ict.griffith.edu.au"); // The database server's host
define("USER", "xxxxxx"); // The database owner's name
define("PASSWORD", "yyyyyy"); // The database owner's password
define("DATABASE", "zzzzzz"); // The database's name
?>

In practice, this file should be stored outside public_html for security, but for teaching and assessment purposes it must be stored inside your application directory, e.g., in the includes subdirectory.

Note that the Smarty templates uses an escape filter to prevent any HTML or JavaScript in user-entered text from being interpreted. However, this application does not protect the system from ``SQL injection attacks'' (see below).

This application corresponds roughly to the one in Autrey's tutorial, except that it uses better style and Smarty templates.

A second example

Here is a slightly cleaned-up implementation of Yank's jokes database example.

The main new feature it illustrates is the ability to delete rows from a database table by passing row ids in the query part of a URL.

The structure of this example is a little complicated because it uses combined scripts and a single one-component script to both add and delete jokes.

General principles

Every interaction between PHP (or another server-side scripting language) and a database mannagement system (whether using a data abstraction layer or a database-specific API) has the form

  1. Establish a connection to the database server.
  2. Select a database on the database server.
  3. Send a query to the database server.
  4. Process the results of the query one row at a time.
  5. Close the connection to the database server.

Each of these operations may fail or result in an error, and error checking must be performed.

When using PDO, the first two operations are combined.

Normally, each such sequence of operations (a transaction) is encapsulated in a PHP function that does a single well-defined task.

Often, unlike these first two examples, the results of database queries are returned to the client in HTML tables. We'll see examples of this later.

Common patterns

Many patterns occur repeatedly in different contexts:

Here is a simple incomplete application that illustrates the first two of these patterns.

We illustrate these patterns in this simple context.

Our illustrations of the first few patterns assume a table items defined as follows:

drop table if exists items;

create table items (
    id int not null auto_increment primary key,
    summary varchar(80) not null,
    details text default ''
);

List-detail pattern

The list-detail pattern consists of two scripts:

Obviously, you can use different names for the scripts (as long as their purposes are clear) and you can interpret "summary" and "details" differently in each application.

item_list.php

$item_list = get_items(); // an array of all items (more generally, of some items)

$smarty = new Smarty;
$smarty->assign("item_list", $item_list);
$smarty->display("item_list.tpl");

item_list.tpl

<ul>
{foreach $item_list as $item}
    <li><a href="item_detail.php?id={$item.id}">{$item.summary}</a></li>
{/foreach}
</ul>

The item id is the chain that connects the item_list.php script with the item_detail.php script.

item_detail.php

$id = $_GET['id'];

$item = get_item($id); // the item with that primary key

$smarty = new Smarty;
$smarty->assign("item", $item);
$smarty->display("item_detail.tpl");

item_detail.tpl

<h1>{$item.summary}</h1>

<p>{$item.details}</p>

CRUD pattern

CRUD stands for Create-Retrieve-Update-Delete. We discuss retrieval below.

(For simplicity, we ignore input validation in this section.)

add_item form

(Forms seem to look best as tables, but not shown here.)

<form method="post" action="add_item_action.php">
    Summary: <input type="text" id="summary"> <br>
    Details: <textarea id="details"></textarea> <br>
    <input type="submit" value="Add item">
</form>

add_item_action.php

$summary = $_POST['summary'];
$details = $_POST['details'];

$id = add_item($summary,$details);

if ($id) {
    header("Location: item_detail.php?id=$id");
    exit;
}

Definition of add_item()

function add_item($summary,$details) {
    $connection = mysql_open();
    $summary = mysql_escape_string($summary);
    $details = mysql_escape_string($details);
    $query = "insert into items (summary, details) " .
             "values ('$summary', '$details')";
    $result = mysql_query($query, $connection) or show_error();
    $id = mysql_insert_id();
    mysql_close($connection) or show_error();
    return $id;
}

Modified item_detail.tpl

<h1>{$item.summary}</h1>

<p>{$item.details}</p>

<p><a href="update_item.php?id={$item.id}">Update</a></p>

<p><a href="delete_item.php?id={$item.id}">Delete</a></p>

It's always best to put update and delete operations on the details page, not on the list page. Why?

update_item.php

$id = $_GET['id'];
$item = get_item($id);

$smarty = new Smarty;
$smarty->assign("item", $item);
$smarty->display("update_item.tpl");

update_item.tpl

<form method="post" action="update_item_action.php">
    <input type="hidden" id="id" value="{$item.id}">
    Summary: <input type="text" id="summary" value="{$item.summary}"><br>
    Details: <textarea id="details">{$item.details}</textarea><br>
    <input type="submit" value="Update item">
</form>

update_item_action.php

$id = $_POST['id'];
$summary = $_POST['summary'];
$details = $_POST['details'];

update_item($id,$summary,$details);

header("Location: item_detail.php?id=$id");
exit;

Definition of update_item()

function update_item($id,$summary,$details) {
    $connection = mysql_open();
    $summary = mysql_escape_string($summary);
    $details = mysql_escape_string($details);
    $query = "update items " .
             "set summary = '$summary', details = '$details' " .
             "where id = $id";
    $result = mysql_query($query, $connection) or show_error();
    mysql_close($connection) or show_error();
}

delete_item.php

$id = $_GET['id'];

delete_item($id);

header("Location: item_list.php");
exit;

Definition of delete_item()

function delete_item($id) {
    $connection = mysql_open();
    $query = "delete from items where id = $id";
    $result = mysql_query($query, $connection) or show_error();
    mysql_close($connection) or show_error();
}

Example

Here is the simple incomplete application that combines the list-detail and CRUD patterns into a single, small application. The source code of the application is available in the zipped archive list-detail.zip. You can download this archive, unzip it, complete it, transfer it to dwarf, test it, and modify it for your own purposes.

User-defined queries

Now, let's return to the guest book example. Study this database-backed guest book application with queries. In this version, the structure of the application has been changed so that you only see the entries if you ask for them. More importantly, you can now query the guest book by name and/or email to see only the entries with the given name and/or email value.

The application now consists of four main scripts: index.php (home page), search.html (display search form), results.php (run query and display resulting entries), and add_message.php (display new message form and process form data to add new message). Note that browsing the entries is equivalent to executing a query with no name or email specified. Study the relationship between these four scripts. Note that add_message.php is a combined script. Documenting the relationship between the scripts in a dynamic Web application is important.

This version of the application demonstrates:

Note that this version includes an alternative search form using a selection that allows users to search for one of name and email rather than both of name and email.

Exercise. Write a third search form and corresponding PHP script that contains a single text field and returns all messages that contain the field value in either name or email.

Here is a transition diagram that documents the relationship between the different scripts that make up this application and the information that is passed between scripts. Transition diagrams are discussed in more detail below.

Selection pattern

Often when creating or updating an item, you want to select one field value from a given list of values, without requiring the user to enter the value (and possibly making an error). The list of values may be retrieved from the database. We use the selection element in a form to do this.

For example, in an "add item" form, we might write:

<select id="category">
    <option value=-1> Enter category </option>
    <option value=0> Books </option>
    <option value=1> Music </option>
    <option value=2> Video </option>
</select>

By default, the first option is the one displayed initially.

In the action that receives the form data, it is the value that is associated with the key category in the global array $_POST.

See the personal details example to understand how this works.

In practice, the list of values comes from a database. For example, suppose each item also has an integer field category, a reference to the key of a separate categories table:

drop table if not exists categories;

create table categories (
    id int not null auto_increment primary key,
    name varchar(40) not null unique
);
Then, we would need to look up the database to find the list of options in the add_item.php script:
$categories = get_categories();

$smarty = new Smarty;
$smarty->assign("categories", $categories);
$smarty->display("add_item.tpl");

And we would use this list in the add_item.tpl template:

<select id="category">
    <option value=-1> Enter category </option>
    {foreach $categories as $category}
        <option value={$category.id}> {$category.name} </option>
    {/foreach}
</select>

If this form occurred in the update_item.tpl template, then we would have to identify the option for the current item. Again, this assumes we have passed the value of the item to be updated into the template.

<select id="category">
    {foreach $categories as $category}
        {if ($category.id == $item.category)}
            <option value={$category.id} selected> {$category.name} </option>
        {else}
             <option value={$category.id}> {$category.name} </option>
        {/if}
    {/foreach}
</select>

Obviously, there's no need for an "Enter category" option when updating.

If you are using XHTML, you must write selected="selected", instead of simply selected.

Data validation

Data entered by users, for example in forms, must be validated to ensure required data is present and that all data, as much as possible, has the correct form. This simplifies all subsequent code, by avoiding repeated tests that data has the correct form.

Data may be validated on the client or on the server. Client validation has the advantage of faster response and reduced network traffic, but the disadvantage of being unreliable (users can bypass forms) and having less access to required information (e.g., database information). Therefore, all data must be validated on the server, and may also be validated on the client to reduce load on shared resources (network, server) and to provide faster response to users.

In each case it's important to remember that user data can't be trusted (e.g., it may contain arbitrary HTML, JavaScript, PHP and SQL code scripts that could do damage if executed on the server or the client). So data must be cleaned before being processed. Basic advice on data cleansing is provided above.

Validation involves two tasks: checking form data for correctness, and reporting errors to users. It's possible to check each field separately and report errors one at a time, but it is much better (from the user's viewpoint) to check all fields first and to report all errors at the same time.

It's also desirable to report errors in the context where they occur, and to provide user-entered data so the user does not have to re-enter valid data. The best way to do this is using session variables, which have not yet been described.

Validation on server

PHP stores the field names and values of a form in the associative array $_POST or $_GET (depending on which method was used). So the form data may be processed in a script as follows. This script accumulates errors generated by different fields in a single string, errors, which is then displayed to the user. The script assumes the form has name, postcode and email fields.

<?php
  // Initialise the error string
  $errors = "";

  // Clean the form values
  $form_vars = array();  // cleaned copy of $_POST
  foreach ($_POST as $var => $value)
    $form_vars[$var] = clean($value);

  // Validate the name
  if (empty($form_vars["name"]))
    $errors .= "\n<br>The name cannot be blank.";

  // Validate the postcode
  if (empty($form_vars["postcode"]) ||
    !preg_match("/^[0-9]{4}$/", $form_vars["postcode"]))
    $errors .= "\n<br>The postcode must contain four digits.";

  // Validate the email address (approximately)
  if (empty($form_vars["email"]) ||
    !preg_match("/^[a-zA-Z0-9_]+@(([a-zA-Z0-9_]+).)+[a-zA-Z]+$/", $form_vars["email"]))
    $errors .= "\n<br>The email address has the wrong form.";

  if (!empty($errors)) {
      $smarty = new Smarty;
      $smarty->assign("errors", $errors);
      $smarty->display("form_error_page.tpl");
  } else {
      // Process the valid form data...
  }

Function clean() could do different things depending on the application, but the simplest version just truncates its argument:

function clean ($input) {
    $input = substr($input, 0, MAX_LENGTH);
    return $input;
}

Script form_error_page.tpl should display the error string errors and a link to go back to the form page.

Function preg_match()() is used for recognising regular expressions in data; it's very useful, and there are many variations and related functions, e.g., preg_match("/.../i",target) for case-insensitive matching.

Validating dates is important and PHP provides many functions to support date processing.

Additional techniques are required for form components such as selections which may have multiple values selected.

Such a script could potentially look up a database to check that entered data was consistent with stored data.

An improved method would pass the generated string back to the form and display it at the top of the form. This requires forms to be PHP scripts, not HTML documents. For example, if the form were in the template form.tpl, then we should modify the above code at the end as follows:

if ($errors) {
    $referer = $_SERVER['HTTP_REFERER'];
    header("Location: $referer?errors=$errors");
    exit;
} else {
    ...
}

(It would be a mistake to display the form template directly here, even with the error message.)

We should make the HTML page containing the form a PHP script, form.php, say, which contains the code:

$errors = $_GET['errors'];
...
$smarty->assign("errors", $errors);
$smarty->display("form.tpl");

And we should modify the template form.tpl that contains the form to include the error string, if any, immediately before the form:

{if $errors}
  <p>{$errors}</p>
{/if}

<form ...> ... </form>

This method still has the problem that all messages are displayed together. It's better to put the error message for each invalid field next to that field. This requires a separate error string for each field, e.g., name_error, postcode_error and email_error. The value of each of these variables must be passed to the form script separately:

header("Location: 
$referer?name_error=$name_error&postcode_error=$postcode_error&email_error=$email_error");
And each error must be displayed in the template separately:
<form ...>
  {if $name_error} <p>{$name_error}</p> {/if}
  <input type="text" id="name">
  ...
</form>

An even better method would ensure that when you return to the form some or all of the fields still contain what the user entered, to avoid irritating retyping. The best way to do this is using session variables, which we cover later.

Validation on client

This will be studied in more depth later when JavaScript is introduced. Here is a very simple script that enables the client to validate user data and submit it to the server if it is OK. The essence of the script is the following.

<html>
<head>
<script type="text/javascript">
<!--
function containsSpace(s) {
  if (s.indexOf(" ") >= 0) {  // s contains a space character
      alert("The field must not contain any spaces.");
      return false; // don't submit form to server
      }
  }
  return true; // submit form to server as usual
}
-->
</script>
</head>

<body>
<form onSubmit="return(containsSpace(this.user.value));"
      method="post" action="test.php">
  User: <input type="text" id="user" size=10>
  <input type="submit" value="Submit">
</form>
</body>
</html>

In practice, JavaScript function definitions should be stored in external files, like external style sheets.

There is no need for you to use JavaScript to perform client-side input validation...yet.

Data sanitisation

In addition to ensuring that some input data has the correct form, we also have to ensure that unrestricted input data (e.g., names, addresses, comments) cannot damage the server, access restricted information, or cause other unintended effects. How to prevent such unintended effects requires input data cleansing or sanitisation.

A first general principle is to restrict the length of the input using the substr() function:

$email = substr($_GET['email'], 0, MAX_LENGTH);

It's also often important to remove leading and trailing spaces with the trim() function, and to convert strings to upper or lower case before comparison with the strtoupper() and strtolower() functions.

The next general principle is to escape "dangerous" characters in the input depending on how it is to be used.

Now, one should not simply apply htmlspecialchars(), escapeshellcmd() and mysql_escape_string() to every user-entered string. First, they are not always all required. Second, the order is important, and tricky to get right.

Here is a simple set of practical guidelines:

Developments

The original MySQL API used in these notes and examples has now been deprecated in favour of the MySQL Improved Extension (which may be used in either an object-oriented or procedural style). Developers are encouraged and will soon be required to use the improved extension, but students are not required to do so for this course.

The function mysql_escape_string() (resp., mysqli_escape_string() is deprecated in favour of the better function mysql_real_escape_string() (resp., mysqli_real_escape_string().

However, the best way to sanitise user input to prevent SQL injection attacks is to use SQL prepared statements and the functions mysqli_prepare(), mysqli_stmt_bind_param(), mysqli_stmt_execute() and so on. Again, it's OK to just use the function mysql_escape_string() for this course.