Using Perl DBI to publish your data on the web

Contents

Introduction

This Perl DBI tutorial describes how to include data from your database in HTML pages by using the Common Gateway Interface (CGI).

There are many ways to publish your data in a web page. This tutorial uses CGI, Perl, and HTML::Template. There are many other methods you can use including mod_perl (which provides performance optimisation, connection caching, and so on).

Prerequisites

Before you start this tutorial, you need to ensure you have satisfied all the prerequisites:

  1. Perl

    We used Perl 5.8, but you only need the minimum required by the DBI and DBD::ODBC modules . Use perl --version to find out your Perl version.

  2. DBI module

    We used DBI 1.45, but this tutorial should work with anything after 1.40. To find out if you have a recent enough version of DBI, run:

      perl -e 'use DBI 1.40;'

    If you get an error like "DBI version 1.40 required--this is only version 1.30 at -e line 1," you need to upgrade DBI.

    If you get an error saying "DBI cannot be found in @INC," you've probably not got DBI installed.

    Go to CPAN to get an up-to-date version of the DBI module.

  3. DBD::ODBC

    We used DBD::ODBC 1.11. To check you have the DBD::ODBC module installed:

      perl -e 'use DBD::ODBC;'
    

    If you don't have DBD::ODBC, refer to Enabling ODBC support in Perl with Perl DBI and DBD::ODBC for installation instructions.

    To display the DBD::ODBC version:

    perl -MDBD::ODBC -e 'print $DBD::ODBC::VERSION;'

    To display all drivers that DBI knows about and their versions:

     perl -MDBI -e 'DBI->installed_versions;'

    Go to CPAN to get an up-to-date version of the DBD::ODBC module.

    You don't have to use DBD::ODBC for this tutorial. Any other DBD driver under Perl should work.

  4. ODBC driver and ODBC Driver Manager

    If you use DBD::ODBC, you'll also need an ODBC driver. We recommend you use an ODBC Driver Manager under DBD::ODBC. Refer to Enabling ODBC support in Perl with Perl DBI and DBD::ODBC. We provide ODBC drivers for many databases and operating systems and all our ODBC drivers include the unixODBC Driver Manager.

    You probably have the unixODBC Driver Manager installed if you have the odbcinst command. For Easysoft ODBC drivers, the unixODBC Driver Manager is located in /usr/local/easysoft/unixODBC and the odbcinst command in the bin subdirectory of that path. We used unixODBC 2.2.9 in this tutorial. You can find out your unixODBC version with:

    odbcinst --version
  5. Web server

    To follow this tutorial, we recommend that you use a web server like Apache. (Although you could redirect the Perl script output to a file and view the file in a web browser instead, if you don't have a web server.)

    If you use a web server, you'll need to enable CGI and tell the web server that files ending in the .pl extension are CGI scripts. Consult your web server documentation for details.

The basics of CGI

The Common Gateway Interface (CGI) is a method by which a web server runs programs on the server. These programs return HTML output to the web server. The output is then supplied to the web browser. The CGI program is in the URL requested by the web browser. The CGI program generates dynamic HTML pages, differentiating such pages from static HTML pages.

CGI also defines a method by which a browser can supply information to the program to be run. This information can be GETDATA (elements in the URL that are interpreted as variable declarations) or submitted form data (more of this later).

In Perl, the most basic program is:

#!/usr/bin/perl -w
print "content-type: text/html\n\n";
print "
<html>
<body>
Hello\n";

The first thing to note here is the content-type line. This tells the web server that content returned by this CGI script is of MIME type text/html. Next, two newline characters split the header information from the HTML output. The first thing a CGI program must do is provide the CONTENT-TYPE to the web server and the web server recognises the end of the HTTP header information by the double newline. In fact, this is probably one of the most common mistakes in CGI: not outputting the CONTENT-TYPE first. For example:

#!/usr/bin/perl -w
print "
<html>
<body>
Hello\n";

generally outputs a server error like:

Error message:
malformed header from script.
Bad header=<html><body>Hello</body></html:
/http/htdocs/tut4_2.pl 

The second thing to note is that the CGI program (xxx.pl) should be executable and contain the #! line to indicate it needs to be passed to the Perl interpreter. Alternatively, you can associate the .pl extension with CGI in your web server configuration files. Generally, if you don't do this, your web server will treat the document as static content and display the content of your Perl script instead of the result of running it.

Basic program that displays data by using CGI

So now you understand the basics of CGI, we now can combine a small CGI script with some Perl to display the contents of a database table from your web server. Since the content-type is text/html all the Perl output must be HTML. Suppose you have the table TEST containing 2 columns, productcode (integer) and product (char(32)):

#!/usr/bin/perl -w
use DBI;

# output the content-type so the web server knows

print "content-type: text/html\n\n";
print '<html><head><title>Basic CGI</title><head><body>';

#
# you should change the connect method call to use the DBD you are
# using. The following examples all use DBD::ODBC and hence the
# first argument to the connect method is 'dbi:ODBC:DSN'.
#

my $dbh = DBI->connect('dbi:ODBC:test', 'dbuser', 'dbauth');
my $sql = q/select productcode, product from TEST/;
my $sth = $dbh->prepare($sql);
$sth->execute;
print '<table border="1">\n';

# table headings are SQL column names

print "<tr><th>$sth->{NAME}->[0]</th><th>$sth->{NAME}->[1]</th></tr>";
while (my @row = $sth->fetchrow_array) {
    print "<tr><td>$row[0]</td><td>$row[1]</td></tr>\n";
}
print "</table>\n";
print "</body></html>\n";

This does the job and you can write all your Perl CGI like this, but:

Improving the basic program

Adding the CGI module to the basic program

The CGI module, which you can find on CPAN, can help with a lot of the CGI processing. Here we introduce it rather simply to take care of the mime-type output, the <HTML>, <BODY> tags, the title, and the tags to end the HTML.

Our simple program becomes:

#!/usr/bin/perl -w
use DBI;
use CGI;

my $cgi = new CGI;

# output the content-type so the web server knows

print $cgi->header;

# output the start HTML sequence with a title

print $cgi->start_html(-title=>'Basic CGI');

#
# you should change the connect method call to use the DBD you are
# using. The following examples all use DBD::ODBC and hence the
# first argument to the connect method is 'dbi:ODBC:DSN'.
#

my $dbh = DBI->connect('dbi:ODBC:test', 'dbuser', 'dbauth');
my $sql = q/select productcode, product from TEST/;
my $sth = $dbh->prepare($sql);
$sth->execute;
print $cgi->table({border=1});

# table headings are SQL column names

print "<tr><th>$sth->{NAME}->[0]</th><th>$sth->{NAME}->[1]</th></tr>";
while (my @row = $sth->fetchrow_array) {
    print "<tr><td>$row[0]</td><td>$row[1]</td></tr>\n";
}
print $cgi->end_table;
print $cgi->end_html;

Here, we have used the CGI as follows:

The CGI module is a lot more powerful than this and you could output the whole table in one go by combining CGI's Tr method and DBI's fetchrow_xxx method. The next section shows a totally different method, which allows you to separate your HTML from the data you extract from the database.

Adding the HTML::Template module to the basic program

The HTML::Template module (find it on CPAN) allows you to separate HTML from the data and metadata you extract from your database. This has the advantage that you can have multiple people working independently on the Perl script and the HTML.

First, you create a template that contains HTML and markers that HTML::Template substitutes with data and metadata that you extract from your database. Then you load the template and set the variables you want to substitute in the template and finally you output the combination of your template and the substituted variables.

Our template for the basic program we started with is:

<html>
<title><TMPL_VAR name=TITLE></title>
<body>
<table border="1">
<tr>
<TMPL_LOOP NAME=HEADINGS>
<th><TMPL_VAR NAME=HEADINGS></th>
</TMPL_LOOP>
</tr>
<TMPL_LOOP NAME=ROWS>
<tr><td><TMPL_VAR NAME=PRODUCTCODE></td><td><TMPL_VAR NAME=PRODUCT></td></tr>
</TMPL_LOOP>
</table>
</body>
</html>

We save this template as tut4_5.tpl.

TMPL_VAR NAME=xxx names the variables we're going to substitute. When we want to do a substitution in the template, we pass the name of the TMPL_VAR and its value to HTML::Template. For loops (TMPL_LOOP) we need to pass the name of the loop (for example, HEADINGS) and then an array reference. Each element of the array is a reference to a hash where the keys are the names of the TMPL_VAR variables in the loop.

Our program now becomes:

#!/usr/bin/perl -w
use strict;
use DBI;
use HTML::Template;

# output the content-type so the web server knows

my $template = HTML::Template->new(filename => 'tut4_5.tpl');
$template->param(TITLE=>'Basic CGI');

#
# you should change the connect method call to use the DBD you are
# using. The following examples all use DBD::ODBC and hence the
# first argument to the connect method is 'dbi:ODBC:DSN'.
#

my $dbh = DBI->connect('dbi:ODBC:test', 'db_user', 'db_pass');
my $sql = q/select productcode, product from TEST/;
my $sth = $dbh->prepare($sql);
$sth->execute;

#
# create an array where each element is a reference to a hash
# where the keys are the name of the TMPL_VAR we want to substitute.
# In this case the TMPL_LOOP is HEADINGS  and the name of the hash key
# is HEADING. i.e. we have
# $headings[0] = {HEADING=>'value'}
# $headings[1] = {HEADING=>'value'}
#

my @headings;
foreach (@{$sth->{NAME}}) {
        my %rowh;
        $rowh{HEADING} = $_;
        push @headings, \%rowh;
}
$template->param(HEADINGS=>\@headings);

#
# For the rows we know the TMPL_VAR names are PRODUCT and PRODUCTCODE.
# The loop TMPL_LOOP is called ROWS.
# For each ROW we need a hash where the keys are PRODUCTCODE and PRODUCT.
# @rows is an array with each element a reference to a ROW and it has as
# many elements as there are rows in the query result.
#

my @rows;
while (my @data_row = $sth->fetchrow_array) {
        my %row;
        $row{PRODUCTCODE} = $data_row[0];
        $row{PRODUCT} = $data_row[1];
        push @rows, \%row;
}
$template->param(ROWS=>\@rows);
print "Content-Type: text/html\n\n", $template->output;

Although, the Perl looks more complex, we have managed to completely separate the HTML from the data. So long as the SQL query or TMPL_LOOP/TMPL_VAR names don't change, one person can edit the style in the template as much as they want without needing to reference the Perl script.

This script can be simplified even further. If the TMPL_VAR variable names match the column names in the query, we can use the fetchrow_arrayref method, but we have avoided this here to illustrate the basic principle.

Altering the program to produce different results

Altering the program to produce different results based on the URL

So far, we have produced a basic CGI program that displays HTML output based on results from an SQL query. But what if we want to use the same script to produce different results depending on the URL?

CGI supports what are called GETDATA arguments. This is where the URL includes the names of CGI variables and their values. For example, http://myserver/myscript.pl?var=value. In this example, the Perl script can obtain the var=value and use it in the SQL query to produce a result dependent on the URL. The CGI module can automatically handle the GETDATA arguments such that you can either have the variable var created automatically for you (we advise against this) or you can ask the CGI module for the value of var. In this way, you can tailor your database query to the requested URL. For example, a user enters http://myserver/myscript.pl?productcode=fred to get a list of the product codes called fred. Extending the HTML::Template example from the previous section, we need to add code to query the value of the productcode GETDATA argument then substitute this into our query:

#!/usr/bin/perl -w
use strict;
use DBI;
use HTML::Template;
use CGI;   # we add CGI to get hold of the GETDATA arguments

my $cgi = new CGI;

# output the content-type so the web server knows

my $template = HTML::Template->new(filename => 'tut4_6.tpl');
$template->param(TITLE=>'Basic CGI');

#
# Get productcode GETDATA argument
# What happens here is if the URL ends in "?productcode=xxx" then
# $cgi->param('productcode') returns "xxx"
#

my $productcode = $cgi->param('productcode');
$productcode = '%' if (!defined($productcode));
my $dbh = DBI->connect('dbi:ODBC:test', 'dbuser', 'dbpass');

# change the SQL to select the productcode = the value of the GETDATA
# argument productcode

my $sql = q/select productcode, product from TEST/;
$sql .= q/ where f1 like ?/;

#
# the rest of this script is the same as the previous example
# except we pass the parameter for the SQL query
#

my $sth = $dbh->prepare($sql);
$sth->execute($productcode);

my @headings;
foreach (@{$sth->{NAME}}) {
        my %rowh;
        $rowh{HEADING} = $_;
        push @headings, \%rowh;
}
$template->param(HEADINGS=>\@headings);
my @rows;
while (my @data_row = $sth->fetchrow_array) {
        my %row;
        $row{PRODUCTCODE} = $data_row[0];
        $row{PRODUCT} = $data_row[1];
        push @rows, \%row;
}
$template->param(ROWS=>\@rows);
print "Content-Type: text/html\n\n", $template->output;

Altering the program to handle form data

What is form data? An HTML form contains fields that let the user enter text, click on check boxes, or select from a list and then submit the form by clicking on the form submit button. The form submit button is linked to a URL that the browser sends as an HTTP GET request and the web server supplies the form data on stdin (standard input) for the CGI program. The CGI program reads stdin, parses the form data, and acts on it. Alternatively, you can use the CGI module to do this for you; we recommend this method.

If you use the CGI module, form data is available by using the param method. If you choose not to use the CGI module, the form data is supplied to stdin as:

fieldname=value&field2name=value

We recommend that you use the CGI module to handle form data because parsing form data on stdin is more complex then you might think given this description. For example, certain characters in form data are encoded in a hex format. For example a space character is encoded as %20.

Suppose you have this form in your HTML:

<form method='post' action='/cgi/myperl.pl'>
Product Code<input type='text' name='productcode' maxlength='64' size='32'>
<input type='image' name='btnsubmit' src='/images/btn_submit.gif' border='0'>
</form>

The form defines the method post, which causes the form data to be supplied on stdin for the CGI script. The form action is the URL to submit the form's contents to (this is the address of your Perl CGI script). The input types define the name of the form fields (productcode in this case) which you pass to CGI's param method. When the user clicks on submit, the browser issues an HTTP GET request to the URL in the action field.

The example form lets a user enter a value for the product code. When the user submits the form, the example SQL's WHERE clause returns only those rows matching that product code.

Our previous Perl CGI script remains the same as the one in Altering the program to produce different results based on the URL because the CGI module transparently handles form data and GETDATA requests.