The text and programs here are from the first edition of CGI Programming 101. This has been replaced by the 2nd edition; please click here to view the updated material from the 2nd edition.

Chapter 3: CGI Environment Variables


Environment variables are a series of hidden values that the web server sends to every CGI you run. Your CGI can parse them, and use the data they send. Environment variables are stored in a hash called %ENV.

Variable NameValue
DOCUMENT_ROOTThe root directory of your server
HTTP_COOKIEThe visitor's cookie, if one is set
HTTP_HOSTThe hostname of your server
HTTP_REFERERThe URL of the page that called your script
HTTP_USER_AGENTThe browser type of the visitor
HTTPS"on" if the script is being called through a secure server
PATHThe system path your server is running under
QUERY_STRINGThe query string (see GET, below)
REMOTE_ADDRThe IP address of the visitor
REMOTE_HOSTThe hostname of the visitor (if your server has reverse-name-lookups on; otherwise this is the IP address again)
REMOTE_PORTThe port the visitor is connected to on the web server
REMOTE_USERThe visitor's username (for .htaccess-protected pages)
REQUEST_METHODGET or POST
REQUEST_URIThe interpreted pathname of the requested document or CGI (relative to the document root)
SCRIPT_FILENAMEThe full pathname of the current CGI
SCRIPT_NAMEThe interpreted pathname of the current CGI (relative to the document root)
SERVER_ADMINThe email address for your server's webmaster
SERVER_NAMEYour server's fully qualified domain name (e.g. www.cgi101.com)
SERVER_PORTThe port number your server is listening on
SERVER_SOFTWAREThe server software you're using (such as Apache 1.3)

Some servers set other environment variables as well; check your server documentation for more information. Notice that some environment variables give information about your server, and will never change from CGI to CGI (such as SERVER_NAME and SERVER_ADMIN), while others give information about the visitor, and will be different every time someone accesses the script.

Not all environment variables get set for every CGI. REMOTE_USER is only set for pages in a directory or subdirectory that's password-protected via a .htaccess file. (See Appendix D to learn how to password protect a directory.) And even then, REMOTE_USER will be the username as it appears in the .htaccess file; it's not the person's email address. There is no reliable way to get a person's email address, short of asking them outright for it (with a form).

The %ENV hash is automatically set for every CGI, and you can use any or all of it as needed. For example, if you wanted to print out the URL of the page that called your CGI, you'd do:

It's very simple to print out all of the environment variables, and some of the values in the ENV array will be useful to you later, so let's try it. Create a new file, and name it env.cgi. Edit it as follows:

Source code: http://www.cgi101.com/class/ch3/env.txt
Working example: http://www.cgi101.com/class/ch3/env.cgi

Save the above CGI, chmod it, and call it up in your web browser. Remember, if you get a server error, you'll want to go back and try running the script at the command line in the Unix shell, to see just where the problem is. (But note, if you run env.cgi in the shell, you'll get an entirely different set of environment variables.)

In this example we've sorted the keys for the ENV hash so they'll print out alphabetically, using the sort function. Perl's sort function, by default, compares the string value of each element of an array - which means it doesn't work properly for sorting numbers. Fortunately, sorting can be customized. We'll cover numeric and custom sorting in Chapter 8.

A Simple Query Form

There are two ways to send data from an HTML form to a CGI: GET and POST. These methods determine how the form data is sent to the server. In the GET method, the input values from the form are sent as part of the URL, and saved in the QUERY_STRING environment variable. With POST, data is sent as an input stream to the program. We'll cover POST in the next chapter, but for now, let's look at the GET method.

You can set the QUERY_STRING value in a number of ways. For example, here are a number of direct links to the env.cgi script:

Try opening each of these in your web browser. Notice that the QUERY_STRING is set to whatever appears after the question mark in the URL itself. In the above examples, it's set to "test1", "test2", and "test3", respectively. This can be carried one step further, by setting up a simple form, using the GET method. Here's the HTML for such an example:

Create the above form (call it form.html), and call it up in your browser. Type something into the field and hit return. You'll get the same env.cgi output, but this time you'll notice that the query string has two parts. It should look something like:

The value on the left is the actual name of the form field. The value on the right is whatever you typed into the input box, BUT you may notice if you had any spaces in the string you typed, they've been replaced with +. Similarly, various punctuation and other special non-alphanumeric characters are escaped out with a %-code. This is called URL-encoding, and it happens with data submitted through either GET or POST methods.

Your Perl script can convert this information back, but it's often easier to use the POST method when sending long or complex data. GET is mainly useful for short, one-field queries, especially for things like database searches.

You can also send multiple input data values with GET:

This will be passed to the env.cgi script as follows:

The values are separated by a &-sign. To parse this, you'll want to split the query string with Perl's split function:

split lets you break up a string into an array of different strings, breaking on a specific character. In the first case, we've split on the &-sign. This gives us two values: "fname=joe" and "lname=smith", which are stored in the array named @values. Then, with a foreach loop, we further split each string on the = sign, and print out the field name and the data that was entered into that field in the form.

Some warnings about GET: it is not at all a secure method of sending data, so don't use it for sending password info, credit card data or other sensitive information. Since the data is passed through as part of the URL, it'll show up in the web server's logfile (complete with all the data), and if that logfile is readable by any user (as most are), you're giving the info away to anyone who might happen to be looking. Private information should always be sent with the POST method, which we'll cover in the next chapter. (Of course, if you're asking visitors to send sensitive information like credit card numbers, you should also use a secure server, in addition to the POST method.)

GETs are most useful because they can be embedded in a link without needing a form element. This is often used in conjunction with databases, or instances where you want a single CGI to handle a clearly defined set of options. For example, you might have a database of articles, each with a unique article ID. You could write a single article.cgi to serve up the article, and the CGI would simply look at the query string to figure out which article to display. For example, clicking on

would display article #22.

Remote Host ID

You've probably seen web pages that greet you with a message like "Hello, visitor from (yourhost)!", where (yourhost) is your actual hostname or IP address. Here is an example of how to do that:

Source code: http://www.cgi101.com/class/ch3/rhost.txt
Working example: http://www.cgi101.com/class/ch3/rhost.cgi

This particular CGI creates a new page, but you'll probably want to use a server-side include (SSI), instead, to embed the information in another page. See Chapter 9 for more on SSIs.

One caveat: this won't work if your server isn't configured to do host name lookups. An alternative would be to display the visitor's IP address:

Working example: http://www.cgi101.com/class/ch3/rhostip.cgi

Last Page Visited

This is a variation on the remote host ID script - only here, we show the last page you visited.

Source code: http://www.cgi101.com/class/ch3/refer.txt
Working example: http://www.cgi101.com/class/ch3/refer.cgi

The HTTP_REFERER value only gets set when a visitor actually clicks on a link to your page - if they type the URL directly, then HTTP_REFERER is blank.

Checking Browser Type

This script does some pattern-checking to see what browser the visitor is using, and displays a different message depending on browser type.

Source code: http://www.cgi101.com/class/ch3/browser.txt
Working example: http://www.cgi101.com/class/ch3/browser.cgi

This is a tricky example because IE actually includes "Mozilla" in the browser type line, so we have to try matching "MSIE" first, before matching "Mozilla". The =~ is a pattern matching operator; it checks to see if /pattern/ is contained somewhere in the string. You can also use the =~ operator to replace patterns; we'll see an example of that in the next chapter.

Resources

Visit http://www.cgi101.com/class/ch3/ for source code and links from this chapter.


Copyright © 2000 by Jacqueline D. Hamilton.
Chapter 2 Table of Contents Chapter 4