The text and programs here are from the first edition of CGI Programming 101. This has been replaced by the 2nd edition; please click here to view the updated material from the 2nd edition.

Chapter 1: Getting Started


Our programming language of choice for this class is Perl. Perl is a simple language, easy to learn, yet powerful enough to accomplish the most difficult tasks. It is widely available, and is probably already installed on your Unix server. Perl is an interpreted language, meaning you don't need to compile your script - you simply write your script and run it (or have the web server run it). The script itself is just text code; the Perl interpreter does all the work. The advantage to this is you can copy your script with little or no changes to any machine with a Perl interpreter. The disadvantage is you won't discover any bugs in your script until you run it.

You can edit your Perl scripts either on your local machine (using your favorite text editor - Notepad, Simpletext, etc.), or in the Unix shell. If you're using Unix, try pico - it's a very simple, easy to use text editor. Just type pico filename to create or edit a file. Type man pico for more information and instructions for pico. If you're not familiar with the Unix shell, see Appendix B for a Unix tutorial and command reference.

You can also use a text editor on your local machine, and upload the finished scripts to the Unix server. Be sure to use a plain text editor, such as Notepad (on PC) or BBEdit (on Mac), and turn off special characters such as smartquotes - CGI files must be ordinary text. Also, it is imperative that you upload your CGI as text, NOT binary! If you upload it as binary, it will come across with a lot of control characters at the end of the lines, and these will break your script. You can save yourself a lot of time and grief by just uploading everything as text (unless you're uploading pictures - for example, GIFs or JPEGs - or other binary data. HTML and Perl CGIs are not binary, they are plain text.)

Once your script is uploaded to the Unix server, you'll want to be sure to move it to your public_html directory (or whatever directory you have set up for web pages). Then, you will also need to change the permissions on the file so that it is "executable" (or runnable) by the system. In Unix, this command is:

This sets the file permissions so that you can read, write, and execute the file, and all other users (including the webserver) can read and execute it. See Appendix B for a full description of chmod and its options.

Most FTP programs also allow you to change file permissions; if you use your FTP client to change perms, you'll want to be sure that the file is readable and executable by everyone, and writable only by the owner (you).

One final note: Perl scripts, Unix commands, and filenames are all case-sensitive. Please keep this in mind as you write your first Perl scripts, because in Unix, "perl" is not the same as "PERL".

Basics of a Perl Script

You probably are already familiar with HTML, and so you know that certain things are necessary in the structure of an HTML document, such as the <HEAD> and <BODY> tags, and that other tags like links and images have a certain allowed syntax. Perl is very similar; it has a clearly defined syntax, and if you follow those syntax rules, you can write Perl as easily as you do HTML.

If you're creating scripts on Unix, you'll need one statement as the first line of every script, telling the server that this is a Perl script, and where to find the Perl interpreter. In most scripts the statement will look like this:

For now, there should generally not be anything else on the same line with this statement. (There are some flags you can use there, but we'll go into those later.) If you aren't sure where Perl lives on your system, try typing these commands:

If the system can find it, it will tell you the path name to Perl. That path is what you should put in the above statement.

After the above line, you'll write your Perl code. Most lines of Perl code must end in a semicolon (;), except the opening and closing lines of loops and conditional blocks. We'll cover those later.

Let's write a simple first program. Enter the following lines into a new file, and name it "first.pl".

Source code: http://www.cgi101.com/class/ch1/first.txt

Save the file. Now, in the Unix shell, you'll need to type:

This changes the file permissions to allow you to run the program. You will have to do this every time you create a new script; however, if you're editing an existing script, the permissions will remain the same and won't need to be changed again.

Now, in the Unix shell, you'll need to type this to run the script:

If all goes well, you should see it print Hello, world! to your screen.

NOTE: This program is not a CGI, and won't work if you try it from your browser. But it's easily changed into a CGI; see below.

Basics of a CGI Script

A CGI program is still a Perl script. But one important difference is that a CGI usually generates a web page (for example: a form-processing CGI, such as a guestbook, usually returns a "thank you for writing" page.) If you are writing a CGI that's going to generate a HTML page, you must include this statement somewhere in the script, before you print out anything else:

This is a content header that tells the receiving web browser what sort of data it is about to receive - in this case, an HTML document. If you forget to include it, or if you print something else before printing this header, you'll get an "Internal Server Error" when you try to access the CGI. A good rule of thumb is to put the Content-type line at the top of your script (just below the #!/usr/bin/perl/ line).

Now let's take our original first.pl script, and make it into a CGI script that displays a web page. If you are running this on a Unix server that lets you run CGIs in your public_html directory, you will probably need to rename the file to first.cgi, so that it ends in the .cgi extension. Here is what it should look like:

Source code: http://www.cgi101.com/class/ch1/firstcgi.txt
Working example: http://www.cgi101.com/class/ch1/firstcgi.cgi

Save this file and run it in the Unix shell, like you ran the other one, by typing "./first.cgi". Notice how the script will just print out a bunch of HTML? This is what it should do, BUT, very importantly, if there's an error in your script, the Perl interpreter will tell you exactly what line the error is on. This is good to remember in the future, because when you're writing longer, more complex scripts, you may have errors, and the error message you get on the web (500 Server Error) is not at all useful for debugging.

Now let's call this CGI from your browser. You don't need to call it from a web page. Just move it into your public_html or CGI-bin directory, and type the direct URL for the CGI. For example:

Try it in your own web directory. It should return a web page with the "Hello, world!" phrase on it. (If it doesn't, see "Debugging a Script," below.)

Another way to write the above CGI, without using multiple print statements, is as follows:

Source code: http://www.cgi101.com/class/ch1/firstcgi2.txt
Working example: http://www.cgi101.com/class/ch1/firstcgi2.cgi

This is the "here-doc" syntax. Note that there are no spaces between the << and the EndOfHTML, in this statement:

Also, despite the fact that the script appears indented on this page, each line should start in column 1 in your CGI - especially the "EndOfHTML" line that's by itself. If there's even one space before the EndOfHTML, you'll get an error, and your script won't run.

This manner of displaying HTML will become more useful with future CGIs, because it doesn't require you to escape embedded quotes, like you would with a normal print statement:

Note that the quotes around the URL have to be escaped with a backslash in front of them for this to work, since Perl strings are enclosed in "quotes" - the only way to embed another quote inside a quoted string is to escape it, like so: "John \"Q.\" Public".

Debugging a Script

A number of problems can happen with your CGI, and unfortunately the default response of the webserver when it encounters an error (the dreaded "Internal Server Error") is not very useful for figuring out what happened.

If you see the code for the actual Perl script instead of the desired output page from your CGI, this means one of two things: either you didn't rename the file with the .cgi extension (perhaps you left it named "first.pl"), or your web server isn't configured to run CGIs (at least not in your directory). You'll need to ask your webmaster how to run CGIs on your server. And if you ARE the webmaster, check your server's documentation to see how to enable CGIs in user directories.

If you get an Internal Server Error, there's a bug in your script. There are numerous ways to hunt down the bugs; perhaps the easiest is to modify your script and add the following line near the top:

This will display error messages that otherwise would go to the server log directly in your browser window.

You can also try running the CGI from the command line in the Unix shell. The following will check the syntax of your script without actually running it:

You might also try the -w flag (for "warnings"), to report any unsafe Perl constructs:

This will report any syntax errors in your script, and warn you of improper usage. For example:

This tells you there's a problem at or around line 9; make sure you didn't forget a closing semicolon on the previous line, and check for any other typos. Also be sure you saved and uploaded the file as text - hidden control characters or smartquotes can cause syntax errors, too.

If the perl -cw command indicates that your syntax is ok, debugging will take a little more work. This means your script is breaking as it runs; possibly the input data is causing a problem. One way to get more info is to look at the server log files. First you'll have to find out where the logs are... some usual locations are /usr/local/etc/httpd/logs/error_log, or /var/log/httpd/error_log. Then to view the end of the log, do:

The last line of the file should be your error message. Here are some example errors from the error log:

A "malformed header" or "premature end of script headers" can either mean that you printed something before printing the "Content-type:text/html" line, or your script died. An error usually appears in the log indicating where the script died, as well; in the above example the @-sign in an email address ("@yahoo") wasn't escaped with a backslash. Of course, such an error would also appear if you ran perl -cw on the script; this example just shows how it would look in the server log.

Resources

Visit http://www.cgi101.com/class/ch1/ for source code and links from this chapter.


Copyright © 2000 by Jacqueline D. Hamilton.
Introduction Table of Contents Chapter 2