Mike Gossland's Perl Tutorial Course for Windows


IO Introduction | STDOUT | Writing to Files | Reading from Files | Reading Directories | Editing File Contents | Recursive Editing


Chapter 4. Handling Files in Perl

Reading Files

Often you will want to read input from a specific file instead of from the keyboard. To do this, you open the specific file for reading, read the lines of input and then close the file again.

To open a file for reading, you can use the "<" character, a mnemonic for "something coming from the file" in the open statement. However, because  reading is a very common operation and is also safe, you don't need to specify any particular character in the open statement. Therefore, you can use either of these statements:

open INPUT, "<input.txt"; or

open INPUT, "input.txt";

Both of these are equivalent.

If the named file does not exist, an error message is generated.

Once you've opened the INPUT filehandle, you can start to read lines with the <INPUT> operation. If you put the read operation inside a while test, the input will into the default variable, $_. For example the following script will print out the contents of input.txt to the screen.

open INPUT, "<input.txt";
while ( <INPUT> ) {
  print;
}
close INPUT;

Reading lines into an array

The line

while ( <INPUT> ) {

is equivalent to

while ( $_ = <INPUT> ) {

and when you assign <INPUT> to a scalar variable as above, the operation is put into a scalar context. In a scalar context, it reads in only a single line at a time. If you put <INPUT> into a list context, then all lines will be read in at once:

open INPUT, "<input.txt";
@lines = <INPUT>;
close INPUT;
#Now @lines holds all the lines, one line in each element.
print "Last line is:\n";
print $lines[-1];

Reading a whole file into one variable

Sometimes, you'd rather read the whole content of the file into a single variable, rather than into an array of lines. This is a particularly good move when you need to do a multi-line pattern match or substitution, because then you can match to the entire content at once.

The thing to understand is that the <> operator reads input until it reaches an "end of record character". By default, this end of record character is a newline, \n. So, by default, input stops after each newline.

The Perl variable which specifies the end of record character is $/. By default, $/ = "\n";

If you change the value of $/, you can change input behaviour. You can even undefine the end of record character to read in the whole file in one operation. Use the undef operator:

open INPUT, "<input.txt";
undef $/;
$content = <INPUT>;
close INPUT;
align="left">$/ = "\n";     #Restore for normal behaviour later in script

With the above script, the variable $content will hold the entire file.

These various methods of reading files: a line at a time, into an array of lines and into a single variable all have their individual strengths and weaknesses in use. We'll soon use the different methods in the upcoming section on editing files. But first, let's learn how to list the files that are in a directory.