Mike Gossland's Perl Tutorial Course for Windows


IO Introduction | STDOUT | Writing to Files | Reading from Files | Reading Directories | Editing File Contents | Recursive Editing


Chapter 4. Handling Files in Perl

Editing Files

We have seen many elements of Perl: matching and substitution, reading and writing files, and a number of functions. It is time to try editing files. We just have to put everything we've learned together.


Let's say you had this nonsensical data file. Copy and paste this text into notepad and save it as test.txt in your scripts directory:

cry through leap been by full take
many again track every many aim quite
able how plus all all life toad than end through 
if would that fire quite away that away smile 
take every away quiet quiet toad strong how 
old every when cry quiet how be pale quiet 
smell leap hope quite sit to able how but by

Let's say you are asked to edit the word "away", and change it to "yellow", but only if it occurs twice on one line. Also, change the second occurrence only. Then save the changed file under the original filename, test.txt. Here's a script to do all those things.

#Specify the file
$file = "test.txt";
#Open the file and read data
#Die with grace if it fails
open (FILE, "<$file") or die "Can't open $file: $!\n";
@lines = <FILE>;
close FILE;

#Open same file for writing, reusing STDOUT
open (STDOUT, ">$file") or die "Can't open $file: $!\n";

#Walk through lines, putting into $_, and substitute 2nd away
for ( @lines ) {
  s/(.*?away.*?)away/$1yellow/;
print;
}
#Finish up
close STDOUT;

Let's suppose you want to work with this comma separated numeric data. 

Copy this file and save it as test.txt

"Date","Time","O","H","L","C","V","OI"
12/18/2018,1600,1562,1562,1562,1562,0,0
12/19/2018,1600,1800,1800,1800,1800,0,0
12/20/2018,1600,1589,1589,1589,1589,0,0
12/23/2018,1600,1121,1121,1121,1121,0,0
12/24/2018,1600,1298,1298,1298,1298,0,0
12/26/2018,1600,1544,1544,1544,1544,0,0
12/27/2018,1600,1451,1451,1451,1451,0,0
12/30/2018,1600,1402,1402,1402,1402,0,0
12/31/2018,1600,1281,1281,1281,1281,0,0
01/02/2019,1600,784,784,784,784,0,0
01/03/2019,1600,1859,1859,1859,1859,0,0
01/06/1999,1600,1391,1391,1391,1391,0,0
01/07/1999,1600,1476,1476,1476,1476,0,0

Let's say you want to remove the Time, V, and OI column, change commas to tabs, and append the words " over 1400" to the line if the value in the C column is over 1400. You want to save the output to a different file, testout.txt, and print out a copy of it when you are finished. Here's a script to do that:

$in_file = "test.txt";
$out_file = "testout.txt";

open (IN, "<$in_file") or die "Can't open $in_file: $!\n";
open (OUT, ">$out_file") or die "Can't open $out_file: $!\n";

while ( $line = <IN> ) {
  @fields = split /\s*,\s*/, $line;
  $line = join "\t", $fields[0], @fields[2..5];  #an array slice!
  print OUT $line;
  print OUT " over 1400" if $fields[5] > 1400;
  print OUT "\n";
}

close IN;
close OUT;

#read in output file and print to screen to confirm
open (TEST, "<$out_file") or die "Can't open $out_file: $!\n";
while ( <TEST> ) {
  print;
}
close TEST;

We have one last thing to cover in this section, and that is editing all the files in a directory tree, or recursive editing.