File Locking in Perl
- Reasons for Locking Files
- Introducing flock()
- Semaphore Files
- Locking Demonstrated
- Caveats & Modules
Accessing resources such as files and programs is one of the most common tasks when writing software. Often, multiple programs or multiple instances of one program want to access the same resource. Locking files is a good way to ensure that only one resource is used at a time. This article explains file locking, common mistakes, and an idiomatic way to lock files. Before using the techniques described here, make sure that you fully understand the Perl flock() command, or have an understanding of the UNIX flock(2) command from the appropriate documentation.
Reasons for Locking Files
When many processes want to access the same resource, at some point multiple programs will try to alter a resource at the same time. When this happens, bad things can occur. Data integrity is important. No one wants a file containing incorrect data, or worse, a file that has become corrupted. This is why we want to lock filesto ensure that only a single process at a time can access a given resource.
Let's look at a very common application that uses text files as a data store: access counters for web sites. Here's a common way that people access counters with Perl:
01: #!/usr/bin/perl -w 02: use strict; 03: my $COUNTER = 'count.dat'; 04: print qq{Content-Type: text/html\n\n}; 05: open(DATA, $COUNTER) or die "Can't open $COUNTER ($!)"; 06: my $count = ; 07: close DATA; 08: $count++; 09: open(DATA, "+<$COUNTER") or die "Can't open $COUNTER ($!)"; 10: print DATA $count; 11: close DATA; 12: print qq{You are number $count!};
Basic as this script may be, it has a few problems. To explain what is wrong, let's begin with the flow this program follows during execution.
- Open file for reading
- Read data from file
- Close file
- Open file for update
- Write updated data to file
- Close file
Do you see what's wrong? If not, try thinking of multiple instances of this program running all at the same time. The problem here is that each instance opens, reads and writes to the same file whenever it feels like it. This means that multiple instances can read the same count at the same time, and update the file concurrently. We are now vulnerable to the count not being accurate, and possibly being reset to 1. This problem is commonly called a race condition, because a condition exists where multiple instances are racing to use a resource. As with any race, there will be only one winner, and that winner may not have the desired information.