Yesterday I posted about not overwriting a file until you knew you had completely and correctly stored its data. Part of that was about data security and another part was completeness for other consumers. I used a temporary file and rename
to move the new file into place. That’s not the entire story though.
Some of this you can glean from reading your filesystem code. For instance, the link(2) manual page explains quite a bit. I’m going to back up to same basics for this post. Perl, being part of the Unix toolbox, is close to the underlying system calls. Understanding what those things do helps you use Perl appropriately.
Linking to data
Create your original data file. The filesystem assigns it some sort of identifier (“inode”, “fileId”, whatever). I’ll use the word “inode” throughout this, but your filesystem might call it something different. This is not the filename. We’ll get to that in a moment. You get this number in the “inode” part of stat
result:
A filename “links” to the inode; this is why there are link
and unlink
commands in Perl:
You can link several names to the same data. No matter which file name you use you’re playing with the same data. The names are just labels:
The data at 215132 stick around as long as there are links to it (and maybe a bit longer) or it has an open filehandle. There are various reasons you might do this, but those would be distracting here. Read Use cases for hardlinks? for some ideas.
Here’s a little Perl program that demonstrates more than one name pointing to the same inode:
#!perl use v5.14; use warnings; my $original = 'rocks.txt'; my $second_name = 'geology.txt'; unlink $original, $second_name; # start fresh my @names = qw(Fred Barney Wilma Betty); open my $fh, '>:utf8', $original or die "Could not write <$original>: $!\n"; say {$fh} $_ foreach @names; close $fh; say "$original has inode " . (stat $original)[1]; say "File has " . (stat $original)[3] . " links"; link( $original, $second_name ); say "$second_name has inode " . (stat $second_name)[1]; say "File has " . (stat $second_name)[3] . " links"; unlink $original or die "Could not remove <$original>: $!\n"; say "=== after unlink"; say "$second_name has inode " . (stat $second_name)[1]; say "File has " . (stat $second_name)[3] . " links"; open my $in, '<:encoding(UTF-8)', $second_name or die "Could not read <$second_name>: $!\n"; print $_ foreach <$in>; close $in;
The output shows that both files have the same inode and that the data are still there even after the original filename disappears:
rocks.txt has inode 8666857151 File has 1 links rocks.txt has inode 8666857151 geology.txt has inode 8666857151 File has 2 links === after unlink geology.txt has inode 8666857151 File has 1 links Fred Barney Wilma Betty
Renaming files
When you rename a file, you aren’t moving data around. You change the text in the filename but not the data on disk or where the filesystem put it. The rename
merely changes the link, which is why rename
only works on the same partition::
rename 'rocks.txt' => 'characters.txt';
This is where the problem comes in when I want to rename some other file, such as my temporary file, into the original filename. I create the temporary file, but that’s a new inode:
When I rename my tempfile to the original name, the new inode gets that name and the original inode loses that name:
Many times this won’t be a problem, but what if some other program or person had made a hard link to the original inode and expected the current data to be in that inode? The other links point to the original inode while the new data are in a different inode:
Here’s the Perl program that shows the flow of inodes and names:
use v5.14; use warnings; my $original = 'rocks.txt'; my $second_name = 'geology.txt'; unlink $original, $second_name; my @names = qw(Fred Barney Wilma Betty); open my $fh, '>:utf8', $original or die "Could not write <$original>: $!\n"; say {$fh} $_ foreach @names; close $fh; say "$original has inode " . (stat $original)[1]; say "File has " . (stat $original)[3] . " links"; link $original, $second_name; say "$second_name has inode " . (stat $second_name)[1]; say "File has " . (stat $second_name)[3] . " links"; use File::Temp qw(tempfile); my( $tempfh, $tempfile ) = tempfile(); say {$tempfh} uc($_) foreach @names; close $tempfh; say "$tempfile has inode " . (stat $tempfile)[1]; rename $tempfile => $original; say "=== After rename"; say "$original has inode " . (stat $original)[1]; say "$second_name has inode " . (stat $second_name)[1]; say "File has " . (stat $original)[3] . " links"; say "=== In $original"; open my $in, '<:encoding(UTF-8)', $original or die "Could not read <$second_name>: $!\n"; print $_ foreach <$in>; close $in; say "=== In $second_name"; open my $in2, '<:encoding(UTF-8)', $second_name or die "Could not read <$second_name>: $!\n"; print $_ foreach <$in2>; close $in2;
The output shows the progression. Both rocks.txt and geology.txt start off with the same inode (so, the same data). The temporary file has a different inode and different data. After the rename, rocks.txt points to the temporary file’s inode while geology.txt points to the original inode. The new data are in rocks.txt but geology.txt still points to the original data. Anyone going through geology.txt doesn’t see the updates:
rocks.txt has inode 8666869539 File has 1 links geology.txt has inode 8666869539 File has 2 links /var/folders/jf/7sn23hrs11jcrn2w39wm6k_r0000gn/T/6vytLYAT2c has inode 8666869542 === After rename rocks.txt has inode 8666869542 geology.txt has inode 8666869539 File has 1 links === In rocks.txt FRED BARNEY WILMA BETTY === In geology.txt Fred Barney Wilma Betty
Copying files
Instead of renaming the file, you can copy the file. That moves the contents of one inode into another. If the destination already has an inode (the file exists), that inode is reused:
# replace the rename with these lines use File::Copy qw(copy); copy $tempfile => $original;
After the copy
, rocks.txt links to the same inode, and both rocks.txt and geology.txt still point to the same data:
rocks.txt has inode 8666869995 File has 1 links geology.txt has inode 8666869995 File has 2 links /var/folders/jf/7sn23hrs11jcrn2w39wm6k_r0000gn/T/LATmehGV5z has inode 8666869998 === After rename rocks.txt has inode 8666869995 geology.txt has inode 8666869995 File has 2 links === In rocks.txt FRED BARNEY WILMA BETTY === In geology.txt FRED BARNEY WILMA BETTY