Wednesday, December 17, 2008

Perl glob performance

I was working on a project and found that my file server that I was pulling data off of seemed a little slow when getting the list of filenames. I was using 'glob' to get a list of files from the server and then kept the 10 most recent (the file names were sequential so a reverse sort gave that simply). But I found that the glob was taking a long time (there were about 4000 files in the directory and I could not remove them due to business rules). So I asked on the perl irc channel and found that they recommended using opendir/readdir over the glob call since glob does a stat on everyfile, making it run slower. So replacing (1) with (2) resulted in a 10 times speed up of my code for just this part of the code.

(1)

@files = glob "d:/path/to/directory/*";
$i = 1;
foreach $file (reverse(sort @files)) {
print $file . "\n";
$i++;
if ($i > 10) {
last;
}
}


(2)
opendir DH, "d:/path/to/directory/";
@files = readdir DH;
$i = 1;
foreach $file (reverse(sort @files)) {
print $file . "\n";
$i++;
if ($i > 10) {
last;
}
}
closedir DH;


I found this change useful so I thought I would share it.

No comments: