10.4.07

File::Find::Duplicates

Over the Easter weekend, I amassed some 6000 odd photos into a 12 GByte partition. There were some duplicates and mused as to how to find them. As usual, I figured someone has had this problem before and solved it, so a Google search turned up a Perl module, File::Find::Duplicates, an add on to File::Find.

Just copied the example out of the documentation and bingo, off it went and found 181 duplicates within a minute.

Haven't looked into how it works, but seems if it finds two files the same size, it then md5sums them (or some other hash) to see if they are the same file. Cool!

No comments: