Mp3 Filter Hints and Tips

Choosing the right comparison type

So you've tried a couple of things with Mp3 Filter, you're quite comfortable with it, but you still don't know what type of comparison you should use. In this section, I'll list the pros and cons of each type of comparison, so you can make your own mind.

  • Dash separated fields: This type is the one I use most of the time. It is the fastest type, and if your matching percentage if 80 or over, the false positives are rare. However, your improperly named file can cause problems, and could even cause false duplicate (I have a couple of tracks with "Audiotrack" in the filename here). Also, if the fields are not sequenced correctly, the files will never match ("My Artist - My Title" will not match "My Title - My Artist").
    • Pros:Lightning fast, quite secure.
    • Cons:Can't detect dupes if filenames are incorrect.
  • One fields: This type is the one that is most likely to generate false positives, but is sometimes necessary to find some very hard to find duplicates. Using this type at 50% and raising the similar word threshold parameter to 1 is a good way to find obscure matches. You must realise that you will have to deal with a lot of false duplicates though.
    • Pros:Very fast, can find obscure results.
    • Cons:A high rate of false duplicates.
  • ID3 tags: If you know that your files have valid ID3 tags, this is probably the type you should use. The results are as secure as dash separated fields (if not more), and since it knows where the title and artist fields are, it can theorically find more duplicates. However, this comparison type is significantly slower than filenames comparison types, because it has to open every file and read it's tag.
    • Pros:More results, and more secure results.
    • Cons:Doesn't work if your files have no tags. Significantly slower.
  • File content: This method gives 99.99% secure results. However, it only works if the files are exactly the same. Well, in fact, it compares 64kb of the file. Thus, a complete mp3 would match an incomplete one.
    • Pros:The most secure results you'll ever see.
    • Cons:Slow. Doesn't work as soon as there is a difference.

Properly cleaning a collection

Start strict:For your first scan, use strict parameters. Thus, when you will refine your search, you will not have obvious duplicates messing with false positives. This will reduce the risk that you accidentaly delete files you didn't intend to. For your first scan, I suggest a Tag scan at 90-100%. If you have a lot of free time, you can also perform a file content scan prior to this.

Refine your search:Now that the obvious duplicates are out, you can start the actual hunting. Lower your parameters. I would personally make a run at 70%/Tag and a 70%/dash-separated. You will have to be careful with these results, because you will have a couple of false positives, but most of the results should be true duplicates.

Seek and destroy:Lower your parameters again, and set the similar word threshold to 1. You'll likely get hell a lot of results, but the goal here is to find true duplicates in a sea of false duplicates. Once you're done with this, you can consider your collection pretty clean.

Managing a music collection

Create a master collection:You should have a directory that is organized by artist, and where there is no duplicates. If you don't have that, use Mp3 Filter to purge your duplicate files, and then organize your file using a Mp3 Renamer (Most tag editor have the rename feature). Once you have that, always set that directory to reference priority.

Never download directly in your master collection:Once you have a directory where you know you have no duplicates, don't screw that! Download new music files in a separate directory. Once in a while, run a duplicate scan between your master collection (at reference priority, of course) and your download directory, delete duplicates, reorganize remaining files by artist, and integrate everything to your master collection.