Comparison parameters
 Fields description- Comparison type:
This field determines the way Mp3 Filter compares each file.
- Dash separated fields:
Mp3 filter will consider each field in each
filename separately. The minimum percentage threshold must be met in every field.
If Field order doesn't matter is checked, "The Artist - Its Song" would match
"Its Song - The Artist" and even "Its Song", but not if the box is unchecked.
Additionally, if the box is unchecked, songs must have the exact same number of fields to match.
- One field:
Fields are ignored. The filename is counted as a one big field. This option will give more
results than separated fields, but is more likely to give false results.
- Use Tag:
Instead of comparing the filenames, Mp3 Filter will compare their ID3 tag.
This comparison type considers fields. Thus, Artist AND Title (and optionally, album)
must meet the minimum percentage for the 2 songs to match.
- File content:
This comparison type ignore filenames and tags. It compares the content of the file itself.
It uses a Md5 checksum of a 64kb part of every file to do so. Why only 64kb will you say?
Because if 64kb of the file matches, it pretty guarantees that the files are the same.
Besides, if one of the files is incomplete, the 2 files will still match.
- Minimum matching %:
This threshold determines what is the proportion of words that must match for
2 files to be considered as duplicates. A good example of how things
work are described in the glossary.
- Similar words threshold:
I advise you to be careful with this option, because it can result into a lot of
false positives. It also slows down the scan process a lot. However, it
can also help you to find duplicates that just couldn't be found otherwise. This parameter determine the
number of letters that can be different for 2 words to match. If this parameter is greater than zero,
the "Levenstein distance" (google it for more info) of the 2 words must be less or equal to this
parameter (The weight of insertion, substitution and deletetion are the same in my implementation). What
the he heck does this mean? It means that if you set this parameter to 1, the word "chair" would
fit with "chairs", but would also fit with "char", "chir", "cuair" etc..
That's why you should be careful with that option. Well, try it! you'll see.
|