Speed Exam Guide (1.x)
If you need a duplicate file finder, your primary concern must be its search speed. Here, we will show you some effects of the testing environment, help you get the right judgment.
First, this kind of software will traverse all the files in your selected directories. Normally, the information of the directories and files is read from the hard disk at the first traversal, and it will be cached in memory, so the subsequent traversal will read it from memory directly. Therefore, it may be unfair to the software tested first. To avoid that, you can (1) reboot the system after every testing, or (2) traverse the directories you want to search before testing, for example, in the windows explorer, right-click on the directory and choose properties from the pop-up menu, and then the explorer will traverse the selected directory.
Second, when the amount of search files increases, the amount of search time doesn't increase linearly. For example, a simple but "stupid" algorithm, find the duplicate file by comparing it to all others, and then 100 files need just 10,000 comparisons, but 1000,000 files will need 1000,000,000,000 comparisons! What a googol! Maybe it will never get finished :-). So the optimization of the search algorithm is the major problem of this kind of software. Therefore, if you want to get the real capability of this kind of software, you shall let the test simples huge enough, and maybe you will realize the huge difference of search speed among them.
Third, you'd better temporarily disable the virus real time monitor before you begin to find duplicate files, because the monitor will catch the operation of the file reading and check the virus, which will slow down the search speed greatly.
I give a example below, help you determine the suitable simple.

System: CPU: AMD Sempron 2500+, Hard disk: Seagate 7200RPM 250GB
Environment: Just after boot and the virus real time monitor is disabled.
Simple: offline image websites, 6,455 directories, 199,204 files, total 8.48GB size, include 14,537 duplicate files, total 665MB size.
Memory Usage: I use IceSword1.12 to get the CloneSensor memory usage. It show CloneSensor consume 5.5MB memory after startup, and when the search finished, CloneSensor struck 62MB peak memory usage. It's mean that you can search upto 500,000 files without using virtual memory if you have 142MB free physical memory, (Any software will slown down greatly when it use virtual memory)
Result: less than 10 minutes, the search result came out, 350 files/second, process 350 files every second?! So crazy! Is it magical? Test it yourself! :) The search algorithm of CloneSensor had gone through evolution four times. With the special optimized algorithm, CloneSensor can keep the high speed in finding files, in spite of the amount of files grows more and more, and really gets all results of the true byte-by-byte comparison. You can validate the results with any duplicate file finder which claim true byte-by-byte comparison.
