This is an old revision of the document!
Table of Contents
A preliminary cross-check done already
Computing resources available in clouds are diverse in their performance, price and lifetime. To save cost, I had to continuously seek inexpensive resources and create instances in daily basis. The total number of instances ever created exceeded five hundreds. I did the following check as a minimal verification of the resouces:
- Every time a new instance created a benchmark program which counts a small subset was executed and the time and the result was recorded. If the answer was wrong, that resource was never used. I actually saw a few such resources and it is noteworthy that some versions of docker environment with multiple RTX-4090s produced wrong answers due to a failure of inter-GPU atomic transactions. I avoided multiple 4090s working together and used them separately instead.
- As a postmortem verification, independent re-counting have been done for every instance at a sampling rate of once per day. If any wrong results were found, all results that resource produced were considered unreliable.
The full cross-check in progress
If every subtotal is calculated twice and the two results match, the counts should be considered correct (provided the code is correct). The re-counting is in progress and is done up to 8% as of 2023/09/05.
Errors Found (updated on 2023.09.05)
During the thorough cross-check, it was discovered that a portion of the results generated by an instance was incorrect. The instance ran with two RTX-4090s for 60 hours and generated 3,771 sub-subtotals. Out of the 3,771 sub-subtotals only 11 was incorrect and these errors occurred sporadically over a period of 5.5 hours. All incorrect results were generated by only one of the two RTX-4090s. It is unlikely that these errors are due to logical flaws or coding mistakes. Hardware defects or instability are the most probable causes.
Correcting these errors increased the number by 864(36×24).
While these errors have not damaged my confidence in the logic and the code I used, it is possible that other errors of similar nature may still be contained in the result. Therefore, the results should be considered unconfirmed until the thorough cross-check will be completed.
