Response to Reviews of "Assessing Conformer Energies"
In the revised manuscript, we have gone through the text thoroughly to improve the quality of the writing, fix a few minor typos, etc. The key omission in the paper is any attempt to provide confidence in the deductions made about the differences in accuracy between the methods compared. Confidence intervals on each of the estimators, estimates of success rates and their errors, and pairwise hypothesis tests, at a minimum, must be added before publication. With this data in hand the new version
... and the new version can make quantitative estimates of the differences between the methods. See comments below. Detailed Report When superiority of one set of results is asserted over another it is simply not acceptable to report raw performance numbers and state that the biggest/smallest is the best. All the results in this paper are sample results and therefore have an associated error, which must be reported. Given this error, when two methods are compared measures of the significance and the impact of the difference must be reported (hypothesis tests, effect sizes, confidence intervals etc.). Across computational chemistry, there are countless benchmark studies similar to ours, often with many fewer data points, which compare multiple methods and do not discuss statistical significance or confidence intervals. Indeed, we are unaware of other published quantum chemistry or machine learning of quantum chemical property papers that report confidence intervals, error bars, etc. as requested by the reviewer. Nevertheless, we agree with the reviewer and provide such metrics in the revised manuscript and hope the field will broadly adopt such practices. We note that little of our discussion or conclusions required revision