Sorry moot- my fault, I didn't mean to imply you said anything about the FM being wrong at all (quite the contrary). My Fail for not being clear! Let me try again.
1) For purposes of basic airplane comparisons I have no problem with your approach using the weights AH allows you to configure (e.g. 25,50,75,100%) to do the comparisons with. Personally you would need to do more to convince me that the spit5/seafire chart based on one of these loadouts is anymore probable than any other weight but this is a minor point. I'd be perfectly fine if they re-did all the AH charts using a consistent configured loadout like this.
2) However if we are to judge a flight model, to assume the AH chart is based on some AH configured weight since that configuration is more probable misses the point of what it means to judge the FM against some law of physics. Please let me know if I need to re-clarify my reasons why I think this.
3) Why did HTC use equal weights for the spit5/seafire (& 190A-8/F-8) chart? Only they know but I strongly think it was to check the FM
. Previously I said there are 3 possible valid outcomes of the "ROC law".
Outcome A: ROC_Seafire < ROC_SpitV
Outcome B: ROC_Seafire = ROC_SpitV
Outcome C: ROC_Seafire > ROC_SpitV
If HTC is treating them as essentially the same plane with differences only in weight then their performance should be essentially the same when their weights are equal. They are testing their FM against one of the valid outcomes of the "ROC law" to check if their FM violates the physics for when the law requires ROC_Seafire=ROC_SpitV (when spit5/seafire are equal weights). If the performance is different when the weights are the same then something is wrong. Equaling the weights is just a really easy way to do this for this case.
The laws of physics don't care an iota about probabilities of when a plane might weigh such-in-such. It doesn't matter. It'll render a judgement for the entirety of valid outcomes within the envelope of that law.