Thanks, Oboe.
Sure thing.
If I have the ability to do things in a more scientific way (which isn't always possible), my feeling is that the best way is to formulate the scenario design and scoring system as a model with adjustable parameters and use past data to indicate how those parameters should be set.
The goal of picking parameters is to achieve equal probability of each side winning, where the random variable in the distribution is execution (which varies from frame to frame and scenario to scenario). For example, if a scenario design is balanced, if you ran that same setup N times selecting participants at random each time, you'd get one side winning about N/2 of the time.
The easiest to balance are symmetric scenarios -- where each side is about the same with the same scoring. That would be like each side having the same number of each type of aircraft, and the aircraft are about equal in capability. That's never exactly the case (as aircraft side to side are different), but some scenarios come close to it.
The hardest to judge are asymmetric scenarios, where each side does not have the same aircraft and does not have the same scoring, like this one, BOG, BoB, Road to Rangoon, DGS, etc., where one side has bombers and the other does not.
The best way to pick these adjustable parameters is to run the scenario a bunch of times and tweak it over time until you see the outcome being balanced (i.e., each side wins about half the times you run it).
For a new design, you can't do that, but you can look at past scenarios that have similarities, and pull indications from those.
One thing you can often do in working to analyze things is to figure out some measures that have some indicative power. For example, in a setup where both sides have fighters but only one side has bombers, it is reasonable to assume that ratios could be used. For example, 10 fighters and 5 bombers vs. 10 fighters is probably similar to 15 fighters and 7 bombers vs. 15 fighters, where all numbers in the later case are 1.5 times the former case. I look at side1_fighters:side2_fighters and side1_bombers:side1_fighters as dimensionless ratios that encapsulate this.
Looking at past scenarios, I see that things worked decently (i.e., people tended to have fun on both sides, and bombers got to target and back sometimes but not always) when fighters:fighters were 1.0-1.2 and bombers:fighters were 0.33-0.36. If you go outside those ratios, you find the scenarios where bombers died nearly all the time and weren't fun to fly, or where bombers seemed too hard to stop, and the defenders didn't have as much fun.
So, that's why I started near those ratios in this Scenario.
Once you have a setup that seems like people will have fun, you have to figure out a scoring system that is balanced. I like very simple scoring systems that are still reasonable, based on how much stuff a side loses (aircraft and ground targets, i.e., kills, number of hangars destroyed, and number of bunkers destroyed in this case). Now the adjustable parameters are how much are things worth. If the design were symmetric, it wouldn't matter what you pick for those parameters as long as they are the same for each side. In this one, they are the same side to side except for points per hangar, because one side has bombers that hit hangars and the other does not -- so that's the tricky parameter to figure out, the one that affects the sides differently. Make points per hangar gigantic, and even if the allies do a bad job and don't protect their bombers well, they win because they got a few hangars here and there. Make points per hangar miniscule, and having bombers is nothing but a drag on the allies, as they are going to lose some bombers, and points per hangar isn't enough to make it worth even having them around.
So, the goal is to figure out what pts/hangar should be so that, if execution varied randomly across its distribution from one side having better execution to the other side having better execution, one side would win half the time over that distribution. Or, saying it another way, whichever side had the better execution would win. That's balanced.
One way to do that is, if you know the mean case of how it goes for bombers, you can figure out what pts/hangar needs to be so that the points differential in that mean case is zero. I did such a rough "back of the envelope" type estimate using my recollection of past bomber sorties to come up 6 pts/hangar. That's not as good as a computer analysis, but it's better than just totally winging it.
A more-accurate way (but a lot more work) is to simulate it on past data and see what the past data says. I can't always pull that off that level of analysis, but here I was able to get it done in the time I had available. What I did is look at past asymmetric scenario frames, find the number of kills on each side (easy) and estimate how many hangars would be killed if it took about a full loadout to kill a hangar (harder). That harder part, what I did was to go through each frame, look at when a bomber pilot's mission started and ended (counted as one mission) and see if he destroyed any ground objects during that mission. If any ground objects were destroyed during that mission, I count that as a drop that would potentially kill a hangar. I put in a conversion of how likely a drop is to result in a dead hangar. 1.0 would mean every drop (even the ones where a guy hit only one building at a factory complex) would kill a hangar. 0.9 would mean that 90% of the drops result in a hangar destroyed. I used 90% for my conversion. Then I go through and see, calculating a points differential d = (side 1 kills) - (side 2 kills) + (side 1 drops) * conversion * (pts/hangar), what pts/hangar needs to be for d = 0 (i.e., for the frame to end in a draw) -- call this "zeroDiffPtsPerHangar". Then I look at, for the whole scenario, what zeroDiffPtsPerHangar was for each frame and what the median, over all frames, was of zeroDiffPtsPerHangar. That median is where half the frames would have been won by one side and half by the other side if pts/hangar were set to that same value for all frames. So, if we were to go back in time, make that scenario have a scoring system of (kills1) - (kills2) + (hangars destroyed) * median, that scenario would have had one side win half the frames, the other side win half the frames, and the scenario would have been overall a draw.
Now I can do that for a bunch of scenarios and see what all their median zeroDiffPtsPerHangar look like. Some scenarios have low medians, where the side with the bombers did well all the time (BOWL, for example, has a median of 1.4 pts/hangar -- you'd need to set pts/hangar to be 1.4 for the allies to win half the frames of BOWL and the axis to win the other half), some scenario have high medians, where the bombers got stomped all the time (MM, for example, has a median of 60 pts/hangar in order for each side to win half the frames, and Rangoon 2008 has a median of infinity because too often no bombers killed anything at target). BoB isn't a particularly good fit because it has significantly fewer defending fighters than attacking fighters (unlike the other asymmetric scenarios), and so the side with bombers doesn't need any bombing points to win based just on K1 - K2 -- it is a setup that needs a different scoring system. Several scenarios are a good fit for the (kills1 - kills2 + hangars * pts/hangar) style of scoring system, like DGS, BOG, and RTR, which have medians in the range of 3.4 to 4.6. Since the middle of that range is 4, I picked 4 for us as the best estimate of what would result in a draw if execution is equal side to side or if execution is half time higher for one side than the other in the scenario.