They are just factors. You track it all, which wouldn't be hard. It could be a relatively simple study, with rows for each test and columns for the factors of each test:
Global Test Number | Gouge Description | Gouge Iteration | Type of Steel | Grindstone Type Used | Grindstone Grit Used | Length of use Before Regrind | Kind of Use | Description of effectiveness during use
Part of setting up a viable study, would be determining these factors for each test case. There could be dozens of factors. Once you have a specification for each test case, then you run them, repeatedly according to the study specifications, say 7 tests per gouge/grindstone, along with specifications for what you look for and describe when using the gouge for each test case, etc.
I wouldn't say there are too many variables, not at all. In fact, I think they would be quite finite. There would be some subjectivity, in the description of effectiveness, for example. But you can minimize that subjectivity, and biases, with an effective specification for how the tests should be performed, and what aspects of cutting or scraping should be noted during each test case.
I think its eminently doable. Its more whether anyone is even interested in doing such a thing, and I kind of doubt it. Woodworking is a craft, an artform, with a natural medium, and a LOT of it is about the artistic creative aspects with said natural substance.
As this thread demonstrates, though, there are as many opinions as...well, you know how the saying goes.

In the long run, they are never really helpful, and there is really no point in asking anything, when the final outcome is: Try everything yourself and make your own judgement.