|
I'm working on a decision-tree based algorithm where we consider expansions to our tree in chunks. We take all of the possible expansions and put them into chunks, ordered by a (decidedly non-perfect) estimate of how good each expansion will be. In order to take an expansion, its p-value of being superior to the previous tree must be better than some threshold α, which has been corrected via Bonferroni/Sidak. So say we've evaluated the first out of N expansion chunks. We've found an expansion that beats our threshold, but only just barely; we think there's a good chance that there might be a better expansion out there. On the other hand, we don't care so much about getting the optimum (as opposed to a near-optimum) that we want to evaluate all of the expansions every time. Given that we have some kind of probability distribution for where the best expansion lies in the space, how do we tell when to stop? Intuitively, it feels like we should have a particularly strict threshold for stopping after only evaluating the first chunk, a slightly looser threshold after looking at the first two, and so on, gradually increasing to α once we've evaluated all of the chunks. Is there some statistically justified way to accomplish that? (This is kind of like the "secretary problem," but it's actually rather different because we can go back to old candidates that we didn't immediately accept. I've briefly looked for but haven't found some equivalent of optimal stopping theory where you can go back to previous options.) |
|
This sounds like a search problem. You can see this noisy goodness value as an heuristic, and use a beam search. Of course, this only changes one problem (when to stop searching) for another (beam size), but then what's usually done is estimating it by cross validation or using held out data. The goodness value as heuristic analogy seems good, but I don't see a natural way to represent expansion space in a tree form, as you would need to do for beam search. We'd either have every expansion connected to every other expansion (which would then just make the beam size a fixed stopping-point) or try to connect them based on the form of the expansion (which doesn't necessarily make sense, as expansions with very similar form could have very different heuristic values and very different performances, depending on the situation).
(Jul 21 '10 at 18:08)
Dougal Sutherland
Maybe it wasn't clear, but I was thinking of a backtracking strategy, actually, and then I think it makes sense (you walk down possible trees, coming back up), but might be too computationally expensive. I understand the analogy with the secretary problem, but wouldn't it be cheaper to just look at all of them? If you want the maximum, and you have access to the distribution, why don't you compute (with bootstrap/sampling) some statistic such as how far is the maximum "goodness" expected to be from the average "goodness", and acccept a value if it's close enough to this estimated maximum goodness? See this question http://metaoptimize.com/qa/questions/26/how-do-i-estimate-the-min-of-a-sample-from-a-wacky-distribution for more discussion on estimating extremal values.
(Jul 21 '10 at 18:20)
Alexandre Passos ♦
Can you characterise the chunks in any way? For example, is it reasonable to assume they are i.i.d? I think you need to consider the exploration/exploitation tradeoff to solve this optimally (which is what the secretary problem is about). There is work on reinforcement learning for search (though I don't know of any work that makes any optimality guarantees). You could frame your problem as a bandit problem, for which you can develop an optimal solution.
(Jul 21 '10 at 23:11)
Noel Welsh
|