Thinking about simulated annealing

October 31st, 2009
machinelearning, programming, tech
I've been reading Duda, Hart, and Stork's "Pattern Classification". Much more detail than we got to in swat's AI class or the following seminar. Some parts are tough going, and I'm definitely not working out every detail, but I think it's helpful.

I just finished reading about simulated annealing, and I think I understand now why it's much more generally applicable than simple random restart hill climbing. In the latter you pick a random point in the solution space, consider all the neighboring points, move to the best one, repeat until you're better than your neighbors. Then repeat the whole process, remembering your best places so far, until you decide to stop. People always talk about the problem with this as "getting stuck in local minima" (with the "down is good" analogy of "gradient descent" instead of the "local maxima" that the "up is good' analogy of "hill climbing" would give) but I didn't really understand. I think now I do.

Consider trying to find the lowest point in a parking lot. The lot is pretty smooth, and pretty consistent, so you could put a soccer ball in the middle and see where it stopped. But what if you chose a ball bearing? The parking lot is smooth on a soccer ball scale, but not on a ball bearing scale. So the bearing gets caught in a low bit (local minimum) while the soccer ball rolls all the way down (global minimum).

In simulated annealing, we use a ball bearing but shake the parking lot so it has a decent chance of getting out of these little pockets and making it's way all the bottom of the lot. So that we don't keep bouncing around for ever, we start off shaking the lot a large amount, and over time decrease our shaking exponentially until we're not shaking at all.

Because simulated annealing decreases the amount of shaking over a very wide range (lots to almost none) it's suitable for all sorts of terrain, from ditches to dents. Hill climbing only works if the terrain is close to smooth.

What would happen if we tried a search system closer to the one with the soccer ball? The soccer ball represents a large portion of the solution space, and at each time step it moves from one set of adjacent points to a new set, under the condition that the highest of the new set of points be lower than the highest of the current set of points. Once could imagine "simulated soccer ball rolling" where one starts off with a giant ball and then makes it smaller over time until one has a ball bearing making tiny adjustments. The biggest problem with this that I see is that rolling a soccer ball requires testing a prohibitively large number of points. Imagine how many points would need to be tested to roll a ball 1/16 the size of the earth over the earth's surface. Worse, there are lots of solutions this search misses because it is strongly biased towards solutions that lie in large flattish regions. Our giant ball rolling over arizona would likely miss the grand canyon.

This diversion with ball rolling aside, I think the part about simulated annealing that really made sense to me is: exponentially decreasing willingness to make things worse is a good fit for a huge range of possible solution spaces.

Comment via: facebook

Recent posts on blogs I like:

How Does Fiction Affect Reality?

Social norms

via Thing of Things April 19, 2024

Clarendon Postmortem

I posted a postmortem of a community I worked to help build, Clarendon, in Cambridge MA, over at Supernuclear.

via Home March 19, 2024

How web bloat impacts users with slow devices

In 2017, we looked at how web bloat affects users with slow connections. Even in the U.S., many users didn't have broadband speeds, making much of the web difficult to use. It's still the case that many users don't have broadband speeds, both …

via Posts on March 16, 2024

more     (via openring)