Thinking about simulated annealing

October 31st, 2009
machinelearning, programming, tech
I've been reading Duda, Hart, and Stork's "Pattern Classification". Much more detail than we got to in swat's AI class or the following seminar. Some parts are tough going, and I'm definitely not working out every detail, but I think it's helpful.

I just finished reading about simulated annealing, and I think I understand now why it's much more generally applicable than simple random restart hill climbing. In the latter you pick a random point in the solution space, consider all the neighboring points, move to the best one, repeat until you're better than your neighbors. Then repeat the whole process, remembering your best places so far, until you decide to stop. People always talk about the problem with this as "getting stuck in local minima" (with the "down is good" analogy of "gradient descent" instead of the "local maxima" that the "up is good' analogy of "hill climbing" would give) but I didn't really understand. I think now I do.

Consider trying to find the lowest point in a parking lot. The lot is pretty smooth, and pretty consistent, so you could put a soccer ball in the middle and see where it stopped. But what if you chose a ball bearing? The parking lot is smooth on a soccer ball scale, but not on a ball bearing scale. So the bearing gets caught in a low bit (local minimum) while the soccer ball rolls all the way down (global minimum).

In simulated annealing, we use a ball bearing but shake the parking lot so it has a decent chance of getting out of these little pockets and making it's way all the bottom of the lot. So that we don't keep bouncing around for ever, we start off shaking the lot a large amount, and over time decrease our shaking exponentially until we're not shaking at all.

Because simulated annealing decreases the amount of shaking over a very wide range (lots to almost none) it's suitable for all sorts of terrain, from ditches to dents. Hill climbing only works if the terrain is close to smooth.

What would happen if we tried a search system closer to the one with the soccer ball? The soccer ball represents a large portion of the solution space, and at each time step it moves from one set of adjacent points to a new set, under the condition that the highest of the new set of points be lower than the highest of the current set of points. Once could imagine "simulated soccer ball rolling" where one starts off with a giant ball and then makes it smaller over time until one has a ball bearing making tiny adjustments. The biggest problem with this that I see is that rolling a soccer ball requires testing a prohibitively large number of points. Imagine how many points would need to be tested to roll a ball 1/16 the size of the earth over the earth's surface. Worse, there are lots of solutions this search misses because it is strongly biased towards solutions that lie in large flattish regions. Our giant ball rolling over arizona would likely miss the grand canyon.

This diversion with ball rolling aside, I think the part about simulated annealing that really made sense to me is: exponentially decreasing willingness to make things worse is a good fit for a huge range of possible solution spaces.

Comment via: facebook

Recent posts on blogs I like:

Jealousy In Polyamory Isn't A Big Problem And I'm Tired Of Being Gaslit By Big Self-Help

The nuance is in the post, guys

via Thing of Things July 18, 2024

Trust as a bottleneck to growing teams quickly

non-trust is reasonable • trust lets collaboration scale • symptoms of trust deficit • how to proactively build trust

via benkuhn.net July 13, 2024

Coaching kids as they learn to climb

Helping kids learn to climb things that are at the edge of their ability The post Coaching kids as they learn to climb appeared first on Otherwise.

via Otherwise July 10, 2024

more     (via openring)