::  Posts  ::  RSS  ::  ◂◂RSS  ::  Contact

Thinking about simulated annealing

October 31st, 2009
machinelearning, programming  [html]

I've been reading Duda, Hart, and Stork's "Pattern Classification". Much more detail than we got to in swat's AI class or the following seminar. Some parts are tough going, and I'm definitely not working out every detail, but I think it's helpful.

I just finished reading about simulated annealing, and I think I understand now why it's much more generally applicable than simple random restart hill climbing. In the latter you pick a random point in the solution space, consider all the neighboring points, move to the best one, repeat until you're better than your neighbors. Then repeat the whole process, remembering your best places so far, until you decide to stop. People always talk about the problem with this as "getting stuck in local minima" (with the "down is good" analogy of "gradient descent" instead of the "local maxima" that the "up is good' analogy of "hill climbing" would give) but I didn't really understand. I think now I do.

Consider trying to find the lowest point in a parking lot. The lot is pretty smooth, and pretty consistent, so you could put a soccer ball in the middle and see where it stopped. But what if you chose a ball bearing? The parking lot is smooth on a soccer ball scale, but not on a ball bearing scale. So the bearing gets caught in a low bit (local minimum) while the soccer ball rolls all the way down (global minimum).

In simulated annealing, we use a ball bearing but shake the parking lot so it has a decent chance of getting out of these little pockets and making it's way all the bottom of the lot. So that we don't keep bouncing around for ever, we start off shaking the lot a large amount, and over time decrease our shaking exponentially until we're not shaking at all.

Because simulated annealing decreases the amount of shaking over a very wide range (lots to almost none) it's suitable for all sorts of terrain, from ditches to dents. Hill climbing only works if the terrain is close to smooth.

What would happen if we tried a search system closer to the one with the soccer ball? The soccer ball represents a large portion of the solution space, and at each time step it moves from one set of adjacent points to a new set, under the condition that the highest of the new set of points be lower than the highest of the current set of points. Once could imagine "simulated soccer ball rolling" where one starts off with a giant ball and then makes it smaller over time until one has a ball bearing making tiny adjustments. The biggest problem with this that I see is that rolling a soccer ball requires testing a prohibitively large number of points. Imagine how many points would need to be tested to roll a ball 1/16 the size of the earth over the earth's surface. Worse, there are lots of solutions this search misses because it is strongly biased towards solutions that lie in large flattish regions. Our giant ball rolling over arizona would likely miss the grand canyon.

This diversion with ball rolling aside, I think the part about simulated annealing that really made sense to me is: exponentially decreasing willingness to make things worse is a good fit for a huge range of possible solution spaces.

Comment via: facebook

Recent posts on blogs I like:

How Fast New York Regional Rail Could Be Part 3

In the third and last installment of my series posting sample commuter rail schedules for New York (part 1, part 2), let’s look at trains in New Jersey. This is going to be a longer post, covering six different lines, namely all New Jersey Transit lines t…

via Pedestrian Observations October 21, 2019

Strong stances

I. The question of confidence Should one hold strong opinions? Some say yes. Some say that while it’s hard to tell, it tentatively seems pretty bad (probably). There are many pragmatically great upsides, and a couple of arguably unconscionable downsides. …

via Meteuphoric October 15, 2019

What do executives do, anyway?

An executive with 8,000 indirect reports and 2000 hours of work in a year can afford to spend, at most, 15 minutes per year per person in their reporting hierarchy... even if they work on nothing else. That job seems impossible. How can anyone make any im…

via apenwarr September 29, 2019

more     (via openring)

More Posts:


  ::  Posts  ::  RSS  ::  ◂◂RSS  ::  Contact