From cube-lovers-errors@oolong.camellia.org Thu May 29 21:58:15 1997
Return-Path: cube-lovers-errors@oolong.camellia.org
Received: from oolong.camellia.org (localhost [127.0.0.1]) by oolong.camellia.org (8.6.12/8.6.12) with SMTP id VAA20780; Thu, 29 May 1997 21:58:15 -0400
Precedence: bulk
Errors-To: cube-lovers-errors@oolong.camellia.org
Date: Thu, 29 May 1997 17:24:45 -0700
From: Richard E Korf
Message-Id: <199705300024.RAA18247@denali.cs.ucla.edu>
To: Cube-Lovers@ai.mit.edu
Subject: Description of algorithm for finding minimal-move solutions to Rubik's Cube
Dear Cube-Lovers,
Here is the promised short description of my algorithm for finding optimal
solutions to Rubik's Cube. I use the face-turn metric, meaning any twist of a
face, even 180 degrees, counts as a single move. A twist of a center slice
can only be accomplished by two twists of the outside faces.
The algorithm is a heuristic search, called Iterative-Deepening-A*, or IDA*,
for any artificial intelligence (AI) folks in the group. Given a scrambled
cube, it first looks for solutions one move long, then solutions two moves long,
then three moves, etc. Each iteration searches for solutions of successively
greater length, until a solution is found. At that point it quits, returning
what must be an optimal solution, barring program bugs. This is a completely
brute-force approach to the problem. At a million twists per second, searches
to depth 10 would take almost 3 days.
To make this approach practical, we need a function that given a cube state
will efficiently calculate a lower bound on the number of moves needed to solve
it. This is called a heuristic evaluation function. For example, we can
precompute the number of moves needed to solve each edge cubie individually from
each possible position and orientation. Then given a state of the cube, we sum
the number of moves needed to solve all 12 edge cubies individually, and divide
by 4, since each move moves 4 edge cubies. This heuristic, called 3D Manhattan
Distance, has an average value of 5.5. The important thing is that this function
always return a lower bound on the number of moves needed to solve a state.
During our search we compute the Manhattan Distance of each state. If we
are looking for solutions of length 10, for example, and we have a state that is
5 moves from the initial state, and its Manhattan Distance from the solved state
is 6 moves, we don't have to search that path any deeper, since it will take at
least 11 moves to get to the goal along that path, since 6 is a lower bound on
the number of moves needed to solve the state. Adding the Manhattan Distance
heuristic to our search algorithm lets us search to depth 14 in about 3 days.
We could do the same thing with the corner cubies, and take the maximum of the
two values, but that doesn't help much.
To do better, we need a more accurate heuristic function. For that, we use
an idea call "Pattern Databases" due to Culberson and Schaeffer. See Culberson,
J.C., and J. Schaeffer, Searching with pattern databases, Proceedings of the
11th Conference of the Canadian Society for the Computational Study of
Intelligence, published in Advances in Artificial Intelligence, Gordon McCalla
(Ed.), Springer Verlag, 1996.
For example, if we consider just the corner cubies, there are only about 88
million possible states they could be in (8!x3^7). We exhaustively build and
store a table, using breadth-first search, that tells us the minimum number of
moves needed to solve just the corner cubies from every possible combination,
ignoring the edge cubies. This value ranges from 0 to 11 moves, averages 8.764
moves, and requires only 4 bits per state. (We could reduce this further using
an idea of Dan Hoey's published in this list awhile ago.) This table only has
to be computed once, taking about a half hour, and requires about 42 megabytes
of memory to store (a megabyte is 1048576 bytes).
Then, during the search, we compute the heuristic lower bound on the number
of moves to the goal by looking up the configuration of the corner cubies, and
using the number of moves stored in the table. 8.764 is a lot better than 5.5.
Finally, we divide the edge cubies into two groups of six, and compute a
similar table for each group. There are too many combinations of all 12 edge
cubies to build a single table. The final heuristic function we use is the
maximum of 3 different values, the moves needed to solve the corner cubies, and
the moves needed to solve each group of six edge cubies. The total memory for
all three tables is 82 megabytes.
Given more memory, we could built larger tables, for example, considering 7
edge cubies at a time. This would give a more accurate heuristic value, and
reduce the running time of the search algorithm. In fact, an informal analysis
of the performance of the algorithm suggests that its speed will increase
linearly with the amount of available memory. Thus, given twice as much memory,
the algorithm should run in roughly half the time. Disks and other secondary
storage are of no use, since the access time is much too slow to be worthwhile.
The current version of the program is written in C on a Sun Ultra-Sparc
Model-1 workstation with 128 megabytes of memory. It generates about 700,000
states per second. Depth 16 searches typically take less than 4 hours, depth 17
searches take about 2 days, and complete depth 18 searches take about 27 days. A
complete depth 19 search would take about a year. Each depth takes roughly
13.34847 times longer than the previous, which is the branching factor of the
problem space.
The algorithm is easily parallelized. Given 18 processors, for example, we
make all 18 possible first moves, and hand each of the resulting states to a
different processor to solve. This will give roughly linear speedup with the
number of processors, since the amount of time needed to search to the deeper
levels is very consistent from one state to the next.
Sorry for the length of this message, but I hope it will of interest to some
of you. If you'd like the full paper, just send me a message. Thanks very much.
-rich