In Part 1 we built a basic genetic solver that used mutation to solve problems. In this part we’re going to tackle a slightly more complex problem, the 8 Queens Puzzle, and then expand the solver as necessary.
Get the latest version of this post as a free chapter from my eBook Genetic Algorithms with Python
The 8 Queens Puzzle involves putting 8 queens on a standard chess board such that none are under attack. According to WikiPedia there are only 92 solutions to this puzzle and once we remove mirrorings and rotations there are only 12 unique solutions.
To start with we need to define a genome. The chess board conveniently has the same number of rows as columns (8) so we’ll use the digits 1-8 for our genes.
geneset = '12345678'
Array indexes are zero based however, so we’ll need to convert gene symbols to row and column values. To do that we’ll find the gene’s index into the set of gene symbols then use that index as a row or column, and combine those to make a Point
.
class Point:
Row = None
Col = None
def __init__(self, row, col):
self.Row = row
self.Col = col
def getPoint(rowGene, colGene, gene_set):
rowIndex = gene_set.index(rowGene)
if rowIndex == -1:
raise ValueError("'" + rowGene + "' is an invalid gene")
colIndex = gene_set.index(colGene)
if colIndex == -1:
raise ValueError("'" + colGene + "' is an invalid gene")
return Point(rowIndex, colIndex)
We also need to be able plot the Points on a board.
def getBoard(candidate, gene_set):
board = [['.'] * 8 for i in range(8)]
for index in range(0, len(candidate), 2):
point = getPoint(candidate[index], candidate[index + 1], gene_set)
board[point.Row][point.Col] = 'Q'
return board
And we want to be able to visualize the board.
def display(candidate, gene_set, startTime):
timeDiff = datetime.datetime.now() - startTime
board = getBoard(candidate.Genes, gene_set)
for i in range(8):
print(board[i][0],
board[i][1],
board[i][2],
board[i][3],
board[i][4],
board[i][5],
board[i][6],
board[i][7]
)
print("%s\t%i\t%s" % (candidate.Genes, candidate.Fitness, str(timeDiff)))
This will give us output like the following
. Q . . . . . .
. . Q . . . . .
. . . . Q Q . .
. . . . . . . Q
. . . . . . . .
Q . . . . . . .
. . . . . . Q .
. . . . Q . . .
7761124823353685 29 0:00:00.013001
The row of digits under the board is the set of genes that created the board layout. The number to the right will be the fitness, a measure of how close this set of genes is to the desired result. To drive improvement we’ll want to increase the fitness value whenever the related board position lets more queens coexist on the board. Let’s think about how we can do that.
We’ll start with counting the number of columns that have a queen. Here’s a layout that gets an optimal score but is undesirable
Q Q Q Q Q Q Q Q
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
We’ll also count the number of rows that have queens. Here’s a revised board where both counts are optimal but the layout still allows queens to attack each other.
Q . . . . . . .
. Q . . . . . .
. . Q . . . . .
. . . Q . . . .
. . . . Q . . .
. . . . . Q . .
. . . . . . Q .
. . . . . . . Q
To fix this problem we’ll include the count of the number of southeast diagonals that have a queen. Again we can find a corner case as follows:
. . . . . . . Q
. . . . . . Q .
. . . . . Q . .
. . . . Q . . .
. . . Q . . . .
. . Q . . . . .
. Q . . . . . .
Q . . . . . . .
To fix this final problem we’ll include the count of the number of northeast diagonals that have a queen.
Our fitness value will be the sum of those four counts, 8+8+8+8 being optimal.
Side note on implementation, we can calculate the northeast diagonals in Excel using the formula =$A2+B$1
which results in a grid as follows
0 1 2 3 4 5 6 7
0 0 1 2 3 4 5 6 7
1 1 2 3 4 5 6 7 8
2 2 3 4 5 6 7 8 9
3 3 4 5 6 7 8 9 10
4 4 5 6 7 8 9 10 11
5 5 6 7 8 9 10 11 12
6 6 7 8 9 10 11 12 13
7 7 8 9 10 11 12 13 14
The southeast diagonals can be calculated using =(8-1-$A2)+B$1
which we can visualize as follows:
0 1 2 3 4 5 6 7
0 7 8 9 10 11 12 13 14
1 6 7 8 9 10 11 12 13
2 5 6 7 8 9 10 11 12
3 4 5 6 7 8 9 10 11
4 3 4 5 6 7 8 9 10
5 2 3 4 5 6 7 8 9
6 1 2 3 4 5 6 7 8
7 0 1 2 3 4 5 6 7
Using the above 2 formulas along with the row and column values lets us touch each board position exactly once, which makes our fitness function run fast.
def getFitness(candidate, gene_set):
board = getBoard(candidate, gene_set)
rowsWithQueens = {}
colsWithQueens = {}
northEastDiagonalsWithQueens = {}
southEastDiagonalsWithQueens = {}
for row in range(8):
for col in range(8):
if board[row][col] == 'Q':
rowsWithQueens[row] = 1
colsWithQueens[col] = 1
northEastDiagonalsWithQueens[row + col] = 1
southEastDiagonalsWithQueens[8 - 1 - row + col] = 1
return len(rowsWithQueens) \
+ len(colsWithQueens) \
+ len(northEastDiagonalsWithQueens) \
+ len(southEastDiagonalsWithQueens)
Finally our main (integration test) method will bring all the parts together.
class EightQueensTests(unittest.TestCase):
def test(self):
geneset = '12345678'
startTime = datetime.datetime.now()
fnDisplay = lambda candidate: display(candidate, geneset, startTime)
fnGetFitness = lambda candidate: getFitness(candidate, geneset)
best = genetic.getBest(fnGetFitness, fnDisplay, 16, 8 + 8 + 8 + 8, geneset)
self.assertEqual(best.Fitness, 8 + 8 + 8 + 8)
Note that we’ve added a parameter to getBest
. This is because in Part 1 the solver determined the optimal fitness internally but now we want pass it in.
def getBest(get_fitness, display, targetLen, optimalFitness, geneSet):
...
while bestParent.Fitness < optimalFitness:
We can run our test to see if the solver can find a solution.
Q . . . . . . .
. . . . . . . .
. . . . Q . . .
. . . . Q . . .
. Q . . Q . . .
. . . . . . . .
. . . Q . . . .
Q . Q . . . . .
4574811152355583 23 0:00:00
Q . . . . . . .
. . . . . . . .
. . . . Q . . .
. . . . Q . . .
. . . . Q . . Q
. . . . . . . .
. . . Q . . . .
Q . Q . . . . .
4574811158355583 24 0:00:00
Q . . . . . . .
. . . . . . . .
. . . . Q . . .
. . Q . Q . . .
. . . . Q . . Q
. . . . . . . .
. . . Q . . . .
Q . . . . . . .
4574811158355543 25 0:00:00.001000
Q . . . . . . .
. . . . . . . .
. . . . Q . . .
. . Q . Q . . .
. . . . . . Q Q
. . . . . . . .
. . . Q . . . .
Q . . . . . . .
4574811158355743 26 0:00:00.002000
Q . . . . . . .
. . . . . . . .
Q . . . . . . .
. . Q . Q . . .
. . . . . . Q Q
. . . . . . . .
. . . Q . . . .
Q . . . . . . .
4574811158315743 27 0:00:00.003000
Q . . . . . . .
. . . . . . . Q
Q . . . . . . .
. . Q . Q . . .
. . . . . . Q .
. . . . . . . .
. . . Q . . . .
Q . . . . . . .
4574811128315743 28 0:00:00.004000
Q . . . . . . .
. . . . . . . Q
Q . . . . . . .
. . Q . Q . . .
. . . . . . Q .
. . . . . . . .
. . . Q . . . .
. Q . . . . . .
4574821128315743 29 0:00:00.005001
Q . . . . . . .
. . . . . Q . .
Q . . . . . . .
. . Q . Q . . .
. . . . . . Q .
. . . . . . . .
. . . Q . . . .
. Q . . . . . .
4574821126315743 30 0:00:00.009001
After several test runs we find that most of the time it gets close but can’t get all the way to the optimal value of 32. We need to enhance the solver’s capabilities for it to be able to handle this problem.
If at first you don’t succeed, try, try again!
We’re going to do that by introducing a second genetic line. We’ll mutate the 2nd line as long as it is improving. If it ends up with a better fitness than bestParent
then it will become the bestParent
. Otherwise, we’ll start a new genetic line again with a random gene sequence. We repeat this process over and over until we find an optimal result. Here’s the updated solver loop:
def getBest(get_fitness, display, targetLen, optimalFitness, geneSet):
random.seed()
bestParent = generateParent(targetLen, geneSet, get_fitness)
display(bestParent)
while bestParent.Fitness < optimalFitness:
parent = generateParent(targetLen, geneSet, get_fitness)
attemptsSinceLastImprovement = 0
while attemptsSinceLastImprovement < 128:
child = mutate(parent, geneSet, get_fitness)
if child.Fitness > parent.Fitness:
parent = child
attemptsSinceLastImprovement = 0
attemptsSinceLastImprovement += 1
if bestParent.Fitness < parent.Fitness:
bestParent, parent = parent, bestParent
display(bestParent)
return bestParent
Now when we run the 8 queens test we find an optimal solution every time.
. Q . . Q . . .
. . . . Q . . .
. . . . . . . Q
. . . . . . . Q
. . . . . . . .
. . . . . . . .
. . . . . . . .
Q . . . . . . .
8148381525122525 20 0:00:00
. . Q . . . . .
. . Q . . . . .
. . . . . . Q .
. . Q . . . . .
. . . . . . . .
. . . Q . . . .
Q . . . . . . .
. . . . Q . . Q
2385884313647137 28 0:00:00.013001
. . . . . . Q .
. . . . . . . .
. . . Q . . . .
. . . . . Q . Q
Q . . . . . . .
. . . . Q . . .
. Q . . . . . .
. . . . Q . . .
3417486546857251 30 0:00:00.041003
. . Q . . . . .
. . . . Q . . .
. Q . . . . . .
. . . Q . . . .
Q . . . . . . .
. . Q . . . . .
. . . . . . . Q
. . . . . Q . .
5178258613324463 31 0:00:00.143008
. . . . Q . . .
. . Q . . . . .
Q . . . . . . .
. . . . . . Q .
. Q . . . . . .
. . . . . . . Q
. . . . . Q . .
. . . Q . . . .
5247312368841576 32 0:00:00.308018
However, when we run the string duplication test from Part 1 it now struggles to find a solution. This is because we ignore our best line completely once we find it and only try to improve by mutating a new genetic line.
JoyEMlECsatqzfBVFkvIHduMg!HAFtnWcC 3 0:00:00
Nj!HBLwEHvdse wGGOKbJctg AriqlNDPP 8 0:00:00.022001
NBiBRlE thoHfFWhW !tnnFyuHAv wputM 11 0:00:00.062004
NUt alx xmusGiwTmEwprder are lKsdG 19 0:00:00.112007
Not allkthosbHnhoCHinvBr OQe losmR 21 0:00:00.852049
NotW!lT NhoJe VhRFwaDderiaCe most. 22 0:00:02.510144
Not alF tQose S!o wDndUZ yre lostA 25 0:00:05.847335
NotXLll those whopwWnder are lost. 30 0:01:33.040322
We need a way to continue to take advantage of the genes in bestParent
. One way nature handles this is through crossbreeding so we’ll introduce a crossover strategy where we take one random gene from a 2nd (the best) parent.
def crossover(parent, parent2, get_fitness):
index = random.randint(0, len(parent.Genes) - 1)
genes = list(parent.Genes)
genes[index] = parent2.Genes[index]
childGenes = (''.join(genes))
fitness = get_fitness(childGenes)
return Individual(childGenes, fitness)
And update the main loop to randomly choose whether to use mutation
or crossover
to improve the 2nd genetic line.
options = {
0: lambda p: mutate(p, geneSet, get_fitness),
1: lambda p: crossover(p, bestParent, get_fitness)
}
while bestParent.Fitness < optimalFitness:
parent = generateParent(targetLen, geneSet, get_fitness)
attemptsSinceLastImprovement = 0
while attemptsSinceLastImprovement < 128:
child = options[random.randint(0, len(options) - 1)](parent)
Now when we run the tests both are able to achieve optimal results every time.
Refactor
Now that everything works we’re going to do some code hygiene. In solving this problem we used genes “12345678” but it would have been more convenient, and faster, if we could have used raw integers 0-7. So let’s make that change. I’ll show selected changes below but you can get the full set from github.
class EightQueensTests(unittest.TestCase):
def test(self):
geneset = [0, 1, 2, 3, 4, 5, 6, 7]
...
fnDisplay = lambda candidate: display(candidate, startTime)
fnGetFitness = lambda candidate: getFitness(candidate)
...
This means both Point
and getPoint
can go away, simplifying getBoard
to:
def getBoard(candidate):
board = [['.'] * 8 for i in range(8)]
for index in range(0, len(candidate), 2):
board[candidate[index]][candidate[index + 1]] = 'Q'
return board
In display
we go the other way so we can show all the genes as a string.
def display(candidate, startTime):
timeDiff = datetime.datetime.now() - startTime
board = getBoard(candidate.Genes)
for i in range(8):
print(board[i][0],
board[i][1],
board[i][2],
board[i][3],
board[i][4],
board[i][5],
board[i][6],
board[i][7]
)
print("%s\t%i\t%s" % (''.join(map(str, candidate.Genes)), candidate.Fitness, str(timeDiff)))
To support the geneset being an array in the solver we only have to eliminate the line in crossover
, mutate
and generateParent
that converts the child gene sequence to a string.
Read more of this series in my book Genetic Algorithms with Python
Like this:
Like Loading...