|
Weasel programs in python
For context and further investigations see
weasel thread on Panda's Thumb,
weasels on parade thread at PT,
and
Wesley R. Elsberry's weasel page
(with interactive Java-version!).
Python programs below by: Anders Gorm
Pedersen
On this page:
- Python program, unlocked version (all positions mutable every generation)
- Plot of fitness from typical run of unlocked version.
- Plot of distribution of mutation types from typical run of unlocked version.
- Python program, locked version (where correct positions become locked)
- Plot of fitness from typical run of locked version.
- Plot of distribuiton of mutation types from typical run of locked version.
Download programs:
Unlocked version (all positions mutable every generation)
# Python implementation of Richard Dawkin's weasel program
# Illustrates power of cumulative selection vs entirely random search
# This version allows all positions to mutate in any generation, meaning correct letters can be reversed
# Initializations
import random
target = list("METHINKS IT IS LIKE A WEASEL") # Target sequence
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ " # Alphabet of possible symbols (uppercase letters + blank)
n_offspring = 50 # Number of offspring per generation
mut_rate = 0.09 # Mutation rate: probability that any given letter will change
best_offspring = [] # Variable containing best offspring after each generation
# Contains empty list when loop is first entered
# Construct random starting string of same length as target
parent = []
for i in range(len(target)):
parent.append(random.choice(alphabet))
# Main loop: construct mutated offspring, select best for next generation, stop when target reached
gen = 0
while best_offspring != target:
gen = gen + 1
# Keep track of the number of good, bad, and neutral mutations arising in this generation
n_beneficial = n_detrimental = n_neutral = num_mut = 0.0
# Keep track of how many kids in current generation that look exactly like their parent
num_unchanged = n_offspring
# Construct n_offspring children that may contain mutations
kid_list = []
for i in range(n_offspring):
# Copy content of parent to kid
kid = parent[:]
# Iterate over positions in kid, for each position allow possibility of mutating
kid_changed = False
for pos in range(len(kid)):
# Let residue mutate with probability mut_rate
if random.random() < mut_rate:
kid_changed = True
num_mut += 1
# Randomly select new symbol (that is different from the one already present)
old_symbol = parent[pos]
possible_new_symbols = set(alphabet) - set(old_symbol)
new_symbol = random.choice(list(possible_new_symbols))
# Mutate kid
kid[pos] = new_symbol
# Check if mutation was beneficial, detrimental, or neutral and update count
if old_symbol == target[pos]: # Correct letter has been changed: detrimental
n_detrimental += 1
elif new_symbol == target[pos]: # Incorrect letter has been changed to correct: beneficial
n_beneficial += 1
else: # Incorrect letter has been changed to other incorrect letter: neutral
n_neutral += 1
# If kid was mutated, then it no longer looks like its parent
if kid_changed:
num_unchanged -= 1
# Add (possibly) mutated kid to list of current kids
kid_list.append(kid)
# Find most fit offspring (= string most similar to target)
smallest_dif = len(target) + 1
for kid in kid_list:
# Find number of positions that differ between kid and target
dif = 0.0
for pos in range(len(target)):
if kid[pos] != target[pos]:
dif = dif + 1
# Keep best kid found so far
if dif < smallest_dif:
smallest_dif = dif
best_offspring = kid
# Use best offspring as starting point for next round of mutation
parent = best_offspring
# Print progress
fitness = (len(target)-smallest_dif)/len(target) # Fitness of most fit individual
ben_frac = n_beneficial/num_mut
detr_frac = n_detrimental/num_mut
neu_frac = n_neutral/num_mut
result_string = ""
for pos in range(len(target)):
if best_offspring[pos] == target[pos]:
result_string += best_offspring[pos]
else:
result_string += best_offspring[pos].lower()
print "%s ** Gen: %4d Dif: %3d Fit: %.4f Bene: %.4f Detr: %.4f Neu: %.4f Unchanged: %3d" % (result_string,
gen, smallest_dif, fitness,
ben_frac, detr_frac, neu_frac,
num_unchanged)
Results from typical run of unlocked version
Fitness of the best offspring in each generation, plotted as a function of generation number. Program was run with
n_offspring = 50 and with the 28 letter target sequence "METHINKS IT IS LIKE A WEASEL". Fitness is measured as
the percentage of correct positions in the offspring string. When the string is far from the peak of the fitness landscape (in
the beginning of the run) fitness rises fairly rapidly. As the string approaches the target, the fitness stays constant much of
the time, with occasional drops.
The fraction of mutations that are beneficial, detrimental, and neutral. In each generation each of the 50 mutant offspring in the
current population were compared to the target in order to determine whether the newly acquired mutations were beneficial, detrimental, or
neutral (in this toy example neutral mutations occur when one incorrect letter is substituted by another incorrect letter). The values
plotted here are the mutation counts for the three categories expressed as a fraction of all mutations in any given generation. It is
clearly visible that the fraction of mutations, which are detrimental, increases rapidly as the population approaches the peak of the
fitness landscape, while the fraction of neutral mutations decreases. The fraction of mutations that are beneficial can also be seen to
drop slightly over time, although the value is much more constant. Also notice how the time between beneficial mutations becomes much longer when the
population is closer to the peak of the fitness landscape.
Locked version (correct positions are fixed as soon as they appear)
# Python implementation of Richard Dawkin's weasel program
# Illustrates power of cumulative selection vs entirely random search
# This version locks correct sites (not allowing them to mutate) as soon as they appear
# Initializations
import random
target = list("METHINKS IT IS LIKE A WEASEL") # Target sequence
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ " # Alphabet of possible symbols (uppercase letters + blank)
n_offspring = 50 # Number of offspring per generation
mut_rate = 0.09 # Mutation rate: probability that any given letter will change
best_offspring = [] # Variable containing best offspring after each generation
# Contains empty list when loop is first entered
locked_list = [] # List of positions that are no longer allowed to mutate
free_list = range(len(target)) # List of positions in string still allowed to mutate
# Initially contains all indices of target, correct letters
# are removed from this list (and thereby locked) as they occur.
# Construct random starting string of same length as target
parent = []
for i in range(len(target)):
parent.append(random.choice(alphabet))
# Main loop: construct mutated offspring, select best for next generation, stop when target reached
gen = 0
while best_offspring != target:
gen = gen + 1
# Keep track of the number of good, bad, and neutral mutations arising in this generation
n_beneficial = n_detrimental = n_neutral = num_mut = 0.0
# Keep track of how many kids in current generation that look exactly like their parent
num_unchanged = n_offspring
# Construct n_offspring children that each have one random mutation
kid_list = []
for i in range(n_offspring):
# Copy content of parent to kid
kid = parent[:]
# Iterate over positions in free_list, for each position allow possibility of mutating
kid_changed = False
for pos in free_list:
# Let residue mutate with probability mut_rate
if random.random() < mut_rate:
kid_changed = True
num_mut += 1
# Randomly select new symbol (that is different from the one already present)
old_symbol = parent[pos]
possible_new_symbols = set(alphabet) - set(old_symbol)
new_symbol = random.choice(list(possible_new_symbols))
# Mutate kid
kid[pos] = new_symbol
# Check if mutation was beneficial, detrimental, or neutral and update count
if old_symbol == target[pos]: # Correct letter has been changed: detrimental
n_detrimental += 1
elif new_symbol == target[pos]: # Incorrect letter has been changed to correct: beneficial
n_beneficial += 1
else: # Incorrect letter has been changed to other incorrect letter: neutral
n_neutral += 1
# If kid was mutated, then it no longer looks like its parent
if kid_changed:
num_unchanged -= 1
# Add (possibly) mutated kid to list of current kids
kid_list.append(kid)
# Find most fit offspring (= string most similar to target)
smallest_dif = len(target) + 1
for kid in kid_list:
# Find number of positions that differ between kid and target
dif = 0.0
for pos in range(len(target)):
if kid[pos] != target[pos]:
dif = dif + 1
# Keep best kid found so far
if dif < smallest_dif:
smallest_dif = dif
best_offspring = kid
# Remove possible new correct position from list of free sites to prevent further mutation
tmp_list = free_list[:] # You can't change list while iterating over it
for pos in tmp_list:
if best_offspring[pos] == target[pos]:
free_list.remove(pos)
# Use best offspring as starting point for next round of mutation
parent = best_offspring
# Print progress
fitness = (len(target)-smallest_dif)/len(target) # Fitness of most fit individual
ben_frac = n_beneficial/num_mut
detr_frac = n_detrimental/num_mut
neu_frac = n_neutral/num_mut
result_string = ""
for pos in range(len(target)):
if best_offspring[pos] == target[pos]:
result_string += best_offspring[pos]
else:
result_string += best_offspring[pos].lower()
print "%s ** Gen: %4d Dif: %3d Fit: %.4f Bene: %.4f Detr: %.4f Neu: %.4f Unchanged: %3d" % (result_string,
gen, smallest_dif, fitness,
ben_frac, detr_frac, neu_frac,
num_unchanged)
Results from typical run of locked version
Fitness of the best offspring in each generation, plotted as a function of generation number. Locked version of program was run
with n_offspring = 50 and with the 28 letter target sequence "METHINKS IT IS LIKE A WEASEL". Fitness is measured
as the percentage of correct positions in the offspring string. Since only incorrect positions are now allowed to mutate
(while correct letters are locked) the fitness increases approximately linearly with time.
The fraction of mutations that are beneficial, detrimental, and neutral. In each generation each of the 50 mutant offspring in the
current population were compared to the target in order to determine whether the newly acquired mutations were beneficial, detrimental, or
neutral (in this toy example neutral mutations occur when one incorrect letter is substituted by another incorrect letter). The values
plotted here are the mutation counts for the three categories expressed as a fraction of all mutations in any given generation. In this
version there are no detrimental mutations. Proximity to fitness peak has no influence on the fraction of mutations that are neutral or
beneficial, and also no effect on waiting time between beneficial mutations.
|