GenePool: Evolutionary Optimization in OCaml

GenePool is a framework for writing evolutionary optimization algorithms in OCaml. This library is not a complete solution but rather is a generic skeleton which takes care of the plumbing and nuisances of optimization. You provide GenePool with functions that give meaning to fitness and reproduction and after a specified number of generation, GenePool returns an array of the best "genomes" it evolved. You can download the GenePool source, which is distributed under the LGPL license.

Library Interface

The interface to GenePool is extremely simple. It consists of a single function evolve whose type is:

('genome, 'fitness) ga_spec -> ga_params -> 'genome array * 'fitness

As you can tell from the signature, GenePool is polymorphic over both the representation of the genome and the type of fitness values. Typically your genome will be an array and fitness will be expressed as either an int or float. However, nothing stops you from using different types if your problem requires them.

The first parameter is a record (of type ga_spec) which encodes how evolution will proceed for your specific problem.

(* GA Specification ------------------------------------------------------------------------------- evaluate - assign a fitness to a genome mutatate - given a genome, create an altered copy crossover - given two genomes, combine them genRandom - produce a random genome from scratch seed - optional initial seed population report - optional function to output information about the best genome of each generation stop - optional stopping predicate *)
type ('genome, 'fitness) ga_spec = { evaluate : 'genome -> 'fitness; mutate : 'genome -> 'genome; crossover : 'genome * 'genome -> 'genome; genRandom: unit -> 'genome; seed: 'genome array option; report: (int -> 'genome -> 'fitness -> unit) option; stop: (int > 'genome -> 'fitness -> bool) option }

Some parameters (ie, the maximum number of generations, the number of genomes which survive each generations, etc...) are universal across all optimization algorithms. These are provided in another record, whose type is ga_params.

(* GA Parameters ------------------------------------------------------------------------------- nRandom - how many random genomes should we generate at the beginning nSurvivors - how many genomes survive of each generation nMutations - # new genomes generated by asexual reproduction in each generation nCrossovers - # new genomes by sexual reproduction in each generation timeLimit - seconds until algorithm termination maxGen - maximum number of generations until algorithm termination *)
type ga_params = { nRandom: int; nSurvivors: int; nMutations: int; nCrossovers: int; timeLimit:float; maxGen: int }

The return value of evolve is 'genome array * 'fitness which is the last generation sorted by order of their fitness and the fitness of the single best genome (ie, the genome at the 0th index of the returned generation).

Usage

To demonstrate GenePool in action, let's solve a completely trivial problem: maximizing the number of bits in a binary vector. First, we must pick a genome representation for solutions. In this simple case both the genomes and their corresponding solutions can be of the same type: bool array.

Now, let's make a ga_spec record to use as an argument to evolve. GenePool comes with a set of handy functions for manipulating array genomes. These can be found in the GenePool.ArrayGenome module.

let spec = { evaluate = Array.fold_left (fun count b -> if b then count + 1 else count) 0; mutate = ArrayGenome.mutatePoint not; crossover = ArrayGenome.randomCrossover; genRandom = ArrayGenome.mkArray 500 Random.bool; seed = None; report = Some (fun gen genome fitness -> Printf.printf "Generation %d, fitness: %d \n" gen fitness); stop = Some (fun gen genome fitness -> fitness = 500) }

Furthermore, let's create a ga_params record to specify that we want to evolve a 1000 vectors over 10 generations without a time limit.

let params = { nRandom = 1000; nSurvivors = 100; nMutations = 450; nCrossovers = 450; timeLimit = max_float; maxGen = 10 }

Putting It All Together

To make something happen, just call evolve with the two records we just created. You'll get back the best array along with its fitness score.

let genomes, bestFitness = evolve spec params let _ = Printf.printf "Final fitness: %d \n" bestFitness

You can download a copy of the above example here.

Compiling

Since the internal timer in GenePool relies on the Unix module, be sure to link with either unix.cma or unix.cmxa (depending on which compiler you use).

$ ocamlopt -o test unix.cmxa genePool.mli genePool.ml test.ml $ ./test Generation 1, fitness: 306 Generation 2, fitness: 327 Generation 3, fitness: 336 Generation 4, fitness: 341 Generation 5, fitness: 348 Generation 6, fitness: 357 Generation 7, fitness: 368 Generation 8, fitness: 378 Generation 9, fitness: 396 Generation 10, fitness: 403 Final fitness: 403

Library Helper Functions

The ArrayGenome module

(* Array genome operations ------------------------------------- mutate - modify each element of an array with probability given by first argument; (at least one position is modified) mutatePoint - modify exactly one position in the array mutateWhenPred - modify exactly one position when its index satisfies a predicate onePointCrossover - first part of new genome comes from first array, latter part comes from second array twoPointCrossover - genome is a copy of the first array with a range blitted from the second randomCrossover - genome is a copy of the first array with a shifted range blitted from the second mkArray - create an array of given length using the supplied 0-arity function (usually a pseudorandom number generator) *)
module ArrayGenome : sig val mutate : float -> ('a -> 'a) -> 'a array -> 'a array val mutatePoint : ('a -> 'a) -> 'a array -> 'a array val mutateWhenPred : float -> ('a -> 'a) -> (int -> bool) -> 'a array -> 'a array val mutatePointWhenPred : ('a -> 'a) -> (int -> bool) -> 'a array -> 'a array val onePointCrossover : ('a array * 'a array) -> 'a array val twoPointCrossover : ('a array * 'a array) -> 'a array val randomCrossover : ('a array * 'a array) -> 'a array val mkArray : int -> (unit -> 'a) -> unit -> 'a array end

The Utility module

(* Useful utility functions to make your life easier ---------------------------------------------------- selectRandom - given an array of functions and an input, apply one of those functions to an input with uniform probability. (useful for using multiple mutation or crossover functions) randIntExcept - given parameters n and e, generate a pseudorandom number in [0, n) excluding e goodRandom - given a RNG and predicate, keep generating numbers until the predicate is satisfied goodRandomInt - works like goodRandom, but generates specifically integers with values up to the first argument gaussRandom - takes mean and sigma as parameters and returns a guassian RNG *)
module Utility : sig val selectRandom : ('input -> 'output) array -> 'input -> 'output val randIntExcept : int -> int -> int val goodRandom : (unit -> 'random) -> ('random -> bool) -> 'random val goodRandomInt : int -> (int -> bool) -> int val gaussRandom : float -> float -> unit -> float end