Improving and validating existing maps

Map improving and validating commands all try to find a map whose loglikelihood is improved over an initial map. For all such commands, the initial map used is the best map in the heap. Beyond a possible map improvement, all these commands may affect the contents of the heap: all maps considered by these commands are candidate for insertion.

The first category of commands are simple heuristics command that slightly perturbate the initial map in a systematic manner:

- The famous
`flips`command applies all possible permutations in a sliding window of fixed size on the initial map. The command reports all maps found whose loglikelihood lies within a given threshold of the loglikelihood of the original map. If a new improved map is found, the command may be automatically iterated. It can also be used to explore all possible orders when a small number of markers is selected by choosing a window size equal to the number of markers. - the
`polish`command removes one marker of the initial map and tries to insert it in all possible intervals. The variation in loglikelihood is reported for all markers and all positions in a matrix. - the
`squeeze`command tries to detect and remove erroneous markers such that their removal corresponds to a large reduction in the total size of the initial map.

These commands are simple and systematic. CARTHAGENE offers much more
powerful commands that use a class of discrete optimization techniques
called ``stochastic optimization'' or more precisely ``local search
methods''. This class of methods is usually acknowledged as the most
efficient class of technique for hard discrete optimisation problems
such as the genetic marker ordering^{2.2}

These optimisation methods all follow a similar principle: one or
several current orders are stochastically ``perturbated'' and
depending on the result of the pertubation and a random choice, the
perturbated maps may replace the original maps or not. Three such
methods are integrated in CARTHAGENE: simulated annealing [LA84],
taboo search [Glo89,Glo90] and genetic
algorithm [Gol89]. The three methods use the same
perturbation mechanism: a subsection of the map is chosen and
flipped ^{2.3}. Simulated annealing
and the genetic algorithm may also use more complex perturbation
techniques.

Moreover, it is possible to assess the robustness of a given map by
looking at the map distribution using Markov Chain Monte Carlo. For
that, see the `mcmc` command. It works only in the comparative mapping approach.

These commands described next can take a long time to improve or not the current solution. The system does not provide a prediction of the time cost of this algorithms. Nevertheless, these methods can be interrupted online, without loss of information (see section 2.6.11).