Provide a Pareto frontier approximation.
The paretolkh command is invoked as either one of :
- paretolkh Options
- paretolkh Resolution NbRun CollectMaps
The paretolkh command creates a Pareto frontier from scratch. The result is a collection of maps, inserted into the heap, each map having a different number of breakpoints with a reference order. The current data set should be a merged by order between a biological data set and a reference order data set. paretolkh applies the weighted sum method for combining the 2-point loglikelihood (MLE) criterion from the biological data with the number of breakpoints (BP) criterion, approximated in the case of missing orthologous relationships. The weighted objective coefficient applied to the number of breakpoints is the product of a normalization factor and a weighted factor. The normalization factor is obtained by finding first the initial ranges of the two criteria (
). The weighted factor is equal to
with , a positive integer varying between and
. The Resolution parameter controls the number of iterations of the weighted sum method. Each iteration corresponds to a mono-objective Traveling Salesman Problem with a different coefficient value (
) solved by Keld Helsgaun's LKH software. Each iteration uses as a first starting point the best map found by the previous iteration. In order to improve the Pareto frontier, CARTHAGENE first generates an initial map by optimizing only one single objective (2-point loglikelihood criterion first), and then to start from this map a sequence of LKH iterations by varying from to
. This search strategy is applied twice. The second time, it starts from the best map found with the minimum number of breakpoints and vary from
The final map found by each iteration is called supported and is locally optimal w.r.t. the LKH neighborhood.
Some maps in the Pareto frontier are called dominated if there exists another map in the frontier which has less breakpoints and a better likelihood.
The best map in the frontier is called balanced.
The NbRun and CollectMaps parameters control the LKH method. See lkh 2.5.8. In order to get non supported maps and a better Pareto frontier approximation, use CollectMaps greater than or equal to 0. Each map found by the weighted sum method is assessed by computing the exact multipoint likelihood2.8 and the exact number of breakpoints.
Note that the loglikelihoods written during the iterative process have no meaning (internal use only).
Try paretolkh 10 1 0 as default parameter values.
- Options : -u to obtain the synopsis of the normal
use, -h to print a one line description, -H to
print a short help.
- Resolution : Controls the number of iterations of the weighted sum method. See above. This number should be related to the number of markers.
- NbRun : Repeats NbRun times the LK heuristic. At each run, LKH tries different starting points (the number of starting points is equal to the number of markers ; if the heap was not empty, the best map in the heap is used as the first starting point, other starting points are randomly generated).
- CollectMaps : Possible values are -1, 0, 2, 3, 4 and 5. If CollectMaps is then every tour found by LKH is inserted into the CarthaGene heap. Moreover, a positive value sets the backtrack move type of LKH. If set to -1, only locally optimal tours w.r.t. the LK neighborhood are inserted into the heap.