Day1

Outline

Benchmark function optimisation
Components of heuristic search
Representation
Evaluation
Neighbourhoods
Basic Hill Climbing/Local Search Methods
Strengths and weaknesses of hill climbing

Benchmark function optimisation

Benchmark function: The function, which can be used to test performance of any optimization approach and the related problem.

Why to use benchmark functions for optimisation?

Benchmark functions servers as a testbed for performance comparison of heuristic optimisation algorithms

The global minimum should be known
The function should be easily computed
The function is recognised to have certain characteristics(e.g. separable vs. non-separable)

Classification of benchmark functions

Continuity(Differentiability): Discontinuous vs continuous
Dimensionality
Modality: Unimodal vs Multimodal with few local minima vs Multimodal with exponential number of local minima
Separability

Explanation:
1. What do you mean by modality?
modal means the climax value. In Wikipedia, the definition is as follows:
In mathematics, unimodality means possessing a unique mode. More generally, unimodality means there is only a single highest value, somehow defined, of some mathematical object.
Therefore, unimodal means there is only one climax(might also be lowest value).
2. What do you mean by continuity or differentiability
Actually they are different. But differentiable function must be continuity.
In Wikipedia:
In calculus (a branch of mathematics), a differentiable function of one real variable is a function whose derivative exists at each point in its domain. As a result, the graph of a differentiable function must have a (non-vertical) tangent line at each point in its domain, be relatively smooth, and cannot contain any breaks, bends, or cusps.

Main Components of Heuristic(search) optimisation methods

Representation (encoding) of candidate solutions
Evaluation function
Initialisation (e.g. random)
Neighbourhood relation (move operators)
Search process (guideline)
Mechanism for escaping from local optima

Representation

We have different problems, such as 0/1 Knapsack problem, personnel rostering, DNA sequencing, parameter control…
We need different representation with respect to different problem:

Binary encoding is the most common for Knapsack problem
Integer representation: Assignment problem, like personnel rostering or timetabling problem
Permutation encoding can be used in ordering problems: TSP, sequencing problems. Compared with integer representation, permutation cannot overlap one value, it can only permute value
Value encoding, value can be neither number, alphabet or word, can be used in DNA sequencing, planning.
Nonlinear encoding: Tree encoding- Genetic programming
Special encoding : For example, random key encoding.

Evaluation Function

Also referred to as objective, cost, fitness, penalty, etc.

Indicates the quality of a given solution, distinguishing between better and worse individuals
Serves as a major link between the algorithm and the problem being solved
Evaluation functions could be computationally expensive

Delta evaluation

Key idea: calculate effects of differences between current search position s and a neighbour s’ on evaluation function value.

Evaluation function values often consist of independent contributions of solution components; hence, f(s’) can be efficiently calculated from f(s) by differences between s and s’ in terms of solution components.
For example, a TSP problem(delta evaluation):
In function optimisation

Neighbourhood

A neighbourhood of a solution x is a set of solutions that can be reached from x by a simple operator(move operator/heuristic)

Examples of neighbourhood

Binary representation

Flipping one bit of the solution
1 0 0 0 1 1 0 0 0 0 -> 0 0 0 0 1 1 0 0 0 0
Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.
Size: if the binary vector is of size n then the neighbourhood size is n.

Integer representation

Neighbourhood: a discrete value is replaced by any other character of the alphabet.
5 7 6 6 4 3 8 4 2 -> 7 7 6 6 4 3 8 4 2
Size: if the alphabet is of size k then the neighbourhood size is (k-1)n

Permutation

Adjacent pairwise interchange: 5 1 4 3 2 -> 1 5 4 3 2
Size: if permutation is of size n then the neighbourhood size is n-1
Insertion operator: take an arbitrary job in the permutation and insert it in another position 5 1 4 3 2 -> 1 4 5 3 2
Size: if permutation is of size n then the neighbourhood size n*(n-1)
Exchange operator: arbitrary selected two elements are swapped
5 1 4 3 2 -> 3 1 4 5 2
Inversion operator: arbitrary select two elements and invert the sequence between them
5 1 4 3 2 -> 5 3 4 1 2

summary from representation/evaluation/neighbourhood

Choosing an appropriate encoding to represent a candidate solution is crucial in heuristic optimisation
Initialisation could influence the performance of an optimisation algorithm
Evaluation function guides the search process and fast evaluation is important
Benchmarks are useful for performance comparison of different algorithms

Search paradigms

Mutational (diversification/exploration)
Hill Climbing (intensification/exploitation)

Processes a given candidate solution and generates a solution which is not guaranteed to be better than the input

Processes a given candidate solution and generates a better or equal quality solution

Hill Climbing Algorithm

Initial starting points may be chosen:

randomly
use a constructive heuristic
according to some regular pattern
based on other information(e.g. results of a prior search)

Simple Hill Climbing Heuristics(classification)

Simple Hill Climbing examining all neighbours:

Best improvement(steepest descent/ascent)
First improvement(gradient descent/ascent)

Stochastic Hill Climbing (randomly choose neighbours):

Random selection/random mutation hill climbing

Random-restart (shotgun) hill climbing is built on top of hill climbing and operates by changing the starting solution for the hill climbing, randomly and returning the best

Best improvement

Every time bit flip one bit, but no matter it improve or not, go back to the initial current solution and flip next bit.
After flip all the bits, choose the best one.
MAX-SAT as a example:

First improvement

Every time bit flip one bit, if there is improvement, accept the bit flip, otherwise reject it and go back.

Davis’s bit hill climbing

Make a permutation, and flip as the order(sequence) of the permutation. The accept criteria is same as first improvement.

Random mutation hill-climbing

Randomly create an array of integers as the order of the bit flip number.

Hill Climbing vs Random Walk

A hill-climbing method exploits the best available solution for possible improvement but neglect exploring a large portion of the search space.
Random walk(performs search in the search space, sampling new points with equal probability, e.g. random bit flip) explores the search space thoroughly but misses exploiting promising regions.

Weaknesses of hill climbing

Local Optimum: if all neighboring states are worse or the same. The algorithm will halt even though the solution may be far from satisfactory
Plateau(neutral space/shoulder): All neighboring states are the same as the current state. In other words the evaluation function is essentially flat. The search will conduct a random walk.
Ridge/valley: the search may oscillate from side to side, making little progress. In each case, the algorithm reaches a point at which no progress is being made. If this happens, an obvious thing to do is start again from a different starting point.
As a result, HC may not find the optimal solution and may get stuck at a local optimum.

搜索此博客

MikeChen's Blog

[AIM]Components of Heuristics and Hill Climbing