optim_pso {ppso}R Documentation

(Parallel) optimization using Particle Swarm Optimization or Dynamically Dimensioned Search

Description

optim_pso minimizes a given function objective_function with regard to its parameters contained in the vector par towards a minimum value of f using Particle Swarm Optimization. optim_ppso_robust is the parallelized versions (using multiple CPUs). optim_dds minimizes using a Dynamically Dimensioned Search with optim_pdds_robust being the parallel version.

Usage

optim_pso        (objective_function = sample_function, number_of_parameters = 2, number_of_particles = 40, max_number_of_iterations = 5, max_number_function_calls=500,
  w = 1, C1 = 2, C2 = 2, abstol = -Inf, reltol = -Inf, max_wait_iterations = 50, wait_complete_iteration = FALSE, parameter_bounds = cbind(rep(-1, number_of_parameters),
  rep(1, number_of_parameters)), initial_estimates=NULL, Vmax = (parameter_bounds[, 2] - parameter_bounds[, 1])/3, lhc_init=FALSE, do_plot = NULL, wait_for_keystroke = FALSE, logfile = "ppso.log", projectfile = "ppso.pro", save_interval = ceiling(number_of_particles/4), load_projectfile = "try", break_file=NULL, plot_progress=FALSE, tryCall=FALSE, verbose=FALSE, ...)
optim_ppso_robust(objective_function = sample_function, number_of_parameters = 2, number_of_particles = 40, max_number_of_iterations = 5, max_number_function_calls=500,
  w = 1, C1 = 2, C2 = 2, abstol = -Inf, reltol = -Inf, max_wait_iterations = 50, wait_complete_iteration = FALSE, parameter_bounds = cbind(rep(-1, number_of_parameters),
  rep(1, number_of_parameters)), initial_estimates=NULL, Vmax = (parameter_bounds[, 2] - parameter_bounds[, 1])/3, lhc_init=FALSE, do_plot = NULL, wait_for_keystroke = FALSE, logfile = "ppso.log", projectfile = "ppso.pro", save_interval = ceiling(number_of_particles/4), load_projectfile = "try", break_file=NULL, plot_progress=FALSE, tryCall=FALSE, nslaves = -1, working_dir_list=NULL, execution_timeout=NULL, maxtries=10, verbose=FALSE, ...)

optim_dds        (objective_function = sample_function, number_of_parameters = 2, number_of_particles =  1,                               max_number_function_calls= 500,
  r=0.2,                 abstol = -Inf, reltol = -Inf, max_wait_iterations=50,                                    parameter_bounds = cbind(rep(-1, number_of_parameters),
  rep(1, number_of_parameters)), initial_estimates=NULL, part_xchange=2,                                                          lhc_init=FALSE, do_plot = NULL, wait_for_keystroke = FALSE, logfile=  "dds.log",  projectfile = "dds.pro",  save_interval = ceiling(number_of_particles/4), load_projectfile = "try", break_file=NULL, plot_progress=FALSE, tryCall=FALSE, verbose=FALSE, ...)
optim_pdds_robust(objective_function = sample_function, number_of_parameters = 2, number_of_particles = 1,                               max_number_function_calls= 500,
  r=0.2,                 abstol = -Inf, reltol = -Inf, max_wait_iterations=50,                                    parameter_bounds = cbind(rep(-1, number_of_parameters),
  rep(1, number_of_parameters)), initial_estimates=NULL, part_xchange=2,                                                          lhc_init=FALSE, do_plot = NULL, wait_for_keystroke = FALSE, logfile=  "dds.log",  projectfile = "dds.pro",  save_interval = ceiling(number_of_particles/4), load_projectfile = "try", break_file=NULL, plot_progress=FALSE, tryCall=FALSE, nslaves = -1, working_dir_list=NULL, execution_timeout=NULL, maxtries=10, verbose=FALSE, ...)

Arguments

objective_function

function whose numerical return value is to be minimized. objective_function needs to accept a vector X as argument, which contains number_of_parameters numerical values. objective_function is expected to return a numerical value and handle exceptions (out-of-bounds-parameters, etc.) on its own by returning Inf.

number_of_parameters

Number of parameters to be optimized.

number_of_particles

Number of particles used in the Particle Swarm Optimization / number of parallel threads in DDS.

max_number_of_iterations

Abortion criterion: Maximum number of iterations (i.e. update of particle positions) that are used.

max_number_function_calls

Abortion criterion: Maximum total number of calls to the objective function, ignored when set to NULL. In non-parallel mode, slightly more function calls may actually be made to finish the last iteration. when resuming from a previous run, the previously made function calls also count, i.e. only the pending number of calls to reach max_number_function_calls will be made. In contrast, if max_number_function_calls is negative, -max_number_function_calls MORE calls will be made, additionally to those of the previous run.

abstol

Abortion criterion: minimum absolute improvement between iterations (default: -Inf)

reltol

Abortion criterion: minimum absolute relative improvement between iterations (default: -Inf)

max_wait_iterations

Number of iterations, within these an improvement of the above described (abstol, reltol) quality has to be achieved before abortion

wait_complete_iteration

TRUE: Update particle velocities after function objective_function has been evaluated for ALL particles. FALSE: Update particle velocities after each completed evaluation of function objective_function (suggested in parallel mode).

parameter_bounds

(row-named) matrix containing lower (first column) and upper (second column) boundary for each parameter. See details.

initial_estimates

(row-named) matrix containing columns of initial estimates (column-wise). Pending computations from a resumed project file are used first, then all initial_estimates are used.

Vmax

maximum velocity of particles in parameter space

lhc_init

set starting positions of particles using Latin Hypecube Sampling (TRUE, requires package lhc) or purely random (FALSE)

logfile

Name of logfile for optional logging of ALL model runs. NULL disables logging. If an prior optimization run is successfully resumed (see projectfile), an existing logfile is appended to.

projectfile

Name of project file for optional logging of state of optimization, which enables resuming aborted optimizations. An existing file is overwritten! NULL disables project file.

save_interval

minimum number of function evaluations to compute before the projectfile is updated.

load_projectfile

"yes": require projectfile to initialise particles; "try": load projectfile , if existent; "no": ignore existing projectfile

break_file

Name of a file file that will cause the termination of the optimization when it is encountered. Useful for interrupting a running optimization gracefully. At start-up, any existing file of this name will be deleted.

plot_progress

If set to TRUE, corresponds to repeated call of plot_optimization_progress with default parameters in an interval of save_interval. If plot_progress is a named list, it is parsed as paraemters to plot_optimization_progress.

tryCall

If set to TRUE, objective_function is executed using try, which yield error messages if objective_function fails on any slave. The optimization retries, but terminates when an error is produced a second time for the same parameter set. For very fast evaluating functions, this may increase evaluation time.

...

additional arguments passed to objective_function

for optim_*pso*:

w

Inertia constant, i.e. "weight" of the particles.

C1

Cognitive component. Weighting factor for the "personal" experience of each particle.

C2

Social component. Weighting factor for the "swarm" experience of each particle.

for optim_*dds*:

r

Neighbourhood size perturbation parameter . Default value: 0.2.

part_xchange

Relevant for number_of_particles > 1: mode how DDS particles (i.e. parallel DDS-threads) communicate with each other between iterations:

  • 0: no communication/relocation between particles

  • 1: relocate/update particle particle that is worst in both objective function AND futile iterations

  • 2: relocate all but the best particle

  • 3: relocate particle that is worst in futile iterations (but not the global best) and set to best improving particle

for the parallel versions optim_ppso*, optim_pdds*:

nslaves

number of rmpi slaves to spawn (default -1: as many as possible, requires package Rmpi)

working_dir_list

String matrix of working directories the slaves should change to. First column holds the hostname, the second column the directory.\ A hostname may be listed several times, if more than one slave is run on it. If a host is listed less times than it has slaves, the last entry will be recycled as necessary.\ The entry "default" denotes the directory all slaves of unlisted hosts will be change to. If missing, slaves on unlisted hosts will stay in the directory they are spawned in (don't ask me which this is).\ Beware: "~" apparently doesn't work, so rather use the full path.

execution_timeout

If set to a factor greater than 1, a slave whose execution time has exceeded (execution_timeout times the mean of its prior calls) for more than three times is no longer used. This also applies when the slave produces its first result very slow when compared to the other slaves.

maxtries

Number of times a slave may exceed execution_timeout in a row before being excluded from further tasks. Ignored, when execution_timeout is not set.

verbose

generate screen output documenting the message passing

for didactical purpose (only for two-parameter search and fast objective function, non-parallelized):

do_plot

enable 3D-plot of response surface and search progress. "base": use basic 3D-plotting functions. "rgl": use moveable rgl-plotting (requires package rgl)

wait_for_keystroke

waiting for keystroke between iterations (e.g. for plotting). Option for changing into debug mode.

Details

If parameter_bounds (dominant) or initial_estimates has named rows, these names are used in vector passed to the the call to objective_function.

recommended use:

concerning reproducibility: setting the seed using set.seed() should make results reproducable. However, in the parallel versions, the randomness introduced by the order of the results returned from the slaves cannot be controlled, so this may limit reproducibility.

Value

The functions return a list with the elements

par

parameters (ie location) of minimum found

value

value of objective function at minimum

function_calls

total number of function calls performed

break_flag

criterium that caused the termination of the algorithm (c("max number of function calls reached", "max iterations reached", "converged", "all slaves gone", "user interrupt"))

Note

- still in development, use with care, all comments welcome -

Parallelization is a great thing, especially when it works. On its dark side, you just parallelize errors and multiply debugging effort. Enabling verbose=TRUE and tryCall=TRUE may help to find the error.
Master or slave crashes can cause the entire session to stall, so restarting R may be necessary. Under Windows, orphaned mpi-sessions and slaves may have to be killed manually to prevent stalling the entire system: Use "Process Explorer" (free) and kill the smpd.exe-tree and mpiexec.exe.

Author(s)

Till Francke <francke_at_uni-potsdam.de>

References

Tolson, B. A., and C. A. Shoemaker (2007) Dynamically dimensioned search algorithm for computationally efficient watershed model calibration, Water Resour. Res., 43, W01413, doi:10.1029/2005WR004723. http://www.agu.org/journals/wr/wr0701/2005WR004723/

See Also

plot_optimization_progress, sample_function

Examples

 library(ppso)
#simple application (all file I/O disabled)
	result = optim_pso(objective_function=rastrigin_function, projectfile=NULL, logfile=NULL)
	print (result)        #actual minimum -2 at (0,0) 
	result = optim_dds(objective_function=rastrigin_function, projectfile=NULL, logfile=NULL)
	print (result)


## Not run: 
#simple application with visualisation
	result = optim_pso(objective_function=rastrigin_function, projectfile=NULL, logfile=NULL, do_plot="rgl")
	print (result)
	result = optim_dds(objective_function=rastrigin_function, projectfile=NULL, logfile=NULL, do_plot="base")
	print (result)

## End(Not run)

#writing and resuming from project file
	projectfile	=tempfile()
	result = optim_pso(objective_function=rastrigin_function, projectfile=projectfile, logfile=NULL, load_projectfile="no")		#start optimization, generate project file
	print (result)

	result = optim_pso(objective_function=rastrigin_function, projectfile=projectfile, logfile=NULL, load_projectfile="yes")		#resume optimization from previous project file
	print (result)
	unlink(projectfile)	#delete project file

## Not run: 
#visualisation of progress
	projectfile	=tempfile()
  logfile	=tempfile()
	result = optim_pso(objective_function=rastrigin_function, projectfile=projectfile, logfile=logfile, load_projectfile="no", plot_progress=TRUE)		#start optimization, generate project file and log file
  unlink(c(projectfile,logfile))	#delete log file and project file



#parallel application
	result = optim_ppso_robust(objective_function=rastrigin_function, nslaves=2, max_number_function_calls=200, projectfile=NULL, logfile=NULL)
	print (result)

	result = optim_pdds_robust(objective_function=rastrigin_function, nslaves=2, max_number_function_calls=200, projectfile=NULL, logfile=NULL)
	print (result)
	
  working_dir_list=rbind( #specify working directories for four slaves
  c(host="default", wd="/tmp/defaultdir"),
  c(host="host1", wd="/home/me/firstthread"),
  c(host="host1", wd="/home/me/secondthread"),
  c(host="host2", wd="/home/me2/onlyonethread")
  )
	result = optim_pdds_robust(objective_function=rastrigin_function, nslaves=4, working_dir_list=working_dir_list)    

## End(Not run)


[Package ppso version 0.9-9994 Index]