generated with version 0.1.1
Discrete-event simulation (DES) is very useful in analyzing processes (amongst other application). Using DES one could for example give an answer to the question:
If we have x amount of resources of type A, what will the average waiting time in the process be?
Using simmer
this becomes a problem that is quite easy
to answer, but what if we rephrase the question?
What amount x of resources of type A minimizes the waiting time, while still maintaining a utilization level of ρA?
This question is less straight forward to answer. One way of going about this is to manually adjust the value of x and rerun the simulation for each value of xi, ..., xn and evaluate the results after each run. While definitely not impossilbe to do this for x, it becomes quite bothersome when doing the same exercise not only for x, but also for y and z.
Another approach would be to use a parameter optimization method.
simmer.optim
is an implementation of a parameter
optimization method specifically built as a plugin for
simmer
.
For the purpose of this vignette, we work with the following key definitions:
See the basic simmer
example below.
library(simmer)
library(dplyr)
t0<-trajectory() %>%
seize("nurse") %>%
timeout(function() rpois(1, 10)) %>%
release("nurse") %>%
seize("cardiologist") %>%
timeout(function() rpois(1, 20)) %>%
release("cardiologist")
envs <- lapply(1:100, function(i){
simmer() %>%
add_generator("patient", t0, at(seq(0,60*4, 15))) %>%
add_resource("nurse", 1) %>%
add_resource("cardiologist", 1) %>%
run()
})
In this example, a patient first sees a nurse, followed by a visit to the cadiologist. The consultation is planned to last 4 hours (or 240 minutes).
In the plot below, we can see that the process isn’t very stable. Moreover it lasts a good bit more than 4 hours.
The number of patients served in this process is:
## [1] 17
While the number of patients served before the planned end of the consultation (after 4 hours) is:
get_mon_arrivals(envs) %>%
filter(end_time < 4*60) %>%
group_by(replication) %>%
summarise(n=n()) %>%
.$n %>% mean
## [1] 10.87
And the average waiting time of the patients served before the planned end of the consultation:
get_mon_arrivals(envs) %>%
filter(end_time < 4*60) %>%
mutate(waiting_time = end_time - start_time ) %>%
.$waiting_time %>% mean
## [1] 54.23275
The hospital wants to reorganize the process. The process requirements are specified as follows:
To optimize the scenario we are going to check a number of different values for the following variables.
1:4
for each
resourcec(10, 15, 20, 25)
For this small problem, testing all the possiblities by hand will become somewhat labourious:
expand.grid(nr_nurses = 1:4,
nr_cardiologists = 1:4,
interarrival_time = c(10, 15, 20, 25)) %>% nrow()
## [1] 64
In other words, we would have to re-run the simulation 64 times to
test each possible combinations. By hand this is quite cumbersome, but
simmer.optim
tries to ease this process.
At the moment there are three types of parameter optimization methods available: grid optimization (extensively testing all combinations), differential evolution and simulated annealing. For this example we will apply the grid optimization method.
We start by taking the same definition of the simulation model, but
wrapping it in a function. We wait with calling run(env)
on
the model, as the optimization framework willt take care of this for
us.
library(simmer.optim)
sim_model<-function(){
t0<-trajectory() %>%
seize("nurse") %>%
timeout(function() rpois(1, 10)) %>%
release("nurse") %>%
seize("cardiologist") %>%
timeout(function() rpois(1, 20)) %>%
release("cardiologist")
env<-simmer() %>%
add_generator("patient", t0, function() .opt("interarrival_time")) %>%
add_resource("nurse", .opt("nr_nurses")) %>%
add_resource("cardiologist", .opt("nr_cardiologists"))
env
}
The objective in this case is the number of patients served
within 4 hours. The helper function
msr_arrivals_finished
will return this value for us.
We want to define two constraints, one constraint on the maximum
waiting time and a second one on the total incurred employee costs. For
the first one, the helper function assert_waiting_time_max
will help to assert that the maximum waiting time is satisfied. For the
second one we need to create a custom constraint evaluation function.
This custom constraint evaluation function is created below.
assert_within_budget<-function(envs, budget){
runtime_hrs <- msr_runtime(envs) / 60
number_nurses <- msr_resource_capacity(envs, "nurse")
number_cardiologists <- msr_resource_capacity(envs, "cardiologist")
(number_nurses * runtime_hrs * 50 +
number_cardiologists * runtime_hrs * 100 ) <= budget
}
Below, we use the above specifications to find the best combination
of parameters (nr_nurses
, nr_cardiologists
and
interarrival_time
) by leveraging the
grid_optim
optimization method and setting the
until
runtime to 4 hours.
set.seed(2016)
r <- simmer_optim(
model = sim_model,
method = grid_optim,
direction = "max",
objective = msr_arrivals_finished,
constraints = list(with_args(assert_within_budget, budget = 1000),
with_args(assert_avg_waiting_time_max, max_val = 30)),
params = list(nr_nurses = par_discrete(1:4),
nr_cardiologists = par_discrete(1:4),
interarrival_time = par_discrete(c(10, 15, 20, 25))
),
control = optim_control(run_args = list(until=4*60)))
r
## simmer.optim result
##
## method: grid_optim
## objective value: 18
## constraints satisfied: TRUE
##
## params:
## > nr_nurses : 1
## > nr_cardiologists : 2
## > interarrival_time : 10
We see from the output that, when we respect the constraints, the
maximum number of patients that can be served before the 4 hours are up
is 14. The output also shows the paramaters which generate this output
and which can be accessed through $params
.
## $nr_nurses
## [1] 1
##
## $nr_cardiologists
## [1] 2
##
## $interarrival_time
## [1] 10
Often you do want to optimize the parameters to for a specific run of
the simulation, but instead want to test a paramater combination on
n number of replications. We
can easily achieve this by specifying the replications
argument in the control object. In the below example we run each
parameter configuration 20 times.
set.seed(2016)
r <- simmer_optim(
model = sim_model,
method = grid_optim,
direction = "max",
objective = msr_arrivals_finished,
constraints = list(with_args(assert_within_budget, budget = 1000),
with_args(assert_avg_waiting_time_max, max_val = 30)),
params = list(nr_nurses = par_discrete(1:4),
nr_cardiologists = par_discrete(1:4),
interarrival_time = par_discrete(c(10, 15, 20, 25))
),
control = optim_control(run_args = list(until=4*60),
replications = 20))
r
## simmer.optim result
##
## method: grid_optim
## objective value: 19.2
## constraints satisfied: TRUE
##
## params:
## > nr_nurses : 1
## > nr_cardiologists : 2
## > interarrival_time : 10
As we can see, the results are different from our earlier findings. While the single run finds quite a good objective valuen, we see that when we run 20 replications of the model, the earlier found solutions was probably a very ‘optimisitic’ run, which will be achieved on most days.
Replications can become quite time consuming when run sequentially.
We you use optim_control(parallel = TRUE)
, the replications
will be run in parallel by leveraging
parallel::mclapply
.
For now, three different optimizers are available:
The grid optimization algorithm starts by building a grid with all the possible combination of the values you supply to it. In essence this comes down to an exhausive search of the parameter space. The upside of this is that it will always return the most optimum value, the downside is that it can take a long time to find this optimal value.
(see previous example)
Simulated annealing is a heuristic
method that tries to find a good solution. Finding the
optimum combination of parameters is however not guaranteed.
simmer.optim
leverages the GenSA
package to
perform simmulated annealing. Note that it is important to explicitely
pass par_discrete
type values if the value has to be
interpreted as an integer, this is due to the fact that a cardinality
constraint is applied to the parameter in the optimization procedure.
GenSA
works with lower and upper bounds on variables, so
passing e.g c(10, 20, 30)
will translate in a lower bound
of 10 and upper bound of 30, while allowing for all values in between
(converted to integer if supplied through
par_discrete()
).
simmer_optim(
model = sim_model,
method = sa_optim,
direction = "max",
objective = msr_arrivals_finished,
constraints = list(with_args(assert_within_budget, budget = 1000),
with_args(assert_avg_waiting_time_max, max_val = 30)),
params = list(nr_nurses = par_discrete(1:4),
nr_cardiologists = par_discrete(1:4),
interarrival_time = par_discrete(c(10:25))
),
control = optim_control(run_args = list(until=4*60),
sa_optim = list(maxit = 15)))
Differential evolition is a heuristic
method that tries to find a good solution. Finding the
optimum combination of parameters is however not guaranteed.
simmer.optim
leverages the RcppDE
package to
perform differential evolution. Note that it is important to explicitely
pass par_discrete
type values if the value has to be
interpreted as an integer, this is due to the fact that a cardinality
constraint is applied to the parameter in the optimization procedure.
GenSA
works with lower and upper bounds on variables, so
passing e.g c(10, 20, 30)
will translate in a lower bound
of 10 and upper bound of 30, while allowing for all values in between
(converted to integer if supplied through
par_discrete()
).
simmer_optim(
model = sim_model,
method = de_optim,
direction = "max",
objective = msr_arrivals_finished,
constraints = list(with_args(assert_within_budget, budget = 1000),
with_args(assert_avg_waiting_time_max, max_val = 30)),
params = list(nr_nurses = par_discrete(1:4),
nr_cardiologists = par_discrete(1:4),
interarrival_time = par_discrete(c(10:25))
),
control = optim_control(run_args = list(until=4*60),
de_optim = RcppDE::DEoptim.control(itermax = 10,
trace = T)))