1. Complete the Get Started tutorial.

  2. Download the tutorial files.

Simulation Overview

This model is based on AnyLogic's Interconnected Call Centers example. Calls are made to each of five interconnected call centers simultaneously. Once a call is received, each call center will decide to either accept the call or transfer it to another call center. A call is balked when the wait time for a particular caller exceeds a randomly initialized threshold (between 20 and 25 minutes).

The rate of incoming calls is determined by a schedule. Each call center will randomly select a schedule.

The schedule determines how many calls will be received by each call center in any given hour.

We compare the Pathmind policy to three call routing heuristics:

  1. No Call Transferring - All calls are processed by the original call center.

  2. Shortest Queue - Calls are routed to the call center with the shortest queue.

  3. Most Efficient Call Center - Calls are routed the call center that can process the call the fastest.

The objective is to minimize wait times and to minimize balked callers. The Pathmind policy outperforms the best performing heuristic (shortest queue) by 9.6%.

Pathmind Improvement Over Shortest Queue Heuristic

Wait Times - 9.6% shorter wait times

Balked Callers - 3.7% less balked callers

Operator Utilization - 0.4% higher operator utilization


Step 1 - Run the simulation to check Pathmind Helper setup.

Go through the steps of the Check Pathmind Helper Setup guide to make sure that everything is working correctly. Completing this step will also demonstrate how the model performs using random actions instead of a policy.

Step 2 - Examine the Pathmind properties.

Observations - The policy is given 37 observations in total.

  • currentCall - The call center in which the current call originated. This is a value between 0 and 4 which correspond to each of the five call centers.

  • callCenters - The number of busy and idle operators at each call center as well as the size of the queue.

  • links - The link utilization. This is important to track because each link has a different capacity.

  • time - The hour and minute of the day.

Metrics - The metrics in this model track the mean wait time, the number of balked callers, and the ratio of balked calls to total calls.

Actions - This model contains one decision point with five possible actions which correspond to each of the five call centers. The action tells the call center where to transfer each call. Note that a call center can transfer to itself which essentially means "do not transfer the caller".

Event Trigger - An action is triggered for each call to decide where the caller should be transferred.

Important Note

In this model, you may notice a lot of additional code inside the "event trigger" above. This is a trick we employ to help the reinforcement learning policy learn more efficiency in cases where the number of event triggers is large. For the purposes of this tutorial, you may ignore this code as well as the items highlighted below.

Done - The simulation is set to end after 12 hours to simulate a 12-hour workday.

Step 3 - Export model and get Pathmind policy.

Complete the steps in the Exporting Models and Training guide to export your model, complete training, and download the Pathmind policy.

Reward Function

After importing the Interconnected Call Centers simulation into Pathmind, you'll be prompted to specify a goal. Set your goal to minimize aMeanWaitTimes.

Click Next. A suggested reward function will automatically populate. Now click Train Policy.

A policy will be generated after training completes. A trained policy file is included in the tutorial folder for your convenience.

Step 4 - Run the simulation with the Pathmind policy.

Once you’ve downloaded the Pathmind policy, return to AnyLogic. Open the Pathmind Helper properties and change the Mode radio button to Use Policy. Click Browse and locate the downloaded policy file.

Now run included Monte Carlo experiment and compare the Pathmind policy to the heuristics. You will see the that the policy is able to reduce mean wait time and total balked calls by more than 13%.


The Interconnected Call Centers example represents a real-life use case in which you can leverage reinforcement learning optimize call center operations.

Did this answer your question?