After exporting your AnyLogic simulation to Pathmind, you will be asked to write a reward function. A reward function describes how a reinforcement learning agent should behave.

Getting Started

As a starting point, we recommend that you try two things.

Recommendation 1

Simply maximize or minimize a metric that you care about to see how well the policy learns.

Maximizing or minimizing a metric is usually enough to achieve your business objectives.

Recommendation 2

Test each reward metric independently as a sanity check. For example, first train a policy using "reward0" only.

Now train a second policy using "reward1" only.

In Pathmind, you may run as many experiments in parallel as you'd like by clicking the + New Experiment button.

You can leverage this feature to quickly test the impact of each individual reward on the policy's behavior.

Start Training

Once you provide a reward function, click the Train Policy button to start training.

Did this answer your question?