When executing your Pathmind policy back in AnyLogic, you may notice inconsistent results. This is normal and expected because randomness is part of a policy's strategy to avoid getting "stuck".

Policy Stochasticity

Training

Randomness during training helps the policy explore all avenues for improvement, regardless if your simulation is stochastic or deterministic in nature. This is always a good thing.

Serving

When executing a policy, randomness can be good or bad depending on the nature of your simulation.

For simulations with stochastic elements, this is perfectly fine. A policy will still yield desirable results. However, for completely deterministic simulations, this will yield undesirable behavior. In this scenario, you will want to "freeze" your policy to make it deterministic as well.

Policy Freezing

To freeze your policy, please share your experiment URL with Pathmind support. Keep in mind that freezing is only beneficial if your simulation is deterministic.

Did this answer your question?