About

More Information

Come over to USC and find us.

Special Thanks To

Summary

To sum up, pommerman project gave us the best lesson and experience ever in studying Deep reinforcement learning. We’ve studied numbers of last year competetion approaches as well as reinforcement learning research papers, we tried implementing some of them, then we learned to train one of our own from scrath. There were many challenges we have overcome in order to create a fine agent.

Pommerman is not such a simple game that cannot be trained end-to-end, it requires us to designed and created both a curriculum learning and reward engineering for our agent to behave in a way we want, avoid overfitting to specific environment setup. We started guiding our agent from a small board setup, teach it how to find and kill a static enemy that spawns randomly across the board, then we made them learn how to chase and capture an enemy that only move randomly and place no bombs. Adding more wooden boxes and rigid boxes, made the agent learn to know which objects it can destroy and which cannot. Lastly, trained it to win against the enemy that can fight back.

Another approach we learned to improve our model learning is to construct a more suitable network to the observation space, increase layers and numbers of hidden neurons to make our model more powerful and learn much faster. Here we added extra CNN layers to correspond to the matrix of the board game.

There are also many more approaches we also wanted to try but due to lack of time. Imitation learning is one of the thing we discovered and has great potential to become the method we could use to speed-up the training.