Proposal

Summary:

The goal of this project is to make an AI whose goal is to to fight off minecraft mobs. The AI should learn how to fight different types of mobs and be able to stay alive for as long as possible. At certain time intervals we will vary the types of mobs being spawned and increase difficulty. The inputs will be the grid view of the minecraft world around it as well as its current health and inventory. If we find that this is working well and we want to take the project further we might attempt to switch to giving the AI the player view instead of the grid view. The AI outputs where the player should move if they should jump if they should attack and if they should use any items from inventory. List of actions consists of but not limited to moving, rotating, attacking, and jumping. Although this AI has few applications outside of Minecraft itself it could be used by players who would like to be able to stay logged in to a minecraft server when they are away from their keyboard and have the AI make sure that they don’t get killed while they are away.

AI/ML Algorithms:

We would be using reinforcement learning, more specifically we will use an On-Policy n-step Sarsa(lambda) algorithm with value function approximation.

Notes:

Evalution:

The metrics we will use to evaluate our model will be survival time, damage taken, damage given, and killing of mobs. We will begin with a baseline of 0 where everything is equally weighted but as we progress with the project this will most likely change as we see what works and what does not. Some example rewards and punishments would be +0.01 for each second of survival, -50 for dying, -5 for taking damage, +1 for dealing damage, +5 for killing mobs, and +10 for destroying a spawner. As for the data we will be evaluating on, since we are doing reinforcement learning, each previous episode will be evaluated and our performance will be updated accordingly.

There are many qualitative ways that we can analyze our project so that we can validate whether or not we are reaching the intended behavior. One simple way is by plotting the number of training episodes vs. survival time or similarly, the number of episodes vs. reward. In both of these cases, our model is working as long as the reward or survival time increase with the number of episodes. Sanity cases are also very useful in this case. One example of a sanity case would be if our model could fend off a mob of a singular type, like zombies. Success on this sanity case would be measured by its ability to eliminate the mob, or being able to run away without dying. Our moonshot goal for our model is for it to be able to deal with 4 mob spawners of different types of mobs at the same time.

Appointement:

Scheduled for May 2nd at 1:30 PM