The disadvantage of this method is to select the action that gave the best results, without selecting a new Action several times because the data of the other levels are insufficient.
The disadvantage of this approach is to choose the action that gives the best results repeatedly. The new Action not because there is not enough data to other levels.