The results in the first to show that when the Learning rate will increase the Action. The Action valuable Q maximum. Change from Action with no risk, move into the Action with better results, but also have an increased risk.The value, it will stop at risk too much.
การแปล กรุณารอสักครู่..