官术网_书友最值得收藏!

Your Q-learning agent in its environment

Let's talk about the self-driving taxi agent that we'll be building. Recall that the Taxi-v2 environment has 500 states, and 6 possible actions that can be taken from each state.

Your objective in the taxi environment is to pick up a passenger at one location, and drop them off at their desired destination in as few timesteps as possible.

You receive points for a successful drop-off, and lose points for the time it takes to complete the task, so your goal is to complete the task in as little time as possible. You also lose points for incorrect actions, such as dropping a passenger off at the wrong location.

Because your goal is to get to both the pickup and drop-off locations as quickly as possible, you lose one point for every move you make per timestep.

Your agent's goal in solving this problem is to find the optimal policy for getting the passenger to their destination as efficiently as possible, netting the maximum reward for itself. While it navigates the environment, it will learn the best action to take from each state, which will serve as its policy function.

Remember that because Q-learning is value-based and not policy-based, it will not take your agent's actual policy into account, and we will not explicitly enumerate this policy. Instead, the Q-learning algorithm will calculate the value for each state-action pair based on the highest possible value of the next action that your agent could take, therefore assuming that your agent is already following the optimal policy.

We will continue to explore this concept in more detail with the functions that you will write for your agent. The OpenAI Gym package that we will use will provide the game environment, and you will implement the Q-learning algorithm yourself. You can then use the same environment to implement other RL algorithms and compare their performance.

主站蜘蛛池模板: 贵港市| 华池县| 钟山县| 大港区| 库车县| 论坛| 荃湾区| 大邑县| 页游| 双城市| 始兴县| 苍溪县| 阳泉市| 柳河县| 沂南县| 东源县| 六枝特区| 吴桥县| 会宁县| 城口县| 自治县| 穆棱市| 右玉县| 鸡东县| 丘北县| 西乌珠穆沁旗| 永顺县| 田林县| 东兰县| 榆林市| 汤阴县| 和平县| 阳西县| 尼勒克县| 阿城市| 曲水县| 咸宁市| 堆龙德庆县| 来凤县| 年辖:市辖区| 墨竹工卡县|